Skip to main content

Asynchronous HTTP Interface

The asynchronous HTTP interface is a non-blocking HTTP API for transcribing long audio into text.

To use this API, follow these steps:

  1. Create a speech recognition job
  2. Poll to check the status of the speech recognition job and retrieve the results

The method for generating the job in step 1 is almost the same as the synchronous HTTP interface, except for how to specify logging options.

Endpoint

Unlike the synchronous HTTP interface, the base endpoint is the same regardless of whether logging is enabled or not.

https://acp-api-async.amivoice.com/v1/recognitions

1. Creating a Job

Endpoint:

POST https://acp-api-async.amivoice.com/v1/recognitions

Request

The request method is the same as the synchronous HTTP interface. For details, please see Request in the synchronous HTTP interface reference.

About d parameters

The d parameters of the synchronous HTTP interface can be set. Parameters that are only valid for the asynchronous HTTP interface are shown in the following table.

Parameter NameValueDescription
loggingOptOutTrue|FalseSpecifies whether to change to logging or no logging. When set to True, the system will not save logs during the session. The default is False.
contentIdAny stringYou can specify any user-defined string. It will be included in the status and result responses during that session. Not set by default.
compatibleWithSyncTrue|FalseResult format compatibility. Formats results in a way compatible with the synchronous HTTP interface. The default is False.
speakerDiarizationTrue|FalseSpeaker diarization enablement option. Enables speaker diarization. The default is False.
diarizationMinSpeakerintMinimum estimated number of speakers for speaker diarization. Only valid when speaker diarization is enabled, you can specify the minimum number of speakers in the audio. Must be set to 1 or higher. The default is 1.
diarizationMaxSpeakerintMaximum estimated number of speakers for speaker diarization. Only valid when speaker diarization is enabled, you can specify the maximum number of speakers in the audio. Must be set to a value greater than or equal to diarizationMinSpeaker. The default is 10.
sentimentAnalysisTrue|FalseEnables sentiment analysis. The default is False.

Response

The successful response is in JSON format and contains the following values:

KeyDescription
sessionidJob ID for the user's speech recognition request.
textAlways returns ...

Example

{"sessionid":"017ac8786c5b0a0504399999","text":"..."}

The failure response is in JSON format and contains the following values:

KeyDescription
resultsArray (1 element)
tokensArray (empty)
tagsArray (empty)
rulenameString (empty)
textString (empty)
textString (empty)
codeSingle character code representing the result. Please see Response Codes and Messages.
messageString describing the error content. Please see Response Codes and Messages.

Example

{
"results": [{ "tokens": [], "tags": [], "rulename": "", "text": "" }],
"text": "",
"code": "-",
"message": "received illegal service authorization"
}

2. Checking Job Status, Retrieving Results

Endpoint:

GET https://acp-api-async.amivoice.com/v1/recognitions/{session_id}

Request

Request Parameters

Parameter NameRequiredTransmission MethodDescription
session_idPath parameterSpecify the ID obtained in the response when creating the job.

Authentication

Please specify the APPKEY in the Authorization header.

Authorization: Bearer {APPKEY}

Response

If the request is successful, it returns the status of the speech recognition request and associated information. If it fails, it returns an HTTP response code other than 200. The status in case of success takes one of the following 5 values:

statusDescription
queuedThe job is registered in the queue.
startedThe job has been taken out of the queue and is preparing the execution environment.
processingThe job is being executed.
completedResults have been obtained from the speech recognition process. If code is an empty string, meaning the speech recognition was successful, the results are included in segments.
errorAn error occurred when trying to execute the job or during job execution. The error details are stored in error_message.

Information Included in the Response

The following table summarizes the information included in each state: queued, started, processing, completed, and error. In the columns for each state's initial letter (q, s, p, c, e), A indicates always included. O indicates included if that information is available.

KeyqspceDescription
statusAAAAAJob status. Takes the states: queued, started, processing, completed, error.
audio_md5AAOMD5 checksum value of the received audio file.
audio_sizeAAOSize of the received audio file.
content_idOOOOOValue of contentId set by the user at the time of request.
service_idAAAAUser name ID.
segmentsAResults of the speech recognition process. Speech recognition results per utterance.
utteranceidAResults of the speech recognition process. Recognition result information ID.
textAResults of the speech recognition process. Overall recognition result text combining all "recognition results for speech segments".
codeAResults of the speech recognition process. Single character code representing the result.
messageAResults of the speech recognition process. String describing the error content.
error_messageAString describing the error content

Details of the completed (speech recognition result) response

When the status is completed, it returns the speech recognition results in JSON format. Unlike the synchronous HTTP interface, the recognition results are placed under segments on a per-utterance basis.

Description
segments
resultsArray of "recognition results for speech segments"
confidenceConfidence (value between 0 and 1. 0: low confidence, 1: high confidence)
starttimeSpeech start time (0 is the beginning of the audio data)
endtimeSpeech end time (0 is the beginning of the audio data)
tagsUnused (empty array)
rulenameUnused (empty string)
textRecognition result text
tokensArray of morphemes of the recognition result text
writtenNotation of the morpheme (word)
confidenceConfidence of the morpheme (likelihood of the recognition result)
starttimeStart time of the morpheme (0 is the beginning of the audio data)
endtimeEnd time of the morpheme (0 is the beginning of the audio data)
spokenReading of the morpheme
labelSpeaker label (speaker0, speaker1, ...) Included in the result only when speakerDiarization=True is specified at the time of request.
utteranceidRecognition result information ID
textOverall recognition result text combining all "recognition results for speech segments"
codeSingle character code representing the result. Please see Response Codes and Messages.
messageString describing the error content. Please see Response Codes and Messages.

Details of the error response

This is an example of a response when speech recognition processing fails. For each value, please see Information Included in the Response.

Example

{
"status": "error",
"audio_md5":"40f59fe5fc7745c33b33af44be43f6ad",
"audio_size":306980,
"service_id":"{YOUR_SERVICE_ID}",
"session_id":"017c25ec12c00a304474a999",
"error_message": "ERROR: Failed to transcribe in recognition process - amineth_result=0, amineth_code='o', amineth_message='recognition result is rejected because confidence is below the threshold'"
}

Error Response

If the call to the endpoint for checking job status and retrieving results fails, it returns an HTTP status code other than 200. The response body returns JSON data containing the following information:

Parameter NameDescription
errorCodeResponse code.
errorMessageError message.

Example:

{
"errorCode":401,
"errorMessage":"Invalid authorization header format"
}
Status Codes and Error Messages

The status codes and error messages in case of call failure are as follows:

HTTP Status CodeError MessageDescription
401No app_keyAPPKEY is not set
401No authorization headerAuthorization header is not set
401Invalid authorization header formatAuthorization header format is invalid
401Failed to authorize for the app_keyAuthentication with the specified APPKEY failed
404Specified session_id is not foundThe job with the specified session_id is not found. If this error occurs even when specifying the correct session_id, the following cases may be possible:
- When the data retention period has passed
- When a user different from the one at the time of request tries to retrieve the job status or results
500-Internal error. Please contact us here.