Skip to main content

Response Codes and Messages

This section explains the responses when processing fails.

HTTP Interface

Synchronous HTTP and Asynchronous HTTP Speech Recognition Job Creation Request

When a speech recognition request fails, the response code and error message indicating the cause of the speech recognition failure are set in the code and message at the top level of the speech recognition result. For details, please see the table below.

Example:

{
"results": [
{
"tokens": [],
"tags": [],
"rulename": "",
"text": ""
}
],
"text": "",
"code": "-",
"message": "received illegal service authorization"
}

For synchronous HTTP, when recognition processing is successful, code and message will be empty strings ("").

Example:

...
"utteranceid": "20220602/14/018122d65d370a30116494c8_20220602_141442",
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。",
"code": "",
"message": ""
...

For asynchronous HTTP, when the job creation request is successful, code and message are not returned.

Example:

{"sessionid":"017ac8786c5b0a0504399999","text":"..."}

Asynchronous HTTP Result and Status Retrieval

For asynchronous HTTP, even if the speech recognition request is successful, processing may be interrupted for some reason during the speech recognition job processing. In this case, when retrieving the result and status, state becomes error, and error_message is set to an error message indicating the cause of the failure. error_message may include a response code (dsrh_code) and output message (amineth_message) indicating the cause of failure in the speech recognition process. For the meaning of response codes and messages, please see the table in Response Codes and Messages Details.

Example:

{
"status": "error",
"audio_md5":"40f59fe5fc7745c33b33af44be43f6ad",
"audio_size":306980,
"service_id":"{YOUR_SERVICE_ID}",
"session_id":"017c25ec12c00a304474a999",
"error_message": "ERROR: Failed to transcribe in recognition process - amineth_result=0, amineth_code='o', amineth_message='recognition result is rejected because confidence is below the threshold'"
}

Client Errors

When the response code in the Response Codes and Messages Details table is >, o, +, -, or %, it is a client error. Please note that if the cause of the error is not resolved, retrying will yield the same result.

Other errors are likely to be infrastructure issues, so please wait for a while and then retry.

Audio Data Transmission Failure (Response Code >)

Cause

This error occurs in the following cases:

  • When the audio data sent via synchronous/asynchronous HTTP interface does not contain audio data
Countermeasure

If this error occurs, please check the following:

  • Whether audio data is being sent
  • Whether the sent audio is not an empty file (zero-byte file)
  • Whether the body of the sent container format contains audio data

Even if data is being sent, if it does not contain audio, it will return o as described later.

Reject (Response Code o)

Cause

This response is returned in the following cases:

  • Could not detect speech from the audio.
  • Speech was detected from the audio, and as a result of speech recognition, the confidence was below the confidence threshold.
  • Speech was detected from the audio, and as a result of speech recognition, there were no characters that could be output. (Supplemented below)
note

Regardless of which cause above, the error message is always recognition result is rejected because confidence is below the threshold.

note

There are no characters that can be output in the speech recognition result in the following cases:

  • All audio was recognized as fillers like "あー" or "えーと" and automatically deleted
  • All audio was estimated to be noise

However, when keepFillerToken is set to 1, fillers will be output.

When transcribing with AmiVoice API, a two-stage pipeline process of speech detection and speech recognition is performed. If speech detection is not performed, speech recognition will not be performed. Even in the speech detection phase, whether it is a human voice is determined by a deep learning model, not just by volume, but as a result of speech recognition processing, it may ultimately be estimated as noise.

Countermeasure

If this error occurs, please check the following:

Whether the sent audio data contains speech

Please check the sent audio data. For example, speech cannot be detected from the following audio data. There may be a program bug or a problem with the recording system.

  • Silence
  • Contains only noise
  • For stereo audio, speech is only in the second channel (only the first channel of multi-channel audio is the target of recognition. Please see Stereo in the audio format)

Also, speech may not be detected from unclear and low-volume audio, such as when recording very far from the sound source.

Whether an incorrect audio format is specified in the request

When sending audio without headers, if an incorrect audio format is specified at the time of request, speech detection and speech recognition processing will not be performed correctly. Please check if the audio format in the request is correctly set for the sent audio data.

When speaking correctly using rule grammar

(If you are not using rule grammar, you do not need to consider this case) If you receive this error even though you are speaking correctly, you can lower the confidence threshold. However, lowering the confidence threshold increases the possibility of accepting incorrect speech.

tip

In some cases, such as when the actually sent audio does not contain speech, it may not be necessary to consider o as an error.

Response Codes and Messages Details

codemessageDescription
+received unsupported audio formatReceived audio data in an unsupported audio data format
-received illegal service authorizationReceived an invalid APPKEY (service authentication key string)
!failed to connect to recognizer serverCommunication failure within the speech recognition server (failed to connect to DSRM or DSRS)
>failed to send audio data to recognizer serverCommunication failure within the speech recognition server (failed to send audio data to DSRS) Please also see Audio Data Transmission Failure.
<failed to receive recognition result from recognizer serverCommunication failure within the speech recognition server (failed to receive recognition results from DSRS)
#received invalid recognition result from recognizer serverCommunication failure within the speech recognition server (invalid format of recognition results received from DSRS)
$timeout occurred while receiving audio data from clientNo communication timeout occurred while receiving audio data from the client
%received too large audio data from clientThe number of bytes of audio data received from the client is too large (does not occur with WebSocket interface)
orecognition result is rejected because confidence is below the thresholdRecognition failed because the overall confidence of the recognition result was below the confidence threshold.
This error is also returned when no speech could be detected from the entire received audio data, or when all results were fillers and there were no recognition results to respond with. Please also see Reject.
brecognition result is rejected because recognizer server is busyRecognition failed because the speech recognition server is busy
xrecognition result is rejected because grammar files are not loadedRecognition failed because the dictionary is not loaded
crecognition result is rejected because the recognition process is cancelledRecognition failed because a recognition process interruption request was made
?recognition result is rejected because fatal error occurred in recognizer serverRecognition failed because a fatal error occurred during recognition on the speech recognition server
^invalid parameter (...)An invalid parameter was specified. Only for Asynchronous HTTP Interface.