Asynchronous HTTP Interface
The asynchronous HTTP interface is a non-blocking HTTP API for transcribing long audio files.
To use this API, follow these steps:
- Create a speech recognition job
- Poll to check the status of the speech recognition job and retrieve the results
About the New Version
We released a new version of the asynchronous HTTP interface (referred to as "v2") on July 15, 2025, which significantly speeds up speech recognition processing compared to the previous version. The recognition accuracy is the same as the conventional asynchronous HTTP interface (v1). We recommend using v2 unless there is a specific reason to use v1.
- The conventional v1 is scheduled to be discontinued around July 31, 2026. If you are currently using it, please migrate to v2. (Note that the end date may change)
- Currently, sentiment analysis functionality is not available in v2. If you need to use sentiment analysis, please use v1.
Features
The main features of v2 are as follows. Previously, the entire audio was processed serially, so the processing time was proportional to the length of the audio. In v2, the audio is divided into segments of a certain length, considering the boundaries of utterances, and processed in parallel, completing the process faster than before.
| Item | Conventional (v1) | New (v2) |
|---|---|---|
| Processing method | Serial processing of entire audio | Parallel processing by dividing into segments of certain length (considering utterance boundaries) |
| Approximate processing time | Proportional to audio length (about 0.3-1 times) | Almost constant time for audio data up to about 90 minutes (e.g., 90 minutes of audio ≈ about 5-6 minutes) |
| Speaker diarization | Proportional to audio length (about 5%) | Proportional to audio length (about 5%) |
- If speaker diarization is performed, it will take about the same amount of time as before, in addition to the speech recognition processing.
- Currently, the speech recognition processing time is almost constant for audio data up to about 90 minutes in length, and gradually increases for longer data. However, processing time may be longer depending on server congestion and other factors.
- For short audio data, the processing time may not be faster, or in some cases, may be slightly slower.
How to Use
Only the endpoint changes, everything else remains the same as v1.
(v1) https://acp-api-async.amivoice.com/v1/recognitions
(v2) https://acp-api-async.amivoice.com/v2/recognitions
Differences from the Conventional Version
In v2, responses to the following requests have changed:
- a. If there were no speech segments in the submitted audio data, v1 would succeed in job registration and return a result indicating no results after processing, but v2 will return a failure at the time of job registration.
- b. When requesting unsupported audio files or setting invalid parameters for speaker diarization, the error response will now also include "text" and "utteranceid".
Example of case b: v1
{
"results": [{ "tokens": [], "tags": [], "rulename": "", "text": "" }],
"code": "+",
"message": "received unsupported audio format"
}
v2
{
"text": "",
"utteranceid":"test_data",
"results": [{"tokens": [], "tags": [], "rulename": "", "text": ""}],
"code": "+",
"message": "received unsupported audio format",
}
How to Use
1. Create a speech recognition job
To create a job, set the request parameters in the same way as the synchronous HTTP interface, and send the request to the asynchronous HTTP interface endpoint.
For v2 (new version)
POST https://acp-api-async.amivoice.com/v2/recognitions
For v1 (conventional version)
POST https://acp-api-async.amivoice.com/v1/recognitions
The following explanation is primarily for v2. If you're using v1, please replace the endpoint accordingly. For example, to send a speech recognition request for the test.wav file no logging using the curl command:
curl https://acp-api-async.amivoice.com/v2/recognitions \
-F u={APP_KEY} \
-F d="grammarFileNames=-a-general loggingOptOut=True" \
-F a=@test.wav
While the endpoint differs from the synchronous HTTP interface, the method for setting request parameters is the same.
Some parameters, such as sentiment analysis, are only supported by the asynchronous HTTP interface.
Unlike the synchronous HTTP interface, logging or no logging is specified by request parameters, not by the endpoint.
Logging is enabled by default. For no logging, specify loggingOptOut=True in the d parameter.
In case of success
The successful response includes a sessionid. This is the job ID for the user's speech recognition request and is used to check the job status and obtain results.
The text always returns "...".
Example
{"sessionid":"017ac8786c5b0a0504399999","text":"..."}
In case of failure
The failed response does not include a sessionid. You can determine the cause of failure from the code and message.
Please see Response codes and messages.
Example
{
"results": [{ "tokens": [], "tags": [], "rulename": "", "text": "" }],
"text": "",
"code": "-",
"message": "received illegal service authorization"
}
2. Check the status of the speech recognition job and retrieve results
After successfully sending a speech recognition request, check the job status and poll until the status becomes completed or error.
Retrieving the job status
Jobs are executed sequentially on the server side. To check the job status or retrieve results, query the result retrieval endpoint GET /v2/recognitions/{session_id}.
Set the sessionid to the job ID obtained when creating the job. Specify the authentication information (authorization) of the request parameters in the Authorization header.
When executing with curl, do the following. Here, assume the sessionid is 017c25ec12c00a304474a999.
curl -H "Authorization: Bearer {APPKEY}" \
https://acp-api-async.amivoice.com/v2/recognitions/017c25ec12c00a304474a999
queued status
Immediately after sending the request, the status will be in the queued state.
{"service_id":"{YOUR_SERVICE_ID}","session_id":"017c25ec12c00a304474a999","status":"queued"}
started status
When the job is taken from the queue, the status becomes started.
It takes about several tens of seconds to several minutes to change from queued to started status, depending on server congestion, etc.
{"service_id":"{YOUR_SERVICE_ID}","session_id":"017c25ec12c00a304474a999","status":"started"}
processing status
When the actual speech recognition process begins, the status becomes processing. Please see the sample below. Here, the results are formatted with line breaks for readability.
You can use the size of the audio received by the API (audio_size) and the MD5 checksum (audio_md5) to verify that the audio was transmitted correctly.
The time it takes to go from processing to the next completed status depends on the length of the audio, but in the case of v2, the audio data is divided into chunks of a certain length for processing, and it generally takes about 0.5 to 1.5 times as long as each chunk's duration. The number of divisions and the length of each chunk depend on the length of the audio data and the duration of the speech it contains. If the speech duration is relatively short, it will be divided into chunks of about 1 minute in length.
Note that if speaker diarization or sentiment analysis is also performed, additional processing time will be required, so it will take longer than mentioned above.
{
"audio_md5":"40f59fe5fc7745c33b33af44be43f6ad",
"audio_size":306980,
"service_id":"{YOUR_SERVICE_ID}",
"session_id":"017c25ec12c00a304474a999",
"status":"processing"
}
completed status
When speech recognition is complete, the status becomes completed. At this time, you can obtain the speech recognition results in the results and segments of the response. The results are stored on the server for a certain period after the speech recognition server processing is completed. For the retention period, please see "Speech recognition result retention period" in the limitations of the asynchronous HTTP interface.
For details on the response including recognition results, please see Speech recognition results.
If you access results that have been deleted after a certain period, you will get a 404 NOT FOUND error. For errors, please see the Error response list in the reference.
error status
If speech recognition fails for some reason, the status becomes error. In this case, the cause of the error is set in error_messsage.
Example:
{
"status": "error",
"audio_md5":"40f59fe5fc7745c33b33af44be43f6ad",
"audio_size":306980,
"service_id":"{YOUR_SERVICE_ID}",
"session_id":"017c25ec12c00a304474a999",
"error_message": "ERROR: Failed to transcribe in recognition process - amineth_result=0, amineth_code='o', amineth_message='recognition result is rejected because confidence is below the threshold'"
}
The error_message may include amineth_code='{response code}' and amineth_message='{error message}'. For details, please see the table in the response codes and messages details.
Particularly, if the error message includes amineth_code='o', there is a problem with the client's request method or audio file, so retrying will yield the same result. For details, please see "Reject (Response code=o)". For errors other than 'o', it's likely an issue with the AmiVoice API infrastructure, so please wait a while before retrying.
Content ID
You can freely set a string in the contentId of the d parameter when making a request. For example, by setting information such as an ID issued by the application, a file name, or user information, you can later obtain this information as part of the recognition result.
For example, to send a request setting the file name as the contentId using the curl command:
curl https://acp-api-async.amivoice.com/v2/recognitions \
-F u={APP_KEY} \
-F d="grammarFileNames=-a-general loggingOptOut=True contentId=test.wav" \
-F a=@test.wav
When retrieving the job status or results, content_id will be included as follows:
{"content_id":"test.wav","service_id":"{YOUR_SERVICE_ID}","session_id":"017c25ec12c00a304474a999","status":"queued"}
Sample Code
Here's a Python sample code demonstrating the typical flow of the asynchronous HTTP interface.
Request Parameters
An AmiVoice API APPKEY is required for execution. Set your AmiVoice API APPKEY in the following line:
app_key = 'TODO: Please set APPKEY here'
Decide on the options to set in the d parameter. Here, we'll set the following:
- Engine: General purpose (
grammarFileNames=-a-general) - Logging: None (
loggingOptOut=True) - Content ID: File name (
contentId=filename) - Speaker diarization: Enabled (
speakerDiarization=True) - Number of speakers: Max=Min=2 (
diarizationMinSpeaker=2,diarizationMaxSpeaker=2) - Sentiment analysis: Enabled (
sentimentAnalysis=False)
Currently, sentiment analysis is not available in v2.
If you want to use sentiment analysis, please use v1 and set sentimentAnalysis=True.
domain = {
'grammarFileNames': '-a-general',
'loggingOptOut': 'True',
'contentId': filename,
'speakerDiarization': 'True',
'diarizationMinSpeaker': '2',
'diarizationMaxSpeaker': '2',
'sentimentAnalysis': 'False',
...
We're also registering two words. profileId is commented out, so the registered words will only be valid for this session. For details, please see Word Registration.
#'profileId': 'test',
'profileWords': 'wwww よんこだぶる|www2 とりぷるだぶる',
}
URL encode the values of the key-value pairs to be set in the d parameter. In Python, we use urllib.parse.quote.
params = {
'u': app_key,
'd': ' '.join([f'{key}={urllib.parse.quote(value)}' for key, value in domain.items()]),
}
logger.info(params)
params will look like this:
{'u': 'XXXX', 'd': 'grammarFileNames=-a-general loggingOptOut=True contentId=www.wav profileWords=wwww%20%E3%82%88%E3%82%93%E3%81%93%E3%81%A0%E3%81%B6%E3%82%8B%7Cwww2%20%E3%81%A8%E3%82%8A%E3%81%B7%E3%82%8B%E3%81%A0%E3%81%B6%E3%82%8B speakerDiarization=True diarizationMinSpeaker=2 diarizationMaxSpeaker=2 sentimentAnalysis=False'}
Request to Create a Speech Recognition Job
Send the above params and the audio file via HTTP POST. For readability in describing HTTP communication, we'll use the HTTP client library requests in this sample.
request_response = requests.post(
url=endpoint,
data={key: value for key, value in params.items()},
files={'a': (filename, open(filename, 'rb').read(), 'application/octet-stream')}
)
Check if the call was successful using the HTTP status code. Also, check if the job creation was successful by verifying the existence of sessionid in the response.
if request_response.status_code != 200:
logger.error(f'Failed to request - {request_response.content}')
exit(1)
request = request_response.json()
if 'sessionid' not in request:
logger.error(f'Failed to create job - {request["message"]} ({request["code"]})')
exit(2)
logger.info(request)
When job creation is successful, you'll get a response like this. Use the sessionid included in the response to check the job status and retrieve results.
{'sessionid': '01838d9535080a304474a07f', 'text': '...'}
Checking the Speech Recognition Job Status
Send an HTTP GET request to recognitions/{sessionid}. Poll until the status in the response becomes completed or error. Here, we're checking the result every 10 seconds.
while True:
# HTTP GET request to `recognitions/{sessionid}`
result_response = requests.get(
url=f'{endpoint}/{request["sessionid"]}',
headers={'Authorization': f'Bearer {app_key}'}
)
if result_response.status_code == 200:
result = result_response.json()
if 'status' in result and (result['status'] == 'completed' or result['status'] == 'error'):
# If the `status` in the response is `completed` or `error`, format and output the result
print(json.dumps(result, ensure_ascii=False, indent=4))
exit(0)
else:
# If the `status` in the response is not `completed` or `error`, the job is still running
# So, wait a bit (10 seconds here) before checking the status again
logger.info(result)
time.sleep(10)
else:
# If the HTTP response code is not 200, exit
logger.error(f'Failed. Response is {result_response.content} - {e}')
exit(3)
Code
Here's the complete Python code we've discussed so far.
import time
import json
import urllib
import logging
import requests
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(message)s")
endpoint = 'https://acp-api-async.amivoice.com/v2/recognitions'
app_key = 'TODO: Please set APPKEY here'
filename = 'www-2.wav'
# Request parameters
domain = {
'grammarFileNames': '-a-general',
'loggingOptOut': 'True',
'contentId': filename,
'speakerDiarization': 'True',
'diarizationMinSpeaker': '2',
'diarizationMaxSpeaker': '2',
'sentimentAnalysis': 'False',
#'profileId': 'test',
'profileWords': 'wwww よんこだぶる|www2 とりぷるだぶる',
}
params = {
'u': app_key,
'd': ' '.join([f'{key}={urllib.parse.quote(value)}' for key, value in domain.items()]),
}
logger.info(params)
# Send job request
request_response = requests.post(
url=endpoint,
data={key: value for key, value in params.items()},
files={'a': (filename, open(filename, 'rb').read(), 'application/octet-stream')}
)
if request_response.status_code != 200:
logger.error(f'Failed to request - {request_response.content}')
exit(1)
request = request_response.json()
if 'sessionid' not in request:
logger.error(f'Failed to create job - {request["message"]} ({request["code"]})')
exit(2)
logger.info(request)
# Check status every 10 seconds until results are ready
while True:
# HTTP GET request to `recognitions/{sessionid}`
result_response = requests.get(
url=f'{endpoint}/{request["sessionid"]}',
headers={'Authorization': f'Bearer {app_key}'}
)
if result_response.status_code == 200:
result = result_response.json()
if 'status' in result and (result['status'] == 'completed' or result['status'] == 'error'):
# If the `status` in the response is `completed` or `error`, format and output the result
print(json.dumps(result, ensure_ascii=False, indent=4))
exit(0)
else:
# If the `status` in the response is not `completed` or `error`, the job is still running
# So, wait a bit (10 seconds in this case) before checking the status again
logger.info(result)
time.sleep(10)
else:
# If the HTTP response code is not 200, exit
logger.error(f'Failed. Response is {result_response.content} - {e}')
exit(3)
How to Run
Make sure Python3 is installed on your system.
Install the required library:
pip install requests
Download the sample audio file (www-2.wav) and copy it to the same directory as the program.
This is an audio file of someone saying "トリプル・ダブルは、バスケットボールの記録に関する用語です。" In the sample code, we register the word "www2" for the pronunciation "とりぷるだぶる", so you can confirm that this is working effectively.
To run the sample program, execute the following from the command line:
python async-http-sample.py
The execution result will be as follows:
$ python sample.py
2025-06-24 11:13:03,932 {'u': 'XXXX', 'd': 'grammarFileNames=-a-general loggingOptOut=True contentId=www-2.wav speakerDiarization=True diarizationMinSpeaker=2 diarizationMaxSpeaker=2 sentimentAnalysis=False profileWords=wwww%20%E3%82%88%E3%82%93%E3%81%93%E3%81%A0%E3%81%B6%E3%82%8B%7Cwww2%20%E3%81%A8%E3%82%8A%E3%81%B7%E3%82%8B%E3%81%A0%E3%81%B6%E3%82%8B'}
2025-06-24 11:13:03,948 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:05,253 https://acp-api-async.amivoice.com:443 "POST /v2/recognitions HTTP/1.1" 200 55
2025-06-24 11:13:05,253 {'text': '...', 'sessionid': '01979fb5d7a30a305c0094c0'}
2025-06-24 11:13:05,253 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:05,551 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 113
2025-06-24 11:13:05,583 {'status': 'queued', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2025-06-24 11:13:15,585 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:15,779 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 114
2025-06-24 11:13:15,779 {'status': 'started', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2025-06-24 11:13:25,784 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:25,957 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 114
2025-06-24 11:13:25,957 {'status': 'started', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2025-06-24 11:13:35,970 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:36,150 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 114
2025-06-24 11:13:36,150 {'status': 'started', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2025-06-24 11:13:46,163 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:46,410 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 114
2025-06-24 11:13:46,413 {'status': 'started', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2025-06-24 11:13:56,415 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:13:56,582 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 114
2025-06-24 11:13:56,582 {'status': 'started', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2025-06-24 11:14:06,589 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:14:06,775 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:14:06,775 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:14:16,786 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:14:16,931 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:14:16,932 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:14:26,933 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:14:27,088 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:14:27,088 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:14:37,092 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:14:37,250 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:14:37,250 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:14:47,258 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:14:47,409 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:14:47,409 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:14:57,414 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:14:57,577 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:14:57,577 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:15:07,585 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:15:07,786 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:15:07,787 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:15:17,788 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:15:17,951 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 184
2025-06-24 11:15:17,951 {'status': 'processing', 'session_id': '01979fb5d7a30a305c0094c0', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2025-06-24 11:15:27,958 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2025-06-24 11:15:28,299 https://acp-api-async.amivoice.com:443 "GET /v2/recognitions/01979fb5d7a30a305c0094c0 HTTP/1.1" 200 1729
{
"status": "completed",
"session_id": "01979fb5d7a30a305c0094c0",
"service_id": "{YOUR_SERVICE_ID}",
"content_id": "www-2.wav",
"audio_size": 270444,
"audio_md5": "fd7d144824e8a5982d3aaa4cda5358a8",
"segments": [
{
"results": [
{
"tokens": [
{
"written": "www2",
"confidence": 1.0,
"starttime": 1620,
"endtime": 2548,
"spoken": "とりぷるだぶる",
"label": "speaker0"
},
{
"written": "は",
"confidence": 1.0,
"starttime": 2548,
"endtime": 2804,
"spoken": "は",
"label": "speaker0"
},
{
"written": "バスケットボール",
"confidence": 1.0,
"starttime": 2836,
"endtime": 3956,
"spoken": "ばすけっとぼーる",
"label": "speaker0"
},
{
"written": "の",
"confidence": 1.0,
"starttime": 3956,
"endtime": 4052,
"spoken": "の",
"label": "speaker0"
},
{
"written": "記録",
"confidence": 1.0,
"starttime": 4052,
"endtime": 4420,
"spoken": "きろく",
"label": "speaker0"
},
{
"written": "に",
"confidence": 1.0,
"starttime": 4420,
"endtime": 4532,
"spoken": "に",
"label": "speaker0"
},
{
"written": "関する",
"confidence": 1.0,
"starttime": 4532,
"endtime": 5060,
"spoken": "かんする",
"label": "speaker0"
},
{
"written": "用語",
"confidence": 1.0,
"starttime": 5060,
"endtime": 5412,
"spoken": "ようご",
"label": "speaker0"
},
{
"written": "です",
"confidence": 0.96,
"starttime": 5412,
"endtime": 5940,
"spoken": "です",
"label": "speaker0"
},
{
"written": "。",
"confidence": 0.96,
"starttime": 5940,
"endtime": 6244,
"spoken": "_",
"label": "speaker0"
}
],
"confidence": 1.0,
"starttime": 1300,
"endtime": 6244,
"tags": [],
"rulename": "",
"text": "www2はバスケットボールの記録に関する用語です。"
}
],
"text": "www2はバスケットボ ールの記録に関する用語です。"
}
],
"utteranceid": "20250624/11/01979fb6df530a30677b39d0_20250624_111411",
"text": "www2はバスケットボールの記録に関する用語です。",
"code": "",
"message": ""
}
When sentiment analysis is enabled in v1
$ python sample.py
2022-12-06 15:01:03,336 {'u': 'XXXX', 'd': 'grammarFileNames=-a-general loggingOptOut=True contentId=www-2.wav speakerDiarization=True diarizationMinSpeaker=2 diarizationMaxSpeaker=2 sentimentAnalysis=True profileWords=wwww%20%E3%82%88%E3%82%93%E3%81%93%E3%81%A0%E3%81%B6%E3%82%8B%7Cwww2%20%E3%81%A8%E3%82%8A%E3%81%B7%E3%82%8B%E3%81%A0%E3%81%B6%E3%82%8B'}
2022-12-06 15:01:03,345 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:04,117 https://acp-api-async.amivoice.com:443 "POST /v1/recognitions HTTP/1.1" 200 55
2022-12-06 15:01:04,119 {'sessionid': '0184e605ff170a306b8f9c96', 'text': '...'}
2022-12-06 15:01:04,122 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:04,309 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 112
2022-12-06 15:01:04,312 {'status': 'queued', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:01:14,328 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:14,517 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 112
2022-12-06 15:01:14,519 {'status': 'queued', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:01:24,523 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:24,718 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 112
2022-12-06 15:01:24,721 {'status': 'queued', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:01:34,728 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:34,886 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 112
2022-12-06 15:01:34,888 {'status': 'queued', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:01:44,940 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:45,114 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 112
2022-12-06 15:01:45,118 {'status': 'queued', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:01:55,124 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:01:56,735 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 112
2022-12-06 15:01:56,736 {'status': 'queued', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:02:06,743 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:02:06,940 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 113
2022-12-06 15:02:06,942 {'status': 'started', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:02:16,948 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:02:17,108 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 113
2022-12-06 15:02:17,109 {'status': 'started', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav'}
2022-12-06 15:02:27,114 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:02:27,281 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 183
2022-12-06 15:02:27,283 {'status': 'processing', 'session_id': '0184e605ff170a306b8f9c96', 'service_id': '{YOUR_SERVICE_ID}', 'content_id': 'www-2.wav', 'audio_size': 270444, 'audio_md5': 'fd7d144824e8a5982d3aaa4cda5358a8'}
2022-12-06 15:02:37,290 Starting new HTTPS connection (1): acp-api-async.amivoice.com:443
2022-12-06 15:02:37,476 https://acp-api-async.amivoice.com:443 "GET /v1/recognitions/0184e605ff170a306b8f9c96 HTTP/1.1" 200 2481
{
"status": "completed",
"session_id": "0184e605ff170a306b8f9c96",
"service_id": "{YOUR_SERVICE_ID}",
"content_id": "www-2.wav",
"audio_size": 270444,
"audio_md5": "fd7d144824e8a5982d3aaa4cda5358a8",
"segments": [
{
"results": [
{
"tokens": [
{
"written": "www2",
"confidence": 1,
"starttime": 1620,
"endtime": 2548,
"spoken": "とりぷるだぶる",
"label": "speaker0"
},
{
"written": "は",
"confidence": 1,
"starttime": 2548,
"endtime": 2788,
"spoken": "は",
"label": "speaker0"
},
{
"written": "バスケットボール",
"confidence": 1,
"starttime": 2916,
"endtime": 3956,
"spoken": "ばすけっとぼーる",
"label": "speaker0"
},
{
"written": "の",
"confidence": 0.99,
"starttime": 3956,
"endtime": 4052,
"spoken": "の",
"label": "speaker0"
},
{
"written": "記録",
"confidence": 1,
"starttime": 4052,
"endtime": 4404,
"spoken": "きろく",
"label": "speaker0"
},
{
"written": "に",
"confidence": 1,
"starttime": 4404,
"endtime": 4532,
"spoken": "に",
"label": "speaker0"
},
{
"written": "関する",
"confidence": 1,
"starttime": 4532,
"endtime": 5060,
"spoken": "かんする",
"label": "speaker0"
},
{
"written": "用語",
"confidence": 1,
"starttime": 5060,
"endtime": 5412,
"spoken": "ようご",
"label": "speaker1"
},
{
"written": "です",
"confidence": 0.96,
"starttime": 5412,
"endtime": 5940,
"spoken": "です",
"label": "speaker0"
},
{
"written": "。",
"confidence": 0.8,
"starttime": 5940,
"endtime": 6196,
"spoken": "_",
"label": "speaker0"
}
],
"confidence": 1,
"starttime": 1300,
"endtime": 6196,
"tags": [],
"rulename": "",
"text": "www2はバスケットボールの記録に関する用語です。"
}
],
"text": "www2はバスケットボールの記録に関する用語です。"
}
],
"utteranceid": "20221206/15/0184e60741170a30522339d0_20221206_150225[nolog]",
"text": "www2はバスケットボールの記録に関する用語です。",
"code": "",
"message": "",
"sentiment_analysis": {
"segments": [
{
"starttime": 1680,
"endtime": 2860,
/* Sentiment Parameters */
},
{
"starttime": 3520,
"endtime": 4900,
/* Sentiment Parameters */
}
]
}
}
The text is the result text, which shows "www2はバスケットボールの記録に関する用語です。" This indicates that the speech content has been correctly recognized. Also, "トリプルダブル" has been converted to "www2", confirming that the word registration is working effectively. For details on the results, please see Speech Recognition Results.
Troubleshooting
received illegal service authorization
If you see the following message, it's possible that the AmiVoice API APPKEY has not been set:
2022-10-11 10:10:44,928 Failed to create job - received illegal service authorization (-)
Please check if the APPKEY is set and correct in the following part of the code:
app_key = 'TODO: Please set APPKEY here'
Please also see the Request Parameters section on this page.
No such file or directory: 'www-2.wav'
If you see the following message, the audio file does not exist in the execution directory:
FileNotFoundError: [Errno 2] No such file or directory: 'www-2.wav'
Please download the sample audio file (www-2.wav) and copy it to the directory where you're running the command. After confirming the file exists, please try running the command again.
Other Documentation
- For the API reference, please see Asynchronous HTTP Interface.