Skip to main content

Introduction

AmiVoice API is a speech recognition API that converts audio into text. When you send audio to AmiVoice API, it returns the spoken content as text. This allows you to create speech-enabled applications such as meeting transcription or voice dialogue systems.

Figure. Overview of AmiVoice API

Quick Start

1

Get your APPKEY

Register as a user and find your APPKEY in the [Connection Info] section on the My Page dashboard. Set it as an environment variable with the following command.

export AMIVOICE_APPKEY=your_appkey_here
2

Prepare an audio file

Prepare an audio file to transcribe. You can use the sample audio (test.wav) below right away.

For supported audio file formats, see Audio Formats.

3

Run speech recognition

Run the following. Replace test.wav with the path to your audio file.

curl https://acp-api.amivoice.com/v1/recognize \
-F d=-a-general \
-F u=$AMIVOICE_APPKEY \
-F a=@test.wav | jq
4

Check the result

On success, you will receive JSON like the following. The text field contains the transcribed text.

{
"results": [
{
"tokens": [ ... ],
"confidence": 0.998,
"starttime": 250,
"endtime": 8794,
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。"
}
],
"utteranceid": "20220602/14/018122d637320a301bc194c9_20220602_141433",
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。",
"code": "",
"message": ""
}

For details about the response, see Speech Recognition Results.

Next Steps

For detailed API usage, refer to the following guides.

Advanced Features