User Guide
You can obtain speech recognition results by connecting to the speech recognition server's endpoint via HTTP or WebSocket and sending audio data along with request parameters. Here, we will explain the usage method for developers creating applications using the AmiVoice API.
Basic Functions
Generally, client applications performing speech recognition using the AmiVoice API need to implement the following:
- Obtain audio data from a recording device or network
- Convert audio data to a compatible format (not necessary if using a supported audio format)
- Send audio data to the speech recognition API endpoint
- Receive speech recognition results
- Interpret and use the speech recognition results (e.g., display as captions on screen, understand intent to generate voice bot responses, use as input for summarization processing such as meeting minutes, etc.)
The following is an overview of the interaction between the client program and the speech recognition server.
A. Appropriate Use of Interface
The AmiVoice API provides three speech recognition interfaces. We will explain the necessary features and intended use cases to help users choose the appropriate interface.
B. Request Method
To obtain speech recognition results, various settings need to be configured when making requests to the server, and audio files need to be sent.
- Request Parameters explains the items that need to be set during the request.
- For supported audio data, please see Audio Format.
- For available speech recognition engines and supported languages, please see Speech Recognition Engines.
The method of sending requests differs depending on whether HTTP or WebSocket is used, so we will explain each interface in order.
For handling of logs on the server for sent data and speech recognition results, please see Logging.
C. Handling of Results
The speech recognition server provides text transcribed from the sent audio. For details on various information obtained in addition to the text, please see Speech Recognition Results. For error handling, please see Response Codes and Messages.
Advanced Features and More
We explain information for developing applications by making better use of the AmiVoice API, as well as client libraries, sample programs, and limitations.
D. AmiVoice API Features
We explain various features of the AmiVoice API.
E. Client Libraries
We introduce client libraries for easily using the AmiVoice API from various languages.
F. Sample Programs
We introduce sample programs in various programming languages using the AmiVoice API.
G. Limitations
We explain limitations that should be known when using the AmiVoice API.