User Guide

You can obtain speech recognition results by connecting to the speech recognition server's endpoint via HTTP or WebSocket and sending audio data along with request parameters. Here, we will explain the usage method for developers creating applications using the AmiVoice API.

Basic Functions

Generally, client applications performing speech recognition using the AmiVoice API need to implement the following:

Obtain audio data from a recording device or network
Convert audio data to a compatible format (not necessary if using a supported audio format)
Send audio data to the speech recognition API endpoint
Receive speech recognition results
Interpret and use the speech recognition results (e.g., display as captions on screen, understand intent to generate voice bot responses, use as input for summarization processing such as meeting minutes, etc.)

The following is an overview of the interaction between the client program and the speech recognition server.

Figure. AmiVoice API Overview

A. Appropriate Use of Interface

The AmiVoice API provides three speech recognition interfaces. We will explain the necessary features and intended use cases to help users choose the appropriate interface.

Appropriate Use of Interface

B. Request Method

To obtain speech recognition results, various settings need to be configured when making requests to the server, and audio files need to be sent.

Request Parameters explains the items that need to be set during the request.
For supported audio data, please see Audio Format.
For available speech recognition engines and supported languages, please see Speech Recognition Engines.

The method of sending requests differs depending on whether HTTP or WebSocket is used, so we will explain each interface in order.

For handling of logs on the server for sent data and speech recognition results, please see Logging.

C. Handling of Results

The speech recognition server provides text transcribed from the sent audio. For details on various information obtained in addition to the text, please see Speech Recognition Results. For error handling, please see Response Codes and Messages.

Advanced Features and More

We explain information for developing applications by making better use of the AmiVoice API, as well as client libraries, sample programs, and limitations.

Figure. AmiVoice API Overview

D. AmiVoice API Features

We explain various features of the AmiVoice API.

E. Client Libraries

We introduce client libraries for easily using the AmiVoice API from various languages.

Client Libraries

F. Sample Programs

We introduce sample programs in various programming languages using the AmiVoice API.

Sample Programs

G. Limitations

We explain limitations that should be known when using the AmiVoice API.

Limitations

Basic Functions​

A. Appropriate Use of Interface​

B. Request Method​

C. Handling of Results​

Advanced Features and More​

D. AmiVoice API Features​

E. Client Libraries​

F. Sample Programs​

G. Limitations​