Speech Recognition Engines
AmiVoice API provides multiple speech recognition engines tailored for various languages and purposes. By selecting the most suitable speech recognition engine for the audio you want to recognize, you can improve accuracy. This section explains the languages supported by the speech recognition engines, types of engines, and key points for choosing the appropriate one.
List of Speech Recognition Engines
AmiVoice API offers various speech recognition engines. Please also see Difference between End to End and Hybrid.
End to End
This is a new generation of speech recognition engines.
| Language | Engine Name | Supported Sampling Rates | Connection Engine Name |
|---|---|---|---|
| Japanese | 日本語E2E_汎用 | 8k / 16k | -a2-ja-general |
| Chinese | 中国語E2E_汎用 | 8k / 16k | -a2-zh-general |
| Multilingual | 多言語E2E_汎用 | 8k / 16k | -a2-multi-general |
| Japanese | 日本語E2E_汎用バッチ | 8k / 16k | -a2b-ja-general |
| Chinese | 中国語E2E_汎用バッチ | 8k / 16k | -a2b-zh-general |
| Multilingual | 多言語E2E_汎用バッチ | 8k / 16k | -a2b-multi-general |
- The multilingual engine can transcribe audio containing multiple languages, each in its respective language. The supported languages are Japanese, English, and Chinese.
- Batch engines are optimized for batch processing where response speed is not critical. Use these engines when you prioritize accuracy. In particular, specify batch engines for asynchronous HTTP interfaces.
Hybrid
These are speech recognition engines optimized for various domains.
| Language | Engine Name | Language Model | Supported Sampling Rates | Connection Engine Name |
|---|---|---|---|---|
| Japanese | 会話_汎用 | General | 8k / 16k | -a-general |
| Japanese | 会話_医療 | Medical Conference | 16k | -a-medical |
| Japanese | 会話_金融 | Finance | 16k | -a-bizfinance |
| Japanese |