Skip to main content

Components of Word Registration

For word registration, you can specify the "notation", "pronunciation", and "class" of the word, among which "notation" and "pronunciation" are mandatory items. The following explains each component.

Item
Description
Required
Example
NotationThe string obtained as a result of speech recognition when the word is spoken.AmiVoice
PronunciationInformation representing how the word is pronounced. The method of describing the pronunciation differs for each language.あみぼいす
ClassA classification used to specify the category or type of the word. This classification allows the speech recognition system to distinguish words with the same pronunciation used in different contexts. Classes are defined for each engine, and API users cannot add classes.固有名詞

The English engine does not support word registration.

Overview of Word Registration

For example, if you want to register the word "パレオパラドキシア" because it's not being recognized, register the notation and pronunciation pair as follows. Separate the notation and pronunciation with a space. If you also want to set a class, please see How to Set Class.

パレオパラドキシア ぱれおぱらどきしあ
info

Setting multiple pronunciations for the same notation

You can set multiple pronunciations for one notation.

For example, you can set the notation "AMI" for pronunciation like "あみ" or "アドバンストメディア".

AMI あみ
AMI あどばんすとめでぃあ

Setting the same pronunciation for multiple notations

You can set the same pronunciation for multiple different notations. It won't cause an error, but which notation will be chosen is undefined. It is not recommended to set this intentionally.

For example, you can set notations like "AMI" and "AmiVoice" for the pronunciation "あみ".

AMI あみ
AmiVoice あみ

Notation

The "notation" is the string you want to output for the spoken audio.

Special Characters Usable in Notation

Among the characters that can be used in the notation, there are symbols that have special functions.

Character
Character Name
Description
_UnderscoreSymbol that outputs as a space in speech recognition results
note

It is not possible to output an underscore (_) as a speech recognition result.

Characters That Cannot Be Registered in Notation

Strings containing the following characters cannot be registered in the notation.

Character
Character Name
|Vertical bar
Space
:Colon
tip

While you cannot use spaces in the notation you're registering, if you use an underscore (_) in the notation when registering a word, it will be output as a space in the speech recognition results.*

For example, if you want to output "Advanced Media" when "あみ" is spoken, register the word as "Advanced_Media あみ".*

Advanced_Media あみ

Pronunciation

"Pronunciation" refers to how the word is pronounced (how it's spoken).

How to Describe Pronunciation for Each Language

The method of describing pronunciation differs for each language. The following explains the description method for each language.

Japanese

For Japanese, describe using hiragana or katakana.

Chinese

For Chinese, describe using pinyin with tones represented by numbers. For example, "我们" should be described as "wo3men5".

我们 wo3men5

Korean

For Korean, describe using Hangul.

Special characters for pronunciation

Among the characters that can be used in pronunciation. The following shows the special characters that can be used for each language.

Japanese

Character
Character name
Description
.Half-width periodSymbol for syllable separation and suppression of long vowels
_Half-width underscoreSymbol representing silence
tip
  • The AmiVoice Tech Blog explains how to use these special characters, particularly the half-width period for pronunciation. For details, please see the following:
    【For intermediate users】About automatic conversion of word pronunciation in AmiVoice

  • By using an underscore (_), even if there's a slight silence in the middle of a word, it's more likely to be recognized as a continuous word.
    For example, if you register "AmiVoice青果店 あみぼいす_せいかてん", even if there's a momentary pause between "あみぼいす" and "せいかてん", it's more likely to be recognized as "AmiVoice青果店".

AmiVoice青果店 あみぼいす_せいかてん

Chinese and Korean

Character
Character name
Description
_Half-width underscoreSymbol representing silence

Class

In AmiVoice API, a classification used to specify the category or type of a word is called a class. Classes allow the speech recognition system to distinguish words with the same pronunciation used in different contexts. Classes are defined for each speech recognition engine. For example, in the case of the "会話_汎用" engine (-a-general), the following classes are defined. For details, please see the list of class names for Japanese language models of the speech recognition engine.

  • 固有名詞
  • 名前
  • 名前(名)
  • 駅名
  • 地名
  • 会社名
  • 部署名
  • 役職名
  • 記号
  • 括弧開き
  • 括弧閉じ
  • 元号
note
  • In the "会話_汎用" engine, the 名前 class represents surnames, and 名前(名) represents first names.
  • If a non-existent class name is specified, it will be treated as if no class name was specified.

For example, if you specify a word as the "名前" class, that word will be more easily recognized in contexts where personal names are spoken. Conversely, it will be less likely to be recognized in contexts other than where personal names are spoken, which can reduce problems of incorrect recognition of words with the same pronunciation in different contexts. If there is a class that fits the word you are trying to register, please try to set the class whenever possible.

tip

If a full name is not recognized well even after registering it as a word, here are some strategies:*

  1. Split the name into "名前" for surname and "名前(名)" for given name, and register them in the 名前 class for surname and 名前(名) class for given name respectively.
    In this case, it becomes easier to recognize even if there's silence or a filler between the surname and given name when spoken. On the other hand, if you register other homophonic names, it becomes easier to misrecognize. (For example, if you register "山田" in the 名前 class, and "太郎" and "太朗" both with the pronunciation "たろう" in the 名前(名) class, when you want "山田太郎" for the full name to be recognized for the speech "やまだたろう", there's a possibility it might be recognized as "山田太朗".)

  2. Insert an underscore (_), which represents silence, between surname and given name in the pronunciation.
    In this case, it becomes easier to recognize even if there's a slight silence between the surname and given name. However, it won't be recognized correctly if there's a filler between the surname and given name.*

How to set a class

The class is specified following the "notation" and "pronunciation". For example, if the station name "アソーク駅" is not recognized, and you want to specify the class name as 駅名, write it as follows after a space:

アソーク駅 あそーくえき 駅名

Special Word Registration

There are special types of word registration that are only supported in some engines.

Filler Words

The "音声入力_氏名" engine is used for recognizing only names, and the "音声入力_住所" engine is used for recognizing only addresses. These engines do not have filler words preset, but if words other than names or addresses are also spoken, users can register filler words themselves as needed.

The classes that can be used for filler words are as follows:

Class Name
Description
フィラー(文頭)Class used when you want to insert words like "えー", "わたしは", "ぼくは" before a full name or surname, or words like "えーと", "住所は" before an address
フィラー(文末)Class used when you want to insert words like "です" or "ともうします" after a full name or first name, or words like "です" after an address

When registering filler words, please enclose the notation in half-width percent signs (%). For the pronunciation, write the word's pronunciation as usual without adding percent signs. For example, if you want to register the word "あのー" in the "フィラー(文頭)" class, write it as follows:

%あのー% あのー フィラー(文頭)
tip

In "フィラー(文頭)" or "フィラー(文末)", you can register not only general filler words like "えー" or "あのー", but also words (short phrases) that are used only at the beginning or end of a sentence and that you want to treat as fillers rather than names or addresses.

For example, in the "音声入力_氏名" engine, you register "私は" and "です" as fillers at the beginning and end of the sentence respectively.

%私は% わたしは フィラー(文頭)
%です% です フィラー(文末)

Now, let's say you speak "わたしは、やまだあみです".

わたしは、やまだあみです

The recognition result of this voice will be as follows, with the words registered as fillers automatically removed:

ヤマダアミ