s Command Packet / s Command Response Packet
The s command packet and s command response packet are paired. When you notify the server of the start of voice data transmission with the s command, the server returns an s command response packet.
If the s command response packet is a single "s" character, it indicates success. You can then begin supplying voice data with p command packets. When all voice data transmission is complete, notify the end of transmission with an e command packet. If the connection is still maintained after that, you can restart again from the s command packet.
s Command Packet
Starts voice data transmission. In addition to notifying the start of transmission, you need to send the format of the voice to be transmitted, the speech recognition engine (connection engine name) you want to use, authentication information (APPKEY), and other parameters.
Format
Type TEXT
s <audio_format> <grammar_file_names>
s <audio_format> <grammar_file_names> <key>=<value> ...
The delimiter between s and each parameter block is a single space. Details of each parameter are explained below.
<audio_format>
Specifies the audio format to be sent. This parameter is required. For specifiable format names, please see About Audio Formats in the usage guide.
<grammar_file_names>
Specifies the connection engine name. This parameter is required.
<key>
The following key strings can be specified for <key>. If <value> contains spaces, enclose the value in double quotes "".
<key> | Description |
---|---|
authorization | Set the [APPKEY] listed on your My Page or the one-time APPKEY obtained from the One-time APPKEY Issuance API for authorization . This parameter is required. |
profileId | profileId is the ID of the user-specific data file (profile) for registering user-specific words. Only that user can use the specified profile. Profiles are stored in user-specific areas, so there is no name collision with other users. * For profileId , specify a string consisting of alphanumeric characters, "-" (hyphen), and "_" (underscore). However, strings starting with "__" (two underscores) are reserved by the speech recognition engine, so do not specify strings starting with "__" (two underscores). When you register words in the word registration on My Page, a profile (My Word List) with the same name as the Service ID (automatically generated from the User ID) listed in the connection information on My Page is automatically created and saved on the server. To use this profile (My Word List) for speech recognition, specify "a string with a colon ":" added to the beginning of the Service ID" for profileId . (Example) If the Service ID is "aiueo12345", set ":aiueo12345" as the value for profileId . Word Registration The easiest way to register words is through the word registration screen on My Page. With this method, you rarely need to use the profileWords described later. For the relationship between profiles (My Word List) created on the word registration screen on My Page and profileWords , please see profileId and profileWords on this page. When specifying both profileId and profileWords , you need to specify profileId first. |
profileWords | Set the notation and reading of the words you want to register in the profileWords parameter. Register as "notation<space> reading". For multiple registrations, separate with " | ". When sending, enclose the entire value of the profileWords parameter in "". * To specify a class name, use "notation<space> reading<space> class name". |
keepFillerToken | If you want to include filler words (such as "あー" or "えー") in the recognition result string, specify as follows: keepFillerToken=1 There are various filler words, but all are enclosed in % at the beginning and end of the notation. (Example) %あー% %えー% %おー% %えーっと% If keepFillerToken=1 is not specified, filler words are removed from the recognition result string. |
segmenterProperties | These are parameters for speech detection. You can control the enabling/disabling of speaker diarization and set parameters to adjust the results. Multiple parameters can be set separated by spaces. segmenterProperties="key1=value1 key2=value2..." The following parameters can be set: useDiarizer Setting 1 enables speaker diarization. ・Specifiable values: 0 or 1 ・Default value: 0 diarizerAlpha This parameter controls the ease of appearance of new speakers. The larger the value, the easier it is for new speakers to appear, and the smaller the value, the harder it is for new speakers to appear. This is only effective when useDiarizer=1 .・Specifiable values: 0 or more ・Recommended range: 1e-100 to 1e50 ・Default value: 1 diarizerTransitionBias This parameter controls the ease of speaker switching. The larger the value, the easier it is for speakers to switch, and the smaller the value, the harder it is for speakers to switch. This is only effective when useDiarizer=1 .・Specifiable values: 0 or more and less than 1 ・Recommended range: 1e-150 to 1e-10 ・Default value: 1e-40 (1e-20 for 8k) |
s command to use words registered on My Page
s MSB16K -a-general profileId=:<UserID> authorization=XXXXXXXXXXXXXX
s command to temporarily register words for this session
s MSB16K -a-general profileWords="AMI あみ|AmiVoice あみぼいす" authorization=XXXXXXXXXXXXXX
In the above example, profileId
is not specified, and "AMI あみ|AmiVoice あみぼいす" (2 words) are registered. If you continue to send voice data within this session, these words will be used in the recognition process. After sending the e command packet (end of session), these words become invalid and are not saved.
s command to create a custom profile and register words
s MSB16K -a-general profileId=test profileWords="AMI あみ|AmiVoice あみぼいす" authorization=XXXXXXXXXXXXXX
In the above example, profileId
is 'test', and the registered words are "AMI あみ|AmiVoice あみぼいす" (2 words).
*You cannot manage custom profiles on the word registration screen of My Page.
How to save words to a custom profile
After sending the s command with profileId
and profileWords
added, send the e command packet without sending voice data. When the stop of voice data transmission is accepted, the words will be saved to the specified profileId
.
To save and register words in a profile, you need to send all the words you want to save and register each time. If you want to add different words later after registering words once, you need to send all the words to be added along with the previously registered words.
Saving words to a profile is a "complete replacement". It is the user's responsibility to know what words are currently registered in the profile and what words have been registered in the past.
- With word registration on My Page, you can check the currently registered words on the screen, and users can choose to add from CSV files or completely replace.
- For connections to save words to a custom profile, do not add ":" (colon) to the beginning of
profileId
. If you add a colon to the beginning, words will not be saved to that profile, so please be careful.
How to use registered words
To perform speech recognition using words registered in a profile, specify the profileId
of the profile where word registration was previously done, with ":" (colon) added to the beginning, when sending the s command packet.
If multiple members are using it and profileId
is specified without adding ":" to the beginning, recognition accuracy may decrease.
How to delete registered words
If you want to delete registered words, specify a single space for profileWords
after specifying profileId
when sending the s command.
s command to use a custom profile for speech recognition
s MSB16K -a-general profileId=:test authorization=XXXXXXXXXXXXXX
s Command Response Packet
This is sent from the server to the client in response to the s command.
Format
Type TEXT
Response packet for successful start request
If the start request is successful, a single s character is returned.
s
Response packet for failed start request
If the start request fails, an error message is returned after s with a single space in between.
s <error_message>
Error Messages
Client Errors
These are errors due to incorrect request parameters or authentication information in the s
command. Please correct and resend the request.
Error Message | Content |
---|---|
s received unsupported audio format | There was an error in the specified audio format. |
s can't verify service authorization | Authentication failed. This is due to one of the following reasons: - APPKEY is not set - The set APPKEY is incorrect - Accessed from an IP address not allowed by the One-time APPKEY |
s service authorization has expired: <expirationTime> <expiresIn> | The expiration time limited by the One-time APPKEY has passed. |
s can't connect to recognizer server | Authentication failed. The One-time APPKEY is invalid. |
s can't connect to recognizer server (can't find available servers) | Connection failed because an appropriate engine could not be found from the specified connection engine name or audio format. |
s can't start feeding audio data to recognizer server | The process to start sending audio data failed due to an error in the specified segmenter parameters. |
s can't connect to recognizer server (can't find available servers because requested dictation grammar file name is invalid) | Connection failed because the specified connection engine name was invalid. |
Server Errors
These are errors that may rarely occur due to infrastructure system failures. Please wait for a while and resend the request.
Error Message | Content |
---|---|
s can't connect to recognizer server (can't connect to server)" | Could not connect to the speech recognition server. |
s can't connect to recognizer server (can't find available servers because all requested servers are busy) | Could not connect because all appropriate speech recognition servers for the specified connection engine name or audio format were busy. |
s can't connect to recognizer server (can't send data) | Connection failed due to communication error between servers in the infrastructure system. |
s can't connect to recognizer server (can't receive data) | Connection failed due to communication error between servers in the infrastructure system. |
s can't connect to recognizer server (disconnected by force) | Connection failed due to communication error between servers in the infrastructure system. |
Errors Due to Limitations
These occur when limitations are violated. Please retry from the s
command request.
Error Message | Content |
---|---|
s session timeout occurred | A session timeout occurred. This occurs when the maximum session time in the limitations is exceeded. The server has initiated the disconnection process. |