跳至主要内容

WebSocket 接口

WebSocket 连接建立后,您可以通过文本消息发送语音识别请求,并逐步接收响应。您可以逐步发送实时录音的语音流等音频数据,并逐步获取识别结果。

处理步骤如下:

  1. 建立 WebSocket 连接
  2. 发送语音识别请求
  3. 发送音频数据
  4. 获取状态事件
  5. 获取语音识别结果
  6. 通知音频数据发送结束
  7. 断开 WebSocket 连接
备注

我们提供了客户端库,隐藏了 WebSocket 接口的细节,可以轻松创建实时语音识别应用程序。使用方法请参阅实时语音识别库 Wrp 的使用方法

使用语音流的应用程序在建立 WebSocket 连接后,会接收命令和命令响应(spe),以及根据服务器处理发送的事件(SECUA),并进行实现。一般流程如下:

图. 命令和事件

下面将逐步说明实现方法。关于命令和命令响应(spe)、根据服务器处理发送的事件(SECUA)的详细信息以及响应,请参阅流式传输的响应

使用方法

1. 建立 WebSocket 连接

通过 WebSocket 连接到语音识别服务器。此时,您可以选择以下两个 endpoint 之一来选择是否允许保存日志:

wss://acp-api.amivoice.com/v1/     (保存日志)
wss://acp-api.amivoice.com/v1/nolog/ (不保存日志)

关于日志保存,请参阅日志保存

与服务器的通信通过文本消息进行。这里我们使用 Python 代码来说明,但使用其他语言也可以在建立 WebSocket 连接后通过发送和接收文本消息来进行实时语音识别。

这里我们使用 Python 的 websocket-client 库来简化 WebSocket 操作。我们使用 AmiVoice API 的 WebSocket 接口的保存日志 endpoint 通过 WebSocket 进行连接。当与服务器建立 WebSocket 连接时,会调用 on_open,当从服务器接收到消息时,会调用 on_message。我们将在说明与语音识别服务器通信的过程中向这些方法添加处理。

import websocket
import logging


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(threadName)s %(message)s")


def on_open(ws):
logger.info("open")

def on_message(ws, message):
logger.info(f"message: {message}")

def on_close(ws):
logger.info("close")


ws = websocket.WebSocketApp('wss://acp-api.amivoice.com/v1/',
on_open=on_open,
on_message=on_message,
on_close=on_close)
ws.run_forever()
警告

在基础系统非常繁忙的情况下,WebSocket 连接可能会极少数失败。在这种情况下,请尝试多次重试,直到连接成功。

2. 发送语音识别请求

WebSocket 连接成功后,发送 s 命令。s 命令的格式如下:

s <audio_format> <grammar_file_names> <key>=<value> ...

audio_format 中指定该会话中要发送的音频的音频格式。在 grammar_file_name 中指定请求参数中的连接引擎名称。然后以 <key>=<value> 的形式设置认证信息 (authorization),如 authorization={APPKEY}。在 <key>=<value> 中可以设置其他请求参数

考虑使用通用引擎(-a-general)转录随样本提供的音频文件(test.wav)。由于该音频文件的采样率为 16kHz,且为 wav 容器文件,因此在 audio_format 中指定 16K。详细信息请参阅带 header 的音频文件的情况。在 grammar_file_name 中设置最通用的 -a-general。在 WebSocket 连接时的处理程序 on_open 中添加以下代码:

APPKEY='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

def on_open(ws):
logger.info("open")
command = f"s 16K -a-general authorization={APPKEY}"
logger.info(f"send> {command}")
ws.send(command)

要设置其他请求参数,请以 <key1>=<value1> <key2>=<value2>... 的形式添加到 s 命令中。这里,我们添加了两个参数到 s 命令行中,以显示不必要的词语如"あのー"和"えーっと"的 keepFillerToken,并将更新事件的发送时间 resultUpdatedInterval 更改为 1 秒。

def on_open(ws):
logger.info("open")
command = f"s 16K -a-general authorization={APPKEY} keepFillerToken=1 resultUpdatedInterval=1000"
logger.info(f"send> {command}")
ws.send(command)

如果要在请求参数中设置的值包含空格,请用双引号将 value 括起来,如 "value"。例如,在 segmenterProperties 中设置多个参数时,可以如下所示:

segmenterProperties="useDiarizer=1 diarizerAlpha=1e-20 diarizerTransitionBias=1e-10"

on_open 如下所示:

def on_open(ws):
logger.info("open")
command = f"s 16K -a-general authorization={APPKEY} segmenterProperties=\"useDiarizer=1 diarizerAlpha=1e-20 diarizerTransitionBias=1e-10\""
logger.info(f"send> {command}")
ws.send(command)

详细信息请参阅参考文档中的 s 命令数据包

响应

当接受语音识别请求时,服务器会返回文本消息,即 s 命令响应数据包。

成功时
s
失败时

在 s 后面加上半角空格,然后接收错误消息。关于错误类型,请参阅参考文档中的 s 命令响应数据包

s 错误消息

例:

s received unsupported audio format
警告

在基础系统非常繁忙的情况下,可能会极少数返回如下错误。在这种情况下,请尝试多次重试,直到 s 命令成功。 有关错误消息的详细信息,请参阅参考文档中的 s 命令数据包 服务器错误。另外,请参阅后面的客户端程序状态转换

s can't connect to recognizer server (can't connect to server)

为了处理 s 命令的响应,在 on_message 中添加以下代码:

def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
if content != "":
logger.error(content)
return
# s 命令成功

3. 发送音频数据

s 命令成功后,可以发送音频文件。使用 p 命令发送二进制的音频数据。p 命令的格式如下:

p<audio_data>

audio_data 是二进制的音频数据。请设置符合会话开始时 s 命令中指定的音频格式的音频数据。

信息

请发送符合 s 命令中指定的音频格式的音频数据。即使格式不同,也不会出现错误,但响应可能会非常耗时,或无法获得正确的识别结果。

备注
  • 一次 p 命令可以发送的音频数据大小最大为 16MB。如果数据大小超过此值,请进行分割。
  • 音频数据可以在任意位置分割,无需考虑 wav 的块或 mp3/ogg/opus 等的帧边界。
  • 中途无法更改要发送的数据格式。如果要更改音频格式,请使用 e 命令结束,然后从 s 开始重新发送语音识别请求。对于 带 header 的音频文件,同样需要对每个文件使用 e 命令结束,然后从 s 开始重新发送语音识别请求。

s 命令响应的处理程序中,发送随样本提供的音频文件(test.wav)。在 on_message 中添加以下内容。这里,我们模拟从麦克风录音的情况,以实时相同的时间间隔发送数据,所以使用了 sleep。

import time
import threading


def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
if content != "":
logger.error(content)
return

# s 命令成功
# 如果请求成功,则向服务器发送音频文件数据
def send_audio(*args):
with open(filename, mode='rb') as file:
buf = file.read(audio_block_size)
while buf:
logger.info("send> p [..({} bytes)..]".format(len(buf)))
ws.send(b'p' + buf,
opcode=websocket.ABNF.OPCODE_BINARY)
buf = file.read(audio_block_size)
# test.wav 为 16 位,16kHz,所以每秒 32,000 字节。每次发送 16,000 字节,等待 0.5 秒,就与实时相同
time.sleep(0.5)
logger.info("send> e")
# 音频发送完毕后,发送 e 命令
ws.send('e')
threading.Thread(target=send_audio).start()

4. 获取状态事件

发送音频后,会收到 G、发话开始(S)、发话结束(E)、语音识别处理开始(C)的事件。G 是通知服务器端生成的信息的事件,请忽略。随 SE 事件,可以获得以音频开头为 0 的相对时间(以毫秒为单位)。

例如,发送音频文件(test.wav)时从服务器收到的事件如下:

G
S 250
C
E 8800

为了处理状态事件,在 on_message 中添加以下代码。这里添加的代码不会产生任何作用,但如果要使用发话开始和发话结束的时间,请添加适当的处理。

def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
# ...省略...
elif event == 'G':
pass
elif event == 'S':
starttime = int(content)
elif event == 'E':
endtime = int(content)
elif event == 'C':
pass
备注

如果无法从音频数据中检测到语音,则不会获得这些事件,也不会获得语音识别结果事件。可能有以下原因,请进行确认:

  • 完全没有包含语音,或者音量非常小。请检查录音系统是否被静音,音量设置是否适当等。
  • 音频格式与音频数据不匹配。请确认音频格式

5. 获取语音识别结果

语音识别服务器会将处理过程中的结果作为U事件通知。这里我们使用s命令设置了resultUpdatedInterval=1000,所以大约每1秒钟会收到一次U事件。当处理完成并确定结果时,会收到A事件。关于结果的详细信息,请参考语音识别结果。对于test.wav音频的一系列结果,请参考示例代码的结果

def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
# ...省略...
elif event == 'G':
pass
elif event == 'S':
starttime = int(content)
elif event == 'E':
endtime = int(content)
elif event == 'C':
pass
elif event == 'U':
raw = json.loads(content) if content else ''
elif event == 'A' or event == 'R':
raw = json.loads(content) if content else ''

6. 通知音频数据发送结束

当所有音频都发送完毕后,可以发送e命令来结束语音识别会话。

e

在以下代码中,发送完所有音频文件数据后,会发送e命令:

def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
if content != "":
logger.error(content)
return

# s命令成功
# 如果请求成功,则向服务器发送音频文件数据
def send_audio(*args):
with open(filename, mode='rb') as file:
buf = file.read(audio_block_size)
while buf:
logger.info("send> p [..({} bytes)..]".format(len(buf)))
ws.send(b'p' + buf,
opcode=websocket.ABNF.OPCODE_BINARY)
buf = file.read(audio_block_size)
# test.wav是16位、16kHz的音频,所以每秒32,000字节。每次发送16,000字节,等待0.5秒,就能与实时发送速度相匹配
time.sleep(0.5)
# 音频发送完毕后,发送e命令
logger.info("send> e")
ws.send('e')
threading.Thread(target=send_audio).start()

发送e命令后,语音识别服务器会处理所有接收到的音频,返回所有结果,然后返回e命令的响应。根据发送的音频长度,完成可能需要一些时间,所以请等待e命令的响应以获得所有结果。

警告

为了应对通信障碍或服务器延迟等意外情况,请设置适当的通信超时,确保即使没有收到语音识别服务器的响应,应用程序也能正常运行。

7. WebSocket的断开连接

在这里,当收到e命令的响应时,我们关闭WebSocket连接。

def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
# ...省略...
elif event == 'G':
pass
elif event == 'S':
starttime = int(content)
elif event == 'E':
endtime = int(content)
elif event == 'C':
pass
elif event == 'U':
raw = json.loads(content) if content else ''
elif event == 'A' or event == 'R':
raw = json.loads(content) if content else ''
elif event == 'e':
logger.info("close>")
ws.close()

示例代码

以下是完整的Python代码。

websocket-sample.py
import time
import websocket
import json
import threading
import logging


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(threadName)s %(message)s")

server = 'wss://acp-api.amivoice.com/v1/'
filename = 'test.wav'
codec = "16K"
audio_block_size = 16000
grammar_file_names = "-a-general"
options = {
"profileId" : "",
"profileWords" : "",
"keepFillerToken": "",
"resultUpdatedInterval" : "1000",
"authorization" : 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
}


def on_open(ws):
logger.info("open")
def start(*args):
command = "s {} {}".format(codec, grammar_file_names)
for k, v in options.items():
if v != "":
if k == 'profileWords':
v = '"' + v.replace('"', '""') + '"'
command += f" {k}={v}"
logger.info("send> {command}")
ws.send(command)
threading.Thread(target=start).start()


def on_message(ws, message):
event = message[0]
content = message[2:].rstrip()
logger.info(f"message: {event} {content}")

if event == 's':
if content == "can't connect to recognizer server":
logger.error(content)
return

def send_audio(*args):
with open(filename, mode='rb') as file:
buf = file.read(audio_block_size)
while buf:
logger.debug("send> p [..({} bytes)..]".format(len(buf)))
ws.send(b'p' + buf,
opcode=websocket.ABNF.OPCODE_BINARY)
buf = file.read(audio_block_size)
time.sleep(0.5)
logger.info("send> e")
ws.send('e')
threading.Thread(target=send_audio).start()

elif event == 'G':
pass
elif event == 'S':
starttime = int(content)
elif event == 'E':
endtime = int(content)
elif event == 'C':
pass
elif event == 'U':
raw = json.loads(content) if content else ''
elif event == 'A' or event == 'R':
raw = json.loads(content) if content else ''
elif event == 'e':
logger.info("close>")
ws.close()


def on_close(ws):
logger.info("close")


logger.info("open> {}".format(server))
ws = websocket.WebSocketApp(server,
on_open=on_open,
on_message=on_message,
on_close=on_close)
ws.run_forever()

执行

可以按以下方式执行:

$ python websocket-sample.py

结果

发送音频文件(test.wav)时的操作日志如下所示。显示了从程序开始的毫秒级时间、线程名称和消息。

Thread-1线程显示正在发送音频(send> p [..(16000 bytes)..]),并且可以确认大约每1秒钟获得一次中间结果(message: U)。最后还获得了最终确定的结果(message: A)。

         4  MainThread   open> wss://acp-api.amivoice.com/v1/
94 MainThread open
94 MainThread send> s LSB16K -a-general resultUpdatedInterval=1000 authorization={APPKEY}
133 MainThread message: s
134 Thread-1 send> p [..(16000 bytes)..]
174 MainThread message: G
637 Thread-1 send> p [..(16000 bytes)..]
668 MainThread message: S 250
668 MainThread message: C
1139 Thread-1 send> p [..(16000 bytes)..]
1639 Thread-1 send> p [..(16000 bytes)..]
2144 Thread-1 send> p [..(16000 bytes)..]
2647 Thread-1 send> p [..(16000 bytes)..]
3148 Thread-1 send> p [..(16000 bytes)..]
3174 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2..."}
3649 Thread-1 send> p [..(16000 bytes)..]
4153 Thread-1 send> p [..(16000 bytes)..]
4179 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u3072\u3068\u3068\u304d"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u3072\u3068\u3068\u304d..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u3072\u3068\u3068\u304d..."}
4656 Thread-1 send> p [..(16000 bytes)..]
5157 Thread-1 send> p [..(16000 bytes)..]
5184 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u4eba"},{"written":"\u3068"},{"written":"\u6a5f\u68b0"},{"written":"\u3068"},{"written":"\u306e"},{"written":"\u81ea\u7136"},{"written":"\u306a"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a..."}
5658 Thread-1 send> p [..(16000 bytes)..]
6159 Thread-1 send> p [..(16000 bytes)..]
6187 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u4eba"},{"written":"\u3068"},{"written":"\u6a5f\u68b0"},{"written":"\u3068"},{"written":"\u306e"},{"written":"\u81ea\u7136"},{"written":"\u306a"},{"written":"\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3"},{"written":"\u3092"},{"written":"\u6301"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u6301..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u6301..."}
6660 Thread-1 send> p [..(16000 bytes)..]
7161 Thread-1 send> p [..(16000 bytes)..]
7185 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u4eba"},{"written":"\u3068"},{"written":"\u6a5f\u68b0"},{"written":"\u3068"},{"written":"\u306e"},{"written":"\u81ea\u7136"},{"written":"\u306a"},{"written":"\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3"},{"written":"\u3092"},{"written":"\u5b9f\u73fe"},{"written":"\u3057"},{"written":"\u3001"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001..."}
7662 Thread-1 send> p [..(16000 bytes)..]
8164 Thread-1 send> p [..(16000 bytes)..]
8199 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u4eba"},{"written":"\u3068"},{"written":"\u6a5f\u68b0"},{"written":"\u3068"},{"written":"\u306e"},{"written":"\u81ea\u7136"},{"written":"\u306a"},{"written":"\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3"},{"written":"\u3092"},{"written":"\u5b9f\u73fe"},{"written":"\u3057"},{"written":"\u3001"},{"written":"\u8c4a\u304b"},{"written":"\u306a"},{"written":"\u672a\u6765"},{"written":"\u3092"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092..."}
8668 Thread-1 send> p [..(16000 bytes)..]
9171 Thread-1 send> p [..(16000 bytes)..]
9188 MainThread message: E 8800
9190 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u4eba"},{"written":"\u3068"},{"written":"\u6a5f\u68b0"},{"written":"\u3068"},{"written":"\u306e"},{"written":"\u81ea\u7136"},{"written":"\u306a"},{"written":"\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3"},{"written":"\u3092"},{"written":"\u5b9f\u73fe"},{"written":"\u3057"},{"written":"\u3001"},{"written":"\u8c4a\u304b"},{"written":"\u306a"},{"written":"\u672a\u6765"},{"written":"\u3092"},{"written":"\u5275\u9020"},{"written":"\u3057\u3066"},{"written":"\u3044\u304f"},{"written":"\u3053\u3068"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092\u5275\u9020\u3057\u3066\u3044\u304f\u3053\u3068..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092\u5275\u9020\u3057\u3066\u3044\u304f\u3053\u3068..."}
9390 MainThread message: U {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2"},{"written":"\u306f"},{"written":"\u3001"},{"written":"\u4eba"},{"written":"\u3068"},{"written":"\u6a5f\u68b0"},{"written":"\u3068"},{"written":"\u306e"},{"written":"\u81ea\u7136"},{"written":"\u306a"},{"written":"\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3"},{"written":"\u3092"},{"written":"\u5b9f\u73fe"},{"written":"\u3057"},{"written":"\u3001"},{"written":"\u8c4a\u304b"},{"written":"\u306a"},{"written":"\u672a\u6765"},{"written":"\u3092"},{"written":"\u5275\u9020"},{"written":"\u3057\u3066"},{"written":"\u3044\u304f"},{"written":"\u3053\u3068"},{"written":"\u3092"},{"written":"\u76ee\u6307\u3057"},{"written":"\u307e\u3059"},{"written":"\u3002"},{"written":"..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092\u5275\u9020\u3057\u3066\u3044\u304f\u3053\u3068\u3092\u76ee\u6307\u3057\u307e\u3059\u3002..."}],"text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092\u5275\u9020\u3057\u3066\u3044\u304f\u3053\u3068\u3092\u76ee\u6307\u3057\u307e\u3059\u3002..."}
9471 MainThread message: A {"results":[{"tokens":[{"written":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2","confidence":1.00,"starttime":522,"endtime":1578,"spoken":"\u3042\u3069\u3070\u3093\u3059\u3068\u3081\u3067\u3043\u3042"},{"written":"\u306f","confidence":1.00,"starttime":1578,"endtime":1866,"spoken":"\u306f"},{"written":"\u3001","confidence":0.74,"starttime":1866,"endtime":2026,"spoken":"_"},{"written":"\u4eba","confidence":1.00,"starttime":2026,"endtime":2314,"spoken":"\u3072\u3068"},{"written":"\u3068","confidence":1.00,"starttime":2314,"endtime":2426,"spoken":"\u3068"},{"written":"\u6a5f\u68b0","confidence":1.00,"starttime":2426,"endtime":2826,"spoken":"\u304d\u304b\u3044"},{"written":"\u3068","confidence":1.00,"starttime":2826,"endtime":2954,"spoken":"\u3068"},{"written":"\u306e","confidence":1.00,"starttime":2954,"endtime":3082,"spoken":"\u306e"},{"written":"\u81ea\u7136","confidence":1.00,"starttime":3082,"endtime":3434,"spoken":"\u3057\u305c\u3093"},{"written":"\u306a","confidence":1.00,"starttime":3434,"endtime":3530,"spoken":"\u306a"},{"written":"\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3","confidence":1.00,"starttime":3530,"endtime":4378,"spoken":"\u3053\u307f\u3085\u306b\u3051\u30fc\u3057\u3087\u3093"},{"written":"\u3092","confidence":1.00,"starttime":4378,"endtime":4458,"spoken":"\u3092"},{"written":"\u5b9f\u73fe","confidence":1.00,"starttime":4458,"endtime":4922,"spoken":"\u3058\u3064\u3052\u3093"},{"written":"\u3057","confidence":1.00,"starttime":4922,"endtime":5434,"spoken":"\u3057"},{"written":"\u3001","confidence":1.00,"starttime":5434,"endtime":5546,"spoken":"_"},{"written":"\u8c4a\u304b","confidence":1.00,"starttime":5546,"endtime":5994,"spoken":"\u3086\u305f\u304b"},{"written":"\u306a","confidence":1.00,"starttime":5994,"endtime":6090,"spoken":"\u306a"},{"written":"\u672a\u6765","confidence":1.00,"starttime":6090,"endtime":6490,"spoken":"\u307f\u3089\u3044"},{"written":"\u3092","confidence":1.00,"starttime":6490,"endtime":6554,"spoken":"\u3092"},{"written":"\u5275\u9020","confidence":0.93,"starttime":6554,"endtime":7050,"spoken":"\u305d\u3046\u305e\u3046"},{"written":"\u3057\u3066","confidence":0.99,"starttime":7050,"endtime":7210,"spoken":"\u3057\u3066"},{"written":"\u3044\u304f","confidence":1.00,"starttime":7210,"endtime":7418,"spoken":"\u3044\u304f"},{"written":"\u3053\u3068","confidence":1.00,"starttime":7418,"endtime":7690,"spoken":"\u3053\u3068"},{"written":"\u3092","confidence":1.00,"starttime":7690,"endtime":7722,"spoken":"\u3092"},{"written":"\u76ee\u6307\u3057","confidence":0.77,"starttime":7722,"endtime":8090,"spoken":"\u3081\u3056\u3057"},{"written":"\u307e\u3059","confidence":0.76,"starttime":8090,"endtime":8506,"spoken":"\u307e\u3059"},{"written":"\u3002","confidence":0.82,"starttime":8506,"endtime":8794,"spoken":"_"}],"confidence":0.998,"starttime":250,"endtime":8794,"tags":[],"rulename":"","text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092\u5275\u9020\u3057\u3066\u3044\u304f\u3053\u3068\u3092\u76ee\u6307\u3057\u307e\u3059\u3002"}],"utteranceid":"20220620/ja_ja-amivoicecloud-16k-user01@01817dce7ba30a301ccf8536-0620_061133","text":"\u30a2\u30c9\u30d0\u30f3\u30b9\u30c8\u30fb\u30e1\u30c7\u30a3\u30a2\u306f\u3001\u4eba\u3068\u6a5f\u68b0\u3068\u306e\u81ea\u7136\u306a\u30b3\u30df\u30e5\u30cb\u30b1\u30fc\u30b7\u30e7\u30f3\u3092\u5b9f\u73fe\u3057\u3001\u8c4a\u304b\u306a\u672a\u6765\u3092\u5275\u9020\u3057\u3066\u3044\u304f\u3053\u3068\u3092\u76ee\u6307\u3057\u307e\u3059\u3002","code":"","message":""}
9495 MainThread message: G
9672 Thread-1 send> p [..(2980 bytes)..]
10174 Thread-1 send> e
10225 MainThread message: e
10225 MainThread close>

会话的维持

在实时发送音频的情况下,如果600秒内未检测到音频(持续发送无声音频),服务器将断开会话。此时,您将收到如下的p命令响应数据包:

p can't feed audio data to recognizer server

此外,如果60秒内没有通信,服务器端将关闭会话。此时,您将收到如下的e命令响应数据包:

e timeout occurred while recognizing audio data from client

收到这些响应后,请重新建立连接,然后再次发送音频。

客户端程序的状态转换

客户端程序的状态根据命令的发送和响应按以下方式转换。

其他文档

  • 关于命令和响应的序列以及响应的详细信息,请参阅流式响应
  • API参考,请参阅WebSocket接口
  • 我们提供了一个客户端库(Wrp),它将WebSocket接口使用时的通信处理和步骤类库化,只需实现语音识别应用程序所需的接口,就可以轻松创建语音识别应用程序。首先请参阅实时语音识别库Wrp的使用方法