Skip to main content

Packets and State Transitions

Packet List

The packets exchanged between client and server in the WebSocket speech recognition protocol are as follows:

Packet NameRelated StateDescription
s command packetAudio SupplyCommand to start sending audio data
s command responseAudio SupplyResponse to start sending audio data command
p command packetAudio SupplyCommand to send audio data
p command responseAudio SupplyResponse to send audio data command
e command packetAudio SupplyCommand to stop sending audio data
e command responseAudio SupplyResponse to stop sending audio data command
S event packetUtterance DetectionNotification of utterance start detection
E event packetUtterance DetectionNotification of utterance end detection
C event packetSpeech RecognitionNotification of recognition process start
U event packet-Notification during recognition process
A/R event packetSpeech RecognitionNotification of recognition process results
G event packet-Notification of action results within server

State Transition List

The state transitions that exist in the WebSocket speech recognition protocol are as follows:

  • Audio Supply State Transition
  • Utterance Detection State Transition
  • Speech Recognition State Transition

1. Audio Supply State Transition

The states representing the status of audio data supply from client to server are as follows:

Audio Supply State Transition Table

Packet Name0 Initialized [Initial State]1 starting2 started3 providing4 ending
s command packetStart supply → 1(ERROR)(ERROR)(ERROR)(ERROR)
s command response (success)(OK) → 2
s command response (failure)(ERROR) → 0
p command packet(ERROR)(ERROR)Supplying → 3Supplying → 3(ERROR)
p command response (success)(OK) → 3
p command response (failure)(ERROR) → 0
e command packet(ERROR)(ERROR)Stop supply → 4Stop supply → 4(ERROR)
e command response (success)(OK) → 0
e command response (failure)(ERROR) → 0

Audio Supply State Transition Diagram

2. Utterance Detection State Transition

The states representing the status of utterance detection are as follows:

Utterance Detection State Transition Table

Packet Name6 not-detecting [Initial State]7 detecting
S event packet→ 7
E event packet→ 6

Utterance Detection State Transition Diagram

3. Speech Recognition State Transition

The states representing the status of speech recognition processing are as follows:

Speech Recognition State Transition Table

Packet Name8 not-recognizing [Initial State]9 recognizing
C event packet→ 9
A/R event packet→ 8

Speech Recognition State Transition Diagram