Sentiment Analysis
Overview
What is Sentiment Analysis
Sentiment analysis is a feature that analyzes and outputs the speaker's emotions from voice. AmiVoice API uses ES Japan's ESAS (Emotional Signature Analysis Solution) sentiment analysis engine to return sentimental parameters from voice. ESAS is an sentiment analysis engine tuned for Japan based on the latest sentiment analysis engine "LVA7" provided by Nemesysco, an Israeli company. For technical background, please see ES Japan's website.
About the API
In AmiVoice API, you can obtain the sentimental parameters output by ESAS by setting the option parameter sentimentAnalysis
to True
when making a speech recognition request. For voice segments that ESAS determines as speech, it outputs an array of 20 sentimental parameters approximately once every 2 seconds.
The results are obtained in sentiment_analysis
within the response.
"sentiment_analysis": {
"segments": [
{
"starttime": 10,
"endtime": 20,
/* Sentimental parameters */
},
{
"starttime": 100,
"endtime": 200,
/* Sentimental parameters */
},
/* Omitted */
]
},
For sentimental parameters, please see the List of Sentimental Parameters and Meaning and Interpretation of Sentimental Parameters below. Also, the parameter names in the response can be obtained using the API.
Sentiment analysis is only supported for the asynchronous HTTP interface.
How to Use
Request
To obtain sentimental parameters, set sentimentAnalysis=True
in the d
parameter when making a speech recognition request.
For example, when using the curl command to perform speech recognition with the general engine on the audio included in the AmiVoice API sample program without using sentiment analysis, execute the command as follows:
curl https://acp-api-async.amivoice.com/v1/recognitions \
-F d="grammarFileNames=-a-general" \
-F u={APP_KEY} \
-F a=@test.wav
To enable the sentiment analysis feature, modify it as follows:
curl https://acp-api-async.amivoice.com/v1/recognitions \
-F d="grammarFileNames=-a-general sentimentAnalysis=True" \
-F u={APP_KEY} \
-F a=@test.wav
Response
If the request is successful, please check the job status periodically until the results are obtained. For details, please see "2. Check the status of the speech recognition job and retrieve the results" in the asynchronous HTTP interface documentation.
The following explains the result response when the sentiment analysis feature is enabled.
When Successful
The results of sentiment analysis are obtained in sentiment_analysis
.
For voice that the sentiment analysis engine determines as speech, it outputs 20 sentimental parameters approximately once every 2 seconds. In sentiment_analysis.segments
, these sentimental parameters are obtained as an array in chronological order. Each element contains the start time (starttime
) and end time (endtime
) as relative time information in milliseconds with the start of the audio as 0, and the corresponding sentimental parameters for that time.
Example of response:
{
"audio_md5": "40f59fe5fc7745c33b33af44be43f6ad",
"audio_size": 306980,
"code": "",
"message": "",
"segments": [
{
"results": [
{
"confidence": 0.998,
"endtime": 8794,
"rulename": "",
"starttime": 250,
"tags": [],
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。",
"tokens": [
{
"confidence": 1,
"endtime": 1578,
"spoken": "あどばんすとめでぃあ",
"starttime": 522,
"written": "アドバンスト・メディア"
},
{
"confidence": 1,
"endtime": 1834,
"spoken": "は",
"starttime": 1578,
"written": "は"
},
/* Omitted */
]
}
],
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。"
}
],
"sentiment_analysis": {
"segments": [
{
"starttime": 10,
"endtime": 20,
/* Sentimental parameters */
},
{
"starttime": 100,
"endtime": 200,
/* Sentimental parameters */
},
/* Omitted */
]
},
"service_id": "user01",
"session_id": "018160cbe43f0a304474a999",
"status": "completed",
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。",
"utteranceid": "20220614/15/018160cc54ee0a3044b539d0_20220614_150012"
}
When Sentiment Analysis Results are Empty
If there isn't enough audio length to analyze emotions, segments
in sentiment_analysis
will be an empty array as follows:
sentiment_analysis": {
"segments": []
}
This could be because the sent data doesn't contain audio, or the volume is too low to be subject to sentiment analysis processing. Please check the volume of the sent data.
For example, you can check the volume using an application that can display volume as a waveform, such as Audacity. The following is an example of a volume level that is too low. It shows a volume level of about 0.05 on a scale where the maximum value of the signal is 1, but it needs to be a bit louder. As a guide, adjust the recording volume to exceed about 0.1.
When Failed
If the sentiment analysis processing fails for some reason, code
, error_code
, and error_message
are set in sentiment_analysis
as follows:
Example:
"sentiment_analysis":{"code":400,"error_code":"BAD_REQUEST","error_message":"file format is invalid"}
List of Error Responses
The error responses are as follows:
code | error_code | error_message | Description |
---|---|---|---|
400 | BAD_REQUEST | file format is invalid | Audio file doesn't exist or the format is invalid |
500 | INTERNAL_SERVER_ERROR | internal server error has occurred | Internal server error |
INTERNAL_SERVER_ERROR | internal server error has occurred before processing sentiment-analyzer | Internal server error |
Retrieving the List of Sentimental Parameters
The list of 20 sentimental parameters included in the response can be obtained by sending a GET
request to https://acp-dsrpp.amivoice.com/v1/sentiment-analysis/ja/result-parameters.json
.
Example:
$ curl -H "Authorization: Bearer {APPKEY}" \
https://acp-dsrpp.amivoice.com/v1/sentiment-analysis/ja/result-parameters.json
For details on this API, please see the Sentiment Analysis API in the reference. The meaning of the sentimental parameters is explained in the following chapters.
About Sentimental Parameters
List of Sentimental Parameters
There are 20 sentimental parameters. The following table summarizes the minimum value, maximum value, and whether a larger value tends to be positive or negative for each parameter.
Parameter Name | Min | Max | Value Trend |
---|---|---|---|
Energy | 0 | 100 | - |
Stress | 0 | 100 | Negative |
Emotional Balanced Logical | 1 | 500 | - |
Concentration | 0 | 100 | - |
Expectation | 0 | 100 | - |
Excitement | 0 | 30 | Positive |
Hesitation | 0 | 30 | - |
Uncertainty | 0 | 30 | - |
Thinking | 0 | 100 | - |
Imagination | 0 | 30 | - |
Confusion | 0 | 30 | Negative (*1 ) |
Passion | 0 | 30 | Positive |
Brain Activity | 0 | 100 | - |
Confidence | 0 | 30 | - |
Aggression Anger | 0 | 30 | Negative (*2 ) |
Atmosphere Conversation Trend | -100 | 100 | - |
Agitation | 0 | 30 | Negative (*1 ,*2 ) |
Joy | 0 | 30 | Positive (*2 ) |
Dissatisfaction | 0 | 30 | Negative (*3 ) |
Extreme Fluctuation | 0 | 30 | - |
- All values are of integer type (
int
). - For the correspondence between the parameter names in the JSON included in the response and the parameter names in the above table, please see Retrieving the List of Sentimental Parameters.
About Value Trends
- Positive: Parameters that are more likely to suggest a positive reaction as the value increases
- Negative: Parameters that are more likely to suggest a negative reaction as the value increases
- Others (-): Parameters whose meaning changes depending on the size of the value, or parameters that are not necessarily easy to classify as positive/negative
*1
: Classified as negative regardless of the size of the number.*2
: It's rare to take a value greater than 0, so if it's greater than 0, it can be considered high.*3
: When it occurs simultaneously withAggression Anger
, it can be interpreted as a tendency to "vent" as it expresses anger. For example, it's often seen in cases of shouting.
Meaning and Interpretation of Sentimental Parameters
This section explains the meaning and interpretation of sentimental parameters. The average and variation of sentimental parameters, and the thresholds for determining whether a value is higher or lower than usual, differ depending on your environment. Therefore, please collect samples for a certain period to make judgments.
It's recommended to first focus on the important indicators: Energy, Stress, and Emotional-Balanced-Logical.
Energy
This is the most fundamental indicator among the sentimental parameters. The meaning and interpretation change depending on the range of values, with high ranges suggesting emotional excitement and being very energetic. On the other hand, low ranges suggest a lack of interest (boredom), lack of empathy, lack of sleep (drowsiness), or poor physical condition. It often takes values near 0 and is a parameter that changes relatively gradually.
Value Trends
Value Range | Description |
---|---|
0 to 10 | Low Range Suggests depression, boredom, fatigue, sadness *If low values are detected continuously, it may also suggest a decrease in willingness to work |
11 to 20 | Suggests comfortable conversation |
21 to 40 | Suggests excitement in conversation *Not necessarily a positive indicator, there's a possibility of excitement accompanied by negative dissatisfaction or anger |
41 to 100 | High Range Suggests emotional excitement, being very energetic *Not necessarily a positive indicator, there's a possibility of anger, etc. |
Stress
This is a fundamental indicator among the sentimental parameters that shows stress and mental load. High ranges indicate high stress and mental load. Basically, it takes values near 0 and is a parameter that changes relatively gradually.
Value Trends
High Range Suggests that the speaker is carrying fundamental stress and mental load
- If high values are detected continuously throughout the week/month regardless of the nature of the conversation, there's a high possibility that the speaker is carrying fundamental mental load, suggesting a decrease in willingness to work or risk of leaving the job
- If it's higher than the average call value and occurs continuously, it tends not to lead to positive decision-making or response results Example) Business negotiations, purchases, contract applications, etc.
Emotional-Balanced-Logical
This is an indicator that shows the balance between whether the content is tied to emotional (emotive) thinking or logical (calm) thinking. Depending on the occurrence range, it indicates whether the statement is emotional (emotive) or logical (calm).
When conducting various analyses, it can be used to classify speakers into three sentimental types and consider parameters for each type.
Value Trends
Value Range | Description |
---|---|
1 to 65 | Logical |
65 to 85 | Balanced |
85 to 500 | Emotional (Emotive) |
Concentration
This is an indicator that shows the degree of concentration of the speaker. It's a parameter that can rise from a value near the lower limit to a value near the upper limit, or fall from a value near the upper limit to a value near the lower limit in a short time.
Value Trends
- If it's higher than the average call value and occurs continuously, it may also suggest an important point for the speaker
- Normally, in conversations at call centers, etc. (especially business negotiations), an increase in "concentration" can be interpreted as a desirable trend
Expectation
This is an indicator that shows expectation for something to happen. In high ranges, there's a possibility of both positive and negative expectations. It basically occurs at low values, but sometimes exceptionally high values can occur.
Value Trends
-
High Range Positive
- Suggests expectation of how one wants to be perceived by the other person
- If it's higher than the average call value and occurs continuously, it suggests a willingness to be helpful to the other person, i.e., a forward-leaning response
-
High Range Negative
- Suggests exaggeration, induction, or manipulation
- High range ≠ Lying, but suggests intentional statements with underlying intentions
Excitement
This is an indicator that shows the degree of excitement or elation. It basically takes about the same value (around 15) and changes relatively gradually, but occasionally exceptionally high values occur.
Value Trends
- Used as an indicator to consider the degree of willingness for conversation, such as arousing interest
- Affects important decision-making such as product purchases
Hesitation
This is an indicator that shows the degree of hesitation in speaking. It basically takes about the same value (around 15) and is a parameter that changes relatively gradually.
Value Trends
- When higher than the call average and persistent, it suggests hesitation in speaking, feeling guilty or ashamed. Like stress, it has a negative effect on decision-making.
Uncertainty
This indicator shows lack of confidence. It typically takes similar values (around 15) and changes relatively gradually, but can rise from near the lower limit to near the upper limit, or fall from near the upper limit to near the lower limit in a short time.
Value Trends
- When higher than the call average and persistent, it suggests lack of confidence, anxiety, lack of understanding or situation awareness, or distrust. Like stress, it has a negative effect on decision-making.
Thinking
This indicator shows the state of speaking while thinking. It often takes values near 0 and changes relatively gradually.
Value Trends
- When higher than the call average and persistent, it suggests thinking while speaking, the possibility of constructive conversation, or the possibility that the conversation is complex
- When high and persistent for both speakers, it suggests the possibility of difficulties in reaching agreement or mutual understanding in the conversation
Imagination
This indicator shows the degree of expanding images through memory or imagination. In high ranges, it suggests that imagination is at work in explaining things or in how to convey things. It is also used to estimate whether the basis of statements is from facts and memories or from imagination. It often takes values near 0 and changes relatively gradually.
Value Trends
-
High Range Positive
- Suggests trying to understand the situation in response to the other person's statements
- In discussions, etc., it is desirable to be active (frequently detected)
-
High Range Negative
- Suggests the possibility of evasion or struggling to answer
Confusion
This indicator shows a state of confusion. It often takes values near 0 and changes relatively gradually.
Passion
This indicator shows whether one is showing interest from the bottom of their heart. It often takes values near 0 and changes relatively gradually.
Brain Activity
This indicator shows the overall activation of brain activity, mainly used for research purposes.
Value Trends
- An indicator mainly intended for use in research
Confidence
This indicator shows the level of confidence.
Value Trends
- High Range
- Also suggests content that has been decided by the speaker.
Aggression Anger
This indicator suggests the speaker's aggressiveness. It often takes values near 0 and changes relatively gradually.
Value Trends
- A parameter under verification that requires caution in handling, as it may erroneously detect different moods as anger
Atmosphere Conversation Trend
This is a supplementary indicator to increase the credibility of interpretation when viewed in combination with the interpretation of other parameters.
Agitation
This indicator shows dissatisfaction or sadness. It often takes values near 0 and changes relatively gradually. It rarely becomes greater than 0.
Joy
This indicator shows satisfaction or joy. It often takes values near 0 and changes relatively gradually. It rarely becomes greater than 0.
Value Trends
- Due to various psychological characteristics in conversation, it is sometimes detected even in situations such as arguments (quarrels)
- There are cases where it is detected (at the segment unit level) as a result of a person expressing and venting their anger by talking back to the other person when feeling angry during an argument
Dissatisfaction
This indicator shows a combination of high stress and high levels of dissatisfaction. It does not change frequently and tends to have the same value continuously.
Value Trends
- Suggests a (negative) deviation from prior expectations
Extreme Fluctuation
This indicator shows the extremity of movement in overall emotions. It often takes values near 0 and changes relatively gradually.
Value Trends
- Used to make more accurate considerations based on the occurrence of other parameters, not alone. Example: If high stress is detected along with this parameter, it may indicate a situation where emotional, impulsive anger is occurring.