Skip to main content

Sentiment Analysis

Overview

What is Sentiment Analysis

Sentiment analysis is a feature that analyzes and outputs the speaker's emotions from voice. AmiVoice API uses ES Japan's ESAS (Emotional Signature Analysis Solution) sentiment analysis engine to return sentimental parameters from voice. ESAS is an sentiment analysis engine tuned for Japan based on the latest sentiment analysis engine "LVA7" provided by Nemesysco, an Israeli company. For technical background, please see ES Japan's website.

About the API

In AmiVoice API, you can obtain the sentimental parameters output by ESAS by setting the option parameter sentimentAnalysis to True when making a speech recognition request. For voice segments that ESAS determines as speech, it outputs an array of 20 sentimental parameters approximately once every 2 seconds.

The results are obtained in sentiment_analysis within the response.

    "sentiment_analysis": {
"segments": [
{
"starttime": 10,
"endtime": 20,
/* Sentimental parameters */
},
{
"starttime": 100,
"endtime": 200,
/* Sentimental parameters */
},
/* Omitted */
]
},

For sentimental parameters, please see the List of Sentimental Parameters and Meaning and Interpretation of Sentimental Parameters below. Also, the parameter names in the response can be obtained using the API.

note

Sentiment analysis is only supported for the asynchronous HTTP interface.

How to Use

Request

To obtain sentimental parameters, set sentimentAnalysis=True in the d parameter when making a speech recognition request.

For example, when using the curl command to perform speech recognition with the general engine on the audio included in the AmiVoice API sample program without using sentiment analysis, execute the command as follows:

curl https://acp-api-async.amivoice.com/v1/recognitions \
-F d="grammarFileNames=-a-general" \
-F u={APP_KEY} \
-F a=@test.wav

To enable the sentiment analysis feature, modify it as follows:

curl https://acp-api-async.amivoice.com/v1/recognitions \
-F d="grammarFileNames=-a-general sentimentAnalysis=True" \
-F u={APP_KEY} \
-F a=@test.wav

Response

If the request is successful, please check the job status periodically until the results are obtained. For details, please see "2. Check the status of the speech recognition job and retrieve the results" in the asynchronous HTTP interface documentation.

The following explains the result response when the sentiment analysis feature is enabled.

When Successful

The results of sentiment analysis are obtained in sentiment_analysis.

For voice that the sentiment analysis engine determines as speech, it outputs 20 sentimental parameters approximately once every 2 seconds. In sentiment_analysis.segments, these sentimental parameters are obtained as an array in chronological order. Each element contains the start time (starttime) and end time (endtime) as relative time information in milliseconds with the start of the audio as 0, and the corresponding sentimental parameters for that time.

Example of response:

{
"audio_md5": "40f59fe5fc7745c33b33af44be43f6ad",
"audio_size": 306980,
"code": "",
"message": "",
"segments": [
{
"results": [
{
"confidence": 0.998,
"endtime": 8794,
"rulename": "",
"starttime": 250,
"tags": [],
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。",
"tokens": [
{
"confidence": 1,
"endtime": 1578,
"spoken": "あどばんすとめでぃあ",
"starttime": 522,
"written": "アドバンスト・メディア"
},
{
"confidence": 1,
"endtime": 1834,
"spoken": "は",
"starttime": 1578,
"written": "は"
},
/* Omitted */
]
}
],
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。"
}
],
"sentiment_analysis": {
"segments": [
{
"starttime": 10,
"endtime": 20,
/* Sentimental parameters */
},
{
"starttime": 100,
"endtime": 200,
/* Sentimental parameters */
},
/* Omitted */
]
},
"service_id": "user01",
"session_id": "018160cbe43f0a304474a999",
"status": "completed",
"text": "アドバンスト・メディアは、人と機械との自然なコミュニケーションを実現し、豊かな未来を創造していくことを目指します。",
"utteranceid": "20220614/15/018160cc54ee0a3044b539d0_20220614_150012"
}

When Sentiment Analysis Results are Empty

If there isn't enough audio length to analyze emotions, segments in sentiment_analysis will be an empty array as follows:

sentiment_analysis": {
"segments": []
}

This could be because the sent data doesn't contain audio, or the volume is too low to be subject to sentiment analysis processing. Please check the volume of the sent data.

For example, you can check the volume using an application that can display volume as a waveform, such as Audacity. The following is an example of a volume level that is too low. It shows a volume level of about 0.05 on a scale where the maximum value of the signal is 1, but it needs to be a bit louder. As a guide, adjust the recording volume to exceed about 0.1.

Figure. About Volume Level for Sentiment Analysis

When Failed

If the sentiment analysis processing fails for some reason, code, error_code, and error_message are set in sentiment_analysis as follows:

Example:

"sentiment_analysis":{"code":400,"error_code":"BAD_REQUEST","error_message":"file format is invalid"}
List of Error Responses

The error responses are as follows:

codeerror_codeerror_messageDescription
400BAD_REQUESTfile format is invalidAudio file doesn't exist or the format is invalid
500INTERNAL_SERVER_ERRORinternal server error has occurredInternal server error
INTERNAL_SERVER_ERRORinternal server error has occurred before processing sentiment-analyzerInternal server error

Retrieving the List of Sentimental Parameters

The list of 20 sentimental parameters included in the response can be obtained by sending a GET request to https://acp-dsrpp.amivoice.com/v1/sentiment-analysis/ja/result-parameters.json.

Example:

$ curl -H "Authorization: Bearer {APPKEY}" \
https://acp-dsrpp.amivoice.com/v1/sentiment-analysis/ja/result-parameters.json

For details on this API, please see the Sentiment Analysis API in the reference. The meaning of the sentimental parameters is explained in the following chapters.

About Sentimental Parameters

List of Sentimental Parameters

There are 20 sentimental parameters. The following table summarizes the minimum value, maximum value, and whether a larger value tends to be positive or negative for each parameter.

Parameter NameMinMaxValue Trend
Energy0100-
Stress0100Negative
Emotional Balanced Logical1500-
Concentration0100-
Expectation0100-
Excitement030Positive
Hesitation030-
Uncertainty030-
Thinking0100-
Imagination030-
Confusion030Negative (*1)
Passion030Positive
Brain Activity0100-
Confidence030-
Aggression Anger030Negative (*2)
Atmosphere Conversation Trend-100100-
Agitation030Negative (*1,*2)
Joy030Positive (*2)
Dissatisfaction030Negative (*3)
Extreme Fluctuation030-
  • All values are of integer type (int).
  • For the correspondence between the parameter names in the JSON included in the response and the parameter names in the above table, please see Retrieving the List of Sentimental Parameters.

About Value Trends

  • Positive: Parameters that are more likely to suggest a positive reaction as the value increases
  • Negative: Parameters that are more likely to suggest a negative reaction as the value increases
  • Others (-): Parameters whose meaning changes depending on the size of the value, or parameters that are not necessarily easy to classify as positive/negative
  • *1: Classified as negative regardless of the size of the number.
  • *2: It's rare to take a value greater than 0, so if it's greater than 0, it can be considered high.
  • *3: When it occurs simultaneously with Aggression Anger, it can be interpreted as a tendency to "vent" as it expresses anger. For example, it's often seen in cases of shouting.

Meaning and Interpretation of Sentimental Parameters

This section explains the meaning and interpretation of sentimental parameters. The average and variation of sentimental parameters, and the thresholds for determining whether a value is higher or lower than usual, differ depending on your environment. Therefore, please collect samples for a certain period to make judgments.

tip

It's recommended to first focus on the important indicators: Energy, Stress, and Emotional-Balanced-Logical.

Energy

This is the most fundamental indicator among the sentimental parameters. The meaning and interpretation change depending on the range of values, with high ranges suggesting emotional excitement and being very energetic. On the other hand, low ranges suggest a lack of interest (boredom), lack of empathy, lack of sleep (drowsiness), or poor physical condition. It often takes values near 0 and is a parameter that changes relatively gradually.

Value Trends

Value RangeDescription
0 to 10Low Range Suggests depression, boredom, fatigue, sadness
*If low values are detected continuously, it may also suggest a decrease in willingness to work
11 to 20Suggests comfortable conversation
21 to 40Suggests excitement in conversation
*Not necessarily a positive indicator, there's a possibility of excitement accompanied by negative dissatisfaction or anger
41 to 100High Range Suggests emotional excitement, being very energetic
*Not necessarily a positive indicator, there's a possibility of anger, etc.

Stress

This is a fundamental indicator among the sentimental parameters that shows stress and mental load. High ranges indicate high stress and mental load. Basically, it takes values near 0 and is a parameter that changes relatively gradually.

Value Trends

High Range Suggests that the speaker is carrying fundamental stress and mental load

  • If high values are detected continuously throughout the week/month regardless of the nature of the conversation, there's a high possibility that the speaker is carrying fundamental mental load, suggesting a decrease in willingness to work or risk of leaving the job
  • If it's higher than the average call value and occurs continuously, it tends not to lead to positive decision-making or response results Example) Business negotiations, purchases, contract applications, etc.

Emotional-Balanced-Logical

This is an indicator that shows the balance between whether the content is tied to emotional (emotive) thinking or logical (calm) thinking. Depending on the occurrence range, it indicates whether the statement is emotional (emotive) or logical (calm).

When conducting various analyses, it can be used to classify speakers into three sentimental types and consider parameters for each type.

Value Trends

Value RangeDescription
1 to 65Logical
65 to 85Balanced
85 to 500Emotional (Emotive)

Concentration

This is an indicator that shows the degree of concentration of the speaker. It's a parameter that can rise from a value near the lower limit to a value near the upper limit, or fall from a value near the upper limit to a value near the lower limit in a short time.

Value Trends

  • If it's higher than the average call value and occurs continuously, it may also suggest an important point for the speaker
  • Normally, in conversations at call centers, etc. (especially business negotiations), an increase in "concentration" can be interpreted as a desirable trend

Expectation

This is an indicator that shows expectation for something to happen. In high ranges, there's a possibility of both positive and negative expectations. It basically occurs at low values, but sometimes exceptionally high values can occur.

Value Trends

  • High Range Positive

    • Suggests expectation of how one wants to be perceived by the other person
    • If it's higher than the average call value and occurs continuously, it suggests a willingness to be helpful to the other person, i.e., a forward-leaning response
  • High Range Negative

    • Suggests exaggeration, induction, or manipulation
    • High range ≠ Lying, but suggests intentional statements with underlying intentions

Excitement

This is an indicator that shows the degree of excitement or elation. It basically takes about the same value (around 15) and changes relatively gradually, but occasionally exceptionally high values occur.

Value Trends

  • Used as an indicator to consider the degree of willingness for conversation, such as arousing interest
  • Affects important decision-making such as product purchases

Hesitation

This is an indicator that shows the degree of hesitation in speaking. It basically takes about the same value (around 15) and is a parameter that changes relatively gradually.

Value Trends

  • When higher than the call average and persistent, it suggests hesitation in speaking, feeling guilty or ashamed. Like stress, it has a negative effect on decision-making.

Uncertainty

This indicator shows lack of confidence. It typically takes similar values (around 15) and changes relatively gradually, but can rise from near the lower limit to near the upper limit, or fall from near the upper limit to near the lower limit in a short time.

Value Trends

  • When higher than the call average and persistent, it suggests lack of confidence, anxiety, lack of understanding or situation awareness, or distrust. Like stress, it has a negative effect on decision-making.

Thinking

This indicator shows the state of speaking while thinking. It often takes values near 0 and changes relatively gradually.

Value Trends

  • When higher than the call average and persistent, it suggests thinking while speaking, the possibility of constructive conversation, or the possibility that the conversation is complex
  • When high and persistent for both speakers, it suggests the possibility of difficulties in reaching agreement or mutual understanding in the conversation

Imagination

This indicator shows the degree of expanding images through memory or imagination. In high ranges, it suggests that imagination is at work in explaining things or in how to convey things. It is also used to estimate whether the basis of statements is from facts and memories or from imagination. It often takes values near 0 and changes relatively gradually.

Value Trends

  • High Range Positive

    • Suggests trying to understand the situation in response to the other person's statements
    • In discussions, etc., it is desirable to be active (frequently detected)
  • High Range Negative

    • Suggests the possibility of evasion or struggling to answer

Confusion

This indicator shows a state of confusion. It often takes values near 0 and changes relatively gradually.

Passion

This indicator shows whether one is showing interest from the bottom of their heart. It often takes values near 0 and changes relatively gradually.

Brain Activity

This indicator shows the overall activation of brain activity, mainly used for research purposes.

Value Trends

  • An indicator mainly intended for use in research

Confidence

This indicator shows the level of confidence.

Value Trends

  • High Range
    • Also suggests content that has been decided by the speaker.

Aggression Anger

This indicator suggests the speaker's aggressiveness. It often takes values near 0 and changes relatively gradually.

Value Trends

  • A parameter under verification that requires caution in handling, as it may erroneously detect different moods as anger

Atmosphere Conversation Trend

This is a supplementary indicator to increase the credibility of interpretation when viewed in combination with the interpretation of other parameters.

Agitation

This indicator shows dissatisfaction or sadness. It often takes values near 0 and changes relatively gradually. It rarely becomes greater than 0.

Joy

This indicator shows satisfaction or joy. It often takes values near 0 and changes relatively gradually. It rarely becomes greater than 0.

Value Trends

  • Due to various psychological characteristics in conversation, it is sometimes detected even in situations such as arguments (quarrels)
  • There are cases where it is detected (at the segment unit level) as a result of a person expressing and venting their anger by talking back to the other person when feeling angry during an argument

Dissatisfaction

This indicator shows a combination of high stress and high levels of dissatisfaction. It does not change frequently and tends to have the same value continuously.

Value Trends

  • Suggests a (negative) deviation from prior expectations

Extreme Fluctuation

This indicator shows the extremity of movement in overall emotions. It often takes values near 0 and changes relatively gradually.

Value Trends

  • Used to make more accurate considerations based on the occurrence of other parameters, not alone. Example: If high stress is detected along with this parameter, it may indicate a situation where emotional, impulsive anger is occurring.