Name: Real-time Recognition of Speech Emotion for Human-robot Interaction
Start: 2024-10-09T14:00:00-0400
End: 2024-10-09T16:00:00-0400

Exhibits+ badges provide access to the ADAM Audio Immersive Room , the Genelec Immersive Room, Tech Tours, and the presentations on the Main Stage .

All Access badges provide access to all content in the Program (Tech Tours still require registration)

View the Exhibit Floor Plan.

Wednesday October 9, 2024 2:00pm - 4:00pm EDT

Poster

In this paper, we propose a novel method for real-time speech emotion recognition (SER) tailored for human-robot interaction. Traditional SER techniques, which analyze entire utterances, often struggle in real-time scenarios due to their high latency. To overcome this challenge, the proposed method breaks down speech into short, overlapping segments and uses a soft voting mechanism to aggregate emotion probabilities in real time. The proposed real-time method is applied to an SER model comprising the pre-trained wav2vec 2.0 and a convolutional network for feature extraction and emotion classification, respectively. The performance of the proposed method was evaluated on the KEMDy19 dataset, a Korean emotion dataset focusing on four key emotions: anger, happiness, neutrality, and sadness. Consequently, applying the real-time method, which processed each segment with a duration of 0.5 or 3.0 seconds, resulted in relative reduction of unweighted accuracy by 10.61% or 5.08%, respectively, compared to the method that processed entire utterances. However, the real-time factor (RTF) was significantly improved.

Speakers

AES Show 2024 NY

Jimin Jun

Jimin Jun

Hong Kook Kim

AES Show 2024 NY

Jimin Jun

Jimin Jun

Hong Kook Kim

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!