Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from Plutchik’s model. Evaluation covered four corpora: 4000 stock-market tweets, 1000 news headlines, 5000 GoEmotions Reddit comments translated by LLMs, and 2000 DeepSeek-generated headlines. While BERTimbau achieved the highest average scores (accuracy 0.876, precision 0.529, and recall 0.423), an overlap with Mistral (accuracy 0.831, precision 0.522, and recall 0.539) and notable performance variability suggest there is no single top performer; however, both transformer-based models outperformed the lexicon-based EmoAtlas (accuracy 0.797) but required up to 40 times more computational resources. We also introduce a novel “emotional fingerprinting” methodology using a synthetically generated dataset to probe emotional alignment, which revealed an imperfect overlap in the emotional representations of the models. While LLMs deliver higher overall scores, EmoAtlas offers superior interpretability and efficiency, making it a cost-effective alternative. This work delivers the first quantitative benchmark for interpretable emotion detection in Brazilian Portuguese, with open datasets and code to foster research in multilingual natural language processing.

Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese / Domingues Aparecido, Thales David; Carrillo, Alexis; Camargo, Chico Q.; Stella, Massimo. - In: AI. - ISSN 2673-2688. - 6:10(2025). [10.3390/ai6100249]

Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese

Carrillo, Alexis;Stella, Massimo
2025-01-01

Abstract

Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from Plutchik’s model. Evaluation covered four corpora: 4000 stock-market tweets, 1000 news headlines, 5000 GoEmotions Reddit comments translated by LLMs, and 2000 DeepSeek-generated headlines. While BERTimbau achieved the highest average scores (accuracy 0.876, precision 0.529, and recall 0.423), an overlap with Mistral (accuracy 0.831, precision 0.522, and recall 0.539) and notable performance variability suggest there is no single top performer; however, both transformer-based models outperformed the lexicon-based EmoAtlas (accuracy 0.797) but required up to 40 times more computational resources. We also introduce a novel “emotional fingerprinting” methodology using a synthetically generated dataset to probe emotional alignment, which revealed an imperfect overlap in the emotional representations of the models. While LLMs deliver higher overall scores, EmoAtlas offers superior interpretability and efficiency, making it a cost-effective alternative. This work delivers the first quantitative benchmark for interpretable emotion detection in Brazilian Portuguese, with open datasets and code to foster research in multilingual natural language processing.
2025
AI
10
Domingues Aparecido, Thales David; Carrillo, Alexis; Camargo, Chico Q.; Stella, Massimo
Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese / Domingues Aparecido, Thales David; Carrillo, Alexis; Camargo, Chico Q.; Stella, Massimo. - In: AI. - ISSN 2673-2688. - 6:10(2025). [10.3390/ai6100249]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/466413
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact