Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese

  • Thales David Domingues Aparecido
  • , Alexis Carrillo
  • , Chico Q. Camargo
  • , Massimo Stella

Research output: Contribution to journalArticlepeer-review

Abstract

Emotion detection in Brazilian Portuguese is less studied than in English. We benchmarked a large language model (Mistral 24B), a language-specific transformer model (BERTimbau), and the lexicon-based EmoAtlas for classifying emotions in Brazilian Portuguese text, with a focus on eight emotions derived from Plutchik’s model. Evaluation covered four corpora: 4000 stock-market tweets, 1000 news headlines, 5000 GoEmotions Reddit comments translated by LLMs, and 2000 DeepSeek-generated headlines. While BERTimbau achieved the highest average scores (accuracy 0.876, precision 0.529, and recall 0.423), an overlap with Mistral (accuracy 0.831, precision 0.522, and recall 0.539) and notable performance variability suggest there is no single top performer; however, both transformer-based models outperformed the lexicon-based EmoAtlas (accuracy 0.797) but required up to 40 times more computational resources. We also introduce a novel “emotional fingerprinting” methodology using a synthetically generated dataset to probe emotional alignment, which revealed an imperfect overlap in the emotional representations of the models. While LLMs deliver higher overall scores, EmoAtlas offers superior interpretability and efficiency, making it a cost-effective alternative. This work delivers the first quantitative benchmark for interpretable emotion detection in Brazilian Portuguese, with open datasets and code to foster research in multilingual natural language processing.

Original languageEnglish
Article number249
JournalAI (Switzerland)
Volume6
Issue number10
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 by the authors.

Keywords

  • Brazilian Portuguese
  • EmoAtlas
  • emotional profiling
  • large language models
  • text analysis

Fingerprint

Dive into the research topics of 'Benchmarking Psychological Lexicons and Large Language Models for Emotion Detection in Brazilian Portuguese'. Together they form a unique fingerprint.

Cite this