Autores
Shahiki Tash Moein
Armenta Segura Jesús Jorge
Ahani Zahra
Kolesnikova Olga
Sidorov Grigori
Gelbukh Alexander
Título LIDOMA at HOMO-MEX2023@IberLEF: Hate Speech Detection Towards the Mexican Spanish-Speaking LGBT+ Population. The Importance of Preprocessing Before Using BERT-Based Models
Tipo Congreso
Sub-tipo Memoria
Descripción 2023 Iberian Languages Evaluation Forum, IberLEF 2023
Resumen Hate speech targeting LGBT+ individuals poses a deeply ingrained problem with wide-ranging consequences, encompassing substance abuse disorders and discrimination. These specific concerns are particularly amplified in Mexico. In this paper, we present our submission on the first track of the HOMO-MEX: Hate Speech Detection towards the Mexican Spanish-Speaking LGBT+ Population. We explore the dataset and we employ transformer architectures, who have demonstrated significant efficacy in similar sentiment analysis tasks. Specifically, we utilize BERT-based models and we show the importance of preprocessing by reaching the last place in the competition with a Macro F1 score of 0.73. The source code to reproduce our results can be found at https://github.com/moeintash72. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Observaciones CEUR Workshop Proceedings, v. 3496
Lugar Jaen
País España
No. de páginas
Vol. / Cap.
Inicio 2023-09-26
Fin
ISBN/ISSN