Autores
Butt Sabur
Ashraf Noman
Sidorov Grigori
Gelbukh Alexander
Título Sexism identification using bert and data augmentation - Exist2021
Tipo Congreso
Sub-tipo Memoria
Descripción 2021 Iberian Languages Evaluation Forum, IberLEF 2021
Resumen Sexism is defined as discrimination among females of all ages. We have seen a rise of sexism in social media platforms manifesting itself in many forms. The paper presents best performing machine learning and deep learning algorithms as well as BERT results on “sEXism Identification in Social neTworks (EXIST 2021)” shared task. The task incorporates multilingual dataset containing both Spanish and English tweets. The multilingual nature of the dataset and inconsistencies of the social media text makes it a challenging problem. Considering these challenges the paper focuses on the pre-processing techniques and data augmentation to boost results on various machine learning and deep learning methods. We achieved an F1 score of 78.02% on the sexism identification task (task 1) and F1 score of 49.08% on the sexism categorization task (task 2).
Observaciones CEUR Workshop Proceedings
Lugar Virtual, online
País España
No. de páginas 381-389
Vol. / Cap. v. 2943
Inicio 2021-09-21
Fin
ISBN/ISSN