Título |
Sexism identification using bert and data augmentation - Exist2021 |
Tipo |
Congreso |
Sub-tipo |
Memoria |
Descripción |
2021 Iberian Languages Evaluation Forum, IberLEF 2021 |
Resumen |
Sexism is defined as discrimination among females of all ages. We have seen a rise of sexism in social media platforms manifesting itself in many forms. The paper presents best performing machine learning and deep learning algorithms as well as BERT results on “sEXism Identification in Social neTworks (EXIST 2021)” shared task. The task incorporates multilingual dataset containing both Spanish and English tweets. The multilingual nature of the dataset and inconsistencies of the social media text makes it a challenging problem. Considering these challenges the paper focuses on the pre-processing techniques and data augmentation to boost results on various machine learning and deep learning methods. We achieved an F1 score of 78.02% on the sexism identification task (task 1) and F1 score of 49.08% on the sexism categorization task (task 2). |
Observaciones |
CEUR Workshop Proceedings |
Lugar |
Virtual, online |
País |
España |
No. de páginas |
381-389 |
Vol. / Cap. |
v. 2943 |
Inicio |
2021-09-21 |
Fin |
|
ISBN/ISSN |
|