Resumen |
This paper focuses on identifying hate speech directed towards the LGBT+ community. The study involves two tasks, track 1 and track 2, which use a multi-class approach to identify LGBT+phobic content in tweets and detect fine-grained multi-label hate speech indicating different types of LGBT+phobias, respectively. The study employs pre-processing and oversampling techniques to address data imbalance problems. The results show that transformer-based approaches, such as BERT and RoBERTa, are effective in identifying hate speech directed at the LGBT+ community. The experiment performance is evaluated by the macro-average F1 measure. The study highlights the challenges associated with data imbalance, order bias, and limited training data, which can lead to bias in model performance and affect its ability to learn the underlying patterns in the data. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |