Resumen |
Open exchange of hate speech, insults, derogatory remarks, and obscenities on social media platforms can undermine objective discourse and facilitate radicalization by spreading propaganda and exposing people to danger. People who have been targeted by these offensive and hateful content often experience physiological effects as a result. In this work, we present our models for detecting hate speech and offensive content in two Indo-Aryan languages submitted to HASOC 2023. Although Gujarati and Sinhala are considered low-resource languages, our models demonstrated commendable accuracy in detecting hate speech after fine-tuning them with language-specific hate speech datasets. Our experiments employed and fine-tuned two transformer models, namely DistilBERT and mBERT, and we show that these transformer models were effective in detecting hate speech in Indo-Aryan texts. mBERT achieved the macro F1-score of 0.6 in the Sinhala text and excelled further with a score of 0.8 in the Gujarati text classification. © 2023 Copyright for this paper by its authors. |