Autores
Yigezu Mesay Gemeda
Bade Girma Yohannis
Tonja Atnafu Lambebo
Kolesnikova Olga
Sidorov Grigori
Título Bilingual Word-Level Language Identification for Omotic Languages
Tipo Congreso
Sub-tipo Memoria
Descripción 11th EAI International Conference on Advancement of Science and Technology, ICAST 2023
Resumen Language identification is the task of determining the languages for a given text. In many real-world scenarios, text may contain more than one language, particularly in multilingual communities. Bilingual language identification (BLID) is the task of identifying and distinguishing between two languages in a given text. This paper presents BLID for languages spoken in the southern part of Ethiopia, namely, Wolaita and Gofa. The presence of words’ similarities and differences between the two languages makes the language identification task challenging. To overcome this challenge, we employed various experiments on various approaches. Then, the combination of the Bert-based pretrained language model and LSTM approach performed better, with an F1-score of 0.72 on the test set. As a result, the work will be effective in tackling unwanted social media issues and providing a foundation for further research in this area. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Observaciones DOI 10.1007/978-3-031-64151-0_5
Lugar Bahir Dar
País Etiopia
No. de páginas 63-77
Vol. / Cap.
Inicio 2023-08-25
Fin 2023-08-27
ISBN/ISSN 9783031641503