Autores
Ta Hoang Thang
Rahman Abu Bakar Siddiqur
Gelbukh Alexander
Título THANGCIC at PoliticEs 2022: Term-based BERT for Extracting Political Ideology from Spanish Author Profiling
Tipo Congreso
Sub-tipo Memoria
Descripción 2022 Iberian Languages Evaluation Forum, IberLEF 2022
Resumen This paper presents our participation in the task of detecting gender, profession, and political ideology in tweets of Spanish users, in a binary and multi-class perspective. The task plays an important role in identifying political ideology of parties and politicians, especially new emerging ones. This may support relevant tasks to make predictions in the elections, or create an impact on the decision of citizens through out propagation systems. For each user, we extracted features as the most popular terms from a bunch of his/her tweets, then put them as input data for the training, which applied a transfer learning set up on pre-trained BERT models. Our quick method should be suggested as a baseline for the task with the highest F1 average macro of 72.72%. In detail, we obtained F1 Gender of 69.14%, F1 Profession of 81.47%, F1 Ideology Binary of 75.76%, and F1 Ideology Multiclass of 64.51%. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Observaciones CEUR Workshop Proceedings, v. 3202
Lugar Coruña
País España
No. de páginas
Vol. / Cap.
Inicio 2022-09-20
Fin
ISBN/ISSN