Autores
Gelbukh Alexander
Rahman Abu Bakar Siddiqur
Ta Hoang Thang
Título Paraphrase Identification: Lightweight Effective Methods Based Features from Pre-trained Models
Tipo Congreso
Sub-tipo Memoria
Descripción 2022 Iberian Languages Evaluation Forum, IberLEF 2022
Resumen In this paper, we work on Paraphrase Identification in Mexican Spanish (PAR-MEX) at the sentence level. We introduced two lightweight methods, linear regression and multilayer perceptron for training data on features, extracted from pre-trained models. A rule of thumb, pair similarity is used to filter noises in the positive examples. We obtained the best F1 of 88.67%, which points out the effectiveness of traditional methods with the support of pre-trained models. In the challenge, our result ranked fourth in the organizers' result table. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Observaciones CEUR Workshop Proceedings, v. 3202
Lugar Coruña
País España
No. de páginas
Vol. / Cap.
Inicio 2022-09-20
Fin
ISBN/ISSN