SABER

Autores
Sidorov Grigori

Título	Cic-IPN@INLi2018: Indian native language identification
Tipo	Congreso
Sub-tipo	Memoria
Descripción	10th Working Notes of FIRE - Forum for Information Retrieval Evaluation, FIRE-WN 2018
Resumen	In this paper, we describe the CIC-IPN submissions to the shared task on Indian Native Language Identification (INLI 2018). We use the Support Vector Machines algorithm trained on numerous feature types: word, character, part-of-speech tag, and punctuation mark n-grams, as well as character n-grams from misspelled words and emotion-based features. The features are weighted using log-entropy scheme. Our team achieved 41.8% accuracy on the test set 1 and 34.5% accuracy on the test set 2, ranking 3rd in the official INLI shared task scoring. © 2018 CEUR-WS. All Rights Reserved.
Observaciones	CEUR Workshop Proceedings, v. 2266
Lugar	Gandhinagar
País	India
No. de páginas	82-88
Vol. / Cap.
Inicio	2018-12-06
Fin	2018-12-09
ISBN/ISSN