SABER

Autores
Markov Ilia
Gómez Adorno Helena Montserrat
Sidorov Grigori

Título	Language- and Subtask-Dependent Feature Selection and Classifier Parameter Tuning for Author Profiling. Notebook for PAN at CLEF 2017
Tipo	Congreso
Sub-tipo	Memoria
Descripción	18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017
Resumen	We present the CIC’s approach to the Author Profiling (AP) task at PAN 2017. This year task consists of two subtasks: gender and language variety identification in English, Spanish, Portuguese, and Arabic. We use typed and untyped character n-grams, word n-grams, and non-textual features (domain names). We experimented with various feature representations (binary, raw frequency, normalized frequency, log-entropy weighting, tf-idf), machine-learning algorithms (liblinear and libSVM implementations of Support Vector Machines (SVM), multinomial naive Bayes, ensemble classifier, meta-classifiers), and frequency threshold values. We adjusted system configurations for each of the languages and subtasks.
Observaciones
Lugar	Dublin
País	Irlanda
No. de páginas	7 p.
Vol. / Cap.
Inicio	2017-09-11
Fin	2017-09-14
ISBN/ISSN