Resumen |
In this paper, we present a lexical resource for preprocessing of texts published in social networks. It is developed for the following languages: English, Spanish, Dutch, and Italian. The resource contains dictionaries of slang words, abbreviations, contractions, and emoticons commonly used in social networks. The dictionaries were used for preprocessing of tweets obtained from the corpus for the task of author proling (PAN 2015). The results show that the use of the proposed dictionaries helps to improve the eciency of classiers for the author proling task. |