Resumen |
In this article, we consider the problem of supervised morphological analysis using an approach that differs from industry spread analogs. The article describes a new method of lemmatization based on the algorithms of machine learning, in particular, on the algorithms of regression analysis, trained on the open grammatical dictionary of Russian language. Comparison of obtained results was performed with existing alternative applications that are used nowadays for addressing lemmatization problems in NLP problems for Russian language. The proposed method shows some potential for further development as it has comparable quality but uses relatively simple machine learning algorithm and at the same time is not rule based involving no manual work. The source code for our lemmatizer is publicly available |