SABER

Autores
Gelbukh Alexander
Cendejas Castro Eduardo Antonio
Barceló Alonso Grettel
Sidorov Grigori

Título	Incorporating Linguistic Information to Statistical Word-Level Alignment
Tipo	Congreso
Sub-tipo	SCOPUS
Descripción	Lecture Notes in Computer Science
Resumen	Parallel texts are enriched by alignment algorithms, thus establishing a relationship between the structures of the implied languages. Depending on the alignment level, the enrichment can be performed on paragraphs, sentences or words, of the expressed content in the source language and its translation. There are two main approaches to perform word-level alignment: statistical or linguistic. Due to the dissimilar grammar rules the languages have, the statistical algorithms usually give lower precision. That is why the development of this type of algorithms is generally aimed at a specific language pair using linguistic techniques. A hybrid alignment system based on the combination of the two traditional approaches is presented in this paper. It provides user-friendly configuration and is adaptable to the computational environment. The system uses linguistic resources and procedures such as identification of cognates, morphological information, syntactic trees, dictionaries, and semantic domains. We show that the system outperforms existing algorithms.
Observaciones	14th Iberoamerican Conference on Pattern Recognition, CIARP 2009; Code 83218; ISBN: 3642102670;978-364210267-7
Lugar	Guadalajara, Jalisco
País	Mexico
No. de páginas	387-394
Vol. / Cap.	5856
Inicio	2009-11-15
Fin	2009-11-18
ISBN/ISSN	3642102670;978-36421