SABER

Autores
Sidorov Grigori
Posadas Durán Juan Pablo Francisco
Jiménez Salazar Héctor
Chanona Hernández Liliana

Título	A New Combined Lexical and Statistical based Sentence Level Alignment Algorithm for Parallel Texts
Tipo	Revista
Sub-tipo	CONACYT
Descripción	INTERNATIONAL JOURNAL OF COMPUTATIONAL LINGUISTICS AND APPLICATIONS
Resumen	Parallel texts alignment is an active research area in Natural Language Processing field. In this paper, we propose a method for sentence alignment of parallel texts that is based both on lexical and statistical information. The alignment procedure uses dynamic programming technique. We made our experiments for Spanish and English texts. We use lexical information from bilingual Spanish-English dictionary, as well as the sentence length measured in words and in characters. The proposed method was tested on a corpus of fiction texts, where the frequency of multiple alignments, omissions and insertions is higher than in other types of texts. We obtained better results than the standard Vanilla aligner system that uses a purely statistical approach.
Observaciones
Lugar
País
No. de páginas	257-263
Vol. / Cap.	Vol. 2, No. 1-2
Inicio	2011-12-01
Fin
ISBN/ISSN