Autores
Zhila Alisa
Gelbukh Alexander
Título Open Information Extraction for Spanish Language based on Syntactic Constraints
Tipo Congreso
Sub-tipo Memoria
Descripción Second Workshop on Natural Language Processing for Social Media (SocialNLP
Resumen Open Information Extraction (Open IE) serves for the analysis of vast amounts of texts by extraction of assertions, or relations, in the form of tuples hargument 1; relation; argument 2i. Various approaches to Open IE have been designed to perform in a fast, unsupervised manner. All of them require language specific information for their implementation. In this work, we introduce an approach to Open IE based on syntactic constraints over POS tag sequences targeted at Spanish language. We describe the rules specific for Spanish language constructions and their implementation in EXTRHECH, an Open IE system for Spanish. We also discuss language-specific issues of implementation. We compare EXTRHECH’s performance with that of REVERB, a similar Open IE system for English, on a parallel dataset and show that these systems perform at a very similar level. We also compare EXTRHECH’s performance on a dataset of grammatically correct sentences against its performance on a dataset of random texts extracted from the Web, drastically different in their quality from the first dataset. The latter experiment shows robustness of EXTRHECH on texts from the Web.
Observaciones Association for Computational Linguistics ** Drive: Open-information_2014
Lugar Baltimore, Maryland
País Estados Unidos
No. de páginas 78–85
Vol. / Cap.
Inicio 2014-06-22
Fin
ISBN/ISSN