Autores
Gelbukh Alexander
Título Soft Cardinality: A Parameterized Similarity Function for Text Comparison
Tipo Congreso
Sub-tipo Memoria
Descripción The First Joint Conference on Lexical and Computational Semantics. Collocated with NAACL-HLT 2012; The Association Of Computational Linguistics Anthology Network
Resumen We present an approach for the construction of text similarity functions using a parameterized resemblance coef?cient in combination with a softened cardinality function called soft cardinality. Our approach provides a consistent and recursive model, varying levels of granularity from sentences to characters. Therefore, our model was used to compare sentences divided into words, and in turn, words divided into q-grams of characters. Experimentally, we observed that a performance correlation function in a space de?ned by all parameters was relatively smooth and had a single maximum achievable by “hill climbing.” Our approach used only surface text information, a stop-word remover, and a stemmer to tackle the semantic text similarity task 6 at SEMEVAL 2012. The proposed method ranked 3rd (average), 5th (normalized correlation), and 15th (aggregated correlation) among 89 systems submitted by 31 teams.
Observaciones
Lugar Monteal
País Canada
No. de páginas 449-453
Vol. / Cap. Vol. 1
Inicio 2012-06-07
Fin 2012-06-08
ISBN/ISSN 978-1-937284-22-0