Resumen |
Sentiment analysis from unstructured natural language text has recently received considerable attention from the research community. In the frame of biologically inspired machine learning approaches, finding good feature sets is particularly challenging yet very important. In this paper, we focus on this fundamental issue of the sentiment analysis task. Specifically, we employ concepts as features and present a concept extraction algorithm based on a novel concept parser scheme to extract semantic features that exploit semantic relationships between words in natural language text. Additional conceptual information of a concept is obtained using the ConceptNet ontology: Concepts extracted from text are sent as queries to ConceptNet to extract their semantics. We select important concepts and eliminate redundant concepts using the Minimum Redundancy and Maximum Relevance feature selection technique. All selected concepts are then used to build a machine learning model that classifies a given document as positive or negative. We evaluate our concept extraction approach using a benchmark movie review dataset provided by Cornell University and product review datasets on books, DVDs, and electronics. Comparative experimental results show that our proposed approach to sentiment analysis outperforms existing state-of-the-art methods. |