Resumen |
This research investigates the efficacy of various computational models and feature sets in the task of classifying scientific text reviews into distinct categories. Utilizing a combination of Word Space Models (WSM) and the Linguistic Inquiry and Word Count (LIWC) dictionary, the study endeavors to categorize reviews initially into five classes before simplifying the classification scheme into a binary system (’accept’ and’reject’). Despite the relatively straightforward nature of the employed feature sets, the binary classification approach demonstrated a notable improvement over a basic baseline that non-discriminatively assigns reviews to the most populous category. We obtain a recall of 0.758, compared with a baseline of 0.585 to the majority class and 0.62 and 0.66 of BERT and RoBERTa respectively. This performance can be considered significant given the diverse and subjective nature of the review content, contributed by 80 distinct individuals, each with their unique writing style and evaluative criteria. This work contributes to XAI through linguistic analysis revealing, for example that a minimal subset of features, specifically two out of the seventy provided by LIWC, can yield insightful distinctions in review classifications (0.649 recall). The analysis further identifies specific lexemes, such as ‘not’, ‘since’ and ‘had’, which offer deeper insights into the linguistic constructs employed by reviewers. Copyright Author (s) 2025. |