Resumen |
In this work multi-document, extractive summaries have been obtained using supervised learning algorithms in a well-known dataset (DUC 2002); the methodology has three steps: the pre-processing step, which filters irrelevant words and reduces vocabulary using stemming; the representation step, which transforms sentences into vectors; and the classification step which selects sentences for the summary. Noting that the last step is crucial because it determines the relevance of each sentence according to the information included in the embeddings. We found that the classifiers performance is not related to the summary quality mainly classifier’s goal is not aligned to summarizer’s goal, as classifier is based on selecting whole sentences, while summarization is evaluated by n-grams, for example ROUGE-n, and therefore it is relevant while comparing performances between different works in the state of the art. © 2020, Springer Nature Switzerland AG. |