Resumen |
In this research, we address the composer classification supervised problem from a Natural Language Processing perspective. Starting from digital symbolic music files, we build two representations: a class representation and other based on MIDI pitches. We use the technique of n-grams to build feature vectors of musical compositions based on their harmonic content. For this, we extract n-grams of size 1 to 4 in harmonic direction, differentiating between all possible subsets of instruments. We populate a term-frequency matrix with the vectors of compositions and we classify by the means of Support Vector Machines (SVM) classifier. Different classification models are evaluated, e.g., using feature filters and varying hyperparameters such as TF-IDF formula, among others. The results obtained show that n-grams based on MIDI pitches perform slightly better than n-grams based on class representation in terms of overall results, but the best result of each one of these representations is identical. Some of our best models reach accuracy results that exceed previous state of the art results based on a well-known dataset composed of string quartets by Haydn and Mozart. © 2024 Instituto Politecnico Nacional. All rights reserved. |