Using Musical and Statistical Analysis of the Predominant Melody of the Voice to Create datasets from a Database of Popular Brazilian Hit Songs
DOI:
https://doi.org/10.5753/jidm.2022.2336Keywords:
MUSIC INFORMATION RETRIEVAL, HIT SONGS, NON-HIT SONGS, BRAZILIAN MUSIC DATASETS, MUSICAL SEMANTIC INFORMATION DATASETSAbstract
This work deals with the creation and optimization of a large set of features extracted from a database of 882 popular brazilian hit songs and non-hit songs, from 2014 to May 2019. From this database of songs, we created four datasets of musical features. The first comprises 3215 statistical features, while the second, third and fourth are completely new, as they were formed from the predominant melody of the Voice and previously there were no similar databases available for study. The second set of data represents the graph of the time-frequency spectrogram of the singer’s voice during the first 90 seconds of each song. The third dataset results from a statistical analysis carried out on the predominant melody of the voice. The fourth is the most peculiar of all, as it results from the musical semantic analysis of the predominant melody of the voice, which allowed the construction of a table with the most frequent melodic sequences of each song. Our datasets use only Brazilian songs and focus their data on a limited and contemporary period. The idea behind these datasets is to encourage the study of Machine Learning techniques that require musical information. The extracted features can help develop new studies in Music and Computer Science in the future.
Downloads
References
Ay, Y. E. Spotify dataset 1921-2020, 160k+ tracks, 2018.
Bertin-Mahieux, T., Ellis, D. P., Whitman, B., and Lamere, P. The million song dataset, 2011.
Bertoni A., L. R. P. Três datasets criados a partir de um banco de canções populares brasileiras de sucesso e não-sucesso de 2014 a 2019, 2021.
Billboard. Billboard magazine., 2019.
Blume, J. What makes a song a hit?, 2019.
Bogdanov, D., Wack, N., Gómez, E., Gulati, S., Herrera, P., Mayor, O., Roma, G., Salamon, J., Zapata, J. R., and Serra, X. Essentia: an audio analysis library for music information retrieval. In International Society for Music Information Retrieval Conference (ISMIR’13). ESSENTIA - UPF - Universitat Pompeu Fabra, Curitiba, Brazil, pp. 493–498, 2013.
Chon, S. H., Slaney, M., and Berger, J. Predicting success from music sales data: a statistical and adaptive approach. In Proceedings of the 1st ACM workshop on Audio and music computing multimedia. ACM, pp. 83–88, 2006.
Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr, A. W. Time series feature extraction on basis of scalable hypothesis tests (tsfresh–a python package). Neurocomputing vol. 307, pp. 72–77, 2018.
Christiano, L. J. and Fitzgerald, T. J. The band pass filter. international economic review 44 (2): 435–465, 2003.
ConnectMIX. Connectmix, monitoramento, auditoria e gestão de áudio em tempo real em rádios e tvs., 2019.
Dhanaraj, R. and Logan, B. Automatic prediction of hit songs. pp. 488–491, 2005.
Fu, T.-c., Hung, Y.-k., and Chung, F.-l. Improvement algorithms of perceptually important point identification for time series data mining. In 2017 IEEE 4th International Conference on Soft Computing & Machine Intelligence (ISCMI). IEEE, IEEE, [link], pp. 11–15, 2017.
Herremans, D., Martens, D., and Sörensen, K. Dance hit song prediction. Journal of New Music Research 43 (3): 291–302, 2014.
IBGE. População do brasil, 2021.
Interiano, M., Kazemi, K., Wang, L., Yang, J., Yu, Z., and Komarova, N. L. Musical trends and predictability of success in contemporary songs in and out of the top charts. Royal Society open science 5 (5): 171274, 2018.
Lima, J. N. A utilização de filtros digitais em séries temporais gnss. In Comunicação apresentada em VIII Conferência Nacional de Cartografia e Geodesia: VIII CNCG. VIII CNCG, [link], 2015.
Ni, Y., Santos-Rodriguez, R., Mcvicar, M., and De Bie, T. Hit song science once again a science. In 4th International Workshop on Machine Learning and Music. Citeseer, 2011.
Olteanu, A. Gtzan dataset - music genre classification, 2020.
Pachet, F. and Roy, P. Hit song science is not yet a science. pp. 355–360, 2008.
Raieli, R. Multimedia Information Retrieval: theory and techniques. Philadelphia, PA : Chandos Pub., Oxford, UK, 2013.
Rossing, T. D., Moore, F. R., and Wheeler, P. A. The science of sound. Pearson, 2014.
Salamon, J. Melody Extraction from Polyphonic Music Signals. Ph.D. thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2013.
Salamon, J. and Gómez, E. Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing 20 (6): 1759–1770, 2012.
Singhi, A. and Brown, D. G. Hit song detection using lyric features alone. Proceedings of International Society for Music Information Retrieval, 2014.
Yang, L.-C., Chou, S.-Y., Liu, J.-Y., Yang, Y.-H., and Chen, Y.-A. Revisiting the problem of audio-based hit song prediction using convolutional neural networks. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 621–625, 2017.