Two Meta-learning approaches for noise filter algorithm recommendation


  • Pedro B. Pio University of Brasilia
  • Adriano Rivolli Federal University of Technology – Paraná, Brazil (UTFPR)
  • André C. P. L. F. de Carvalho University of São Paulo
  • Luís P. F. Garcia University of Brasilia



Meta-Learning, Noise Detection, Preprocessing, Machine Learning, Algorithm Recommendation, Ranking


Preprocessing techniques can increase the predictive performance, or even allow the use, of Machine Learning (ML) algorithms. This occurs because many of these techniques can improve the quality of a dataset, such as noise removal or filtering. However, it is not simple to identify which preprocessing techniques to apply to a given dataset. This work presents two approaches to recommend a noise filtering technique using meta-learning. Meta-learning is an automated machine learning (AutoML) method that can, based on a set of features extracted from a dataset, induce a meta-model able to predict the most suitable technique to be applied to a new dataset. The first approach returns a ranking of the noise filter techniques using regression models. The second sequentially applies multiple meta-models, to decide the most suitable noise filter technique for a particular dataset. For both approaches we extract the meta-features from use synthetics datasets and use as meta-label the f1-score value obtained by different ML algorithms when applied to these datasets. For the experiments, eight noise filtering techniques were used. The experimental results indicated that the rank approach acquired higher performance gain than the baseline, while the second obtained higher predictive performance. The ranking based approach also ranked the best algorithm in the top-3 positions with high predictive accuracy.


Download data is not yet available.


Alcobaça, E., Siqueira, F., Rivolli, A., Garcia, L. P. F., Oliva, J. T., and de Carvalho, A. C. P. L. F. (2020). Mfe: Towards reproducible meta-feature extraction. Journal of Machine Learning Research, 21(1):4503–4507. DOI: 10.5555/3455716.3455827.

Bilalli, B., Abelló, A., Aluja-Banet, T., and Wrembel, R. (2019). Presistant: Learning based assistant for data preprocessing. Data & Knowledge Engineering, 123:1–22. DOI: 10.1016/j.datak.2019.101727.

Brazdil, P., Giraud-Carrier, C., Soares, C., and Vilalta, R. (2009). Metalearning - Applications to Data Mining. Cognitive Technologies. Springer, Berlin, Heidelberg, 1 edition. DOI: 10.1007/978-3-540-73263-1.

Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32. DOI: 10.1023/a:1010933404324.

Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (2017). Classification and regression trees. Routledge, New York, NY. DOI: 10.1201/9781315139470.

Cawley, G. C. and Talbot, N. L. (2003). Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognition, 36(11):2585–2592. DOI: 10.1016/S0031-3203(03)00136-5.

Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20:273–297. DOI: 10.1007/bf00994018.

Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine learning research, 7:1–30. DOI: 10.5555/1248547.1248548.

Famili, A., Shen, W.-M., Weber, R., and Simoudis, E. (1997). Data preprocessing and intelligent data analysis. Intelligent data analysis, 1(1):3–23. DOI: 10.1016/S1088-467X(98)00007-9.

Fayyad, U. M., Haussler, D., and Stolorz, P. E. (1996). Kdd for science data analysis: Issues and examples. In Second International Conference on Knowledge Discovery & Data Mining (KDD), pages 50–56, Portland, OR. AAAI Press. DOI: 10.5555/3001460.3001471.

Frénay, B. and Verleysen, M. (2014). Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems, 25(5):845–869. DOI: 10.1109/TNNLS.2013.2292894.

Freund, Y. and Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139. DOI: 10.1006/jcss.1997.1504.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Symposium on Knowledge Discovery, Mining and Learning, 29(2):1189–1232. DOI: 10.1214/aos/1013203451.

Garcia, L. P., de Carvalho, A. C., and Lorena, A. C. (2016a). Noise detection in the meta-learning level. Neurocomputing, 176:14–25. DOI: 10.1016/j.neucom.2014.12.100.

Garcia, L. P., Lorena, A. C., Matwin, S., and de Carvalho, A. C. (2016b). Ensembles of label noise filters: a ranking approach. Data Mining and Knowledge Discovery, 30(5):1192–1216. DOI: 10.1007/s10618-016-0475-9.

Garcia, L. P. F., Lorena, A. C., and Carvalho, A. C. (2012). A study on class noise detection and elimination. In Brazilian Symposium on Neural Networks (BRACIS), pages 13–18. DOI: 10.1109/SBRN.2012.49.

García, S., Luengo, J., and Herrera, F. (2015). Data preprocessing in data mining, volume 72. Springer, Cham, Switzerland, 1 edition. DOI: 10.1007/978-3-319-10247-4.

Gupta, S. and Gupta, A. (2019). Dealing with noise problem in machine learning data-sets: A systematic review. Procedia Computer Science, 161:466–474. DOI: 10.1016/j.procs.2019.11.146.

Hutter, F., Kotthoff, L., and Vanschoren, J. (2019). Automated machine learning: methods, systems, challenges. Springer Nature, Cham, Switzerland. DOI: 10.1007/978-3-030-05318-5.

Karmaker, A. and Kwek, S. (2006). A boosting approach to remove class label noise. International Journal of Hybrid Intelligent Systems, 3(3):169–177. DOI: 10.1109/ICHIS.2005.1.

Koplowitz, J. and Brown, T. A. (1981). On the relation of performance to editing in nearest neighbor rules. Pattern Recognition, 13(3):251–255. DOI: 10.1016/0031-3203(81)90102-3.

Miranda, A. L., Garcia, L. P. F., Carvalho, A. C., and Lorena, A. C. (2009). Use of classification algorithms in noise detection and elimination. In International Conference on Hybrid Artificial Intelligence Systems, pages 417–424. DOI: 10.1007/978-3-642-02319-4_50.

Mitchell, T. M. (1997). Machine Learning. McGraw Hill series in computer science. McGraw Hill, New York, NY.

Morales, P., Luengo, J., Garcia, L. P., Lorena, A. C., de Carvalho, A. C., and Herrera, F. (2017). The noisefiltersr package: Label noise preprocessing in r. The R Journal, 9(1):219–228. DOI: 10.32614/RJ-2017-027.

Munson, M. A. (2012). A study on the importance of and time spent on different modeling steps. ACM SIGKDD Explorations Newsletter, 13(2):65–71. DOI: 10.1145/2207243.2207253.

Nagarajah, T. and Poravi, G. (2019). A review on automated machine learning (automl) systems. In 5th International Conference for Convergence in Technology (I2CT), pages 1–6, Bombay, India. IEEE. DOI: 10.1109/I2CT45611.2019.9033810.

Parmezan, A. R. S., Lee, H. D., Spolaôr, N., and Wu, F. C. (2021). Automatic recommendation of feature selection algorithms based on dataset characteristics. Expert Systems with Applications, 185:115589. DOI: 10.1016/j.eswa.2021.115589.

Pio, P. B., Garcia, L. P., and Rivolli, A. (2022). Meta-learning approach for noise filter algorithm recommendation. In X Symposium on Knowledge Discovery, Mining and Learning, pages 186–193. SBC. DOI: 10.5753/kdmile.2022.227958.

Rice, J. R. (1976). The algorithm selection problem. Advances in Computers, 15:65–118. DOI: 10.1016/S0065-2458(08)60520-3.

Rivolli, A., Garcia, L. P., Soares, C., Vanschoren, J., and de Carvalho, A. C. (2022). Meta-features for meta-learning. Knowledge-Based Systems, 240:108101. DOI: 10.1016/j.knosys.2021.108101.

Russell, S. J. and Norvig, P. (2009). Artificial Intelligence: a modern approach. Pearson, Prentice Hall Upper Saddle River, NJ, USA, 3 edition. DOI: 10.5555/1671238.

Sluban, B., Gamberger, D., and Lavrač, N. (2014). Ensemble-based noise detection: noise ranking and visual performance evaluation. Data Mining and Knowledge Discovery, 28(2):265 303. DOI: 10.1007/s10618-012-0299-1.

Smith, M. R. and Martinez, T. (2011). Improving classification accuracy by identifying and removing instances that should be misclassified. In International Joint Conference on Neural Networks, pages 2690–2697. DOI: 10.1109/IJCNN.2011.6033571.

Smith-Miles, K. A. (2008). Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys, 41(1):1–25. DOI: 10.1145/1456650.1456656.

Tomek, I. (1976). An experiment with the edited nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(6):448–452. DOI: 10.1109/TSMC.1976.4309523.

Truong, A., Walters, A., Goodsitt, J., Hines, K., Bruss, C. B., and Farivar, R. (2019). Towards automated machine learning: Evaluation and comparison of automl approaches and tools. In 31st International Conference on Tools with Artificial Intelligence (ICTAI), pages 1471–1479, Portland, OR. IEEE. DOI: 10.1109/ICTAI.2019.00209.

Vanschoren, J. (2019). Meta-learning. In Automated Machine Learning, pages 35–61. Springer Nature, Cham, Switzerland. DOI: 10.1007/978-3-030-05318-5_2.

Wheway, V. (2001). Using boosting to detect noisy data. In Pacific Rim International Conference on Artificial Intelligence, pages 123–130. DOI: 10.1007/3-540-45408-X_13.

Wilson, D. L. (1972). Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, SMC-2(3):408–421. DOI: 10.1109/TSMC.1972.4309137.

Wirth, R. and Hipp, J. (2000). Crisp-dm: Towards a standard process model for data mining. In 4th International Conference on the Practical Application of Knowledge Discovery and Data Mining, pages 29–39, New York, NY. AAAI Press.

Zar, J. H. (2014). Spearman rank correlation: overview. Wiley StatsRef: Statistics Reference Online. DOI: 10.1002/9781118445112.stat05964.

Zhu, X. and Wu, X. (2004). Class noise vs. attribute noise: A quantitative study. Artificial Intelligence Review, 22(3):177–210. DOI: 10.1007/s10462-004-0751-8.




How to Cite

B. Pio, P., Rivolli, A., C. P. L. F. de Carvalho, A., & P. F. Garcia, L. (2024). Two Meta-learning approaches for noise filter algorithm recommendation. Journal of Information and Data Management, 15(1), 132–141.



Best Papers of KDMiLe 2022 - Extended Papers