Synthetic Minority Over-sampling Technique for detecting Malicious Traffic targeting Internet of Things' devices

Authors

DOI:

https://doi.org/10.5753/jisa.2025.5251

Keywords:

IoT, machine learning, malicious traffic, over-sampling

Abstract

This study proposes a multiclass machine learning approach for detecting 34 distinct types of cyberattacks in Internet of Things (IoT) traffic using the CICIoT2023 dataset. We evaluate the performance of lightweight classifiers—Bernoulli Naive Bayes, Decision Tree, Random Forest, and XGBoost—under highly imbalanced conditions. To address class imbalance and improve minority-class detection, we apply the Synthetic Minority Over-sampling Technique (SMOTE). In addition, we conduct hyperparameter tuning using RandomizedSearchCV and assess model performance using macro-average metrics, including recall, precision, and F1-score. Experimental results demonstrate that XGBoost and Random Forest, when optimized and combined with SMOTE, consistently achieves high and balanced detection rates across all classes. These findings suggest its applicability to real-world IoT intrusion detection scenarios, particularly in resource-constrained environments.

Downloads

Download data is not yet available.

References

Abdelmoumin, G., Rawat, D. B., and Rahman, A. (2022). On the performance of machine learning models for anomaly-based intelligent intrusion detection systems for the internet of things. IEEE Internet of Things Journal, 9(6):4280-4290. DOI: 10.1109/JIOT.2021.3103829.

Abdul Samad, S. R., Balasubaramanian, S., Al-Kaabi, A. S., Sharma, B., Chowdhury, S., Mehbodniya, A., Webber, J. L., and Bostani, A. (2023). Analysis of the performance impact of fine-tuned machine learning model for phishing url detection. Electronics, 12(7):1642. DOI: 10.3390/electronics12071642.

Algabroun, H. and Håkansson, L. (2025). Parametric machine learning-based adaptive sampling algorithm for efficient iot data collection in environmental monitoring. Journal of Network and Systems Management, 33(5):1-22. DOI: 10.1007/s10922-024-09881-1.

Amaouche, S., Guezzaz, A., Benkirane, S., Azrour, M., Khattak, S. B. A., Farman, H., and Nasralla, M. M. (2023). Fscb-ids: Feature selection and minority class balancing for attacks detection in vanets. Applied sciences, 13(13):7488. DOI: 10.3390/app13137488.

Arshad, A., Jabeen, M., Ubaid, S., Raza, A., Abualigah, L., Aldiabat, K., and Jia, H. (2023). A novel ensemble method for enhancing internet of things device security against botnet attacks. Decision Analytics Journal, 8:100307. DOI: 10.1016/j.dajour.2023.100307.

Attou, H., Mohy-eddine, M., Guezzaz, A., Benkirane, S., Azrour, M., Alabdultif, A., and Almusallam, N. (2023). Towards an intelligent intrusion detection system to detect malicious activities in cloud computing. Applied Sciences, 13(17):9588. DOI: 10.3390/app13179588.

Awad, M., Fraihat, S., Salameh, K., and Al Redhaei, A. (2022). Examining the suitability of netflow features in detecting iot network intrusions. Sensors, 22(16):6164. DOI: 10.3390/s22166164.

Behnke, M., Briner, N., Cullen, D., Schwerdtfeger, K., Warren, J., Basnet, R., and Doleck, T. (2021). Feature engineering and machine learning model comparison for malicious activity detection in the dns-over-https protocol. IEEE Access, 9:129902-129916. DOI: 10.1109/ACCESS.2021.3113294.

Bispo, G. D., Vergara, G. F., Saiki, G. M., Martins, P. H. d. S., Coelho, J. G., Rodrigues, G. A. P., Oliveira, M. N. d., Mosquéra, L. R., Gonçalves, V. P., Neumann, C., et al. (2024). Automatic literature mapping selection: Classification of papers on industry productivity. Applied Sciences, 14(9):3679. DOI: 10.3390/app14093679.

Büyükkeçeci, M. and Okur, M. C. (2022). A comprehensive review of feature selection and feature selection stability in machine learning. Gazi University Journal of Science, 36(4):1506-1520. DOI: 10.35378/gujs.993763.

Chalé, M. and Bastian, N. D. (2022). Generating realistic cyber data for training and evaluating machine learning classifiers for network intrusion detection systems. Expert Systems with Applications, 207:117936. DOI: 10.1016/j.eswa.2022.117936.

Chaudhary, P., Singh, A., and Gupta, B. (2025). Dynamic multiphase ddos attack identification and mitigation framework to secure sdn-based fog-empowered consumer iot networks. Computers and Electrical Engineering, 123:110226. DOI: 10.1016/j.compeleceng.2025.110226.

Coscia, A., Dentamaro, V., Galantucci, S., Maci, A., and Pirlo, G. (2024). Automatic decision tree-based nidps ruleset generation for dos/ddos attacks. Journal of Information Security and Applications, 82:103736. DOI: 10.1016/j.jisa.2024.103736.

Dasari, S. and Kaluri, R. (2024). An effective classification of ddos attacks in a distributed network by adopting hierarchical machine learning and hyperparameters optimization techniques. IEEE Access, 12:10834-10845. DOI: 10.1109/ACCESS.2024.3352281.

Diwan, T. D., Choubey, S., Hota, H., Goyal, S., Jamal, S. S., Shukla, P. K., and Tiwari, B. (2021). Feature entropy estimation (fee) for malicious iot traffic and detection using machine learning. Mobile Information Systems, 2021(1):8091363. DOI: 10.1155/2021/8091363.

Hou, J., Qu, L., and Shi, W. (2019). A survey on internet of things security from data perspectives. Computer Networks, 148:295-306. DOI: 10.1016/j.comnet.2018.11.026.

Jeon, S.-E., Oh, Y.-S., Lee, Y.-J., and Lee, I.-G. (2024). Suboptimal feature selection techniques for effective malicious traffic detection on lightweight devices. CMES-Computer Modeling in Engineering & Sciences, 140(2). DOI: 10.32604/cmes.2024.047239.

Ji, I. H., Lee, J. H., Kang, M. J., Park, W. J., Jeon, S. H., and Seo, J. T. (2024). Artificial intelligence-based anomaly detection technology over encrypted traffic: a systematic literature review. Sensors, 24(3):898. DOI: 10.3390/s24030898.

Li, J., Othman, M. S., Chen, H., and Yusuf, L. M. (2024). Optimizing iot intrusion detection system: feature selection versus feature extraction in machine learning. Journal of Big Data, 11(1):36. DOI: 10.1186/s40537-024-00892-y.

Liu, Y., Wang, J., Li, J., Niu, S., and Song, H. (2021). Machine learning for the detection and identification of internet of things devices: A survey. IEEE Internet of Things Journal, 9(1):298-320. DOI: 10.1109/JIOT.2021.3099028.

Mittal, S., Dhall, A., Sharma, T., and Sharma, T. (2025). Deep learning ensemble for iot malware detection with explainable ai and class imbalance mitigation. Engineering Applications of Artificial Intelligence, 127:107847. DOI: 10.1016/j.engappai.2024.109560.

Musleh, D., Alotaibi, M., Alhaidari, F., Rahman, A., and Mohammad, R. M. (2023). Intrusion detection system using feature extraction with machine learning algorithms in iot. Journal of Sensor and Actuator Networks, 12(2). DOI: 10.3390/jsan12020029.

Neto, E. C. P., Dadkhah, S., Ferreira, R., Zohourian, A., Lu, R., and Ghorbani, A. A. (2023). Ciciot2023: A real-time dataset and benchmark for large-scale attacks in iot environment. Sensors, 23(13):5941. DOI: 10.3390/s23135941.

Noman, H. A. and Abu-Sharkh, O. M. F. (2023). Code injection attacks in wireless-based internet of things (iot): A comprehensive review and practical implementations. Sensors, 23(13). DOI: 10.3390/s23136067.

Pedreira, V., Barros, D., and Pinto, P. (2021). A review of attacks, vulnerabilities, and defenses in industry 4.0 with new challenges on data sovereignty ahead. Sensors, 21(15). DOI: 10.3390/s21155189.

Pimenta Rodrigues, G. A., Marques Serrano, A. L., de Oliveira Albuquerque, R., Mayumi Saiki, G., Santedicola Ribeiro, S., Sandoval Orozco, A. L., and García Villalba, L. J. (2024). Mapping of data breaches in companies listed on the nyse and nasdaq: Insights and implications. Results in Engineering, 21:101893. DOI: 10.1016/j.rineng.2024.101893.

Pimenta Rodrigues, G. A., Marques Serrano, A. L., Lopes Espiñeira Lemos, A. N., Canedo, E. D., Mendonça, F. L. L. d., de Oliveira Albuquerque, R., Sandoval Orozco, A. L., and García Villalba, L. J. (2024). Understanding data breach from a global perspective: Incident visualization and data protection law review. Data, 9(2). DOI: 10.3390/data9020027.

Rimal, Y., Sharma, N., and Alsadoon, A. (2024). The accuracy of machine learning models relies on hyperparameter tuning: student result classification using random forest, randomized search, grid search, bayesian, genetic, and optuna algorithms. Multimedia Tools and Applications, pages 1-16. DOI: 10.1007/s11042-024-18426-2.

Rodrigues, G. A. P., Fernandes, P. A. G., Serrano, A. L. M., Filho, G. P. R., Vergara, G. F., Bispo, G. D., Albuquerque, R. d. O., and Gonçalves, V. P. (2025). From rockyou to rockyou2024: Analyzing password patterns across generations, their use in industrial systems and vulnerability to password guessing attacks. Journal of Internet Services and Applications, 16(1):69–86. DOI: 10.5753/jisa.2025.5034.

Sharma, A., Mansotra, V., and Singh, K. (2023). Detection of mirai botnet attacks on iot devices using deep learning. Journal of Scientific Research and Technology, pages 174-187. DOI: 10.5281/zenodo.8330561.

Thaseen, I. S., Mohanraj, V., Ramachandran, S., Sanapala, K., and Yeo, S.-S. (2021). A hadoop based framework integrating machine learning classifiers for anomaly detection in the internet of things. Electronics, 10(16):1955. DOI: 10.3390/electronics10161955.

Vergara, G. F., Giacomelli, P., Serrano, A. L. M., Mendonça, F. L. L. d., Rodrigues, G. A. P., Bispo, G. D., Gonçalves, V. P., Albuquerque, R. d. O., and Sousa Júnior, R. T. d. (2024). Stego-stfan: A novel neural network for video steganography. Computers, 13(7). DOI: 10.3390/computers13070180.

Yalçın, N., Çakır, S., and Üaldı, S. (2024). Attack detection using artificial intelligence methods for scada security. IEEE Internet of Things Journal. DOI: 10.1109/JIOT.2024.3447876.

Zhang, Y. and Liu, Q. (2022). On iot intrusion detection based on data augmentation for enhancing learning on unbalanced samples. Future Generation Computer Systems, 133:213-227. DOI: 10.1016/j.future.2022.03.007.

Downloads

Published

2025-06-19

How to Cite

Duarte, J. D., Bispo, G. D., Rodrigues, G. A. P., Serrano, A. L. M., Saiki, G. M., & Gonçalves, V. P. (2025). Synthetic Minority Over-sampling Technique for detecting Malicious Traffic targeting Internet of Things’ devices. Journal of Internet Services and Applications, 16(1), 332–344. https://doi.org/10.5753/jisa.2025.5251

Issue

Section

Research article