Zero-Day Ransomware Family Detection Based on Printable Character Analysis and Machine Learning
DOI:
https://doi.org/10.5753/reic.2025.6021Keywords:
Ransomware, Static analysis, Printable characters, Zero-day detection, ScalabilityAbstract
This study proposes a static analysis-based method for detecting zero-day ransomware families through the extraction of printable characters from Windows binary files. The method employs a soft-voting ensemble classification composed of three machine learning techniques: Adaptive Boosting (ADB), Extra-Trees (EXT), and Logistic Regression (LR). To ensure the effectiveness of the approach, we created a dataset of 2,675 binary samples (ransomware and goodware). The training set includes 1,023 samples from 25 relevant ransomware families and 1,134 goodware samples, while the test set consists of 385 samples from 15 recent ransomware families and 133 benign samples. The Detection of New Ransomware Families (DNRF) results achieved 95.88% accuracy, 90.50% precision, 100% recall, and 94.74% F-measure, with an average analysis and prediction time of 0.45 seconds. These results highlight the method’s potential as an additional layer of protection for antivirus systems, particularly on devices with limited hardware resources. Our method advances the field of zero-day ransomware detection by offering a more resilient and real-time applicable solution.
Descargas
Citas
Aljabri, M., Alhaidari, F., Albuainain, A., Alrashidi, S., Alansari, J., Alqahtani, W., and Alshaya, J. (2024). Ransomware detection based on machine learning using memory features. Egyptian Informatics Journal, 25:100445. DOI: 10.1016/j.eij.2024.100445.
Ayub, M. A. and Sirai, A. (2021). Similarity analysis of ransomware based on portable executable (pe) file metadata. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1–6. DOI: 10.1109/SSCI50451.2021.9660019.
Beaman, C., Barkworth, A., Akande, T. D., Hakak, S., and Khan, M. K. (2021). Ransomware: Recent advances, analysis, challenges and future research directions. Computers & Security, 111:102490. DOI: 10.1016/j.cose.2021.102490.
Cen, M., Jiang, F., and Doss, R. (2025). Ransoguard: A rnn-based framework leveraging pre-attack sensitive apis for early ransomware detection. Computers & Security, 150:104293. DOI: 10.1016/j.cose.2024.104293.
Cen, M., Jiang, F., Qin, X., Jiang, Q., and Doss, R. (2024). Ransomware early detection: A survey. Computer Networks, 239:110138. DOI: 10.1016/j.comnet.2023.110138.
Ciaramella, G., Iadarola, G., Martinelli, F., Mercaldo, F., and Santone, A. (2023). Explainable ransomware detection with deep learning techniques. Journal of Computer Virology and Hacking Techniques, :1–14. DOI: 10.1007/s11416-023-00501-1.
Cohen, A. and Nissim, N. (2018). Trusted detection of ransomware in a private cloud using machine learning methods leveraging meta-features from volatile memory. Expert Systems with Applications, 102:158–178. DOI: 10.1016/j.eswa.2018.02.039.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine learning research, 7:1–30. Available at: [link].
Fida, M. A. F. A., Ahmad, T., and Ntahobari, M. (2021). Variance threshold as early screening to boruta feature selection for intrusion detection system. In 2021 13th International Conference on Information & Communication Technology and System (ICTS), pages 46–50. IEEE. DOI: 10.1109/ICTS52701.2021.9608852.
Forbes, C., Evans, M., Hastings, N., and Peacock, B. (2010). Statistical Distributions. John Wiley & Sons, Ltd, Hoboken, New Jersey, 4 edition.
Gaur, K., Kumar, N., Handa, A., and Shukla, S. K. (2021). Static ransomware analysis using machine learning and deep learning models. In Anbar, M., Abdullah, N., and Manickam, S., editors, Advances in Cyber Security, pages 450–467, Singapore. Springer Singapore. DOI: 10.1007/978-981-33-6835-4_30.
Guo, Y. (2023). A review of machine learning-based zero-day attack detection: Challenges and future directions. Computer Communications, 198:175–185. DOI: 10.1016/j.comcom.2022.11.001.
Gurukala, N. K. Y. and Verma, D. K. (2024). Feature selection using particle swarm optimization and ensemble-based machine learning models for ransomware detection. SN Computer Science, 5:1093. DOI: 10.1007/s42979-024-03454-4.
Hampton, N., Baig, Z., and Zeadally, S. (2018). Ransomware behavioural analysis on windows platforms. Journal of Information Security and Applications, 40:44–51. DOI: 10.1016/j.jisa.2018.02.008.
Hassan, N. A. (2019). Ransomware Families. Apress, Berkeley, CA.
Hossin, M. and Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2):1. DOI: 10.5121/ijdkp.2015.5201.
Hull, G., John, H., and Arief, B. (2019). Ransomware deployment methods and analysis: views from a predictive model and human responses. Crime Science, 8:2. DOI: 10.1186/s40163-019-0097-9.
Kapoor, A., Gupta, A., Gupta, R., Tanwar, S., Sharma, G., and Davidson, I. E. (2022). Ransomware detection, avoidance, and mitigation scheme: A review and future directions. Sustainability, 14(1):8. DOI: 10.3390/su14010008.
Khan, M. A.-Z., Al-Karaki, J., and Omar, M. (2024). Llms for malware detection: Review, framework design, and countermeasure approaches. Framework Design, and Countermeasure Approaches. DOI: 10.2139/ssrn.4995252.
Kok, S., Abdullah, A., and Jhanjhi, N. (2022). Early detection of crypto-ransomware using pre-encryption detection algorithm. Journal of King Saud University - Computer and Information Sciences, 34(5):1984–1999. DOI: 10.1016/j.jksuci.2020.06.012.
Kolodenker, E., Koch, W., Stringhini, G., and Egele, M. (2017). Paybreak: Defense against cryptographic ransomware. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’17, page 599–611, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3052973.3053035.
Meland, P. H., Bayoumy, Y. F. F., and Sindre, G. (2020). The ransomware-as-a-service economy within the darknet. Computers & Security, 92:101762. DOI: 10.1016/j.cose.2020.101762.
Moreira, C., Sales, Jr., C., and Moreira, D. (2022). Understanding ransomware actions through behavioral feature analysis. Journal of Communication and Information Systems, 37(1):61–76. DOI: 10.14209/jcis.2022.7.
Moreira, C. C., Moreira, D. C., and de S. de Sales Jr., C. (2023). Improving ransomware detection based on portable executable header using xception convolutional neural network. Computers & Security, 130:103265. DOI: 10.1016/j.cose.2023.103265.
Moreira, C. C., Moreira, D. C., and Sales, C. (2024). A comprehensive analysis combining structural features for detection of new ransomware families. Journal of Information Security and Applications, 81:103716. DOI: 10.1016/j.jisa.2024.103716.
Moussaileb, R., Cuppens, N., Lanet, J.-L., and Bouder, H. L. (2021). A survey on windows-based ransomware taxonomy and detection mechanisms. ACM Comput. Surv., 54(6):117. DOI: 10.1145/3453153.
Oz, H., Aris, A., Levi, A., and Uluagac, A. S. (2022). A survey on ransomware: Evolution, taxonomy, and defense solutions. ACM Comput. Surv., 54(11s). DOI: 10.1145/3514229.
Poudyal, S., Dasgupta, D., Akhtar, Z., and Gupta, K. (2019). A multi-level ransomware detection framework using natural language processing and machine learning. In 14th International Conference on Malicious and Unwanted Software (MALCON). Available at: [link].
Powers, D. M. (2020). Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061. DOI: 10.48550/arXiv.2010.16061.
Shaukat, S. K. and Ribeiro, V. J. (2018). Ransomwall: A layered defense system against cryptographic ransomware attacks using machine learning. In 2018 10th International Conference on Communication Systems & Networks (COMSNETS), pages 356–363. DOI: 10.1109/COMSNETS.2018.8328219.
Ucci, D., Aniello, L., and Baldoni, R. (2019). Survey of machine learning techniques for malware analysis. Computers & Security, 81:123–147. DOI: 10.1016/j.cose.2018.11.001.
Vehabovic, A., Zanddizari, H., Ghani, N., Shaikh, F., Bou-Harb, E., Pour, M. S., and Crichigno, J. (2023). Data-centric machine learning approach for early ransomware detection and attribution. In NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pages 1–6. DOI: 10.1109/NOMS56928.2023.10154378.
Zahoora, U., Rajarajan, M., Pan, Z., and Khan, A. (2022). Zero-day ransomware attack detection using deep contractive autoencoder and voting based ensemble classifier. Applied Intelligence, 52(12):13941–13960. DOI: 10.1007/s10489-022-03244-6.
Zhang, B., Xiao, W., Xiao, X., Sangaiah, A. K., Zhang, W., and Zhang, J. (2020). Ransomware classification using patch-based cnn and self-attention network on embedded n-grams of opcodes. Future Generation Computer Systems, 110:708–720. DOI: 10.1016/j.future.2019.09.025.
Zhang, H., Xiao, X., Mercaldo, F., Ni, S., Martinelli, F., and Sangaiah, A. K. (2019). Classification of ransomware families with machine learning based on n-gram of opcodes. Future Generation Computer Systems, 90:211–221. DOI: 10.1016/j.future.2018.07.052.
Zhang, Z., Wang, C., Wang, Y., Shi, E., Ma, Y., Zhong, W., Chen, J., Mao, M., and Zheng, Z. (2025). Llm hallucinations in practical code generation: Phenomena, mechanism, and mitigation. Proceedings of the ACM on Software Engineering, 2(ISSTA):481–503. DOI: 10.1145/3728894
Descargas
Published
Cómo citar
Issue
Section
Licencia
Derechos de autor 2025 The authors

Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
