Prediction of defects in Smart Contracts applying Deep Learning with Solidity metrics

Rogério de J. Oliveira; Edson M. Lucas; Gustavo Barbosa Libotte

doi:10.5753/jbcs.2025.4495

Authors

Rogério de J. Oliveira Rio de Janeiro State University, IPRJ-UERJ https://orcid.org/0000-0002-4142-0138
Edson M. Lucas Rio de Janeiro State University, IPRJ-UERJ https://orcid.org/0000-0001-5918-624X
Gustavo Barbosa Libotte Rio de Janeiro State University, IPRJ-UERJ https://orcid.org/0000-0002-4583-6026

DOI:

https://doi.org/10.5753/jbcs.2025.4495

Keywords:

Smart Contracts, Software Defects Prediction, Code Metrics, Machine Learning, Blockchain

Abstract

Smart Contracts are autonomous, self-executable programs that facilitate agreement execution without the need for intermediaries. These contracts are also susceptible to software defects, leading to vulnerabilities that can be exploited by attackers. The use of models for predicting software defects is a well-studied research area. However, applying these models with Smart Contract metrics is an area that remains underexplored. The aim of this study is to evaluate whether deep learning models used in the prediction of traditional software defects produce equivalent results with specific Smart Contract metrics. Machine learning models were applied to four data sets, and performances were evaluated using Precision, Recall, F-score, Area under the curve (AUC), Precision-recall curve (PRC), and Matthews Correlation Coefficient (MCC). This approach complements traditional formal verification methods, which, although accurate, are often slower and less adaptable to emerging vulnerabilities. By employing deep learning, the model enables faster and more cost-effective analysis of large volumes of Smart Contracts. Unlike conventional techniques that rely on expert-defined rules and require substantial computational resources, this model offers scalable and continuous monitoring. Consequently, the research provides a complementary solution that can significantly enhance the security of the smart contract ecosystem, allowing for the detection of potential defects before exploitation occurs.

Downloads

Download data is not yet available.

References

Alghanim, F., Azzeh, M., El-Hassan, A., and Qattous, H. (2022). Software defect density prediction using deep learning. IEEE Access, 10:114629-114641. DOI: 10.1109/ACCESS.2022.3217480.

Badruddoja, S., Dantu, R., He, Y., Upadhayay, K., and Thompson, M. (2021). Making smart contracts smarter. In 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pages 1-3. DOI: 10.1109/ICBC51069.2021.9461148.

Bahaa, A., Fathy, E. M., Eldin, A. S., and Abd-Elmegid, L. A. (2021). A systematic literature review of software defect prediction using deep learning. Journal of Computer Science, 17:490-510. DOI: 10.3844/JCSSP.2021.490.510.

Beck, R., Avital, M., Rossi, M., and Thatcher, J. B. (2017). Blockchain technology in business and information systems research. Business & information systems engineering, 59:381-384. DOI: 10.1007/s12599-017-0505-1.

Bhargavan, K., Swamy, N., Zanella-Béguelin, S., Delignat-Lavaud, A., Fournet, C., Gollamudi, A., Gonthier, G., Kobeissi, N., Kulatova, N., Rastogi, A., and Sibut-Pinote, T. (2016). Formal verification of smart contracts: Short paper. In Proceedings of the 2016 ACM Workshop on Programming Languages and Analysis for Security, PLAS '16, page 91–96. Association for Computing Machinery. DOI: 10.1145/2993600.2993611.

Bowers, A. and Zhou, X. (2019). Receiver operating characteristic (roc) area under the curve (auc): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR), 24:1-25. DOI: 10.1080/10824669.2018.1523734.

Bowes, D., Hall, T., and Petrić, J. (2018). Software defect prediction: do different classifiers find the same defects? Software Quality Journal, 26. DOI: 10.1007/s11219-016-9353-3.

Chang, R., Mu, X., and Zhang, L. (2011). Software defect prediction using non-negative matrix factorization. JSW, 6:2114-2120. DOI: 10.4304/jsw.6.11.2114-2120.

Chen, T., Cao, R., Li, T., Luo, X., Gu, G., Zhang, Y., Liao, Z., Zhu, H., Chen, G., He, Z., Tang, Y., Lin, X., and Zhang, X. (2020). Soda: A generic online detection framework for smart contracts. Proceedings 2020 Network and Distributed System Security Symposium. DOI: 10.14722/ndss.2020.24449.

Chicco, D. and Jurman, G. (2020). The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics, 21. DOI: 10.1186/s12864-019-6413-7.

Chicco, D. and Jurman, G. (2023). The matthews correlation coefficient (mcc) should replace the roc auc as the standard metric for assessing binary classification. BioData Mining, 16(1):1-23. DOI: 10.1186/s13040-023-00322-4.

Davis, J. and Goadrich, M. (2006). The relationship between precision-recall and roc curves. In Proceedings of the 23rd International Conference on Machine Learning, ICML '06, page 233–240, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/1143844.1143874.

Deng, J., Lu, L., and Qiu, S. (2020). Software defect prediction via lstm. IET software, 14(4):443-450. DOI: 10.1049/IET-SEN.2019.0149.

Durelli, V. H. S., Durelli, R. S., Borges, S. S., Endo, A. T., Eler, M. M., Dias, D. R. C., and Guimarães, M. P. (2019). Machine learning applied to software testing: A systematic mapping study. IEEE Transactions on Reliability, 68(3):1189-1212. DOI: 10.1109/TR.2019.2892517.

Durieux, T., Ferreira, J. a. F., Abreu, R., and Cruz, P. (2020). Empirical review of automated analysis tools on 47,587 ethereum smart contracts. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE '20, page 530–541, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3377811.3380364.

Fan, Y., Shang, S., and Ding, X. (2021). Smart contract vulnerability detection based on dual attention graph convolutional network. In Collaborative Computing: Networking, Applications and Worksharing: 17th EAI International Conference, CollaborateCom 2021, Virtual Event, October 16-18, 2021, Proceedings, Part II 17, pages 335-351. Springer. DOI: 10.1007/978-3-030-92638-0_20.

Fenton, N. and Bieman, J. (2014). Software Metrics: A Rigorous and Practical Approach, Third Edition. CRC Press, Inc.. DOI: 10.1201/b17461.

Gao, Y., Liu, W., and Lombardi, F. (2020). Design and implementation of an approximate softmax layer for deep neural networks. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1-5. DOI: 10.1109/ISCAS45731.2020.9180870.

Ghaffarian, S. M. and Shahriari, H. R. (2017). Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey. ACM Comput. Surv., 50(4). DOI: 10.1145/3092566.

Gogineni, A. K., Swayamjyoti, S., Sahoo, D., Sahu, K., and Kishore, R. (2020). Multi-class classification of vulnerabilities in smart contracts using awd-lstm, with pre-trained encoder inspired from natural language processing. IOP SciNotes, 1:035002. DOI: 10.1088/2633-1357/abcd29.

Hicks, S., Strumke, I., Thambawita, V., Hammou, M., Halvorsen, P., Riegler, M., and Parasa, S. (2022). On evaluation metrics for medical applications of artificial intelligence. Scientific Reports, 12(1):5979. DOI: 10.1038/s41598-022-09954-8.

Huang, J., Zhou, K., Xiong, A., and Li, D. (2022). Smart contract vulnerability detection model based on multi-task learning. Sensors, 22(5):1829. DOI: 10.3390/s22051829.

Ioffe, S. and Szegedy, C. (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML'15, page 448–456. JMLR.org. DOI: 10.5555/3045118.3045167.

ISDW (2010). Ieee standard classification for software anomalies. IEEE Std 1044-2009 (Revision of IEEE Std 1044-1993), pages 1-23. DOI: 10.1109/IEEESTD.2010.5399061.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning: With Applications in R, volume 112. Springer. DOI: 10.1007/978-1-4614-7138-7.

Jiang, F., Chao, K., Xiao, J., Liu, Q., Gu, K., Wu, J., and Cao, Y. (2023). Enhancing smart-contract security through machine learning: A survey of approaches and techniques. Electronics, 12(9). DOI: 10.3390/electronics12092046.

Kingma, D. P. and Ba, J. (2017). Adam: A method for stochastic optimization. DOI: 10.48550/arXiv.1412.6980.

Lessmann, S., Baesens, B., Mues, C., and Pietsch, S. (2008). Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE transactions on software engineering, 34(4):485-496. DOI: 10.1109/TSE.2008.35.

Liang, H., Yu, Y., Jiang, L., and Xie, Z. (2019). Seml: A semantic lstm model for software defect prediction. IEEE Access, 7:83812-83824. DOI: 10.1109/ACCESS.2019.2925313.

Liu, Z., Qian, P., Wang, X., Zhuang, Y., Qiu, L., and Wang, X. (2023). Combining graph neural networks with expert knowledge for smart contract vulnerability detection. IEEE Transactions on Knowledge and Data Engineering, 35(2):1296-1310. DOI: 10.1109/TKDE.2021.3095196.

Lutz, O., Chen, H., Fereidooni, H., Sendner, C., Dmitrienko, A., Sadeghi, A. R., and Koushanfar, F. (2021). Escort: ethereum smart contracts vulnerability detection using deep neural network and transfer learning. ArXiv. DOI: 10.14722/ndss.2023.23263.

Matthews, B. (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2):442-451. DOI: 10.1016/0005-2795(75)90109-9.

Mirjalili, S., Faris, H., and Aljarah, I. (2020). Introduction to Evolutionary Machine Learning Techniques, pages 1-7. Springer Singapore. DOI: 10.1007/978-981-32-9990-0_1.

Muschelli, J. (2019). Roc and auc with a binary predictor: a potentially misleading metric. Journal of Classification, 37. DOI: 10.1007/s00357-019-09345-1.

Ng, A. (2004). Feature selection, l1 vs. l2 regularization, and rotational invariance. Proceedings of the twenty-first international conference on Machine learning. DOI: 10.1145/1015330.1015435.

Ortu, M., Orrú, M., and Destefanis, G. (2019). On comparing software quality metrics of traditional vs blockchain-oriented software: An empirical study. In 2019 IEEE International Workshop on Blockchain Oriented Software Engineering (IWBOSE), pages 32-37. DOI: 10.1109/IWBOSE.2019.8666575.

O'Shea, K. and Nash, R. (2015). An introduction to convolutional neural networks. ArXiv e-prints. DOI: 10.48550/arXiv.1511.08458.

Osterland, T. and Rose, T. (2020). Model checking smart contracts for ethereum. Pervasive and Mobile Computing, 63:101129. DOI: 10.1016/j.pmcj.2020.101129.

Ozakinci, R. and Tarhan, A. (2016). The role of process in early software defect prediction: Methods, attributes and metrics. In International Conference on Software Process Improvement and Capability Determination. DOI: 10.1007/978-3-319-38980-6_21.

Pachouly, J., Ahirrao, S., Kotecha, K., Selvachandran, G., and Abraham, A. (2022). A systematic literature review on software defect prediction using artificial intelligence: Datasets, data validation methods, approaches, and tools. Engineering Applications of Artificial Intelligence, 111:104773. DOI: 10.1016/j.engappai.2022.104773.

Pierro, G. A. and Rocha, H. (2019). The influence factors on ethereum transaction fees. In 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), pages 24-31. DOI: 10.1109/WETSEB.2019.00010.

Pierro, G. A. and Tonelli, R. (2020). Paso: A web-based parser for solidity language analysis. In 2020 IEEE International Workshop on Blockchain Oriented Software Engineering (IWBOSE), pages 16-21. DOI: 10.1109/IWBOSE50093.2020.9050263.

Pinna, A., Ibba, S., Baralla, G., Tonelli, R., and Marchesi, M. (2019). A massive analysis of ethereum smart contracts empirical study and code metrics. IEEE Access. DOI: 10.1109/ACCESS.2019.2921936.

Porru, S., Pinna, A., Marchesi, M., and Tonelli, R. (2017). Blockchain-oriented software engineering: Challenges and new directions. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), pages 169-171. DOI: 10.1109/ICSE-C.2017.142.

Qiao, L., Li, X., Umer, Q., and Guo, P. (2020). Deep learning based software defect prediction. Neurocomputing, 385:100-110. DOI: 10.1016/j.neucom.2019.11.067.

Rezaei-Dastjerdehei, M. R., Mijani, A., and Fatemizadeh, E. (2020). Addressing imbalance in multi-label classification using weighted cross entropy loss function. In 2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME), pages 333-338. DOI: 10.1109/ICBME51989.2020.9319440.

Saifan, A. and Abu-wardih, L. (2020). Software defect prediction based on feature subset selection and ensemble classification. ECTI Transactions on Computer and Information Technology (ECTI-CIT), 14:213-228. DOI: 10.37936/ecti-cit.2020142.224489.

Sui, J., Chu, L., and Bao, H. (2023). An opcode-based vulnerability detection of smart contracts. Applied Sciences, 13(13). DOI: 10.3390/app13137721.

T R, M., Thakur, A., Sinha, D., Mishra, K., Kumar, V. V., and Guluwadi, S. (2024). Transformative breast cancer diagnosis using cnns with optimized reducelronplateau and early stopping enhancements. International Journal of Computational Intelligence Systems, 17. DOI: 10.1007/s44196-023-00397-1.

Tonelli, R., Pierro, G. A., Ortu, M., and Destefanis, G. (2023). Smart contracts software metrics: A first study. PLOS ONE, 18(4):1-31. DOI: 10.1371/journal.pone.0281043.

Tong, H., Liu, B., and Wang, S. (2018). Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning. Information and Software Technology, 96:94-111. DOI: 10.1016/j.infsof.2017.11.008.

Treiblmaier, H. (2020). Toward More Rigorous Blockchain Research: Recommendations for Writing Blockchain Case Studies, pages 1-31. Springer International Publishing. DOI: 10.1007/978-3-030-44337-5_1.

Velasco, G., Vaz, N., and Carvalho, S. (2023). Challenges and opportunities in smart contract development on the ethereum virtual machine: A systematic literature review. In Anais do VI Workshop em Blockchain: Teoria, Tecnologias e Aplicações, pages 15-28, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/wblockchain.2023.756.

Wang, H., Khoshgoftaar, T., and Napolitano, A. (2012). Software measurement data reduction using ensemble techniques. Neurocomputing, 92:124–132. DOI: 10.1016/j.neucom.2011.08.040.

Xu, J., Li, Z., Du, B., Zhang, M., and Liu, J. (2020). Reluplex made more practical: Leaky relu. In 2020 IEEE Symposium on Computers and Communications (ISCC), pages 1-7. DOI: 10.1109/ISCC50000.2020.9219587.

Yadav, H. B. and Yadav, D. K. (2015). A fuzzy logic based approach for phase-wise software defects prediction using software metrics. Information and Software Technology, 63:44-57. DOI: 10.1016/j.infsof.2015.03.001.

Yashavant, C., Kumar, S., and Karkare, A. (2022). Scrawld: A dataset of real world ethereum smart contracts labelled with vulnerabilities. arXiv preprint arXiv:2202.11409. DOI: 10.48550/arXiv.2202.11409.

Zain, Z. M., Sakri, S., and Ismail, N. H. A. (2023). Application of deep learning in software defect prediction: Systematic literature review and meta-analysis. Inf. Softw. Technol., 158(C). DOI: 10.1016/j.infsof.2023.107175.

Zeng, P., Lin, G., Pan, L., Tai, Y., and Zhang, J. (2020). Software vulnerability analysis and discovery using deep learning techniques: A survey. IEEE Access, 8:197158-197172. DOI: 10.1109/ACCESS.2020.3034766.

Zhang, L., Chen, W., Wang, W., Jin, Z., Zhao, C., Cai, Z., and Chen, H. (2022). Cbgru: A detection method of smart contract vulnerability based on a hybrid model. Sensors, 22(9). DOI: 10.3390/s22093577.

Zhang, P., Xiao, F., and Luo, X. (2020). A framework and dataset for bugs in ethereum smart contracts. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 139-150. DOI: 10.1109/ICSME46990.2020.00023.

Zhang, Y. and Liu, D. (2022). Toward vulnerability detection for ethereum smart contracts using graph-matching network. Future Internet, 14(11). DOI: 10.3390/fi14110326.