Technical Debt in Pull Requests: Insights from Apache Projects
DOI:
https://doi.org/10.5753/jserd.2025.5722Keywords:
Technical Debt, Pull Request, Code Review, Empirical Study, Software Evolution, Mining Software RepositoriesAbstract
Technical Debt (TD) represents the effort required to address quality issues that affect a software system and progressively hinder code evolution over time. A pull request (PR) is a discrete unit of work that must meet specific quality standards to be integrated into the main codebase. PRs offer a valuable opportunity to assess how developers handle TD and how codebase quality evolves. In this work, we conducted two empirical analyses to understand how developers address TD within PRs and whether TD is effectively managed during PR reviews by both developers and reviewers. We examined 12 Java projects from Apache. The first study employed the SonarQube tool on 2,035 merged PRs to evaluate TD variation, identify the most frequently neglected and resolved types of TD issues, and analyze how TD evolves over time. The second study involved a qualitative analysis of review threads of 250 PRs, focusing on the types of PRs that frequently discuss TD, the characteristics of TD fix suggestions, and the reasons some suggestions are rejected. Our findings reveal that TD issues are prevalent in PRs, following a ratio of 1:2:1 (reduced: unchanged: increased). Among all TD issues, those related to code duplication and cognitive complexity are most frequently overlooked, while code duplication and obsolete code are the most commonly resolved. Regarding PR code review, we found that around 76% of review threads address TD, with code, design, and documentation being the most frequently discussed areas. Additionally, 96% of discussions include a fix suggestion, and over 80% of the discussed issues are resolved. These insights can help practitioners become more aware of TD management and may inspire the development of new tools to facilitate TD handling during PRs.
Downloads
References
Alves, N. S. R., Ribeiro, L. F., Caires, V., Mendes, T. S., and Spínola, R. O. (2014). Towards an ontology of terms on technical debt. In 2014 Sixth International Workshop on Managing Technical Debt, pages 1–7.
Ampatzoglou, A., Mittas, N., Tsintzira, A.-A., Ampatzoglou, A., Arvanitou, E.-M., Chatzigeorgiou, A., Avgeriou, P., and Angelis, L. (2020). Exploring the relation between technical debt principal and interest: An empirical approach. Information and Software Technology, 128:106391.
Anquetil, N., Delplanque, J., Ducasse, S., Zaitsev, O., Fuhrman, C., and Guéhéneuc, Y.-G. (2022). What do developers consider magic literals? A smalltalk perspective. Information and Software Technology, 149.
Avgeriou, P. C., Taibi, D., Ampatzoglou, A., Arcelli Fontana, F., Besker, T., Chatzigeorgiou, A., Lenarduzzi, V., Martini, A., Moschou, A., Pigazzini, I., Saarimaki, N., Sas, D. D., de Toledo, S. S., and Tsintzira, A. A. (2021). An overview and comparison of technical debt measurement tools. IEEE Software, 38(3):61–71.
Baltes, S. and Ralph, P. (2022). Sampling in software engineering research: a critical review and guidelines. Empirical Software Engineering, 27(4):94.
Bansiya, J. and Davis, C. G. (2002). A hierarchical model for object-oriented design quality assessment. IEEE Transactions on Software Engineering, 28(1):4–17.
Beller, M., Bacchelli, A., Zaidman, A., and Juergens, E. (2014). Modern code reviews in open-source projects: which problems do they fix? In Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, page 202–211, New York, NY, USA. Association for Computing Machinery.
Besker, T., Ghanbari, H., Martini, A., and Bosch, J. (2020). The influence of technical debt on software developer morale. Journal of Systems and Software, 167:110586.
Calixto, F., Araújo, E., and Alves, E. (2024). How does technical debt evolve within pull requests? an empirical study with apache projects. In Anais do XXXVIII Simpósio Brasileiro de Engenharia de Software, pages 212–223, Porto Alegre, RS, Brasil. SBC.
Calixto, F., Araújo, E., and Alves, E. (2025). [replication kit] how developers address technical debt in pull requests: A dual study. https://doi.org/10.5281/zenodo.15054186.
Coelho, F., Tsantalis, N., Massoni, T., and Alves, E. L. G. (2021). An empirical study on refactoring-inducing pull requests. In Proceedings of the 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 1–12.
Coelho, F., Tsantalis, N., Massoni, T., and Alves, E. L. G. (2024). A qualitative study on refactorings induced by code review. Empirical Software Engineering, 30(1):17.
Coq, T. and Rosen, J.-P. (2011). The sqale quality and analysis models for assessing the quality of ada source code. In Romanovsky, A. and Vardanega, T., editors, Reliable Software Technologies - Ada-Europe 2011, pages 61–74, Berlin, Heidelberg. Springer Berlin Heidelberg.
Cunningham, W. (1992). The wycash portfolio management system. In Addendum to the Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Addendum), OOPSLA ’92, page 29–30, New York, NY, USA. Association for Computing Machinery.
Curtis, B., Sappidi, J., and Szynkarski, A. (2012). Estimating the principal of an application’s technical debt. IEEE Software, 29(6):34–42.
Dantas, C. E. C., Rocha, A. M., and Maia, M. A. (2023). How do developers improve code readability? an empirical study of pull requests. In 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 110–122.
Digkas, G., Chatzigeorgiou, A., Ampatzoglou, A., and Avgeriou, P. (2022). Can clean new code reduce technical debt density? IEEE Transactions on Software Engineering, 48(05):1705–1721.
Digkas, G., Lungu, M., Chatzigeorgiou, A., and Avgeriou, P. (2017). The evolution of technical debt in the apache ecosystem. In Lopes, A. and de Lemos, R., editors, Software Architecture, pages 51–66, Cham. Springer International Publishing.
Eisenberg, R. J. (2012). A threshold based approach to technical debt. 37(2):1–6.
El Zanaty, F., Hirao, T., McIntosh, S., Ihara, A., and Matsumoto, K. (2018). An empirical study of design discussions in code review. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’18, New York, NY, USA. Association for Computing Machinery.
Giordano, G., Annunziata, G., De Lucia, A., and Palomba, F. (2023). Understanding developer practices and code smells diffusion in ai-enabled software: A preliminary study. In De Vito, G., Ferrucci, F., and Gravino, C., editors, Joint Proceedings of the 32nd International Workshop on Software Measurement (IWSM) and the 17th International Conference on Software Process and Product Measurement (MENSURA), Rome, Italy, September 14-15, 2023, volume 3543 of CEUR Workshop Proceedings. CEUR-WS.org.
Griffith, I. and Izurieta, C. (2014). Design pattern decay: the case for class grime. ESEM ’14, New York, NY, USA. Association for Computing Machinery.
Han, X., Tahir, A., Liang, P., Counsell, S., Blincoe, K., Li, B., and Luo, Y. (2022). Code smells detection via modern code review: a study of the openstack and qt communities. Empirical Software Engineering, 27(6):127.
Hassan, A. E. (2009). Predicting faults using the complexity of code changes. In 2009 IEEE 31st International Conference on Software Engineering, pages 78–88.
Karmakar, S., Codabux, Z., and Vidoni, M. (2022). An experience report on technical debt in pull requests: Challenges and lessons learned. In Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’22, page 295–300, New York, NY, USA. Association for Computing Machinery.
Kruchten, P., Nord, R., and Ozkaya, I. (2019). Managing Technical Debt: Reducing Friction in Software Development. Addison-Wesley Professional, 1st edition.
Kruchten, P., Nord, R. L., and Ozkaya, I. (2012). Technical debt: From metaphor to theory and practice. IEEE Software, 29(6):18–21.
Lacerda, G., Petrillo, F., Pimenta, M., and Guéhéneuc, Y. G. (2020). Code smells and refactoring: A tertiary systematic review of challenges and observations. Journal of Systems and Software, 167:110610.
Lenarduzzi, V., Nikkola, V., Saarimäki, N., and Taibi, D. (2021). Does code quality affect pull request acceptance? an empirical study. Journal of Systems and Software, 171:110806.
Lenarduzzi, V., Saarimäki, N., and Taibi, D. (2019). The technical debt dataset. CoRR, abs/1908.00827.
Lenarduzzi, V., Saarimäki, N., and Taibi, D. (2020). Some sonarqube issues have a significant but small effect on faults and changes. a large-scale empirical study. Journal of Systems and Software, 170:110750.
Letouzey, J.-L. (2012). The sqale method for evaluating technical debt. In 2012 Third International Workshop on Managing Technical Debt (MTD), pages 31–36.
Li, Z., Avgeriou, P., and Liang, P. (2015). A systematic mapping study on technical debt and its management. Journal of Systems and Software, 101:193–220.
Marcilio, D., Bonifácio, R., Monteiro, E., Canedo, E., Luz, W., and Pinto, G. (2019). Are static analysis violations really fixed? a closer look at realistic usage of sonarqube. In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pages 209–219.
Molnar, A.-J. and Motogna, S. (2020). Long-term evaluation of technical debt in open-source software. ESEM ’20, New York, NY, USA. Association for Computing Machinery.
Monteith, J. Y. and McGregor, J. D. (2013). Exploring software supply chains from a technical debt perspective. In 2013 4th International Workshop on Managing Technical Debt (MTD), pages 32–38.
Mäntylä, M. V. and Lassenius, C. (2009). What types of defects are really discovered in code reviews? IEEE Transactions on Software Engineering, 35(3):430–448.
Nikolaidis, N., Ampatzoglou, A., Chatzigeorgiou, A., Mittas, N., Konstantinidis, E., and Bamidis, P. (2023). Exploring the effect of various maintenance activities on the accumulation of td principal. In 2023 ACM/IEEE International Conference on Technical Debt (TechDebt), pages 102–111.
Nugroho, A., Visser, J., and Kuipers, T. (2011). An empirical model of technical debt and interest. MTD ’11, page 1–8, New York, NY, USA. Association for Computing Machinery.
Palomba, F., Zaidman, A., Oliveto, R., and De Lucia, A. (2017). An exploratory study on the relationship between changes and refactoring. In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pages 176–185.
Panichella, S., Arnaoudova, V., Di Penta, M., and Antoniol, G. (2015). Would static analysis tools help developers with code reviews? In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pages 161–170.
Panichella, S. and Zaugg, N. (2020). An empirical investigation of relevant changes and automation needs in modern code review. Empirical Softw. Engg., 25(6):4833–4872.
Pascarella, L., Spadini, D., Palomba, F., and Bacchelli, A. (2019). On the effect of code review on code smells.
Rios, N., de Mendonça Neto, M. G., and Spínola, R. O. (2018). A tertiary study on technical debt: Types, management strategies, research trends, and base information for practitioners. Information and Software Technology, 102:117–145.
Saarimäki, N., Baldassarre, M. T., Lenarduzzi, V., and Romano, S. (2019). On the accuracy of sonarqube technical debt remediation time. In 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pages 317–324.
SonarQube. Metric definitions. [link]. Accessed: 2024-03-30.
SonarSouce (2022). Sonarsource posts record growth with its clean code solution. [link]. Accessed: 2023-05-04.
SonarSource. Adding code rules. [link]. Accessed: 2024-03-31.
Tan, J., Feitosa, D., Avgeriou, P., and Lungu, M. (2021). Evolution of technical debt remediation in python: A case study on the apache software ecosystem. Journal of Software: Evolution and Process, 33(4):e2319. e2319 smr.2319.
Trautsch, A., Herbold, S., and Grabowski, J. (2023). Are automated static analysis tools worth it? an investigation into relative warning density and external software quality on the example of apache open source projects. Empirical Software Engineering, 28(3):66.
Uchôa, A., Barbosa, C., Oizumi, W., Blenilio, P., Lima, R., Garcia, A., and Bezerra, C. (2020). How does modern code review impact software design degradation? an in-depth empirical study. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 511–522.
Vassallo, C., Panichella, S., Palomba, F., Proksch, S., Zaidman, A., and Gall, H. C. (2018). Context is king: The developer perspective on the usage of static analysis tools. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pages 38–49.
Vetrò, A. (2012). Using automatic static analysis to identify technical debt. In 2012 34th International Conference on Software Engineering (ICSE), pages 1613–1615.
Wagner, S., Lochmann, K., Heinemann, L., Kläs, M., Trendowicz, A., Plösch, R., Seidl, A., Goeb, A., and Streit, J. (2012). The quamoco product quality modelling and assessment approach. In Proceedings of the 34th International Conference on Software Engineering, ICSE ’12, page 1133–1142. IEEE Press.
Yamashita, A. and Counsell, S. (2013). Code smells as system-level indicators of maintainability: An empirical study. Journal of Systems and Software, 86(10):2639–2653.
Yu, P., Wu, Y., Peng, J., Zhang, J., and Xie, P. (2023). Towards understanding fixes of sonarqube static analysis violations: A large-scale empirical study. In 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pages 569–580.
Zabardast, E., Bennin, K. E., and GonzalezJ. Softw. Evol. Process, 34(2).
Zazworka, N., Vetrò, A., Izurieta, C., Wong, S., Cai, Y., Seaman, C., and Shull, F. (2014). Comparing four approaches for technical debt identification. Software Quality Journal, 22(3):403–426.
Zou, W., Xuan, J., Xie, X., Chen, Z., and Xu, B. (2019). How does code style inconsistency affect pull request integration? an exploratory study on 117 github projects. Empirical Software Engineering, 24(6):3871–3903.
Özçevik, Y. (2024). Data-oriented qmood model for quality assessment of multi-client software applications. Engineering Science and Technology, an International Journal, 51:101660.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Felipe E. de O. Calixto, Eliane C. Araújo, Everton L. G. Alves

This work is licensed under a Creative Commons Attribution 4.0 International License.

