Analysis of the Role of GitHub Discussions as a Tool for Onboarding Newcomers in Open Source Software

Authors

DOI:

https://doi.org/10.5753/jserd.2025.6090

Keywords:

Open Source, Communication, Discussions, GitHub

Abstract

The introduction of the discussions feature on GitHub provides a new, dedicated space for collaborative communication within open-source software (OSS) projects. This paper investigates the impact of discussions on community engagement, focusing on newcomer onboarding and the activity surrounding issues and pull requests. Through a comprehensive empirical analysis of 285 OSS projects, we observe a significant shift in participation patterns, with discussions emerging as a more attractive entry point for newcomers compared to traditional mechanisms like issues and pull requests. Our findings suggest that while discussions lower the barrier for newcomer participation, it leads to a decrease in traditional contributions such as new issues and pull requests. Additionally, we explore the engagement levels of different contributor roles, demonstrating how discussions foster diverse interactions and community involvement. The results provide critical insights into the evolving nature of collaboration on GitHub and highlight the role of discussions in shaping the dynamics of OSS project ecosystems.

Downloads

Download data is not yet available.

References

Ait, A., Izquierdo, J. L. C., and Cabot, J. (2022). An empirical study on the survival rate of github projects. In Proceedings of the 19th International Conference on Mining Software Repositories, page 365–375, New York, NY, USA. Association for Computing Machinery.

Arguello, J., Butler, B. S., Joyce, E., Kraut, R., Ling, K. S., Rosé, C., and Wang, X. (2006). Talk to me: Foundations for successful individual-group interactions in online communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’06, page 959–968, New York, NY, USA. Association for Computing Machinery.

Baym, N. K. (1993). Interpreting soap operas and creating community: Inside a computer-mediated fan culture. Journal of folklore research, pages 143–176.

Beyer, S., Macho, C., Di Penta, M., and Pinzger, M. (2020). What kind of questions do developers ask on Stack Overflow? a comparison of automated approaches to classify posts into question categories. Empirical Software Engineering, 25:2258–2301.

Borges, H., Hora, A., and Valente, M. T. (2016). Predicting the Popularity of Github Repositories. New York, NY, USA. Association for Computing Machinery.

Burke, M., Joyce, E., Kim, T., Anand, V., and Kraut, R. (2007). Introductions and requests: Rhetorical strategies that elicit response in online communities. In Communities and Technologies 2007: Proceedings of the Third Communities and Technologies Conference, Michigan State University 2007, pages 21–39, London, UK. Springer.

Cassee, N., Vasilescu, B., and Serebrenik, A. (2020). The silent helper: The impact of continuous integration on code reviews. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), pages 423–434.

Chidambaram, N., Decan, A., and Mens, T. (2023). A dataset of bot and human activities in github. In 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), pages 465–469.

Cochran, W. (1977). Sampling Techniques. Wiley Series in Probability and Statistics. John Wiley & Sons.

Cook, T. and Campbell, D. (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings. Houghton Mifflin.

Dabic, O., Aghajani, E., and Bavota, G. (2021). Sampling projects in GitHub for MSR studies. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pages 560–564. IEEE.

Ducheneaut, N. (2005). Socialization in an open source software community: A socio-technical analysis. Computer Supported Cooperative Work (CSCW), 14(4):323–368.

Furtado, A., Andrade, N., Oliveira, N., and Brasileiro, F. (2013). Contributor Profiles, Their Dynamics, and Their Importance in Five Q&A Sites. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work, page 1237–1252, New York, NY, USA. Association for Computing Machinery.

Furtado, A., Andrade, N., Oliveira, N., and Brasileiro, F. (2013). Contributor Profiles, Their Dynamics, and Their Importance in Five Q&A Sites. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work, page 1237–1252, New York, NY, USA. Association for Computing Machinery.

Gao, W., Wu, J., and Xu, G. (2022). Detecting duplicate questions in stack overflow via source code modeling. Int. J. Softw. Eng. Knowl. Eng., 32(2):227–255.

Giuffrida, R. and Dittrich, Y. (2013). Empirical studies on the use of social software in global software development – A systematic mapping study. Information and Software Technology, 55(7):1143–1164.

Guzzi, A., Bacchelli, A., Lanza, M., Pinzger, M., and Deursen, A. v. (2013). Communication in Open Source Software Development Mailing Lists. In Proceedings of the 10th Working Conference on Mining Software Repositories, page 277–286, San Francisco, CA, USA. IEEE Press.

Hata, H., Novielli, N., Baltes, S., Kula, R. G., and Treude, C. (2022). Github discussions: An exploratory study of early adoption. Empirical Softw. Engg., 27(1).

Hinds, P. J. and Kiesler, S. (2002). Attribution in Distributed Work Groups.

Imbens, G. W. and Lemieux, T. (2008). Regression discontinuity designs: A guide to practice. Journal of Econometrics, 142(2):615–635. The regression discontinuity design: Theory and applications.

Jin, Y., Bai, Y., Zhu, Y., Sun, Y., and Wang, W. (2022). Code recommendation for open source software developers. arXiv preprint arXiv:2210.08332.

Joyce, E. and Kraut, R. E. (2017). Predicting Continued Participation in Newsgroups. Journal of Computer-Mediated Communication, 11(3):723–747.

Kafer, V., Graziotin, D., Bogicevic, I., Wagner, S., and Ramadani, J. (2018). Communication in Open-Source Projects-End of the e-Mail Era? In Proceedings of the 40th International Conference on Software Engineering: Companion Proceedings, page 242–243, New York, NY, USA. Association for Computing Machinery.

Kamienski, A. V., Hindle, A., and Bezemer, C. (2023). Analyzing techniques for duplicate question detection on q&a websites for game developers. Empir. Softw. Eng., 28(1):17.

Kavaler, D., Devanbu, P., and Filkov, V. (2019). Whom are you going to call? determinants of @-mentions in github discussions. Empirical Software Engineering, 24:3904–3932.

Kavaler, D., Sirovica, S., Hellendoorn, V., Aranovich, R., and Filkov, V. (2017). Perceived language complexity in github issue discussions and their effect on issue resolution. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 72–83. IEEE.

Krejcie, R. V. and Morgan, D. W. (1970). Determining sample size for research activities. Educational and Psychological Measurement, 30(3):607–610.

Kuznetsova, A., Brockhoff, P. B., and Christensen, R. H. B. (2017). lmertest package: tests in linear mixed effects models. Journal of Statistical Software, 82(13).

Lampe, C. and Johnston, E. (2005). Follow the (slash) dot: Effects of feedback on new members in an online community. In Proceedings of the 2005 International ACM SIGGROUP Conference on Supporting Group Work, GROUP ’05, page 11–20, New York, NY, USA. Association for Computing Machinery.

Lave, J. and Wenger, E. (2001). Legitimate peripheral participation in communities of practice. In Supporting lifelong learning, pages 121–136. Routledge.

Lima, M., Steinmacher, I., Ford, D., Liu, E., Vorreuter, G., Conte, T., and Gadelha, B. (2023a). Looking for related discussions on github discussions. PeerJ.

Lima, M., Steinmacher, I., Ford, D., Liu, E., Vorreuter, G., Conte, T., and Gadelha, B. (2023b). Looking for related posts on github discussions. PeerJ Computer Science, 9:e1567.

Lima, M., Steinmacher, I., Ford, D., Vorreuter, G., Gonçalves, L., Conte, T., and Gadelha, B. (2025). How are discussions linked? a link analysis study on github discussions. Journal of Systems and Software, 219:112196.

Linares-Vasquez, M., Bavota, G., Di Penta, M., Oliveto, R., and Poshyvanyk, D. (2014). How do api changes trigger stack overflow discussions? a study on the android sdk. In Proceedings of the 22nd International Conference on Program Comprehension, page 83–94, New York, NY, USA. Association for Computing Machinery.

Maciel, A., Wessel, M., Serebrenik, A., Wiese, I., and Steinmacher, I. (2025). Replication package for this paper. [link].

Marlow, J., Dabbish, L., and Herbsleb, J. (2013). Impression formation in online peer production: Activity traces and personal profiles in github. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work, page 117–128, New York, NY, USA. Association for Computing Machinery.

Mockus, A., Fielding, R. T., and Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11(3):309–346.

Mondal, S., Saifullah, C. K., Bhattacharjee, A., Rahman, M. M., and Roy, C. K. (2021). Early detection and guidelines to improve unanswered questions on stack overflow. In 14th Innovations in software engineering conference (formerly known as India software engineering conference), pages 1–11.

Nakagawa, S. and Schielzeth, H. (2013). A general and simple method for obtaining r2 from generalized linear mixed-effects models. Methods in ecology and evolution, 4(2):133–142.

Panichella, S., Bavota, G., Penta, M. D., Canfora, G., and Antoniol, G. (2014). How Developers’ Collaborations Identified from Different Sources Tell Us about Code Changes. In 2014 IEEE International Conference on Software Maintenance and Evolution, pages 251–260, Victoria, British Columbia, Canada. IEEE.

Park, Y. and Jensen, C. (2009). Beyond pretty pictures: Examining the benefits of code visualization for open source newcomers. In 2009 5th IEEE International Workshop on Visualizing Software for Understanding and Analysis, pages 3–10, Edmonton, AB, Canada. IEEE.

Pletea, D., Vasilescu, B., and Serebrenik, A. (2014). Security and emotion: sentiment analysis of security discussions on github. In Proceedings of the 11th working conference on mining software repositories, pages 348–351.

Silva, J. O., Wiese, I., German, D. M., Treude, C., Gerosa, M. A., and Steinmacher, I. (2020). Google summer of code: Student motivations and contributions. Journal of Systems and Software, 162:110487.

Steinmacher, I., Chaves, A. P., Conte, T. U., and Gerosa, M. A. (2014). Preliminary Empirical Identification of Barriers Faced by Newcomers to Open Source Software Projects. In Proceedings of the 2014 Ninth International Conference on Availability, Reliability and Security, page 51–60, USA. IEEE Computer Society.

Steinmacher, I., Conte, T., Gerosa, M. A., and Redmiles, D. (2015). Social barriers faced by newcomers placing their first contribution in open source software projects. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work; Social Computing, CSCW ’15, page 1379–1392, New York, NY, USA. Association for Computing Machinery.

Steinmacher, I., Conte, T. U., Treude, C., and Gerosa, M. A. (2016). Overcoming Open Source Project Entry Barriers with a Portal for Newcomers. In Proceedings of the 38th International Conference on Software Engineering, page 273–284, New York, NY, USA. Association for Computing Machinery.

Steinmacher, I., Wiese, I., Chaves, A. P., and Gerosa, M. A. (2013). Why Do Newcomers Abandon Open Source Software Projects? In 2013 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), pages 25–32, San Francisco, CA, USA. IEEE.

Steinmacher, I., Wiese, I. S., and Gerosa, M. A. (2012). Recommending Mentors to Software Project Newcomers. In Proceedings of the Third International Workshop on Recommendation Systems for Software Engineering, page 63–67, Zurich, Switzerland. IEEE Press.

Storey, M.-A., Zagalsky, A., Filho, F. F., Singer, L., and German, D. M. (2017). How Social and Communication Channels Shape and Challenge a Participatory Culture in Software Development. IEEE Transactions on Software Engineering, 43(2):185–204.

Sullivan, G. and Feinn, R. (2012). Using effect size—or why the p value is not enough. Journal of Graduate Medical Education, 4:279–82.

Thistlethwaite, D. L. and Campbell, D. T. (1960). Regression-discontinuity analysis: An alternative to the ex post facto experiment. Journal of Educational Psychology, 51(6):309.

Tsay, J., Dabbish, L., and Herbsleb, J. (2014a). Let’s Talk about It: Evaluating Contributions through Discussion in GitHub. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, page 144–154, New York, NY, USA. Association for Computing Machinery.

Tsay, J., Dabbish, L., and Herbsleb, J. (2014b). Let’s Talk about It: Evaluating Contributions through Discussion in GitHub. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 144–154.

Vasilescu, B., Serebrenik, A., Devanbu, P. T., and Filkov, V. (2014). How Social Q&A Sites Are Changing Knowledge Sharing in Open Source Software Communities. In Fussell, S. R., Lutters, W. G., Morris, M. R., and Reddy, M. C., editors, Computer Supported Cooperative Work, CSCW ’14, Baltimore, MD, USA, February 15–19, 2014, pages 342–354. ACM.

Viviani, G., Famelis, M., Xia, X., Janik-Jones, C., and Murphy, G. C. (2019). Locating Latent Design Information in Developer Discussions: A Study on Pull Requests. IEEE Transactions on Software Engineering, 47(7):1402–1413.

Wang, D., Kondo, M., Kamei, Y., Kula, R. G., and Ubayashi, N. (2023). When Conversations Turn into Work: A Taxonomy of Converted Discussions and Issues in GitHub. Empirical Software Engineering (EMSE).

Wenger, E. (2011). Communities of practice: A brief introduction.

Wessel, M., Serebrenik, A., Wiese, I., Steinmacher, I., and Gerosa, M. A. (2020). Effects of adopting code review bots on pull requests to OSS projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 1–11.

Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., and Wesslén, A. (2012). Experimentation in software engineering. Springer Science & Business Media.

Zhang, H., Wang, S., Chen, T.-H., and Hassan, A. E. (2021). Reading answers on Stack Overflow: Not enough! IEEE Transactions on Software Engineering, 47(11):2520–2533.

Zhang, J., Ackerman, M. S., and Adamic, L. (2007). Expertise Networks in Online Communities: Structure and Algorithms. In Proceedings of the 16th International Conference on World Wide Web, page 221–230, New York, NY, USA. Association for Computing Machinery.

Zhang, Y., Wang, H., Wu, Y., Hu, D., and Wang, T. (2020). GitHub’s milestone tool: A mixed-methods analysis on its use. Journal of Software: Evolution and Process, 32(4):e2229.

Zhao, Y., Serebrenik, A., Zhou, Y., Filkov, V., and Vasilescu, B. (2017). The impact of continuous integration on other software development practices: A large-scale empirical study. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, pages 60–71. IEEE Press.

Zhu, W., Zhang, H., Hassan, A. E., and Godfrey, M. W. (2022). An Empirical Study of Question Discussions on Stack Overflow. Empirical Softw. Engg., 27(6).

Downloads

Published

2025-11-26

How to Cite

Maciel, A. C., Wessel, M., Serebrenik, A., Wiese, I., & Steinmacher, I. (2025). Analysis of the Role of GitHub Discussions as a Tool for Onboarding Newcomers in Open Source Software. Journal of Software Engineering Research and Development, 13(2), 13:239 – 13:254. https://doi.org/10.5753/jserd.2025.6090

Issue

Section

Research Article