ELLAS Architecture and Process: Collecting and Curating Data on Women’s Presence in STEM





Knowledge Graphs, Latin America, Female Leadership, STEM, Open Data


The underrepresentation of women in STEM fields needs to be highlighted through data to assist decision-makers and public policy creators in addressing the issue effectively. However, the lack of structured, organized data published openly in this domain is still a reality. To address this problem, a Latin American research network called ELLAS was created. The project's goal is to develop a platform with Semantic Web-based technologies to structure and concentrate data from Brazil, Peru, and Bolivia, initially. This paper presents the processes defined for the collection and curation of both unstructured and structured data, sourced from scientific articles, social networks, and existing open data. We explore the architecture design in a way that facilitates understanding of the details of the processes and the actors involved for each data source. We present the preliminary results from the application of these processes, and the strategies for future work, which include the data extraction and curation, and the ontology and knowledge graph development We also present some of the undergoing work, such as the survey development and application as well as showing what still hasn't been done, such as the platform development.


Download data is not yet available.


Berardi, R. C. G., Amador, B. O., Hoger, M. D. V., Turato, P. A., da Silva Santos, L. M., and Bim, S. A. (2022). The demand for stereotype-free computing courses for elementary school teachers. Journal on Interactive Systems, 13(1):410–418. DOI: https://doi.org/10.5753/jis.2022.2854.

Berardi, R. C. G., Auceli, P. H. S., Maciel, C., Davila, G., Guzman, I. R., and Mendes, L. (2023). Ellas: Uma plataforma de dados abertos com foco em lideranças femininas em stem no contexto da américa latina. In Anais do XVII Women in Information Technology, pages 124–135. SBC. DOI: https://doi.org/10.5753/wit.2023.230764.

Berners-Lee, T. (2006). Linked data. world wide web consortium (w3c),. Available at: [link]. Accessed on 29 May 2024.

Bertucini, O. T., Berardi, R. C., Belizario, M. G., and Kozievitch, N. (2023). Garantindo a qualidade de dados na fusão de dados conectados: Um caso de uso de shacl em dados abertos de mobilidade e educação de curitiba. In Anais da XVIII Escola Regional de Banco de Dados, pages 31–40. SBC. DOI: https://doi.org/10.5753/erbd.2023.229429.

Branisa, B., Cabero, P., and Guzman, I. (2021). The main factors explaining it career choices of female students in bolivia. AMCIS 2021 Proceedings.

Brazil. (2011). Law no. 12,527, of november 18, 2011. regulates access to information provided for in xxxiii of art. 5, ii of § 3 of art. 37 and § 2 of art. 216 of the federal constitution; amends law no. 8,112, of december 11, 1990; repeals law no. 11,111, of may 5, 2005, and provisions of law no. 8,159, of january 8, 1991; and provides other measures. Available at: [link] Accessed on 29 May 2024.

de Mello, A. V., Finger, A. F., Gindri, L., and Melo, A. M. (2021). Mapping the actions carried out by partner projects of the meninas digitais program in the southern region. In In Proceedings of the XV Women in Information Technology, pages 91–100. SBC. DOI: https://doi.org/10.5753/wit.2021.15845.

Egana-delSol, P., Bustelo, M., Ripani, L., Soler, N., and Viollaz, M. (2022). Automation in latin america: are women at higher risk of losing their jobs? Technological Forecasting and Social Change, 175:121333. DOI: https://doi.org/10.1016/j.techfore.2021.121333.

Fensel, D., Şimşek, U., Angele, K., Huaman, E., Kärle, E., Panasiuk, O., Toma, I., Umbrich, J., Wahler, A., Fensel, D., et al. (2020). Introduction: what is a knowledge graph? Knowledge graphs: Methodology, tools and selected use cases, pages 1–10. DOI: https://doi.org/10.1007/978-3-030-37439-6_1.

Garcia-Holgado, A., Deco, C., Bredegal-Alpaca, N., Bender, C., and Villalba-Condori, K. O. (2020). Perception of the gender gap in computer engineering studies: a comparative study in peru and Argentina. In 2020 IEEE Global Engineering Education Conference (EDUCON), pages 1252–1258. IEEE. DOI: https://doi.org/10.1109/EDUCON45650.2020.9125224.

Gil, A. C. (2008). Methods and techniques of social research. São Paulo: Atlas.

Gindri, L., Araújo-de Oliveira, P., Melo, A. M., Maciel, A., Vargas, K. D. A. R., Otokovieski, M. B., and dos Anjos, R. (2021). Mulheres na computaçao: de norte a sul-uma açao de extensao na pandemia na busca pela integraçao das diferentes regioes do brasil. In Anais do XV Women in Information Technology, pages 101–110. SBC. DOI: https://doi.org/10.5753/wit.2021.15846.

Guzman, I., Berardi, R., Maciel, C., Cabero Tapia, P., Marin-Raventos, G., Rodriguez, N., and Rodriguez, M. (2020). Gender gap in it in latin america. AMCIS 2020 Proceedings.

Hippolyte, J.-L., Rezgui, Y., Li, H., Jayan, B., and Howell, S. (2018). Ontology-driven development of web services to support district energy applications. Automation in Construction, 86:210–225. DOI: https://doi.org/10.1016/j.autcon.2017.10.004.

Hyvönen, E. (2020). Linked open data infrastructure for digital humanities in finland. In Digital Humanities in the Nordic Countries, pages 254–259. CEUR.

Isotani, S. and Bittencourt, I. I. (2015). Connected Open Data: In Search of the Web of Knowledge. Novatec Editora.

Kahn, S. and Ginther, D. (2017). Women and stem. Technical report, National Bureau of Economic Research.

Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele, UK, Keele University, 33(2004):1–26.

Maciel, C., Bim, S. A., and da Silva Figueiredo, K. (2018). Digital girls program: disseminating computer science to girls in brazil. In Proceedings of the 1st International Workshop on Gender Equality in Software Engineering, pages 29–32.

Maciel, C., Guzman, I., Berardi, R., Caballero, B., Rodriguez, N., Frigo, L., Salgado, L., Jimenez, E., Bim, S., and Tapia, P. (2023). Open data platform to promote gender equality policies in stem. Proceedings of the Western Decision Sciences Institute (WDSI).

MINEDU, M. d. E. and DIGESU, D. G. d. E. S. U. (2023). Sistema de recolección de información para educación superior (siries). Available at: [link] Accessed on 29 May 2024.

Noy, N. F., McGuinness, D. L., et al. (2001). Ontology development 101: A guide to creating your first ontology.

Nunes, L. H. C., Reis, J. R., Paxiúba, C. M., Ponte, M. J., Nascimento, M. W., and Nascimento, R. P. (2020). Perfil dos egressos de computação do interior da amazônia no mercado de trabalho. In Anais do XIV Women in Information Technology, pages 254–258. SBC. DOI: https://doi.org/10.5753/wit.2020.11305.

OECD (2022). Women in peru are under-represented among stem graduates, though less so than across the oecd: Share of graduates in stem subjects (% of women graduates), 2019 or last year available, in gender equality in peru: Towards a better sharing of paid and unpaid work, gender equality at work. OECD Publishing, Paris, https://doi.org/10.1787/a5e150db-en.

Pereira, L. R. R., de Souza, K., dos Santos Nunes, E. P., Maciel, C., et al. (2022). Perfis em mídia social para meninas e mulheres com interesse na área stem e steam. In Anais do XVI Women in Information Technology, pages 227–232. SBC. DOI: https://doi.org/10.5753/wit.2022.223162.

Ribeiro, K. d. S. F. M. (2020). Gênero, carreira e formação: O desenvolvimento da carreira das estudantes do ensino médio integrado em informática. Thesis (Doctorate in Education). Institute of Education, Federal University of Mato Grosso, Mato Grosso.

Rodrigues, F. A. and Maciel, C. (2022). Um método para captura e compartilhamento de dados abertos educacionais via um processo etl. In Anais do X Workshop de Computação Aplicada em Governo Eletrônico, pages 133–144. SBC. DOI: https://doi.org/10.5753/wcge.2022.223023.

Smiraglia, R. (2015). Domain analysis for knowledge organization: tools for ontology extraction. Chandos Publishing.

Torres Manrique, D. S., P. P. A. J. C. G. F. D. N. V. A. N. O. M. J. A. C. A. J. M. . and Miñan Sánchez, L. F. (2021). National survey of university higher education students 2019: main results.

Tull, R., Jangha, S., Medina, Y., Bell, T., and Parker, R. (2018). Sharing peace engineering with us-based minority students, through the un’s sustainable development goals, in peru. In 2018 World Engineering Education Forum-Global Engineering Deans Council (WEEF-GEDC), pages 1–6. IEEE. DOI: https://doi.org/10.1109/WEEF-GEDC.2018.8629764.

Vidal, E., Castro, E., Montoya, S., and Payihuanca, K. (2020). Closing the gender gap in engineering: Students role model program. In 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO), pages 1493–1496. IEEE. DOI: https://doi.org/10.23919/MIPRO48935.2020.9245186.




How to Cite

BERARDI, R. C. G.; AUCELI, P. H. S.; MACIEL, C.; FRITOLI, R.; DAVILA, G.; GUZMAN, I. R.; MENDES, L. ELLAS Architecture and Process: Collecting and Curating Data on Women’s Presence in STEM. Journal on Interactive Systems, Porto Alegre, RS, v. 15, n. 1, p. 530–540, 2024. DOI: 10.5753/jis.2024.3853. Disponível em: https://journals-sol.sbc.org.br/index.php/jis/article/view/3853. Acesso em: 24 jun. 2024.



Regular Paper

Most read articles by the same author(s)