Ecore4PROV-DM: A Metamodel for Enhancing Data Provenance Adoption in Information Systems
DOI:
https://doi.org/10.5753/isys.2024.4691Keywords:
Data Provenance, Model-Driven Engineering, W3C PROV, W3C PROV-DM, Metamodel EvaluationAbstract
Effective management of data provenance is essential in Information Systems, particularly for data-intensive applications. Despite the W3C PROV family of documents establishing a standard for representing provenance, integrating this information into software development processes remains a significant challenge. This paper addresses the problem by introducing the Ecore4PROV-DM metamodel, developed using Model-Driven Engineering techniques to align with the W3C PROV data model (PROV-DM). The metamodel's application is demonstrated through real-world scenarios, including the Urban Observatory project at Newcastle University. Evaluated using a subset of the Metamodel Quality Requirements and Evaluation (MQuaRE) framework, focusing on three key quality requirements, Ecore4PROV-DM exhibits high accuracy and completeness, making it a robust tool for provenance modeling. By bridging the gap between the conceptual richness of W3C PROV-DM and practical implementation needs, Ecore4PROV-DM facilitates precise provenance representation and seamless integration into diverse Information Systems.
Downloads
References
ARDC, Australian Research Data Commons (2022). Data Provenance. Disponível em: [link]. Acesso em: 16 abr. 2024.
Bastin, L., Reynolds, O., Garcia-Dominguez, A., and Sprinks, J. (2023). Facilitating provenance documentation with a model-driven-engineering approach. In EGU General Assembly 2023, pages 24–28, Vienna, Austria. EGU23-8321.
Bruel, J. M., Combemale, B., Guerra, E., Jézéquel, J.-M., Kienzle, J., de Lara, J., Mussbacher, G., Syriani, E., and Vangheluwe, H. (2018). Model transformation reuse across metamodels. In Rensink, A. and Sánchez Cuadrado, J., editors, Theory and Practice of Model Transformation, pages 92–109, Cham. Springer International Publishing.
Bucchiarone, A., Cabot, J., Paige, R. F., and Pierantonio, A. (2020). Grand challenges in model-driven engineering: an analysis of the state of the research. Software and Systems Modeling, 19(1):5–13.
Callahan, S. P., Freire, J., Santos, E., Scheidegger, C. E., Silva, C. T., and Vo, H. T. (2006). Vistrails: visualization meets data management. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD ’06, page 745–747, New York, NY, USA. Association for Computing Machinery.
Community, T. G. (2022). The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic Acids Res., 50(W1):W345–W351.
Gil, Y., Miles, S., Belhajjame, K., Deus, H., Garijo, D., Klyne, G., Missier, P., Soiland-Reyes, S., and Zednik, S. (2013). PROV Model Primer.
Glavic, B. (2021). Data provenance. Foundations and Trends® in Databases, 9(3-4):209–441.
Herschel, M., Diestelkämper, R., and Ben Lahmar, H. (2017). A survey on provenance: What for? What form? What from? The VLDB Journal, 26(6):881–906.
Hu, R., Yan, Z., Ding, W., and Yang, L. T. (2020). A survey on data provenance in IoT. World Wide Web, 23(2):1441–1463.
ISO Central Secretary (2023). Geographic information – Metadata – Part 3: XML schema implementation for fundamental concepts. Standard, International Organization for Standardization, Geneva, CH.
Kinderen, S. D., Kaczmarek-Hess, M., Ma, Q., and Razo-Zapata, I. S. (2017). Towards Meta Model Provenance: A Goal-Driven Approach to Document the Provenance of Meta Models. In Poels, G., Gailly, F., Asensio, E. S., and Snoeck, M., editors, 10th IFIP Working Conference on The Practice of Enterprise Modeling (PoEM), volume LNBIP-305 of The Practice of Enterprise Modeling, pages 49–64, Leuven, Belgium. Springer International Publishing. Part 1: Regular Papers.
Kudo, T. N. (2021). A metamodel for aligning requirements standards and testing standards and a framework for evaluating metamodels [in Portuguese]. PhD thesis, Universidade Federal de São Carlos, São Carlos – SP, Brazil.
Kudo, T. N., Bulcão Neto, R. F., and Vincenzi, A. M. R. (2020a). Toward a Metamodel Quality Evaluation Framework: Requirements, Model, Measures, and Process. In Proceedings of the XXXIV Brazilian Symposium on Software Engineering, SBES ’20, page 102–107, New York, NY, USA. Association for Computing Machinery.
Kudo, T. N., Bulcão-Neto, R. F., and Vincenzi, A. M. R. (2020b). Metamodel Quality Requirements and Evaluation (MQuaRE). Technical report, Departamento de Computação, UFScar, São Carlos-SP, Brazil. v 2.0.
López-Fernández, J. J., Cuadrado, J. S., Guerra, E., and de Lara, J. (2015). Example-driven meta-model development. Software & Systems Modeling, 14(4):1323–1347.
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E. A., Tao, J., and Zhao, Y. (2006). Scientific workflow management and the kepler system. Concurr. Comput., 18(10):1039–1065.
Madiot, F., Goubet, L., Begaudeau, S., Chauvin, M., Musset, J., and Pupier, A. (2024). Eclipse Acceleo Wiki. Disponível em: [link]. Acesso em: 16 abr. 2024.
Madiot, F. and Paganelli, M. (2015). Eclipse sirius demonstration. P&D@ MoDELS, 1554:9–11.
Moreau, L. (2017). PROV-Template: A Quick Start.
Moreau, L., Batlajery, B. V., Huynh, T. D., Michaelides, D., and Packer, H. (2018). A templating system to generate provenance. IEEE Transactions on Software Engineering, 44(2):103–121.
Moreau, L., Missier, P., Belhajjame, K., B’Far, R., Cheney, J., Coppens, S., Cresswell, S., Gil, Y., Groth, P., Lebo, G. K. T., McCusker, J., Miles, S., Myers, J., and Sahoo, S. (2013a). PROV-DM: The PROV Data Model.
Moreau, L., Missier, P., Cheney, J., and Soiland-Reyes, S. (2013b). PROV-N: The Provenance Notation.
Pérez, B., Rubio, J., and Sáenz-Adán, C. (2018). A systematic review of provenance systems. Knowledge and Information Systems, 57(3):495–543.
Rodrigues da Silva, A. (2015). Model-driven engineering: A survey supported by the unified conceptual model. Computer Languages, Systems & Structures, 43:139–155.
Schmidt, D. C. (2006). Guest editor’s introduction: Model-driven engineering. Computer, 39(2):0025–31.
Steinberg, D., Budinsky, F., Merks, E., and Paternostro, M. (2008). EMF: Eclipse Modeling Framework. Pearson Education, Boston.
Velasco, G. C., Vieira, M. A., and Carvalho, S. T. (2023). Evaluation of a high-level metamodel for developing smart contracts on the ethereum virtual machine. In Anais do VI Workshop em Blockchain: Teoria, Tecnologias e Aplicações, pages 29–42, Porto Alegre, RS, Brasil. SBC.
Vieira, M. A. and Carvalho, S. T. (2024). MDE-Based Graphical Tool for Modeling Data Provenance According to the W3C PROV Standard. In Proceedings of the 12th International Conference on Model-Based Software and Systems Engineering - MODELSWARD, pages 141–148. INSTICC, SciTePress.
Völter, M., Stahl, T., Bettin, J., Haase, A., and Helsen, S. (2013). Model-driven software development: technology, engineering, management. John Wiley & Sons.
Wolf, M., Kunze, J. A., Lagoze, C., and Weibel, D. S. (1998). Dublin Core Metadata for Resource Discovery. RFC 2413.
Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., Owen, S., Soiland-Reyes, S., Dunlop, I., Nenadic, A., Fisher, P., Bhagat, J., Belhajjame, K., Bacall, F., Hardisty, A., Nieva de la Hidalga, A., Balcazar Vargas, M. P., Sufi, S., and Goble, C. (2013). The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Research, 41(W1):W557–W561.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 iSys - Brazilian Journal of Information Systems
This work is licensed under a Creative Commons Attribution 4.0 International License.