A Data-centric Model Transformation Approach using Model2GraphFrame Transformations
DOI:
https://doi.org/10.5753/jserd.2021.477Keywords:
Model Extractor, Data-centric approach, Spark GraphFrames, Model TransformationsAbstract
Data-centric (Dc) approaches are being used for data processing in several application domains, such as distributed systems, natural language processing, and others. There are different data processing frameworks that ease the task of parallel and distributed data processing. However, there are few research approaches studying on how to execute model manipulation operations, as model transformations models on such frameworks. In addition, it is often necessary to provide extraction of XMI-based formats into possibly distributed models. In this paper, we present a Model2GraphFrame operation to extract a model in a modeling technical space into the Apache Spark framework and its GraphFrame supported format. It generates GraphFrame from the input models, which can be used for partitioning and processing model operations. We used two model partitioning strategies: based on subgraphs, and clustering. The approach allows to perform model analysis applying operations on the generated graphs, as well as Model Transformations (MT). The proof of concept results such as model2GraphFrame, GraphFrame partitioning, GraphFrame connectivity, and GraphFrame model transformations indicate that our Model Extraction can be used in various application domains, since it enables the specification of analytical expressions on graphs. Furthermore, its model graph elements are used in model transformations on a scalable platform.
Downloads
References
Ahlgren, B., Hidell, M., and Ngai, E. C. (2016). Internet of things for smart cities: Interoperability and open data. IEEE Internet Computing, 20(6):52–56.
Alvaro, P., Conway, N., Hellerstein, J. M., and Marczak, W. R. (2011). Consistency analysis in bloom: a CALM and collected approach. In CIDR 2011, pages 249–260, CA, USA. CIDRDB.
Anjorin, A., Leblebici, E., and Schürr, A. (2016). 20 years of triple graph grammars: A roadmap for future research. Electronic Communications of the EASST, 73.
Apache, S. F. (2019). Apache spark, 2019 may, release 2.4.3. https://spark.apache.org/. Online, accessed 201908.
Aslak, U., Rosvall, M., and Lehmann, S. (2018). Constrained information flows in temporal networks reveal intermittent communities. Phys. Rev. E 97, 062312 (2018),97(6):062312.
Azzi, G. G., Bezerra, J. S., Ribeiro, L., Costa, A., Rodrigues, L. M., and Machado, R. (2018). The Verigraph System for Graph Transformation. In Heckel, R. and Taentzer, G., editors, Graph Transformation, Specifications, and Nets: In Memory of Hartmut Ehrig, pages 160–178. Springer International Publishing.
Barquero, G., Burgueño, L., Troya, J., and Vallecillo, A. (2018). Extending complex event processing to graph-structured information. In Proceedings of the 21th ACM/IEEE International Conference on Model-Driven Engineering Languages and Systems, MODELS ’18, pages 166–175, New York, NY, USA. ACM.
Batory, D. and Azanza, M. (2017). Teaching model-driven engineering from a relational database perspective. Software & Systems Modeling, 16(2):443–467.
Benelallam, A., Gómez, A., Tisi, M., and Cabot, J. (2015). Distributed Model-to-Model Transformation with ATL on MapReduce. In 2015 ACM SIGPLAN Software Language Engineering, SLE 2015, pages 37–48, New York, NY,USA. ACM
Benelallam, A., Gómez, A., Tisi, M., and Cabot, J. (2018). Distributing relational model transformation on MapReduce. Journal of Systems and Software, 142:1 – 20.
Benelallam, A., Tisi, M., Cuadrado, J. S., de Lara, J., andCabot, J. (2016). Efficient model partitioning for distributed model transformations. In Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering, SLE 2016, pages 226–238, New York, NY, USA. ACM.
Blondel, V. D., Guillaume, J.L., Lambiotte, R., and Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):10008.
Bohlin, L., Edler, D., A., L., and M., R. (2014). Mapequation framework.
Bollati, V. A., Vara, J. M., Jiménez, A., and Marcos, E.(2013). Applying MDE to the (semi-)automatic development of model transformations. Inf. Softw. Technol., 55(4):699–718.
Brambilla, M., Cabot, J., and Wimmer, M. (2012). Model-Driven Software Engineering in Practice, volume 1. Morgan & Claypool, Williston, USA, 1 ed. edition.
Burgueno, L., Troya, J., Wimmer, M., and Vallecillo, A.(2015). Parallel in place model transformations with LinTra. In Proceedings of the 3rd Workshop on Scalable Model-Driven Engineering, pages 52–62.
Burgueno, L., Wimmer, M., and Vallecillo, A. (2016). A lindabased platform for the parallel execution of outplace model transformations. Inf. Software Technology, 79:17–35.
Camargo, L. C. and Fabro, M. D. D. (2019). Applying a data-centric framework for developing model transformations. In ACM/SIGAPP Symposium on Applied Computing, SAC’19, page 1570–1573, New York, NY, USA. Association for Computing Machinery.
Chambers, B. and Zaharia, M. (2018).Spark: The Definitive Guide, volume 1. Ó Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA, USA, 1 ed. edition.
Daniel, G., Sunye, G., Benelallam, A., Tisi, M., Vernageau, Y., Gomez, A., and Cabot, J. (2017). NeoEMF: A multidatabase model persistence framework for very large models. Science of Computer Programming, 149:9 – 14. Special Issue on MODELS’16.
Daniel, G., Sunyé, G., and Cabot, J. (2016). UMLtoGraphDB: Mapping conceptual schemas to graph databases. In ComynWattiau, I., Tanaka, K., Song, I.Y., Yamamoto, S.,and Saeki, M., editors, Conceptual Modeling, pages 430–444, Cham. Springer International Publishing.
Dean, J. and Ghemawat, S. (2008). Mapreduce: Simplified data processing on large clusters.Commun. ACM, 51(1):107–113.
Eclipse, F. (2019). Atl transformations list (zoo). http://www.eclipse.org/atl/atlTransformations/. Online, accessed 2019/02.
Edgar, J., Sebastian, B., Dennis, W., Li, D., Abel, H., Markus,H., Tassilo, H., Elina, K., Christian, K., Kevin, L., Markus,L., Arend, R., Louis, R., Sebastian, W., and Steffen, M.(2014). A survey and comparison of transformation tools based on the transformation tool contest. Science of Computer Programming, 85:41 – 99. Special issue on Experimental Software Engineering in the Cloud(ESEiC).
Edler, D., Bohlin, L., and Rosvall, M. (2017). Mapping higher-order network flows in memory and multilayer networks with Infomap. CoRR, abs/1706.04792.
Gao, Y., Zhou, Y., Zhou, B., Shi, L., and Zhang, J. (2017). Handling data skew in MapReduce cluster by using partition tuning. In Journal of healthcare engineering, pages1–12.
Gómez, A., Tisi, M., Sunyé, G., and Cabot, J. (2015). Map-based transparent persistence for very large models. In Fundamental Approaches to Software Engineering 18th International Conference, (FASE), pages 19–34.
Hermann, F., Ehrig, H., Golas, U., and Orejas, F. (2014). Formal analysis of model transformations based on triple graph grammars. Mathematical Structures in Computer Science, 24(4).
Hochbaum, D. S. (2008). The pseudoflow algorithm: A new algorithm for the maximum flow problem. Oper. Res.,56(4):992–1009.
Imre, G. and Mezei, G. (2012). Parallel graph transformations on multicore systems. In Proceedings of the 2012 International Conference on Multicore Software Engineering, Performance, and Tools, MSEPT’12, pages 86–89, Berlin, Heidelberg. SpringerVerlag.
Jia, X. and Jones, C. (2015). Design of adaptive domainspecific modeling languages for modeldriven mobile application development. In 2015 10th International Joint Conference on Software Technologies (ICSOFT), volume 1, pages 1–6.
Jin, J., Luo, J., Song, A., Dong, F., and Xiong, R. (2011). Bar: An efficient data locality-driven task scheduling algorithm for cloud computing. In2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 295–304.
Jouault, F., Allilaire, F., Bézivin, J., and I., K. (2008). Atl: A model transformation tool.Science of Computer Programming, 72(1):31 – 39. Special Issue on Second issue of experimental software and toolkits (EST).
Junghanns, M., Petermann, A., Teichmann, N., Gómez, K.,and Rahm, E. (2016). Analyzing Extended Property Graphs with Apache Flink. In SIGMOD Workshop on Network Data Analytics (NDA), pages 1–8.
Kahani, N., Bagherzadeh, M., Cordy, J. R., Dingel, J., andVarró, D. (2018). Survey and classification of model transformation tools.Software & Systems Modeling.
Kendig, C. E. (2016). What is proof of concept research and how does it generate epistemic and ethical categories for future scientific practice? In Nature, S., editor, Science and Engineering Ethics, pages 735–753. Springer International Publishing, Switzerland AG.
Kolovos, D. S., Paige, R. F., and Polack, F. A. C. (2008). The Epsilon Transformation Language, pages 46–60. Springer Berlin Heidelberg, Berlin, Heidelberg.
Larman, C. (2004). Applying UML and Patterns: An Introduction to ObjectOriented Analysis and Design and the Unified Process, volume 1. Prentice-Hall, Upper SaddleRiver, United States, 3 ed. edition.
Le, Y., Liu, J., Ergün, F., and Wang, D. (2014). Online load balancing for MapReduce with skewed data input. In IEEE INFOCOM 2014 IEEE Conference on Computer Communications, pages 2004–2012.
Li, L., Geda, R., Hayes, A. B., Chen, Y., Chaudhari, P., Zhang, E. Z., and Szegedy, M. (2017). A simple yet effective balanced edge partition model for parallel computing.SIGMETRICS Perform. Eval. Rev., 45(1):6–6.
Löwe, M. (2018). Model transformations as free constructions. In Heckel, R. and Taentzer, G., editors, Graph Transformation, Specifications, and Nets: In Memory of Hartmut Ehrig, pages 142–159. Springer International Publishing, Cham.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pages 281–297, Berkeley, Calif. University of California Press.
Michael l., S. (2016).Programming Language Pragmatics.Morgan Kaufmann, 4 ed. edition.
Milo, R., ShenOrr, S., Itzkovitz, S., Kashtan, N., Chklovskii,D., and Alon, U. (2002). Network motifs: Simple building blocks of complex networks.Science (New York, N.Y.),298:824–7.
OMG (2016). Qvt query view transformation, formal/20160603 v1.3. http://www.omg.org/spec/QVT. Accessed in2018/06.
Pagán, J. E., Cuadrado, J. S., and Molina, J. G. (2015). Arepository for scalable model management.Software &Systems Modeling, 14(1):219–239.
Raman, R. (2015). Encoding data structures. In Rahman, M. S. and Tomita, E., editors, WALCOM: Algorithms and Computation, pages 1–7, Cham. Springer InternationalPublishing.
Rutle, A., Rossini, A., Lamo, Y., and Wolter, U. (2012). A formal approach to the specification and transformation of constraints in mde.The Journal of Logic and AlgebraicProgramming, 81(4):422 – 457.
Schürr, A. (1995). Specification of graph translators with triple graph grammars. InProceedings of the 20th International Workshop on GraphTheoretic Concepts in Computer Science, WG 94, pages 151–163. SpringerVerlag.
Shkapsky, A., Yang, M., Interlandi, M., Chiu, H., Condie, T.,and Zaniolo, C. (2016). Big data analytics with data log queries on spark. InProceedingsofthe2016InternationalConference on Management of Data, SIGMOD16, pages1135–1149.
Szárnyas, G., Izsó, B., Ráth, I., Harmath, D., Bergmann, G.,and Varró, D. (2014). Incqueryd: A distributed incremental model query framework in the cloud. In Dingel, J., Schulte, W., Ramos, I., Abrahão, S., and Insfran, E., editors,ModelDriven Engineering Languages and Systems, pages 653–669. Springer International Publishing.
Szárnyas, G., Izsó, B., Ráth, I., and Varró, D. (2018). The train benchmark: crosstechnology performance evaluation of continuous model queries.Software System Model,17, 4:28.
Tang, M., Shao, S., Yang, W., Liang, Y., Yu, Y., Saha, B.,and Hyun, D. (2019). Sac: A system for big data lineage tracking. In2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 1964–1967.
Tisi, M., Martínez, S., and Choura, H. (2013). Parallel execution of atl transformation rules. InProceedings of the 16thInternational Conference on ModelDriven EngineeringLanguages and Systems Volume 8107, pages 656–672, New York, NY, USA. SpringerVerlag New York, Inc.
Tomaszek, S., Leblebici, E., Wang, L., and Schürr, A. (2018).Modeldriven development of virtual network embeddingalgorithms with model transformation and linear optimization techniques. In Schaefer, I., Karagiannis, D., Vogelsang, A., Méndez, D., and Seidl, C., editors,Modellierung2018, pages 39–54, Bonn. Gesellschaft für Informatik e.V.
Vara, J. M. and Marcos, E. (2012). A framework for modeldriven development of information systems.Journal ofSystems Software., 85(10):2368–2384.
Varró, D., Bergmann, G., Hegedüs, Á., Horváth, Á., Ráth,I., and Ujhelyi, Z. (2016). Road to a reactive and incremental model transformation platform: three generations of the viatra framework.Software & Systems Modeling,15(3):609–629.
Varro, G., Schurr, A., and Varro, D. (2005). Benchmarking for graph transformation. In2005 IEEE SymposiumonVisualLanguagesandHumanCentricComputing(VL/HCC’05), pages 79–88.
W3C (2014). Rdf 1.1 concepts and abstract syntax.
Wischenbart, M., Mitsch, S., Kapsammer, E., Kusel, A.,Pröll, B., Retschitzegger, W., Schwinger, W., Schönböck,J., Wimmer, M., and Lechner, S. (2012). User profile integration made easy: Modeldriven extraction and transformation of social network schemas. InProceedings of the21st International Conference on World Wide Web, pages939–948.
Xin, R. S., Gonzalez, J. E., Franklin, M. J., and Stoica, I. (2013). Graphx: A resilient distributed graph system on spark. In First International Workshop on GraphData Management Experiences and Systems, GRADES’13, pages 2:1–2:6.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Marcos Didonet Del Fabro, Luiz Carlos Camargo
This work is licensed under a Creative Commons Attribution 4.0 International License.