Hurricane: a Dataflow-oriented Data Service for Smart Cities Applications

Authors

DOI:

https://doi.org/10.5753/jidm.2023.3189

Keywords:

Smart Cities, Data Management, Data Integration

Abstract

The concept of Smart Cities has gained relevance, especially in the last decade, due to the availability of data associated with cities, e.g., car traffic, public transportation, crime data, etc. The purpose of using these data is to improve the services offered to the citizens. Most of these applications manipulate spatiotemporal data. These data are processed in a dataflow that starts with the collection, integration, and aggregation and ends with visualization. This way, specialized data services for smart city applications are most welcome. However, many of the existing data services in this context, are either specific to a particular application/domain or do not consider the entire data life cycle. In this article, we present Hurricane, a dataflow-oriented data service for smart city applications. Hurricane executes multiple dataflows to gather, pre-process, integrate, and public data. Hurricane was evaluated with an application in the area of public security and results reinforced the importance of this type of data service.

Downloads

Download data is not yet available.

References

Ahmad, K., Maabreh, M., Ghaly, M., Khan, K., Qadir, J., and Al-Fuqaha, A. (2022). Developing future human-centered smart cities: Critical analysis of smart city security, data management, and ethical challenges. Computer Science Review, 43:100452.

Banni, M., Rosseti, I., and de Oliveira, D. (2022). Hurricane: um serviço para gerência de dados de aplicações de cidades inteligentes. In Anais do XXXVII Simpósio Brasileiro de Bancos de Dados, pages 151–163, Porto Alegre, RS, Brasil. SBC.

Bellini, E., Bellini, P., Cenni, D., Nesi, P., Pantaleo, G., Paoli, I., and Paolucci, M. (2021). An ioe and big multimedia data approach for urban transport system resilience management in smart cities. Sensors, 21:435. DOI: 10.3390/s21020435.

Bertelli, L., Ströele, V., Machado, J. C., and de Oliveira, D. (2022). Privacidade diferencial em sistemas polystore: uma abordagem prática. In 2022: Proceedings of the 37th Brazilian Symposium on Databases, SBBD 2022, Buzios, Brazil, September 19 -23, 2022, pages 279–291. SBC. DOI: 10.5753/sbbd.2022.224305.

Bhardwaj, D. and Kumar, R. (2005). A parallel file transfer protocol for clusters and grid systems. In First International Conference on e-Science and Grid Computing (e-Science’05), pages 7 pp.–254.

Bilal, M., Usmani, R. S. A., Tayyab, M., Mahmoud, A. A., Abdalla, R. M., Marjani, M., Pillai, T. R., and Targio Hashem, I. A. (2020). Smart Cities Data: Framework, Applications, and Challenges, pages 1–29. Springer International Publishing, Cham.

Boeing, G. (2017). Osmnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Comp., Env. and Urban Sys., 65:126–139. DOI: https://doi.org/10.1016/j.compenvurbsys.2017.05.004.

Bohli, J.-M., Skarmeta, A., Victoria Moreno, M., García, D., and Langendörfer, P. (2015). Smartie project: Secure iot data management for smart cities. In 2015 International Conference on Recent Advances in Internet of Things (RIoT), pages 1–6.

Brito, J. J. (2018). Data Warehouses in the era of Big Data: efficient processing of Star Joins in Hadoop. Computer science and computational mathematics, ICMC-USP.

Chawathe, S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J., and Widom, J. (1994). The tsimmis project: Integration of heterogenous information sources. In Information Processing Society of Japan.

Chen, H., Cheng, T., and Wise, S. (2017). Developing an online cooperative police patrol routing strategy. Computers, Environment and Urban Systems, 62:19–29. DOI: https://doi.org/10.1016/j.compenvurbsys.2016.10.013.

Ciobanu, M. G., Fasano, F., Martinelli, F., Mercaldo, F., and Santone, A. (2019). A data life cycle modeling proposal by means of formal methods. In Proceedings of the Asia Conference on Computer and Communications Security, page 670, New York, NY, USA. Association for Computing Machinery.

Consoli, S., Mongiovì, M., Nuzzolese, A. G., Peroni, S., Presutti, V., Recupero, D. R., and Spampinato, D. (2015). A smart city data model based on semantics best practice and principles. In WWW 2015, pages 1395–1400. ACM. DOI: 10.1145/2740908.2742133.

Costa, C. and Santos, M. Y. (2017). The suscity big data warehousing approach for smart cities. IDEAS 2017, page 264–273, New York, NY, USA. ACM. DOI: 10.1145/3105831.3105841.

Cunha Sá, B., Muller, G., Banni, M., Santos, W., Lage, M., Rosseti, I., Frota, Y., and de Oliveira, D. (2022). Polrouteds: a crime dataset for optimization-based police patrol routing. Journal of Information and Data Management, 13(1).

Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS quarterly, pages 319–340.

de Oliveira, D., Rodrigues, E., Costa, S., Amora, P. R. P., Caldas, A., Horta, M., de Fillippis, A. M., Ocaña, K. A. C. S., Vidal, V. M. P., and Machado, J. C. (2019a). Um estudo comparativo de mecanismos de privacidade diferencial sobre um dataset de ocorrências do ZIKV no brasil. In XXXIV Simpósio Brasileiro de Banco de Dados, SBBD 2019, Fortaleza, CE, Brazil, October 7-10, 2019, pages 253–258. SBC. DOI: 10.5753/sbbd.2019.8832.

de Oliveira, D. C. M., Liu, J., and Pacitti, E. (2019b). Data-Intensive Workflow Management: For Clouds and Data-Intensive and Scalable Computing Environments. Synthesis Lectures on Data Management. Morgan & Claypool Publishers. DOI: 10.2200/S00915ED1V01Y201904DTM060.

de Souza, I. E., Oliveira, P. H. L., Bispo, E. L., Inocencio, A. C. G., and Parreira, P. A. (2015). TESE - an information system for management of experimental software engineering projects. In Siqueira, S. W. M. and Carvalho, S. T., editors, Proceedings of the Brazilian Symposium on Information Systems, pages 563–570. ACM.

Dwork, C. and Lei, J. (2009). Differential privacy and robust statistics. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 371–380.

Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006a). Calibrating noise to sensitivity in private data analysis. In Halevi, S. and Rabin, T., editors, Theory of Cryptography, pages 265–284, Berlin, Heidelberg. Springer Berlin Heidelberg.

Dwork, C., McSherry, F., Nissim, K., and Smith, A. (2006b). Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pages 265–284. Springer.

Dwork, C., Roth, A., et al. (2014). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407.

Freire, J., Koop, D., Santos, E., and Silva, C. T. (2008). Provenance for Computational Tasks: A Survey. Computing in Science & Engineering, pages 20–30.

Garcia-Font, V. (2020). Socialblock: An architecture for decentralized user-centric data management applications for communications in smart cities. JPDC, 145:13–23. DOI: 10.1016/j.jpdc.2020.06.004.

Ikeda, R., Sarma, A. D., and Widom, J. (2013). Logical provenance in data-oriented workflows? In Data Engineering (ICDE), 2013 IEEE 29th International Conference on, pages 877–888. IEEE.

Jindal, A., Kumar, N., and Singh, M. (2020). A unified framework for big data acquisition, storage, and analytics for demand response management in smart cities. FGCS, 108:921–934. DOI: 10.1016/j.future.2018.02.039.

Kikuchi, G., Amemiya, M., and Shimada, T. (2012). An analysis of crime hot spots using GPS tracking data of children and agent-based simulation modeling. Ann. GIS, 18(3):207–223. DOI: 10.1080/19475683.2012.691902.

Kimball, R. and Ross, M. (2002). The data warehouse toolkit: the complete guide to dimensional modeling, 2nd Edition. Wiley.

Liu, X., Heller, A., and Nielsen, P. S. (2017). Citiesdata: a smart city data management framework. Knowl. Inf. Syst., 53:699–722. DOI: 10.1007/s10115-017-1051-3.

Liu, Z. and Heer, J. (2014). The effects of interactive latency on exploratory visual analysis. IEEE transactions on visualization and computer graphics, 20:2122–2131.

Lourenço, V., Mann, P., Guimaraes, A., Paes, A., and de Oliveira, D. (2018). Towards safer (smart) cities: Discovering urban crime patterns using logic-based relational machine learning. In 2018 International Joint Conference on Neural Networks, IJCNN 2018, Rio de Janeiro, Brazil, July 8-13, 2018, pages 1–8. IEEE. DOI: 10.1109/IJCNN.2018.8489374.

Mehmood, H., Gilman, E., Cortes, M., Kostakos, P., Byrne, A., Valta, K., Tekes, S., and Riekki, J. (2019). Implementing big data lake for heterogeneous data sources. In ICDEW 2019, pages 37–44. DOI: 10.1109/ICDEW.2019.00-37.

Miller, R. J. (2018). Open data integration. Proc. VLDB Endow., 11(12):2130–2139.

Nandury, S. V. and Begum, B. A. (2016). Strategies to handle big data for traffic management in smart cities. In ICACCI 2016, India, pages 356–364. IEEE. DOI: 10.1109/ICACCI.2016.7732072.

Nargesian, F., Zhu, E., Miller, R. J., Pu, K. Q., and Arocena, P. C. (2019). Data lake management: Challenges and opportunities. Proc. VLDB Endow., 12:1986–1989. DOI: 10.14778/3352063.3352116.

Oracle (2023). Oracle Smart Cities. [link]. Accessed: July 10, 2023.

Petersen, K., Vakkalanka, S., and Kuzniarz, L. (2015). Guidelines for conducting systematic mapping studies in software engineering: An update. Information & Software Technology, 64:1–18.

Pinkel, C., Binnig, C., Jiménez-Ruiz, E., May, W., Ritze, D., Skjæveland, M. G., Solimando, A., and Kharlamov, E. (2015). RODI: A benchmark for automatic mapping generation in relational-to-ontology data integration. In Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., and Zimmermann, A., editors, The Semantic Web. Latest Advances and New Domains - 12th European Semantic Web Conference, ESWC 2015, Portoroz, Slovenia, May 31 - June 4, 2015. Proceedings, volume 9088 of Lecture Notes in Computer Science, pages 21–37. Springer.

Pisco, V. G. and Marques-Neto, H. T. (2021). iwalk: Uma solução para medição e análise da caminhabilidade de cidades com portais de dados abertos. In Anais do V Workshop de Computação Urbana, pages 84–97. SBC.

Radic, B., Kajic, V., and Imamagic, E. (2007). Optimization of data transfer for grid using gridftp. In 2007 29th International Conference on Information Technology Interfaces, pages 709–715.

Raghavan, S., Boung Yew, S. L., Lee, Y. L., Tan, W., and Kee, K. K. (2019). Data Integration for Smart Cities: Opportunities and Challenges, pages 393–403. DOI: 10.1007/978-981-15-0058-938.

Ribeiro, M. and R. Braghetto, K. (2022). A scalable data integration architecture for smart cities: Implementation and evaluation. Journal of Information and Data Management, 13(2).

Ribeiro, M. B. and Braghetto, K. R. (2021). A data integration architecture for smart cities. In SBBD 2021, Rio de Janeiro, Brazil, pages 205–216. SBC. DOI: 10.5753/sbbd.2021.17878.

Ribeiro, M. W. M., Lima, A. A. B., and de Oliveira, D. (2020). OLAP parallel query processing in clouds with c-pargres. Concurr. Comput. Pract. Exp., 32(7). DOI: 10.1002/cpe.5590.

Salvadores, M., Correndo, G., Rodriguez-Castro, B., Gibbins, N., Darlington, J., and Shadbolt, N. R. (2009). Linksb2n: Automatic data integration for the semantic web. In Meersman, R., Dillon, T. S., and Herrero, P., editors, OTM 2009, Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Vilamoura, Portugal, 2009, volume 5871 of Lecture Notes in Computer Science, pages 1121–1138. Springer.

Silva, J., Almeida, J. G., Batista, T., and Cavalcante, E. (2021). Aquedücte: A data integration service for smart cities. WebMedia ’21, page 177–180, NY, USA. ACM. DOI: 10.1145/3470482.3479631.

Silva, V., Campos, V., Guedes, T., Camata, J. J., de Oliveira, D., Coutinho, A. L. G. A., Valduriez, P., and Mattoso, M. (2020). Dfanalyzer: Runtime dataflow analysis tool for computational science and engineering applications. SoftwareX, 12:100592.

Silva, V., Leite, J., Camata, J. J., de Oliveira, D., Coutinho, A. L. G. A., Valduriez, P., and Mattoso, M. (2017a). Raw data queries during data-intensive parallel workflow execution. FGCS, 75:402–422. DOI: 10.1016/j.future.2017.01.016.

Silva, V., Leite, J., Camata, J. J., De Oliveira, D., Coutinho, A. L. G. A., Valduriez, P., and Mattoso, M. (2017b). Raw data queries during data-intensive parallel workflow execution. FGCS, 75:402–422.

Syed, A. (2020). The challenge of building effective, enterprise-scale data lakes. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD ’20, page 803, New York, NY, USA. Association for Computing Machinery.

Widom, J. (1995). Research problems in data warehousing. In CIKM’95, CIKM ’95, pages 25–30, New York, NY, USA. ACM.

Wohlin, C. (2014). Guidelines for snowballing in systematic literature studies and a replication in software engineering. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, EASE ’14. ACM.

Zhou, R., Zhang, X., Wang, X., Yang, G., Guizani, N., and Du, X. (2021). Efficient and traceable patient health data search system for hospital management in smart cities. IEEE Internet Things J., 8(8):6425–6436. DOI: 10.1109/JIOT.2020.3028598.

Downloads

Published

2023-10-31

How to Cite

Banni, M., Falci, M. L., Rosseti, I., & de Oliveira, D. (2023). Hurricane: a Dataflow-oriented Data Service for Smart Cities Applications. Journal of Information and Data Management, 14(1). https://doi.org/10.5753/jidm.2023.3189

Issue

Section

SBBD 2022 Full papers - Extended Papers