An Autonomous Hybrid Data Partitioning Approach for NewSQL Databases

Authors

DOI:

https://doi.org/10.5753/jbcs.2026.5684

Keywords:

NewSQL, Heat Graph, Data Partition, Hybrid Data Partitioning

Abstract

Like online games and the financial market, several applications require specific data management features such as large data volume support, data streaming, and the processing of thousands of OLTP transactions per second. In general, traditional relational databases are not suitable for these requirements. NewSQL is a new generation of databases that combines high scalability and availability with ACID support, being a promising solution for these kinds of applications. Although data partitioning is an essential feature for tuning relational databases, it is still an open issue for NewSQL databases. This paper proposes an automated approach for hybrid data partitioning that minimizes the number of distributed transactions and keeps the system well-balanced. In order to demonstrate its efficacy, we compare our solution with an optimal partitioning solution generated by a solver and a state-of-art baseline. The experiments show that the quality of the partitioning scheme is similar to the optional solution and overcomes the state-of-art approach in number of distributed transactions.

Downloads

Download data is not yet available.

References

Al-Kateb, M., Sinclair, P., Au, G., and Ballinger, C. (2016). Hybrid row-column partitioning in teradata. Proc. VLDB Endow., 9(13):1353-1364. DOI: 10.14778/3007263.30072.

Amossen, R. R. (2010). Vertical partitioning of relational oltp databases using integer programming. In ICDEW, pages 93-98. IEEE. DOI: 10.1109/ICDEW.2010.5452739.

Arulraj, J., Pavlo, A., and Menon, P. (2016). Bridging the archipelago between row-stores and column-stores for hybrid workloads. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16, page 583–598, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/2882903.2915231.

Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., and Schulz, C. (2016). Recent advances in graph partitioning. Algorithm engineering, pages 117-158. DOI: 10.1007/978-3-319-49487-6_4.

Cetintemel, U., Du, J., Kraska, T., and Madden, e. a. (2014). S-store: A streaming newsql system for big velocity applications. Proc. VLDB Endow., 7(13). DOI: 10.14778/2733004.2733048.

Curino, C., Jones, E., Zhang, Y., and Madden, S. (2010). Schism: A workload-driven approach to database replication and partitioning. Proc. VLDB Endow., 3(1-2). DOI: 10.14778/1920841.1920853.

Dantzig, G. B. and Wolfe, P. (1960). Decomposition principle for linear programs. Oper. Res., 8(1):101–111. DOI: 10.1287/opre.8.1.101.

Desrosiers, J. and Lubbecke, M. E. (2005). A Primer in Column Generation. In Column Generation, volume 3, pages 1-32. Springer-Verlag, New York. DOI: 10.1007/0-387-25486-2.

DeWitt, D. J., Katz, R. H., Olken, F., Shapiro, L. D., Stonebraker, M. R., and Wood, D. A. (1984). Implementation techniques for main memory database systems. SIGMOD Rec., 14(2):1-8. DOI: 10.1145/971697.602261.

Elmore, A. J., Arora, V., Taft, R., Pavlo, A., Agrawal, D., and El Abbadi, A. (2015). Squall: Fine-grained live reconfiguration for partitioned main memory databases. In 2015 ACM SIGMOD, pages 299-313, New York, NY, USA. ACM. DOI: 10.1145/2723372.2723726.

Gass, S. I. (2003). Linear programming: methods and applications. Courier Corporation. DOI: 10.2307/2006141.

Grolinger, K., Higashino, W. A., Tiwari, A., and Capretz, M. A. (2013). Data management in cloud environments: Nosql and newsql data stores. Journal of Cloud Computing: Advances, Systems and Applications, 2(1):22. DOI: 10.1186/2192-113X-2-22.

Kallman, R., Kimura, H., Natkins, Stonebraker, M., et al. (2008). H-store: a high-performance, distributed main memory transaction processing system. VLDB, 1(2). DOI: 10.14778/1454159.1454211.

Kumar, R., Gupta, N., Charu, S., and Jangir, S. K. (2014). Manage big data through newsql. In National Conference on Innovation in Wireless Communication and Networking Technology-2014, Association with THE INSTITUTION OF ENGINEERS (INDIA), pages 1-5. DOI: 10.13140/2.1.3965.3768.

M. Tamer Özsu, P. V. a. (2011). Principles of Distributed Database Systems, Third Edition. Springer-Verlag New York, 3 edition. Book.

Ma, L., Arulraj, J., Zhao, S., Pavlo, A., Dulloor, S. R., Giardino, M. J., Parkhurst, J., Gardner, J. L., Doshi, K., and Zdonik, S. (2016). Larger-than-memory data management on modern storage hardware for in-memory oltp database systems. In Proceedings of the 12th International Workshop on Data Management on New Hardware, DaMoN '16, pages 9:1-9:7, New York, NY, USA. ACM. DOI: 10.1145/2933349.2933358.

Meehan, J., Tatbul, N., Zdonik, S., Aslantas, C., et al. (2015). S-store: Streaming meets transaction processing. Proc. VLDB Endow., 8(13). DOI: 10.48550/arXiv.1503.01143.

Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., and Schwarz, P. (1992). Aries: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Trans. Database Syst., 17(1):94-162. DOI: 10.1145/128765.128770.

Nash, S. G. (2013). Encyclopedia of Operations Research and Management Science. Springer US, Boston, MA. DOI: 10.1016/s0898-1221(97)90013-4.

Navathe, S., Ceri, S., Wiederhold, G., and Dou, J. (1984). Vertical partitioning algorithms for database design. ACM Transactions on Database Systems (TODS), 9(4):680-710. DOI: 10.1145/1994.2209.

Pavlo, A. and Aslett, M. (2016). What's really new with newsql? SIGMOD Rec., 45(2):45-55. DOI: 10.1145/3003665.3003674.

Pavlo, A., Curino, C., and Zdonik, S. (2012). Skew-aware automatic database partitioning in shared-nothing, parallel oltp systems. In 2012 ACM SIGMOD, pages 61-72, New York, NY, USA. ACM. DOI: 10.1145/2213836.221384.

Pavlo, A., Jones, E. P. C., and Zdonik, S. (2011). On predictive modeling for optimizing transaction execution in parallel oltp systems. DOI: 10.14778/2078324.2078325.

Schreiner, G. A., Duarte, D., Dal Bianco, G., and Mello, R. d. S. (2019). A hybrid partitioning strategy for newsql databases: The voltdb case. In Proceedings of the 21st International Conference on Information Integration and Web-Based Applications & Services, iiWAS2019, page 353–360, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3366030.3366062.

Serafini, M., Mansour, E., Aboulnaga, A., Salem, K., Rafiq, T., and Minhas, U. F. (2014). Accordion: Elastic scalability for database systems supporting distributed transactions. Proc. VLDB Endow., 7(12):1035-1046. DOI: 10.14778/2732977.2732979.

Serafini, M., Taft, R., Elmore, A. J., Pavlo, A., Aboulnaga, A., and Stonebraker, M. (2016). Clay: Fine-grained adaptive partitioning for general database schemas. Proc. VLDB Endow., 10(4):445-456. DOI: 10.14778/3025111.3025125.

Sharma, G. and Kaur, P. D. (2015). Architecting solutions for scalable databases in cloud. In Proceedings of the Third International Symposium on Women in Computing and Informatics, pages 469-476. ACM. DOI: 10.1145/2791405.2791432.

Stonebraker, M. (2012). New opportunities for new sql. Commun. ACM, 55(11). DOI: 10.1145/2366316.2366319.

Taft, R., Mansour, E., Serafini, M., Duggan, J., Elmore, A. J., Aboulnaga, A., Pavlo, A., and Stonebraker, M. (2014). E-store: Fine-grained elastic partitioning for distributed transaction processing systems. Proc. VLDB Endow., 8(3). DOI: 10.14778/2735508.2735514.

Valdes, J., Garcia-Molina, H., and Lipton, R. (1984). A massive memory machine. IEEE Transactions on Computers, 33(05):391-399. DOI: 10.1109/TC.1984.1676454.

Downloads

Published

2026-02-02

How to Cite

Schreiner, G. A., de Santiago, R., Duarte, D., & Mello, R. dos S. (2026). An Autonomous Hybrid Data Partitioning Approach for NewSQL Databases. Journal of the Brazilian Computer Society, 32(1), 55–72. https://doi.org/10.5753/jbcs.2026.5684

Issue

Section

Regular Issue