Performance Evaluation for Scientific Applications using the Roofline Model

Authors

  • Vitor de Sá Laboratório Nacional de Computação Científica (LNCC)
  • Vinícius P. Klôh Laboratório Nacional de Computação Científica (LNCC)
  • Bruno Schulze Laboratório Nacional de Computação Científica (LNCC)
  • Mariza Ferro Laboratório Nacional de Computação Cientifica (LNCC)

DOI:

https://doi.org/10.5753/reic.2023.2429

Keywords:

Performance Evaluation, HPC, Roofline, Machine Learning, Scientific Applications

Abstract

Understanding the factors that limit the performance of applications and their computational requirements can help in software optimization and in the acquisition, development and choice of hardwares that best meet the application's computational performance. This work proposes a methodology for evaluating and characterizing the performance of scientific applications using the Roofline model. A series of experiments with a set of applications were developed and analyzed following the proposed methodology. The results allow us to identify the aspects that limit the performance of applications, suggest software optimizations and the hardware that increases the performance of the applications.

Downloads

Download data is not yet available.

References

Antão, D., Taniça, L., Ilic, A., Pratas, F., Tomás, P., and Sousa, L. (2013). Monitoring performance and power for application characterization with the cache-aware roofline model. In Wyrzykowski, R., Dongarra, J. J., Karczewski, K., and Wasniewski, J., editors, Parallel Processing and Applied Mathematics - 10th International Conference, PPAM 2013, Warsaw, Poland, September 8-11, 2013, Revised Selected Papers, Part I, volume 8384 of Lecture Notes in Computer Science, pages 747–760. Springer.

Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984). Classification and regression trees. CRC press.

Denoyelle, N., Goglin, B., Ilic, A., Jeannot, E., and Sousa, L. (2019). Modeling non-uniform memory access on large compute nodes with the cache-aware roofline model. IEEE Trans. Parallel Distributed Syst., 30(6):1374–1389.

Frumkin, M., Jin, H., and Yan, J. (2009). Implementation of nas parallel benchmarks in high performance fortran.

Ibrahim, K., Williams, S., and Oliker, L. (2020). Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories, pages 3–19.

Ilic, A., Pratas, F., and Sousa, L. (2014). Cache-aware roofline model: Upgrading the loft. IEEE Computer Architecture Letters, 13(1):21–24.

Ilic, A., Pratas, F., and Sousa, L. (2017). Beyond the roofline: Cache-aware power and energy-efficiency modeling for multi-cores. IEEE Trans. Computers, 66(1):52–58.

Kim, K.-H., Kim, K.-H., and Park, Q.-H. (2011). Performance analysis and optimization of three-dimensional fdtd on gpu using roofline model. Computer Physics Communications, 182:1201–1207.

Lopes, A., Pratas, F., Sousa, L., and Ilic, A. (2017). Exploring GPU performance, power and energy-efficiency bounds with cache-aware roofline modeling. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2017, Santa Rosa, CA, USA, April 24-25, 2017, pages 259-268. IEEE Computer Society.

Lorenzo, J. A., Pichel, J. C., Pena, T. F., Suarez, M., and Rivera, F. F. (2011). Study of performance issues on a SMP-NUMA system using the roofline model. In 2011 International Conference on Parallel and Distributed Processing Techniques and Applications.

Marques, D., Duarte, H., Ilic, A., Sousa, L., Belenov, R., Thierry, P., and Matveev, Z. A. (2017a). Performance analysis with cache-aware roofline model in intel advisor. In 2017 International Conference on High Performance Computing & Simulation, HPCS 2017, Genoa, Italy, July 17-21, 2017, pages 898–907. IEEE.

Marques, D., Duarte, H., Sousa, L., and Ilic, A. (2017b). Analyzing performance of multi-cores and applications with cache-aware roofline model. In 2017 International Conference on High Performance Computing & Simulation, HPCS 2017, Genoa, Italy, July 17-21, 2017, pages 933–934. IEEE.

Marques, D., Ilic, A., Matveev, Z. A., and Sousa, L. (2020). Application-driven cache-aware roofline model. Future Gener. Comput. Syst., 107:257–273.

Marques, D., Ilic, A., and Sousa, L. (2021). Mansard roofline model: Reinforcing the accuracy of the roofs. ACM Trans. Model. Perform. Evaluation Comput. Syst., 6(2):7:1–7:23.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Pereira, R., Couto, M., Ribeiro, F., Rua, R., Cunha, J., Fernandes, J. a. P., and Saraiva, J. a. (2017). Energy efficiency across programming languages: How do energy, time, and memory relate? In Proceedings of the 10th ACM SIGPLAN International Conference on Software Language Engineering, SLE 2017, page 256–267, New York, NY, USA.

Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Sato, Y., Nagaoka, R., Musa, A., Egawa, R., Takizawa, H., Okabe, K., and Kobayashi, H. (2009). Performance tuning and analysis of future vector processors based on the roofline model. In Proceedings of the 10th Workshop on MEmory Performance: DEaling with Applications, Systems and Architecture, MEDEA ’09, page 7–14, New York, NY, USA. Association for Computing Machinery.

Serrano, E., Ilic, A., Sousa, L., Garc ́ıa-Blas, J., and Carretero, J. (2018). Cache-aware roofline model and medical image processing optimizations in gpus. In Yokota, R., Weiland, M., Shalf, J., and Alam, S. R., editors, High Performance Computing - ISC High Performance 2018 International Workshops, Frankfurt/Main, Germany, June 28, 2018, Revised Selected Papers, volume 11203 of Lecture Notes in Computer Science, pages 509–526. Springer.

Silva, G., Schulze, B., and Ferro, M. (2021). Performance and energy efficiency analysis of machine learning algorithms towards green ai: a case study of decision tree algorithms. Master’s thesis, National Lab. for Scientific Computing.

Sá, V., Klôh, V., Schulze, B., and Ferro, M. (2020). Análise de desempenho e de requisitos computacionais utilizando o modelo roofline: Um estudo para aplicações de inteligência artificial e do nas-hpc. In Anais Estendidos do XXI Simpósio em Sistemas Computacionais de Alto Desempenho, pages 22–29, Porto Alegre, RS, Brasil. SBC.

Sá, V., Klôh, V., Schulze, B., and Ferro, M. (2021). Análise e avaliação de desempenho para aplicações científicas utilizando o modelo roofline. [link].

Williams, S., Waterman, A., and Patterson, D. (2009). Roofline:an insightful visual performance model for multicore architectures. Commun.ACM, 52(4):65–76.

Yang, C., Gayatri, R., Kurth, T., Basu, P., Ronaghi, Z., Adetokunbo, A., Friesen, B., Cook, B., Doerfler, D., Oliker, L., Deslippe, J., and Williams, S. (2018). An empirical roofline methodology for quantitatively assessing performance portability. In 2018, IEEE/ACM Int. Workshop on Performance, Portability and Productivity in HPC (P3HPC), pages 14–23.

Published

2023-05-29

How to Cite

de Sá, V., Klôh, V. P., Schulze, B., & Ferro, M. (2023). Performance Evaluation for Scientific Applications using the Roofline Model. Electronic Journal of Undergraduate Research on Computing, 21(1), 44–53. https://doi.org/10.5753/reic.2023.2429

Issue

Section

Full Papers