MLISP: Machine-Learning-based ISP Decision Scheme for VVC Encoders
DOI:
https://doi.org/10.5753/jbcs.2025.5464Keywords:
VVC, Intra Prediction, ISP, Machine LearningAbstract
The Versatile Video Coding (VVC) standard achieves high compression rates by introducing new encoding tools, such as the Intra Subpartition Prediction (ISP). However, the ISP increases the computational effort necessary to perform the mode decision in the intra-prediction step. In this paper, we propose the MLISP, a machine learning-based ISP decision scheme for VVC encoders where two solutions are adopted to accelerate the intra-mode decision process for the ISP tool. The first solution, named ISP Skip Decision, utilizes a Decision Tree trained with image features that predicts whether the evaluation of the ISP tool is necessary, resulting in an average time saving of 8.53% with only 0.22% of coding efficiency loss. The second solution called ISP Mode Decision, uses a Decision Tree trained with encoding features to predict the optimal class of intra modes between Planar/DC and Angular to be evaluated with the ISP tool, obtaining an average time saving of 7.01% with only 0.19% of coding efficiency loss. By combining these solutions, MLISP achieves an average time saving of 10.97% with only 0.32% loss in coding efficiency, demonstrating its effectiveness in reducing encoding time with minimal impact on compression performance. Compared with related works, MLISP achieves competitive results and introduces a novel approach for optimizing the ISP decision.
Downloads
References
Araújo, L., Duarte, A., Zatt, B., Correa, G., and Palomino, D. (2024). Fast isp mode decision for the versatile video coding intra prediction using machine learning. In Proceedings of the 30th Brazilian Symposium on Multimedia and the Web, pages 162-170, Porto Alegre, RS, Brasil. SBC. DOI: 10.5753/webmedia.2024.241692.
Bergstra, J. and Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of machine learning research, 13(2). Available online [link].
Bjontegaard, G. (2001). Calculation of average psnr differences between rd-curves. Available online [link].
Bossen, F., Boyce, J., Sühring, K., Li, X., and Seregin, V. (2020). Vtm common test conditions and software reference configurations for sdr video. Available online [link], .
Bossen, F., Suehring, K., and Li, X. (2018). Vtm reference software for vvc. Available online [link].
Bross, B., Wang, Y.-K., Ye, Y., Liu, S., Chen, J., Sullivan, G. J., and Ohm, J.-R. (2021). Overview of the versatile video coding (vvc) standard and its applications. IEEE Transactions on Circuits and Systems for Video Technology, 31(10):3736-3764. DOI: 10.1109/TCSVT.2021.3101953.
Ceci, L. (2023). Live streaming - statistics & facts. Available online [link].
Chang, Y.-J., Jhu, H.-J., Jiang, H.-Y., Zhao, L., Zhao, X., Li, X., Liu, S., Bross, B., Keydel, P., Schwarz, H., Marpe, D., and Wiegand, T. (2019). Multiple reference line coding for most probable modes in intra prediction. In 2019 Data Compression Conference (DCC), pages 559-559, Snowbird, UT, USA. IEEE. DOI: 10.1109/DCC.2019.00071.
Chen, Y., Yu, L., Wang, H., Li, T., and Wang, S. (2020). A novel fast intra mode decision for versatile video coding. Journal of Visual Communication and Image Representation, 71:102849. DOI: 10.1016/j.jvcir.2020.102849.
De-Luxán-Hernández, S., George, V., Ma, J., Nguyen, T., Schwarz, H., Marpe, D., and Wiegand, T. (2019). An intra subpartition coding mode for vvc. In 2019 IEEE International Conference on Image Processing (ICIP), pages 1203-1207, Taipei, Taiwan. IEEE. DOI: 10.1109/ICIP.2019.8803777.
Dong, X., Shen, L., Yu, M., and Yang, H. (2022). Fast intra mode decision algorithm for versatile video coding. IEEE Transactions on Multimedia, 24:400-414. DOI: 10.1109/TMM.2021.3052348.
Duarte, A., Gonçalves, P., Agostini, L., Zatt, B., Correa, G., Porto, M., and Palomino, D. (2022). Fast affine motion estimation for vvc using machine-learning-based early search termination. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1-5. DOI: 10.1109/ISCAS48785.2022.9937973.
Duarte, A., Zatt, B., Correa, G., and Palomino, D. (2023). Fast intra mode decision using machine learning for the versatile video coding standard. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1-5, Monterey, CA, USA. IEEE. DOI: 10.1109/ISCAS46773.2023.10181769.
Huang, Y.-W., Hsu, C.-W., Chen, C.-Y., Chuang, T.-D., Hsiang, S.-T., Chen, C.-C., Chiang, M.-S., Lai, C.-Y., Tsai, C.-M., Su, Y.-C., Lin, Z.-Y., Hsiao, Y.-L., Chubach, O., Lin, Y.-C., and Lei, S.-M. (2020). A vvc proposal with quaternary tree plus binary-ternary tree coding block structure and advanced coding techniques. IEEE Transactions on Circuits and Systems for Video Technology, 30(5):1311-1325. DOI: 10.1109/TCSVT.2019.2945048.
ITU (2023). Subjective video quality assessment methods for multimedia applications. Available online [link].
Liu, Z., Dong, M., Guan, X., Zhang, M., and Wang, R. (2021). Fast isp coding mode optimization algorithm based on cu texture complexity for vvc. EURASIP Journal on Image and Video Processing, 2021. DOI: 10.1186/s13640-021-00564-4.
Liu, Z., Li, T., Chen, Y., Wei, K., Xu, M., and Qi, H. (2023). Deep multi-task learning based fast intra-mode decision for versatile video coding. IEEE Transactions on Circuits and Systems for Video Technology, 33(10):6101-6116. DOI: 10.1109/TCSVT.2023.3262733.
Mercat, A., Mäkinen, A., Sainio, J., Lemmetti, A., Viitanen, M., and Vanne, J. (2021). Comparative rate-distortion-complexity analysis of vvc and hevc video codecs. IEEE Access, 9:67813-67828. DOI: 10.1109/ACCESS.2021.3077116.
Park, J., Kim, B., and Jeon, B. (2020). Fast VVC intra prediction mode decision based on block shapes. In Applications of Digital Image Processing XLIII, volume 11510, page 115102H, Basel, Switzerland. SPIE. DOI: 10.1117/12.2567919.
Park, J., Kim, B., Lee, J., and Jeon, B. (2022). Machine learning-based early skip decision for intra subpartition prediction in vvc. IEEE Access, 10:111052-111065. DOI: 10.1109/ACCESS.2022.3215163.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Édouard Duchesnay (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(85):2825-2830. Available online [link].
Pfaff, J., Filippov, A., Liu, S., Zhao, X., Chen, J., De-Luxán-Hernández, S., Wiegand, T., Rufitskiy, V., Ramasubramonian, A. K., and Van der Auwera, G. (2021). Intra prediction and mode coding in vvc. IEEE Transactions on Circuits and Systems for Video Technology, 31(10):3834-3847. DOI: 10.1109/TCSVT.2021.3072430.
Saldanha, M., Sanchez, G., Marcon, C., and Agostini, L. (2021). Learning-based complexity reduction scheme for vvc intra-frame prediction. In 2021 International Conference on Visual Communications and Image Processing (VCIP), pages 1-5, Munich, Germany. IEEE. DOI: 10.1109/VCIP53242.2021.9675394.
Schäfer, M., Stallenberger, B., Pfaff, J., Helle, P., Schwarz, H., Marpe, D., and Wiegand, T. (2019). An affine-linear intra prediction with complexity constraints. In 2019 IEEE International Conference on Image Processing (ICIP), pages 1089-1093, Taipei, Taiwan. IEEE. DOI: 10.1109/ICIP.2019.8803724.
Siqueira, I., Correa, G., and Grellert, M. (2020). Rate-distortion and complexity comparison of hevc and vvc video encoders. In 2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS), pages 1-4. DOI: 10.1109/LASCAS45839.2020.9069036.
Sullivan, G. and Wiegand, T. (1998). Rate-distortion optimization for video compression. IEEE Signal Processing Magazine, 15(6):74-90. DOI: 10.1109/79.733497.
Yang, H., Shen, L., Dong, X., Ding, Q., An, P., and Jiang, G. (2020). Low-complexity ctu partition structure decision and fast intra mode decision for versatile video coding. IEEE Transactions on Circuits and Systems for Video Technology, 30(6):1668-1682. DOI: 10.1109/TCSVT.2019.2904198.
Zhang, Q., Wang, Y., Huang, L., and Jiang, B. (2020). Fast cu partition and intra mode decision method for h.266/vvc. IEEE Access, 8:117539-117550. DOI: 10.1109/ACCESS.2020.3004580.
Zhao, L., Zhang, L., Ma, S., and Zhao, D. (2011). Fast mode decision algorithm for intra prediction in hevc. In 2011 Visual Communications and Image Processing (VCIP), pages 1-4, Tainan, Taiwan. IEEE. DOI: 10.1109/VCIP.2011.6115979.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Larissa Araújo, Adson Duarte, Bruno Zatt, Guilherme Correa, Daniel Palomino

This work is licensed under a Creative Commons Attribution 4.0 International License.

