Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition

Gabriel Eduardo Lima; Valfride Nascimento; Eduardo Santos; Eduil Nascimento Jr.; Rayson Laroca; David Menotti

doi:10.5753/jbcs.2026.5899

Authors

Gabriel Eduardo Lima Federal University of Paraná https://orcid.org/0009-0009-7599-8550
Valfride Nascimento Federal University of Paraná https://orcid.org/0000-0002-7416-613X
Eduardo Santos Paraná Military Police, Federal University of Paraná https://orcid.org/0009-0000-9512-6498
Eduil Nascimento Jr. Paraná Military Police https://orcid.org/0000-0003-1632-4942
Rayson Laroca Pontifical Catholic University of Paraná, Federal University of Paraná https://orcid.org/0000-0003-1943-2711
David Menotti Federal University of Paraná https://orcid.org/0000-0003-2430-2030

DOI:

https://doi.org/10.5753/jbcs.2026.5899

Keywords:

Intelligent Transportation Systems, Fine-Grained Vehicle Classification, Automatic License Plate Recognition, Surveillance

Abstract

Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal investigations. While Automatic License Plate Recognition (ALPR) is widely used, Fine-Grained Vehicle Classification (FGVC) offers a complementary approach by identifying vehicles based on attributes such as color, make, model, and type. Although there have been advances in this field, existing studies often assume well-controlled conditions, explore limited attributes, and overlook FGVC integration with ALPR. To address these gaps, we introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paraná (Brazil) surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A qualitative and quantitative comparison with established datasets confirmed the challenging nature of our dataset. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform. Additionally, we apply two optical character recognition models to license plate recognition and explore the joint use of FGVC and ALPR. The results highlight the potential of integrating these complementary tasks for real-world applications.

Downloads

Download data is not yet available.

References

Amirkhani, A. and Barshooi, A. H. (2023). Deepcar 5.0: Vehicle make and model recognition under challenging conditions. IEEE Transactions on Intelligent Transportation Systems, 24(1):541-553. DOI: 10.1109/TITS.2022.3212921.

Baek, N., Park, S.-M., Kim, K.-J., and Park, S.-B. (2007). Vehicle color classification based on the support vector machine method. In International Conference on Intelligent Computing, pages 1133-1139. DOI: 10.1007/978-3-540-74282-1_127.

Basak, S. and Suresh, S. (2024). Vehicle detection and type classification in low resolution congested traffic scenes using image super resolution. Multimedia Tools and Applications, 83(8):21825-21847. DOI: 10.1007/s11042-023-16337-2.

Bautista, D. and Atienza, R. (2022). Scene text recognition with permuted autoregressive sequence models. In European Conference on Computer Vision (ECCV), pages 178-196. DOI: 10.1007/978-3-031-19815-1_11.

Caruana, R. (1997). Multitask learning. Machine learning, 28:41-75. DOI: 10.1023/A:1007379606734.

Celestino, M. (2021). 10 marcas que mais venderam carros na década. Available at:[link]. Accessed: 2025-02-19.

Chen, P., Bai, X., and Liu, W. (2014). Vehicle color recognition on urban road by feature context. IEEE Transactions on Intelligent Transportation Systems, 15(5):2340-2346. DOI: 10.1109/TITS.2014.2308897.

Cubuk, E. D., Zoph, B., Shlens, J., and Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 3008-3017. DOI: 10.1109/CVPRW50498.2020.00359.

Deng, J., Guo, J., Ververas, E., Kotsia, I., and Zafeiriou, S. (2020). RetinaFace: Single-shot multi-level face localisation in the wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5202-5211. DOI: 10.1109/CVPR42600.2020.00525.

Deng, J., Krause, J., and Fei-Fei, L. (2013). Fine-grained crowdsourcing for fine-grained recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR.2013.81.

Dong, Z., Wu, Y., Pei, M., and Jia, Y. (2015). Vehicle type classification using a semisupervised convolutional neural network. IEEE Transactions on Intelligent Transportation Systems, 16(4):2247-2256. DOI: 10.1109/TITS.2015.2402438.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), pages 1-22. Available at:[link].

Du, Y., Chen, Z., Su, Y., Jia, C., and Jiang, Y.-G. (2025). Instruction-guided scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1-16. DOI: 10.1109/TPAMI.2025.3525526.

Dule, E., Gökmen, M., and Beratoğlu, M. S. (2010). A convenient feature vector construction for vehicle color recognition. In WSEAS International Conference on Neural Networks, Evolutionary Computing and Fuzzy systems, page 250–255. DOI: 10.5555/1863431.1863473.

Fan, X. and Zhao, W. (2022). Improving robustness of license plates automatic recognition in natural scenes. IEEE Transactions on Intelligent Transportation Systems, 23(10):18845-18854. DOI: 10.1109/TITS.2022.3151475.

Farias, V. and Croquer, G. (2023). Por que o carro colorido sumiu? 67% dos veículos no Brasil são brancos, pretos ou cinzas. Available at:[link]. Accessed: 2025-02-19.

Ferryman, J. M., Worrall, A. D., Sullivan, G. D., and Baker, K. D. (1995). A generic deformable model for vehicle recognition. In British Machine Vision Conference (BMVC), page 127–136. DOI: 10.5555/236190.236202.

Fu, H., Ma, H., Wang, G., Zhang, X., and Zhang, Y. (2020). MCFF-CNN: Multiscale comprehensive feature fusion convolutional neural network for vehicle color recognition based on residual learning. Neurocomputing, 395:178-187. DOI: 10.1016/j.neucom.2018.02.111.

Geifman, Y. and El-Yaniv, R. (2017). Selective classification for deep neural networks. In International Conference on Neural Information Processing Systems (NeurIPS), page 4885–4894. DOI: 10.5555/3295222.3295241.

Gonçalves, G. R., Diniz, M. A., Laroca, R., Menotti, D., and Schwartz, W. R. (2018). Real-time automatic license plate recognition through deep multi-task networks. In Conference on Graphics, Patterns and Images (SIBGRAPI), pages 110-117. DOI: 10.1109/SIBGRAPI.2018.00021.

Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017). On calibration of modern neural networks. In Precup, D. and Teh, Y. W., editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1321-1330. PMLR. DOI: 10.5555/3305381.3305518.

Han, K., Xiao, A., Wu, E., Guo, J., XU, C., and Wang, Y. (2021). Transformer in transformer. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 15908-15919. Curran Associates, Inc. DOI: 10.5555/3540261.3541478.

Hassan, A., Ali, M., Durrani, N. M., and Tahir, M. A. (2021). An empirical analysis of deep learning architectures for vehicle make and model recognition. IEEE Access, 9:91487-91499. DOI: 10.1109/ACCESS.2021.3090766.

He, C., Wang, D., Cai, Z., Zeng, J., and Fu, F. (2024a). A vehicle matching algorithm by maximizing travel time probability based on automatic license plate recognition data. IEEE Transactions on Intelligent Transportation Systems, 25(8):9103-9114. DOI: 10.1109/TITS.2024.3358625.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770-778. DOI: 10.1109/CVPR.2016.90.

He, L., Zhou, Y., Liu, L., and Ma, J. (2024b). Research and application of YOLOv11-based object segmentation in intelligent recognition at construction sites. Buildings, 14(12). DOI: 10.3390/buildings14123777.

Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R., Adam, H., and Le, Q. (2019). Searching for MobileNetV3. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 1314-1324. DOI: 10.1109/ICCV.2019.00140.

Hsu, G.-S., Chen, J.-C., and Chung, Y.-Z. (2013). Application-oriented license plate recognition. IEEE Transactions on Vehicular Technology, 62(2):552-561. DOI: 10.1109/TVT.2012.2226218.

Hu, B., Lai, J.-H., and Guo, C.-C. (2017). Location-aware fine-grained vehicle type recognition using multi-task deep networks. Neurocomputing, 243:60-68. DOI: 10.1016/j.neucom.2017.02.085.

Hu, C., Bai, X., Qi, L., Chen, P., Xue, G., and Mei, L. (2015). Vehicle color recognition with spatial pyramid deep learning. IEEE Transactions on Intelligent Transportation Systems, 16(5):2925-2934. DOI: 10.1109/TITS.2015.2430892.

Hu, M., Bai, L., Fan, J., Zhao, S., and Chen, E. (2023). Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion. Frontiers of Computer Science, 17(3):173321. DOI: 10.1007/s11704-022-1389-x.

Huang, C., Li, Y., Loy, C. C., and Tang, X. (2016). Learning deep representation for imbalanced classification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5375-5384. DOI: 10.1109/CVPR.2016.580.

Huang, G., Liu, Z., van der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR.2017.243.

Jolly, M.-P., Lakshmanan, S., and Jain, A. (1996). Vehicle segmentation and classification using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(3):293-308. DOI: 10.1109/34.485557.

Khanam, R. and Hussain, M. (2024). YOLOv11: An overview of the key architectural enhancements. arXiv preprint. DOI: 10.48550/arXiv.2410.17725.

Krause, J., Deng, J., Stark, M., and Fei-Fei, L. (2013a). Collecting a large-scale dataset of fine-grained cars. In Second Workshop on Fine-Grained Visual Categorisation (FGVC), in conjunction with CVPR. Available at:[link].

Krause, J., Stark, M., Deng, J., and Fei-Fei, L. (2013b). 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops, pages 554-561. DOI: 10.1109/ICCVW.2013.77.

Kuhn, D. M. and Moreira, V. P. (2021). BRCars: a dataset for fine-grained classification of car images. In 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 231-238. DOI: 10.1109/SIBGRAPI54419.2021.00039.

Lai, A., Fung, G., and Yung, N. (2001). Vehicle type classification from visual-based dimension estimation. In IEEE Intelligent Transportation Systems Conference (ITSC), pages 201-206. DOI: 10.1109/ITSC.2001.948656.

Laroca, R., Araujo, A. B., Zanlorensi, L. A., De Almeida, E. C., and Menotti, D. (2021). Towards image-based automatic meter reading in unconstrained scenarios: A robust and efficient approach. IEEE Access, 9:67569-67584. DOI: 10.1109/ACCESS.2021.3077415.

Laroca, R., Cardoso, E. V., Lucio, D. R., Estevam, V., and Menotti, D. (2022). On the cross-dataset generalization in license plate recognition. In International Conference on Computer Vision Theory and Applications (VISAPP), pages 166-178. DOI: 10.5220/0010846800003124.

Laroca, R., Estevam, V., Britto Jr., A. S., Minetto, R., and Menotti, D. (2023a). Do we train on test data? The impact of near-duplicates on license plate recognition. In International Joint Conference on Neural Networks (IJCNN), pages 1-8. DOI: 10.1109/IJCNN54540.2023.10191584.

Laroca, R., Estevam, V., Moreira, G. J. P., Minetto, R., and Menotti, D. (2025). Advancing multinational license plate recognition through synthetic and real data fusion: A comprehensive evaluation. IET Intelligent Transport Systems, 19(1):e70086. DOI: 10.1049/itr2.70086.

Laroca, R., Severo, E., Zanlorensi, L. A., Oliveira, L. S., Gonçalves, G. R., Schwartz, W. R., and Menotti, D. (2018). A robust real-time automatic license plate recognition based on the YOLO detector. In International Joint Conference on Neural Networks (IJCNN), pages 1-10. DOI: 10.1109/IJCNN.2018.8489629.

Laroca, R., Zanlorensi, L. A., Estevam, V., Minetto, R., and Menotti, D. (2023b). Leveraging model fusion for improved license plate recognition. In Iberoamerican Congress on Pattern Recognition (CIARP), pages 60-75. DOI: 10.1007/978-3-031-49249-5_5.

Lima, G. E., Laroca, R., Santos, E., Nascimento Jr., E., and Menotti, D. (2024). Toward enhancing vehicle color recognition in adverse conditions: A dataset and benchmark. In Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. DOI: 10.1109/SIBGRAPI62404.2024.10716307.

Liu, Q., Chen, S.-L., Chen, Y.-X., and Yin, X.-C. (2024). Improving license plate recognition via diverse stylistic plate generation. Pattern Recognition Letters, 183:117-124. DOI: 10.1016/j.patrec.2024.05.005.

Liu, Y.-Y., Liu, Q., Chen, S.-L., Chen, F., and Yin, X.-C. (2024). Irregular license plate recognition via global information integration. In International Conference on Multimedia Modeling, pages 325-339. DOI: 10.1007/978-3-031-53308-2_24.

Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., Wei, F., and Guo, B. (2022). Swin transformer v2: Scaling up capacity and resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11999-12009. DOI: 10.1109/CVPR52688.2022.01170.

Lu, L., Cai, Y., Huang, H., and Wang, P. (2023). An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing, 536:40-49. DOI: 10.1016/j.neucom.2023.03.035.

Lucio, D. R., Laroca, R., Zanlorensi, L. A., Moreira, G., and Menotti, D. (2019). Simultaneous iris and periocular region detection using coarse annotations. In Conference on Graphics, Patterns and Images (SIBGRAPI), pages 178-185. DOI: 10.1109/SIBGRAPI.2019.00032.

Luo, R., Song, Y., Ye, L., and Su, R. (2024). Dense-tnt: Efficient vehicle type classification neural network using satellite imagery. Sensors, 24(23). DOI: 10.3390/s24237662.

Ma, X. and Grimson, W. (2005). Edge-based rich representation for vehicle classification. In IEEE International Conference on Computer Vision (ICCV), pages 1185-1192. DOI: 10.1109/ICCV.2005.80.

Minderer, M., Djolonga, J., Romijnders, R., Hubis, F., Zhai, X., Houlsby, N., Tran, D., and Lucic, M. (2021). Revisiting the calibration of modern neural networks. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 15682-15694. Curran Associates, Inc. DOI: 10.5555/3540261.3541461.

Ministério dos Transportes (2024). Frota nacional (junho de 2024). Available at:[link]. Accessed: 2025-02-19.

Nascimento, V., Laroca, R., Ribeiro, R. O., Schwartz, W. R., and Menotti, D. (2024). Enhancing license plate super-resolution: A layout-aware and character-driven approach. Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. DOI: 10.1109/SIBGRAPI62404.2024.10716303.

Nascimento, V., Lima, G. E., Ribeiro, R. O., Schwartz, W. R., Laroca, R., and Menotti, D. (2025). Toward advancing license plate super-resolution in real-world scenarios: A dataset and benchmark. Journal of the Brazilian Computer Society, 1(31):435-449. DOI: 10.5753/jbcs.2025.5159.

Nyi Myo, N., Boonkong, A., Khampitak, K., and Hormdee, D. (2025). A two-point association tracking system incorporated with YOLOv11 for real-time visual tracking of laparoscopic surgical instruments. IEEE Access, 13:12225-12238. DOI: 10.1109/ACCESS.2025.3529710.

Ochal, M., Patacchiola, M., Vazquez, J., Storkey, A., and Wang, S. (2023). Few-shot learning with class imbalance. IEEE Transactions on Artificial Intelligence, 4(5):1348-1358. DOI: 10.1109/TAI.2023.3298303.

Oliveira, I. O., Laroca, R., Menotti, D., Fonseca, K. V. O., and Minetto, R. (2021). Vehicle-Rear: A new dataset to explore feature fusion for vehicle identification using convolutional neural networks. IEEE Access, 9:101065-101077. DOI: 10.1109/ACCESS.2021.3097964.

Rao, Z., Yang, D., Chen, N., and Liu, J. (2024). License plate recognition system in unconstrained scenes via a new image correction scheme and improved CRNN. Expert Systems with Applications, 243:122878. DOI: 10.1016/j.eswa.2023.122878.

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 618-626. DOI: 10.1109/ICCV.2017.74.

Shvai, N., Hasnat, A., Meicler, A., and Nakib, A. (2020). Accurate classification for automatic vehicle-type recognition based on ensemble classifiers. IEEE Transactions on Intelligent Transportation Systems, 21(3):1288-1297. DOI: 10.1109/TITS.2019.2906821.

Sochor, J., Herout, A., and Havel, J. (2016). BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3006-3015. DOI: 10.1109/CVPR.2016.328.

Son, J.-W., Park, S.-B., and Kim, K.-J. (2007). A convolution kernel method for color recognition. In International Conference on Advanced Language Processing and Web Information Technology, pages 242-247. DOI: 10.1109/ALPIT.2007.28.

Tan, M. and Le, Q. (2021). Efficientnetv2: Smaller models and faster training. In International Conference on Machine Learning, volume 139, pages 10096-10106. Available at:[link].

Ultralytics (2025). YOLOv11. Available at:[link]. Accessed: 2025-03-04.

Wang, H., Peng, J., Zhao, Y., and Fu, X. (2020). Multi-path deep CNNs for fine-grained car recognition. IEEE Transactions on Vehicular Technology, 69(10):10484-10493. DOI: 10.1109/TVT.2020.3009162.

Wang, Y., Wang, C., Zheng, Y., Fu, H., and Ma, H. (2021). Transformer based neural network for fine-grained classification of vehicle color. In International Conference on Multimedia Information Processing and Retrieval (MIPR), pages 118-124. DOI: 10.1109/MIPR51284.2021.00025.

Wojcik, L., Lima, G. E., Nascimento, V., Nascimento Jr., E., Laroca, R., and Menotti, D. (2025). LPLC: A dataset for license plate legibility classification. Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. DOI: 10.1109/SIBGRAPI67909.2025.11223367.

Wolf, S., Loran, D., and Beyerer, J. (2024). Knowledge-distillation-based label smoothing for fine-grained open-set vehicle recognition. In IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), pages 330-340. DOI: 10.1109/WACVW60836.2024.00041.

Wu, W., QiSen, Z., and Mingjun, W. (2001). A method of vehicle classification using models and neural networks. In IEEE Vehicular Technology Conference, pages 3022-3026. DOI: 10.1109/VETECS.2001.944158.

Xu, Z., Yang, W., Meng, A., Lu, N., Huang, H., Ying, C., and Huang, L. (2018). Towards end-to-end license plate detection and recognition: A large dataset and baseline. In European Conference on Computer Vision (ECCV). DOI: 10.1007/978-3-030-01261-8_16.

Yang, L., Luo, P., Loy, C. C., and Tang, X. (2015). A large-scale car dataset for fine-grained categorization and verification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3973-3981. DOI: 10.1109/CVPR.2015.7299023.

Yu, Y., Liu, H., Fu, Y., Jia, W., Yu, J., and Yan, Z. (2022). Embedding pose information for multiview vehicle model recognition. IEEE Transactions on Circuits and Systems for Video Technology, 32(8):5467-5480. DOI: 10.1109/TCSVT.2022.3151116.

Yuan, Y., Zou, W., Zhao, Y., Wang, X., Hu, X., and Komodakis, N. (2017). A robust and efficient approach to license plate detection. IEEE Transactions on Image Processing, 26(3):1102-1114. DOI: 10.1109/TIP.2016.2631901.

Zhang, L., Wang, P., Li, H., Li, Z., Shen, C., and Zhang, Y. (2021). A robust attentional framework for license plate recognition in the wild. IEEE Transactions on Intelligent Transportation Systems, 22(11):6967-6976. DOI: 10.1109/TITS.2020.3000072.

Zhang, Q., Zhuo, L., Li, J., Zhang, J., Zhang, H., and Li, X. (2018). Vehicle color recognition using multiple-layer feature representations of lightweight convolutional neural network. Signal Processing, 147:146-153. DOI: 10.1016/j.sigpro.2018.01.021.