Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition
DOI:
https://doi.org/10.5753/jbcs.2026.5899Keywords:
Intelligent Transportation Systems, Fine-Grained Vehicle Classification, Automatic License Plate Recognition, SurveillanceAbstract
Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal investigations. While Automatic License Plate Recognition (ALPR) is widely used, Fine-Grained Vehicle Classification (FGVC) offers a complementary approach by identifying vehicles based on attributes such as color, make, model, and type. Although there have been advances in this field, existing studies often assume well-controlled conditions, explore limited attributes, and overlook FGVC integration with ALPR. To address these gaps, we introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paraná (Brazil) surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A qualitative and quantitative comparison with established datasets confirmed the challenging nature of our dataset. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform. Additionally, we apply two optical character recognition models to license plate recognition and explore the joint use of FGVC and ALPR. The results highlight the potential of integrating these complementary tasks for real-world applications.
Downloads
References
Amirkhani, A. and Barshooi, A. H. (2023). Deepcar 5.0: Vehicle make and model recognition under challenging conditions. IEEE Transactions on Intelligent Transportation Systems, 24(1):541-553. newblock doi:10.1109/TITS.2022.3212921.
Baek, N., Park, S.-M., Kim, K.-J., and Park, S.-B. (2007). Vehicle color classification based on the support vector machine method. In International Conference on Intelligent Computing, pages 1133-1139. newblock doi:10.1007/978-3-540-74282-1_127.
Basak, S. and Suresh, S. (2024). Vehicle detection and type classification in low resolution congested traffic scenes using image super resolution. Multimedia Tools and Applications, 83(8):21825-21847. newblock doi:10.1007/s11042-023-16337-2.
Bautista, D. and Atienza, R. (2022). Scene text recognition with permuted autoregressive sequence models. In European Conference on Computer Vision (ECCV), pages 178-196. newblock doi:10.1007/978-3-031-19815-1_11.
Caruana, R. (1997). Multitask learning. Machine learning, 28:41-75. newblock doi:10.1023/A:1007379606734.
Celestino, M. (2021). 10 marcas que mais venderam carros na década. newblock https://www.webmotors.com.br/wm1/noticias/10-marcas-que-mais-venderam-carros-na-decada. Accessed: 2025-02-19.
Chen, P., Bai, X., and Liu, W. (2014). Vehicle color recognition on urban road by feature context. IEEE Transactions on Intelligent Transportation Systems, 15(5):2340-2346. newblock doi:10.1109/TITS.2014.2308897.
Cubuk, E. D., Zoph, B., Shlens, J., and Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 3008-3017. newblock doi:10.1109/CVPRW50498.2020.00359.
Deng, J., Guo, J., Ververas, E., Kotsia, I., and Zafeiriou, S. (2020). RetinaFace: Single-shot multi-level face localisation in the wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5202-5211. newblock doi:10.1109/CVPR42600.2020.00525.
Deng, J., Krause, J., and Fei-Fei, L. (2013). Fine-grained crowdsourcing for fine-grained recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). newblock doi:10.1109/CVPR.2013.81.
Dong, Z., Wu, Y., Pei, M., and Jia, Y. (2015). Vehicle type classification using a semisupervised convolutional neural network. IEEE Transactions on Intelligent Transportation Systems, 16(4):2247-2256. newblock doi:10.1109/TITS.2015.2402438.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), pages 1-22.
Du, Y., Chen, Z., Su, Y., Jia, C., and Jiang, Y.-G. (2025). Instruction-guided scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1-16. newblock doi:10.1109/TPAMI.2025.3525526.
Dule, E., G"okmen, M., and Beratouglu, M. S. (2010). A convenient feature vector construction for vehicle color recognition. In WSEAS International Conference on Neural Networks, Evolutionary Computing and Fuzzy systems, page 250–255. newblock doi:10.5555/1863431.1863473.
Fan, X. and Zhao, W. (2022). Improving robustness of license plates automatic recognition in natural scenes. IEEE Transactions on Intelligent Transportation Systems, 23(10):18845-18854. newblock doi:10.1109/TITS.2022.3151475.
Farias, V. and Croquer, G. (2023). Por que o carro colorido sumiu? 67% dos ve'iculos no Brasil s ao brancos, pretos ou cinzas. newblock https://g1.globo.com/economia/noticia/2023/08/20/por-que-o-carro-colorido-sumiu-67percent-dos-veiculos-no-brasil-sao-brancos-pretos-ou-cinzas.ghtml. Accessed: 2025-02-19.
Ferryman, J. M., Worrall, A. D., Sullivan, G. D., and Baker, K. D. (1995). A generic deformable model for vehicle recognition. In British Machine Vision Conference (BMVC), page 127–136. newblock doi:10.5555/236190.236202.
Fu, H., Ma, H., Wang, G., Zhang, X., and Zhang, Y. (2020). MCFF-CNN: Multiscale comprehensive feature fusion convolutional neural network for vehicle color recognition based on residual learning. Neurocomputing, 395:178-187. newblock doi:10.1016/j.neucom.2018.02.111.
Geifman, Y. and El-Yaniv, R. (2017). Selective classification for deep neural networks. In International Conference on Neural Information Processing Systems (NeurIPS), page 4885–4894. newblock doi:10.5555/3295222.3295241.
Gonçalves, G. R., Diniz, M. A., Laroca, R., Menotti, D., and Schwartz, W. R. (2018). Real-time automatic license plate recognition through deep multi-task networks. In Conference on Graphics, Patterns and Images (SIBGRAPI), pages 110-117. newblock doi:10.1109/SIBGRAPI.2018.00021.
Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017). On calibration of modern neural networks. In Precup, D. and Teh, Y. W., editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1321-1330. PMLR. newblock doi:10.5555/3305381.3305518.
Han, K., Xiao, A., Wu, E., Guo, J., XU, C., and Wang, Y. (2021). Transformer in transformer. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 15908-15919. Curran Associates, Inc. newblock doi:10.5555/3540261.3541478.
Hassan, A., Ali, M., Durrani, N. M., and Tahir, M. A. (2021). An empirical analysis of deep learning architectures for vehicle make and model recognition. IEEE Access, 9:91487-91499. newblock doi:10.1109/ACCESS.2021.3090766.
He, C., Wang, D., Cai, Z., Zeng, J., and Fu, F. (2024a). A vehicle matching algorithm by maximizing travel time probability based on automatic license plate recognition data. IEEE Transactions on Intelligent Transportation Systems, 25(8):9103-9114. newblock doi:10.1109/TITS.2024.3358625.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770-778. newblock doi:10.1109/CVPR.2016.90.
He, L., Zhou, Y., Liu, L., and Ma, J. (2024b). Research and application of YOLOv11-based object segmentation in intelligent recognition at construction sites. Buildings, 14(12). newblock doi:10.3390/buildings14123777.
Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R., Adam, H., and Le, Q. (2019). Searching for MobileNetV3. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 1314-1324. newblock doi:10.1109/ICCV.2019.00140.
Hsu, G.-S., Chen, J.-C., and Chung, Y.-Z. (2013). Application-oriented license plate recognition. IEEE Transactions on Vehicular Technology, 62(2):552-561. newblock doi:10.1109/TVT.2012.2226218.
Hu, B., Lai, J.-H., and Guo, C.-C. (2017). Location-aware fine-grained vehicle type recognition using multi-task deep networks. Neurocomputing, 243:60-68. newblock doi:10.1016/j.neucom.2017.02.085.
Hu, C., Bai, X., Qi, L., Chen, P., Xue, G., and Mei, L. (2015). Vehicle color recognition with spatial pyramid deep learning. IEEE Transactions on Intelligent Transportation Systems, 16(5):2925-2934. newblock doi:10.1109/TITS.2015.2430892.
Hu, M., Bai, L., Fan, J., Zhao, S., and Chen, E. (2023). Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion. Frontiers of Computer Science, 17(3):173321. newblock doi:10.1007/s11704-022-1389-x.
Huang, C., Li, Y., Loy, C. C., and Tang, X. (2016). Learning deep representation for imbalanced classification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5375-5384. newblock doi:10.1109/CVPR.2016.580.
Huang, G., Liu, Z., van der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). newblock doi:10.1109/CVPR.2017.243.
Jolly, M.-P., Lakshmanan, S., and Jain, A. (1996). Vehicle segmentation and classification using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(3):293-308. doi:10.1109/34.485557.
Khanam, R. and Hussain, M. (2024). YOLOv11: An overview of the key architectural enhancements. arXiv preprint. newblock doi:10.48550/arXiv.2410.17725.
Krause, J., Deng, J., Stark, M., and Fei-Fei, L. (2013a). Collecting a large-scale dataset of fine-grained cars. In Second Workshop on Fine-Grained Visual Categorisation (FGVC), in conjunction with CVPR. available at [link].
Krause, J., Stark, M., Deng, J., and Fei-Fei, L. (2013b). 3d object representations for fine-grained categorization. In 2013 IEEE International Conference on Computer Vision Workshops, pages 554-561. newblock doi:10.1109/ICCVW.2013.77.
Kuhn, D. M. and Moreira, V. P. (2021). BRCars: a dataset for fine-grained classification of car images. In 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 231-238. newblock doi:10.1109/SIBGRAPI54419.2021.00039.
Lai, A., Fung, G., and Yung, N. (2001). Vehicle type classification from visual-based dimension estimation. In IEEE Intelligent Transportation Systems Conference (ITSC), pages 201-206. newblock doi:10.1109/ITSC.2001.948656.
Laroca, R., Araujo, A. B., Zanlorensi, L. A., De Almeida, E. C., and Menotti, D. (2021). Towards image-based automatic meter reading in unconstrained scenarios: A robust and efficient approach. IEEE Access, 9:67569-67584. newblock doi:10.1109/ACCESS.2021.3077415.
Laroca, R., Cardoso, E. V., Lucio, D. R., Estevam, V., and Menotti, D. (2022). On the cross-dataset generalization in license plate recognition. In International Conference on Computer Vision Theory and Applications (VISAPP), pages 166-178. newblock doi:10.5220/0010846800003124.
Laroca, R., Estevam, V., Britto Jr., A. S., Minetto, R., and Menotti, D. (2023a). Do we train on test data? The impact of near-duplicates on license plate recognition. In International Joint Conference on Neural Networks (IJCNN), pages 1-8. newblock doi:10.1109/IJCNN54540.2023.10191584.
Laroca, R., Estevam, V., Moreira, G. J. P., Minetto, R., and Menotti, D. (2025). Advancing multinational license plate recognition through synthetic and real data fusion: A comprehensive evaluation. IET Intelligent Transport Systems, 19(1):e70086. doi:10.1049/itr2.70086.
Laroca, R., Severo, E., Zanlorensi, L. A., Oliveira, L. S., Gonçalves, G. R., Schwartz, W. R., and Menotti, D. (2018). A robust real-time automatic license plate recognition based on the YOLO detector. In International Joint Conference on Neural Networks (IJCNN), pages 1-10. newblock doi:10.1109/IJCNN.2018.8489629.
Laroca, R., Zanlorensi, L. A., Estevam, V., Minetto, R., and Menotti, D. (2023b). Leveraging model fusion for improved license plate recognition. In Iberoamerican Congress on Pattern Recognition (CIARP), pages 60-75. newblock doi:10.1007/978-3-031-49249-5_5.
Lima, G. E., Laroca, R., Santos, E., Nascimento Jr., E., and Menotti, D. (2024). Toward enhancing vehicle color recognition in adverse conditions: A dataset and benchmark. In Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. newblock doi:10.1109/SIBGRAPI62404.2024.10716307.
Liu, Q., Chen, S.-L., Chen, Y.-X., and Yin, X.-C. (2024). Improving license plate recognition via diverse stylistic plate generation. Pattern Recognition Letters, 183:117-124. newblock doi:10.1016/j.patrec.2024.05.005.
Liu, Y.-Y., Liu, Q., Chen, S.-L., Chen, F., and Yin, X.-C. (2024). Irregular license plate recognition via global information integration. In International Conference on Multimedia Modeling, pages 325-339. newblock doi:10.1007/978-3-031-53308-2_24.
Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., Wei, F., and Guo, B. (2022). Swin transformer v2: Scaling up capacity and resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11999-12009. newblock doi:10.1109/CVPR52688.2022.01170.
Lu, L., Cai, Y., Huang, H., and Wang, P. (2023). An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing, 536:40-49. newblock doi:10.1016/j.neucom.2023.03.035.
Lucio, D. R., Laroca, R., Zanlorensi, L. A., Moreira, G., and Menotti, D. (2019). Simultaneous iris and periocular region detection using coarse annotations. In Conference on Graphics, Patterns and Images (SIBGRAPI), pages 178-185. newblock doi:10.1109/SIBGRAPI.2019.00032.
Luo, R., Song, Y., Ye, L., and Su, R. (2024). Dense-tnt: Efficient vehicle type classification neural network using satellite imagery. Sensors, 24(23). doi:10.3390/s24237662.
Ma, X. and Grimson, W. (2005). Edge-based rich representation for vehicle classification. In IEEE International Conference on Computer Vision (ICCV), pages 1185-1192. newblock doi:10.1109/ICCV.2005.80.
Minderer, M., Djolonga, J., Romijnders, R., Hubis, F., Zhai, X., Houlsby, N., Tran, D., and Lucic, M. (2021). Revisiting the calibration of modern neural networks. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 15682-15694. Curran Associates, Inc. newblock doi:10.5555/3540261.3541461.
Ministério dos Transportes (2024). Frota nacional (junho de 2024). newblock https://www.gov.br/transportes/pt-br/assuntos/transito/conteudo-Senatran/frota-de-veiculos-2024. Accessed: 2025-02-19.
Nascimento, V., Laroca, R., Ribeiro, R. O., Schwartz, W. R., and Menotti, D. (2024). Enhancing license plate super-resolution: A layout-aware and character-driven approach. Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. newblock doi:10.1109/SIBGRAPI62404.2024.10716303.
Nascimento, V., Lima, G. E., Ribeiro, R. O., Schwartz, W. R., Laroca, R., and Menotti, D. (2025). Toward advancing license plate super-resolution in real-world scenarios: A dataset and benchmark. Journal of the Brazilian Computer Society, 1(31):435-449. newblock doi:10.5753/jbcs.2025.5159.
Nyi Myo, N., Boonkong, A., Khampitak, K., and Hormdee, D. (2025). A two-point association tracking system incorporated with YOLOv11 for real-time visual tracking of laparoscopic surgical instruments. IEEE Access, 13:12225-12238. newblock doi:10.1109/ACCESS.2025.3529710.
Ochal, M., Patacchiola, M., Vazquez, J., Storkey, A., and Wang, S. (2023). Few-shot learning with class imbalance. IEEE Transactions on Artificial Intelligence, 4(5):1348-1358. newblock doi:10.1109/TAI.2023.3298303.
Oliveira, I. O., Laroca, R., Menotti, D., Fonseca, K. V. O., and Minetto, R. (2021). Vehicle-Rear: A new dataset to explore feature fusion for vehicle identification using convolutional neural networks. IEEE Access, 9:101065-101077. newblock doi:10.1109/ACCESS.2021.3097964.
Rao, Z., Yang, D., Chen, N., and Liu, J. (2024). License plate recognition system in unconstrained scenes via a new image correction scheme and improved CRNN. Expert Systems with Applications, 243:122878. newblock doi:10.1016/j.eswa.2023.122878.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 618-626. newblock doi:10.1109/ICCV.2017.74.
Shvai, N., Hasnat, A., Meicler, A., and Nakib, A. (2020). Accurate classification for automatic vehicle-type recognition based on ensemble classifiers. IEEE Transactions on Intelligent Transportation Systems, 21(3):1288-1297. newblock doi:10.1109/TITS.2019.2906821.
Sochor, J., Herout, A., and Havel, J. (2016). BoxCars: 3D boxes as CNN input for improved fine-grained vehicle recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3006-3015. newblock doi:10.1109/CVPR.2016.328.
Son, J.-W., Park, S.-B., and Kim, K.-J. (2007). A convolution kernel method for color recognition. In International Conference on Advanced Language Processing and Web Information Technology, pages 242-247. newblock doi:10.1109/ALPIT.2007.28.
Tan, M. and Le, Q. (2021). Efficientnetv2: Smaller models and faster training. In International Conference on Machine Learning, volume 139, pages 10096-10106.
Ultralytics (2025). YOLOv11. https://docs.ultralytics.com/models/yolo11/. Accessed: 2025-03-04.
Wang, H., Peng, J., Zhao, Y., and Fu, X. (2020). Multi-path deep CNNs for fine-grained car recognition. IEEE Transactions on Vehicular Technology, 69(10):10484-10493. newblock doi:10.1109/TVT.2020.3009162.
Wang, Y., Wang, C., Zheng, Y., Fu, H., and Ma, H. (2021). Transformer based neural network for fine-grained classification of vehicle color. In International Conference on Multimedia Information Processing and Retrieval (MIPR), pages 118-124. newblock doi:10.1109/MIPR51284.2021.00025.
Wojcik, L., Lima, G. E., Nascimento, V., Nascimento Jr., E., Laroca, R., and Menotti, D. (2025). LPLC: A dataset for license plate legibility classification. Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. newblock doi:10.1109/SIBGRAPI67909.2025.11223367.
Wolf, S., Loran, D., and Beyerer, J. (2024). Knowledge-distillation-based label smoothing for fine-grained open-set vehicle recognition. In IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), pages 330-340. newblock doi:10.1109/WACVW60836.2024.00041.
Wu, W., QiSen, Z., and Mingjun, W. (2001). A method of vehicle classification using models and neural networks. In IEEE Vehicular Technology Conference, pages 3022-3026. newblock doi:10.1109/VETECS.2001.944158.
Xu, Z., Yang, W., Meng, A., Lu, N., Huang, H., Ying, C., and Huang, L. (2018). Towards end-to-end license plate detection and recognition: A large dataset and baseline. In European Conference on Computer Vision (ECCV). newblock doi:10.1007/978-3-030-01261-8_16.
Yang, L., Luo, P., Loy, C. C., and Tang, X. (2015). A large-scale car dataset for fine-grained categorization and verification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3973-3981. newblock doi:10.1109/CVPR.2015.7299023.
Yu, Y., Liu, H., Fu, Y., Jia, W., Yu, J., and Yan, Z. (2022). Embedding pose information for multiview vehicle model recognition. IEEE Transactions on Circuits and Systems for Video Technology, 32(8):5467-5480. newblock doi:10.1109/TCSVT.2022.3151116.
Yuan, Y., Zou, W., Zhao, Y., Wang, X., Hu, X., and Komodakis, N. (2017). A robust and efficient approach to license plate detection. IEEE Transactions on Image Processing, 26(3):1102-1114. newblock doi:10.1109/TIP.2016.2631901.
Zhang, L., Wang, P., Li, H., Li, Z., Shen, C., and Zhang, Y. (2021). A robust attentional framework for license plate recognition in the wild. IEEE Transactions on Intelligent Transportation Systems, 22(11):6967-6976. newblock doi:10.1109/TITS.2020.3000072.
Zhang, Q., Zhuo, L., Li, J., Zhang, J., Zhang, H., and Li, X. (2018). Vehicle color recognition using multiple-layer feature representations of lightweight convolutional neural network. Signal Processing, 147:146-153. newblock doi:10.1016/j.sigpro.2018.01.021.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Gabriel Eduardo Lima, Valfride Nascimento, Eduardo Santos, Eduil Nascimento Jr., Rayson Laroca, David Menotti

This work is licensed under a Creative Commons Attribution 4.0 International License.

