Improved generalization of cyclist detection on security cameras with the OpenImages Cyclists dataset
DOI:
https://doi.org/10.5753/jidm.2023.3179Keywords:
Deep Learning, Object Detection, Online Object Detection, Real Time Monitoring, Cyclist DetectionAbstract
Most large public datasets containing cyclists for training detectors based on Deep Learning have annotations for bicycles and people, but not for cyclists. Even when it is not the case, the quality and quantity of the images are limited. To overcome these limitations, we propose the new OpenImages Cyclists dataset, built through the pre-selection of images from the OpenImages set and a new algorithm for semiautomatic generation of cyclist annotation aided by people and bicycle detectors. A cyclist detector trained with this dataset achieved identification rates up to 78% and 89% in two different sets of images obtained from security cameras at USP, Campus São Paulo - Capital.
Downloads
References
Abadi, A. D., Gu, Y., Goncharenko, I., and Kamijo, S. (2022). Detection of cyclists’ crossing intentions for autonomous vehicles. In 2022 IEEE International Conference on Consumer Electronics (ICCE), pages 1–6. DOI: 10.1109/ICCE53296.2022.9730559.
Ahmed, S., Huda, M. N., Rajbhandari, S., Saha, C., Elshaw, M., and Kanarachos, S. (2019). Pedestrian and cyclist detection and intent estimation for autonomous vehicles: A survey. Applied Sciences, 9(11):2335. DOI: 10.3390/app9112335.
Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. ArXiv, abs/2004.10934. DOI: 10.48550/arXiv.2004.10934.
Dabiri, A., Hegyi, A., and Hoogendoorn, S. (2022). Optimized speed trajectories for cyclists, based on personal preferences and traffic light information-a stochastic dynamic programming approach. IEEE Transactions on Intelligent Transportation Systems, 23(2):777–793. DOI: 10.1109/TITS.2020.3014448.
Dollár, P., Appel, R., Belongie, S., and Perona, P. (2014). Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(8):1532–1545. DOI: 10.1109/TPAMI.2014.2300479.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303–338. DOI: 10.1007/s11263-009-0275-4.
Fan, L., Pang, Z., Zhang, T., Wang, Y.-X., Zhao, H., Wang, F., Wang, N., and Zhang, Z. (2022). Embracing single stride 3d object detector with sparse transformer. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8448–8458. DOI: 10.1109/CVPR52688.2022.00827.
Fang, Z. and López, A. M. (2020). Intention recognition of pedestrians and cyclists by 2d pose estimation. IEEE Transactions on Intelligent Transportation Systems, 21(11):4773–4783. DOI: 10.1109/TITS.2019.2946642.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, 32(9):1627–1645. DOI: 10.1109/TPAMI.2009.167.
Ferreira, J. E., Antônio Visintin, J., Okamoto, J., Cesar Bernardes, M., Paterlini, A., Roque, A. C., and Ramalho Miguel, M. (2018). Integrating the university of são paulo security mobile app to the electronic monitoring system. In 2018 IEEE International Conference on Big Data (Big Data), pages 1377–1386. IEEE. DOI: 10.1109/Big-Data.2018.8622069.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 580–587. IEEE. DOI: 10.1109/CVPR.2014.81.
Joseph, K. J., Khan, S., Khan, F. S., and Balasubramanian, V. N. (2021). Towards open world object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5826–5836. DOI: 10.1109/CVPR46437.2021.00577.
Jung, H., Choi, M.-K., Jung, J., Lee, J.-H., Kwon, S., and Jung, W. Y. (2017). Resnet-based vehicle classification and localization in traffic surveillance systems. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 934–940. IEEE. DOI: 10.1109/CVPRW.2017.129.
Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Kuznetsova, A., Rom, H., Uijlings, J., Popov, S., Veit, A., et al. (2017). Openimages: A public dataset for large-scale multi-label and multi-class image classification. [link].
Ku, J., Pon, A. D., and Waslander, S. L. (2019). Monocular 3d object detection leveraging accurate proposals and shape reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR.2019.01214.
Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A., et al. (2020). The open images dataset v4. International Journal of Computer Vision, 128(7):1956–1981. DOI: 10.1007/s11263-020-01316-z.
Li, X., Flohr, F., Yang, Y., Xiong, H., Braun, M., Pan, S., Li, K., and Gavrila, D. M. (2016). A new benchmark for vision-based cyclist detection. In 2016 IEEE Intelligent Vehicles Symposium (IV), pages 1028–1033. IEEE. DOI: 10.1109/IVS.2016.7535515.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer. DOI: 10.1007/978-3-319-10602-148.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A. C. (2016). Ssd: Single shot multibox detector. In European Conference on Computer Vision, pages 21–37. Springer. DOI: 10.1007/978-3-319-46448-02.
Luo, Z., Branchaud-Charron, F., Lemaire, C., Konrad, J., Li, S., Mishra, A., Achkar, A., Eichel, J., and Jodoin, P.-M. (2018). Mio-tcd: A new benchmark dataset for vehicle classification and localization. IEEE Transactions on Image Processing, 27(10):5129–5141. DOI: 10.1109/TIP.2018.2848705.
MacAskill, D. (2018). Putting your best photo forward: Flickr updates. [link].
Masalov, A., Matrenin, P., Ota, J., Wirth, F., Stiller, C., Corbet, H., and Lee, E. (2019). Specialized cyclist detection dataset: Challenging real-world computer vision dataset for cyclist detection using a monocular rgb camera. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 114–118. IEEE. DOI: 10.1109/IVS.2019.8813814.
Nardi, E., Padilha, B., Kamaura, L. T., and Ferreira, J. E. (2022). Openimages cyclists: Expandindo a generalização na detecção de ciclistas em câmeras de segurança. In Anais do XXXVII Simpósio Brasileiro de Bancos de Dados, pages 229–240. SBC. DOI: 10.5753/sbbd.2022.224626.
Pool, E. A. I., Kooij, J. F. P., and Gavrila, D. M. (2019). Context-based cyclist path prediction using recurrent neural networks. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 824–830. DOI: 10.1109/IVS.2019.8813889.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 779–788. IEEE. DOI: 10.1109/CVPR.2016.91.
Redmon, J. and Farhadi, A. (2017). Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6517–6525. IEEE. DOI: 10.1109/CVPR.2017.690.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental improvement. ArXiv, abs/1804.02767. DOI: 10.48550/arXiv.1804.02767.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, volume 28, pages 91–99. Curran Associates, Inc.
Robert, Ross, Marcin, Elvis, Guillem, Andrew, and Thomas (2022). Papers with code. [link]. Accessed on May 20, 2022.
Saleh, K., Hossny, M., Hossny, A., and Nahavandi, S. (2017). Cyclist detection in lidar scans using faster r-cnn and synthetic depth images. In 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pages 1–6. DOI: 10.1109/ITSC.2017.8317599.
Santhosh, K. K., Dogra, D. P., and Roy, P. P. (2020). Anomaly detection in road traffic using visual surveillance: A survey. ACM Comput. Surv., 53(6). DOI: 10.1145/3417989.
Tan, M., Pang, R., and Le, Q. V. (2020). Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10778–10787. IEEE. DOI: 10.1109/CVPR42600.2020.01079.
Vasconcelos, C. N., Paes, A., and Montenegro, A. (2016a). Towards deep learning invariant pedestrian detection by data enrichment. In 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 837–841. DOI: 10.1109/ICMLA.2016.0150.
Vasconcelos, C. N., Vargag, A. C. G., Paes, A., and Montenegro, A. (2016b). Pedestrian detection using convolutional neural networks. In Proceedings of XII Workshop de Visão Computacional, 2016, pages 289–294.
Vial, A., Hendeby, G., Daamen, W., van Arem, B., and Hoogendoorn, S. (2023). Framework for network-constrained tracking of cyclists and pedestrians. IEEE Transactions on Intelligent Transportation Systems, 24(3):3282–3296. DOI: 10.1109/TITS.2022.3225467.
Wang, T., He, X., Su, S., and Guan, Y. (2017). Efficient scene layout aware object detection for traffic surveillance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 926–933. IEEE. DOI: 10.1109/CVPRW.2017.128.
Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., and Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126:103514. DOI: 10.1016/j.dsp.2022.103514.
Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Commun. ACM, 64(3):107–115. DOI: 10.1145/3446776.
Zhou, X., Gong, W., Fu, W., and Du, F. (2017). Application of deep learning in object detection. In 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), pages 631–634. IEEE. DOI: 10.1109/ICIS.2017.7960069.
Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2020). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76. DOI: 10.1109/JPROC.2020.3004555.
Zou, Z., Chen, K., Shi, Z., Guo, Y., and Ye, J. (2023). Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276. DOI: 10.1109/JPROC.2023.3238524.