Graph Neural Networks for Semi-Supervised Image Classification with Multi-Feature Aggregation

Marina Chagas Bulach Gapski; Vinicius Atsushi Sato Kawai; Gustavo Rosseto Leticio; Lucas Pascotti Valem; Daniel Carlos Guimarães Pedronette; Mohand Said Allili

doi:10.5753/jbcs.2026.5880

Authors

Marina Chagas Bulach Gapski São Paulo State University (UNESP), Rio Claro, Brazil https://orcid.org/0009-0003-2732-3862
Vinicius Atsushi Sato Kawai São Paulo State University (UNESP), Rio Claro, Brazil https://orcid.org/0000-0003-0153-7910
Gustavo Rosseto Leticio São Paulo State University (UNESP), Rio Claro, Brazil https://orcid.org/0009-0008-3715-8991
Lucas Pascotti Valem University of São Paulo (USP), São Carlos, Brazil https://orcid.org/0000-0002-3833-9072
Daniel Carlos Guimarães Pedronette São Paulo State University (UNESP), Rio Claro, Brazil https://orcid.org/0000-0002-2867-4838
Mohand Said Allili Université du Québec en Outaouais https://orcid.org/0000-0001-8736-6600

DOI:

https://doi.org/10.5753/jbcs.2026.5880

Keywords:

Semi-supervised image classification, Graph Neural Networks, Feature Fusion, Rank Aggregation

Abstract

Feature extraction involves the identification and extraction of salient characteristics or patterns, including edges, textures, shapes, and color attributes. Contemporary feature extractors predominantly leverage deep learning architectures, such as Convolutional Neural Networks (CNNs) and Vision Transformers (VITs). The availability of diverse feature extractors in the literature provides a wide range of feature representations. Features extracted from an image depend on the specific application, the chosen extractor, and its configuration. Therefore, integrating complementary information by combining distinct extractors offers a promising way to enhance performance. Graph Neural Networks (GNNs), particularly Graph Convolutional Networks (GCNs), have emerged as powerful and widely adopted approaches for semi-supervised image classification, as they effectively leverage both labeled and unlabeled data while exploiting the underlying graph structures that capture relationships among samples. This study proposes a novel approach for GNNs in scenarios where labeled data is scarce, by integrating diverse sets of feature and graph representations derived from various extractors in classification scenarios. Experimental investigations were conducted, encompassing combinations of distinct feature and graph extractors, as well as rank aggregation strategies. The primary contributions of this work are underscored by the experimental findings, which demonstrate that the strategic combination of feature and graph representations, coupled with the application of manifold learning for graph processing, leads to significant improvements in classification accuracy across the majority of experimental conditions. Furthermore, the utilization of rank aggregation techniques to integrate features from different extractors was shown to enhance classification accuracy.

Downloads

Download data is not yet available.

References

Albawi, S., Mohammed, T. A., and Al-Zawi, S. (2017). Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET), pages 1-6. DOI: 10.1109/ICEngTechnol.2017.8308186.

Argyris, Y. A., Wang, Z., Kim, Y., and Yin, Z. (2020). The effects of visual congruence on increasing consumers’ brand engagement: An empirical investigation of influencer marketing on instagram using deep-learning algorithms for automatic image classification. Computers in Human Behavior, 112:106443. DOI: 10.1016/j.chb.2020.106443.

Bianchi, F. M., Grattarola, D., Livi, L., and Alippi, C. (2022). Graph neural networks with convolutional arma filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3496-3507. DOI: 10.1109/TPAMI.2021.3054830.

Chapelle, O., Schölkopf, B., and Zien, A. (2006). Semi-Supervised Learning. The MIT Press. DOI: 10.7551/mitpress/9780262033589.001.0001.

Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., and Feng, J. (2017). Dual path networks. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.. DOI: 10.48550/arxiv.1707.01629.

Dehghan, A., Masood, S. Z., Shu, G., and Ortiz, E. G. (2017). View independent vehicle make, model and color recognition using convolutional neural network. CoRR, abs/1702.01721. DOI: 10.48550/arxiv.1702.01721.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248-255. IEEE. DOI: 10.1109/cvpr.2009.5206848.

Ding, Y., Zhao, X., Zhang, Z., Cai, W., and Yang, N. (2021). Multiscale graph sample and aggregate network with context-aware learning for hyperspectral image classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14:4561-4572. DOI: 10.1109/JSTARS.2021.3074469.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations. DOI: 10.48550/arXiv.2010.11929.

Fey, M. and Lenssen, J. E. (2019). Fast graph representation learning with pytorch geometric. CoRR, abs/1903.02428. DOI: 10.48550/arxiv.1903.02428.

Gapski, M. C. B., Valem, L. P., and Pedronette, D. C. G. (2024). Feature fusion for graph convolutional networks in semi-supervised image classification. In 2024 37th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 1-6. DOI: 10.1109/SIBGRAPI62404.2024.10716341.

Gasteiger, J., Bojchevski, A., and Günnemann, S. (2019). Combining neural networks with personalized pagerank for classification on graphs. arxiv. DOI: 10.48550/arXiv.1810.05997.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In CVPR, pages 770-778. DOI: 10.1109/cvpr.2016.90.

Hu, J., Shen, L., and Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/tpami.2019.2913372.

Humeau-Heurtier, A. (2019). Texture feature extraction methods: A survey. IEEE Access, 7:8975-9000. DOI: 10.1109/ACCESS.2018.2890743.

Jiang, J., Wang, B., and Tu, Z. (2011). Unsupervised metric learning by self-smoothing operator. In 2011 International Conference on Computer Vision, pages 794-801. DOI: 10.1109/iccv.2011.6126318.

Jiang, L., Fang, X., Sun, W., Han, N., and Teng, S. (2023). Low-rank constraint based dual projections learning for dimensionality reduction. Signal Processing, 204:108817. DOI: 10.1016/j.sigpro.2022.108817.

Khosla, A., Jayadevaprakash, N., Yao, B., and Fei-Fei, L. (2011). Novel dataset for fine-grained image categorization. In Workshop on Fine-Grained Visual Categorization, CVPR. Available at:[link].

Kipf, T. N. and Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations. DOI: 10.48550/arxiv.1609.02907.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Pereira, F., Burges, C. J. C., Bottou, L., and Weinberger, K. Q., editors, Advances in Neural Information Processing Systems 25, pages 1097-1105. Curran Associates, Inc.. DOI: 10.1145/3065386.

Leticio, G. R., Kawai, V. A. S., Valem, L. P., and Pedronette, D. C. G. (2025). Neighbor embedding projection and graph convolutional networks for image classification. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP, pages 511-518. INSTICC, SciTePress. DOI: 10.5220/0013260500003912.

Li, X., Wang, X., and Xiao, G. (2017). A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications. Briefings in Bioinformatics, 20(1):178-189. DOI: 10.1093/bib/bbx101.

Liu, G.-H., Zhang, L., Hou, Y.-K., Li, Z.-Y., and Yang, J.-Y. (2010). Image retrieval based on multi-texton histogram. Pattern Recognition, 43(7):2380-2389. DOI: 10.1016/j.patcog.2010.02.012.

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. ICCV. DOI: 10.1109/iccv48922.2021.00986.

Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022). A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11976-11986. DOI: 10.1109/cvpr52688.2022.01167.

Mutlag, W. K., Ali, S. K., Aydam, Z. M., and Taher, B. H. (2020). Feature extraction methods: A review. Journal of Physics: Conference Series, 1591(1):012028. DOI: 10.1088/1742-6596/1591/1/012028.

Nilsback, M.-E. and Zisserman, A. (2006). A visual vocabulary for flower classification. In IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 1447-1454. DOI: 10.1109/cvpr.2006.42.

Omohundro, S. M. (1989). Five balltree construction algorithms. Technical report, International Computer Science Institute. Available at:[link].

Oquab, M., Darcet, T., Moutakanni, T., Vo, H. V., Szafraniec, M., Khalidov, V., Fernandez, P., HAZIZA, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Huang, P.-Y., Li, S.-W., Misra, I., Rabbat, M., Sharma, V., Synnaeve, G., Xu, H., Jegou, H., Mairal, J., Labatut, P., Joulin, A., and Bojanowski, P. (2024). DINOv2: Learning robust visual features without supervision. Transactions on Machine Learning Research. Available at:[link]. Featured Certification.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., dtextquotesingle Alché-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.. DOI: 10.48550/arxiv.1912.01703.

Pedronette, D. C. G., Valem, L. P., Almeida, J., and da S. Torres, R. (2019a). Multimedia retrieval through unsupervised hypergraph-based manifold ranking. IEEE Transactions on Image Processing, 28(12):5824-5838. DOI: 10.1109/TIP.2019.2920526.

Pedronette, D. C. G., Valem, L. P., and da Silva Torres, R. (2021a). A bfs-tree of ranking references for unsupervised manifold learning. Pattern Recognition, 111. 107666, ISSN 0031-3203. DOI: 10.1016/j.patcog.2020.107666.

Pedronette, D. C. G., Valem, L. P., and Latecki, L. J. (2021b). Efficient rank-based diffusion process with assured convergence. Journal of Imaging, 7(3). DOI: 10.3390/jimaging7030049.

Pedronette, D. C. G., Weng, Y., Baldassin, A., and Hou, C. (2019b). Semi-supervised and active learning through manifold reciprocal knn graph for image retrieval. Neurocomputing, 340:19-31. DOI: 10.1016/j.neucom.2019.02.016.

Piras, L. and Giacinto, G. (2017). Information fusion in content based image retrieval: A comprehensive overview. Information Fusion, 37:50-60. DOI: 10.1016/j.inffus.2017.01.003.

Rahma, R. A., Nugroho, R. A., Kartini, D., Faisal, M. R., and Abadi, F. (2023). Combination of texture feature extraction and forward selection for one-class support vector machine improvement in self-portrait classification. International Journal of Electrical and Computer Engineering, 13(1):425. DOI: 10.11591/ijece.v13i1.pp425-434.

Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., and Eliassi-Rad, T. (2008). Collective classification in network data. AI Magazine, 29(3):93. DOI: 10.1609/aimag.v29i3.2157.

Torres, R. d. S. and Falcão, A. X. (2006). Content-based image retrieval: Theory and applications. Revista de Informática Teórica e Aplicada, 13:161-185. Available at:[link].

Tripathi, S. and King, C. R. (2024). Contrastive learning: Big data foundations and applications. In Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), CODS-COMAD '24, page 493–497, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3632410.3633291.

Uelwer, T., Robine, J., Wagner, S. S., Höftmann, M., Upschulte, E., Konietzny, S., Behrendt, M., and Harmeling, S. (2023). A survey on self-supervised representation learning. arXiv preprint arXiv:2308.11455. DOI: 10.48550/arxiv.2308.11455.

Valem, L. P. and Pedronette, D. C. G. (2017). An unsupervised distance learning framework for multimedia retrieval. International Conference on Multimedia Retrieval (ICMR). DOI: 10.1145/3078971.3079017.

Valem, L. P. and Pedronette, D. C. G. (2022). Person re-id through unsupervised hypergraph rank selection and fusion. Image and Vision Computing, 123:104473. DOI: 10.1016/j.imavis.2022.104473.

Valem, L. P., Pedronette, D. C. G., and Latecki, L. J. (2023). Graph convolutional networks based on manifold learning for semi-supervised image classification. Computer Vision and Image Understanding, 227:103618. DOI: 10.1016/j.cviu.2022.103618.

Vats, A. and Suri, M. (2023). A survey of graph and attention based hyperspectral image classification methods for remote sensing data. arXiv preprint arXiv:2310.09994. DOI: 10.48550/arxiv.2310.09994.

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., and Bengio, Y. (2018). Graph attention networks. In International Conference on Learning Representations. DOI: 10.48550/arxiv.1710.10903.

Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. (2011). The caltech-ucsd birds-200-2011 dataset. Available at:[link].

Wang, X., Hua, Z., and Li, J. (2023). Multi-focus image fusion framework based on transformer and feedback mechanism. Ain Shams Engineering Journal, 14(5):101978. DOI: 10.1016/j.asej.2022.101978.

Wang, X., Qiu, S., Liu, K., and Tang, X. (2014). Web image re-ranking using query-specific semantic signatures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4):810-823. DOI: 10.1109/TPAMI.2013.214.

Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., and Weinberger, K. (2019). Simplifying graph convolutional networks. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 6861-6871. PMLR. DOI: 10.48550/arxiv.1902.07153.

Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Jiang, Z.-H., Tay, F. E., Feng, J., and Yan, S. (2021). Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 558-567. DOI: 10.1109/iccv48922.2021.00060.