Integrating Domain Knowledge in Multi-Source Classification Tasks

Alexandre Thurow Bender; Emillyn Mellyne Gobetti Souza; Ihan Belmonte Bender; Ulisses Brisolara Corrêa; Ricardo Matsumura Araujo

doi:10.5753/jis.2024.4096

Authors

Alexandre Thurow Bender Federal University of Pelotas https://orcid.org/0000-0001-8370-8028
Emillyn Mellyne Gobetti Souza Federal University of Pelotas https://orcid.org/0009-0007-3150-9175
Ihan Belmonte Bender Federal University of Pelotas https://orcid.org/0009-0004-8846-7373
Ulisses Brisolara Corrêa Federal University of Pelotas https://orcid.org/0000-0001-6695-3451
Ricardo Matsumura Araujo Federal University of Pelotas https://orcid.org/0000-0003-0514-8883

DOI:

https://doi.org/10.5753/jis.2024.4096

Keywords:

Multi-Domain Learning, Batch Regularization, Classification Task, Image, Audio

Abstract

This work presents an extended investigation into multi-domain learning techniques within the context of image and audio classification, with a focus on the latter. In machine learning, collections of data obtained or generated under similar conditions are referred to as domains or data sources. However, the distinct acquisition or generation conditions of these data sources are often overlooked, despite their potential to significantly impact model generalization. Multi-domain learning addresses this challenge by seeking effective methods to train models to perform adequately across all domains seen during the training process. Our study explores a range of model-agnostic multi-domain learning techniques that leverage explicit domain information alongside class labels. Specifically, we delve into three distinct methodologies: a general approach termed Stew, which involves mixing all available data indiscriminately; and two batch domain-regularization methods: Balanced Domains and Loss Sum. These methods are evaluated through several experiments conducted on datasets featuring multiple data sources for audio and image classification tasks. Our findings underscore the importance of considering domain-specific information during the training process. We demonstrate that the application of the Loss Sum method yields notable improvements in model performance (0.79 F1-Score) compared to conventional approaches that blend data from all available domains (0.62 F1-Score). By examining the impact of different multi-domain learning techniques on classification tasks, this study contributes to a deeper understanding of effective strategies for leveraging domain knowledge in machine learning model training.

Downloads

References

Arpit, D., Wang, H., Zhou, Y., and Xiong, C. (2021). Ensemble of averages: Improving model selection and boosting performance in domain generalization. arXiv preprint arXiv:2110.10832. DOI: https://doi.org/10.48550/arXiv.2110.10832.

Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Vaughan, J. (2010). A theory of learning from different domains. Machine Learning, 79:151–175. DOI: https://doi.org/10.1007/s10994-009-5152-4.

Ben-David, S., Blitzer, J., Crammer, K., and Pereira, F. (2006). Analysis of representations for domain adaptation. Advances in neural information processing systems, 19.

Bender, A. T., Souza, E. M. G., Bender, I. B., Corrêa, U. B., and Araujo, R. M. (2023). Improving multi-domain learning by balancing batches with domain information. In Proceedings of the 29th Brazilian Symposium on Multimedia and the Web, pages 96–103. DOI: https://doi.org/10.1145/3617023.3617037.

Bender, I. B. (2022). Evaluating machine learning methodologies for multi-domain learning in image classification. Master’s thesis (computer science), Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas, Pelotas.

Chan, W., Park, D., Lee, C., Zhang, Y., Le, Q., and Norouzi, M. (2021). Speechstew: Simply mix all available speech recognition data to train one large neural network. arXiv preprint arXiv:2104.02133. DOI: https://doi.org/10.48550/arXiv.2104.02133.

Chen, Y., Lu, R., Zou, Y., and Zhang, Y. (2018). Branch-activated multi-domain convolutional neural network for visual tracking. Journal of Shanghai Jiaotong University (Science), 23:360–367. DOI: https://doi.org/10.1007/s12204-018-1951-8.

Chojnacka, R., Pelecanos, J., Wang, Q., and Moreno, I. L. (2021). Speakerstew: Scaling to many languages with a triaged multilingual text-dependent and text-independent speaker verification system. arXiv preprint arXiv:2104.02125. DOI: https://doi.org/10.48550/arXiv.2104.02125.

Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10):78–87. DOI: https://doi.org/10.1145/2347736.234775.

French, R. M. (1999). Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, 3(4):128–135. DOI: https://doi.org/10.1016/S1364-6613(99)01294-2.

Ganin, Y. and Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189, Lille, France. PMLR, JMLR.org. DOI: https://doi.org/10.48550/arXiv.1409.7495.

Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., and Lempitsky, V. (2016). Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030. DOI: https://doi.org/10.1007/978-3-319-58347-1_10.

Goodfellow, I. J., Mirza, M., Xiao, D., Courville, A., and Bengio, Y. (2014). An empirical investigation of catastrophic forgetting in gradient-based neural networks. 2nd International Conference on Learning Representations, ICLR 2014. DOI: https://doi.org/10.48550/arXiv.1312.6211.

Gulrajani, I. and Lopez-Paz, D. (2020). In search of lost domain generalization. arXiv preprint arXiv:2007.01434. DOI: https://doi.org/10.48550/arXiv.2007.01434.

Guo, S., Mokhberian, N., and Lerman, K. (2023). A data fusion framework for multi-domain morality learning. Proceedings of the International AAAI Conference on Web and Social Media, 17(1):281–291. DOI: https://doi.org/10.1609/icwsm.v17i1.22145.

Jain, A., Patel, H., Nagalapatti, L., Gupta, N., Mehta, S., Guttula, S., Mujumdar, S., Afzal, S., Sharma Mittal, R., and Munigala, V. (2020). Overview and importance of data quality for machine learning tasks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’20, pages 3561–3562. Association for Computing Machinery. DOI: https://doi.org/10.1145/3394486.3406477.

Kang, G., Jiang, L., Yang, Y., and Hauptmann, A. G. (2019). Contrastive adaptation network for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4893–4902, CA, USA. IEEE. DOI: https://doi.org/10.1109/CVPR.2019.00503.

Laparra, E., Bethard, S., and Miller, T. A. (2020). Rethinking domain adaptation for machine learning over clinical language. JAMIA open, 3(2):146–150. DOI: https://doi.org/10.1093/jamiaopen/ooaa010.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature, 521(7553):436–444. DOI: https://doi.org/10.1038/nature14539.

Li, D., Yang, Y., Song, Y.-Z., and Hospedales, T. M. (2017). Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, Venice, Italy. IEEE. DOI: https://doi.org/10.1109/ICCV.2017.591.

Li, H., Pan, S. J., Wang, S., and Kot, A. C. (2018). Domain generalization with adversarial feature learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5400–5409, Salt Lake City, UT, USA. IEEE. DOI: https://doi.org/10.1109/CVPR.2018.00566.

Likhomanenko, T., Xu, Q., Pratap, V., Tomasello, P., Kahn, J., Avidov, G., Collobert, R., and Synnaeve, G. (2020). Rethinking evaluation in asr: Are our models robust enough? arXiv preprint arXiv:2010.11745. DOI: https://doi.org/10.48550/arXiv.2010.11745.

Liu, Y., Tian, X., Li, Y., Xiong, Z., and Wu, F. (2019). Compact feature learning for multi-domain image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7193–7201, Long Beach, CA, USA. IEEE. DOI: https://doi.org/10.1109/CVPR.2019.00736.

Makhoul, J. and Cosell, L. (1976). Lpcw: An lpc vocoder with linear predictive spectral warping. In ICASSP’76. IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 466–469, Philadelphia, Pennsylvania, USA. IEEE, IEEE. DOI: https://doi.org/10.1109/ICASSP.1976.1170013.

Mysore, G. J. (2014). Can we automatically transform speech recorded on common consumer devices in real-world environments into professional production quality speech?—a dataset, insights, and challenges. IEEE Signal Processing Letters, 22(8):1006–1010. DOI: https://doi.org/10.1109/LSP.2014.2379648.

Na, J., Jung, H., Chang, H. J., and Hwang, W. (2021). Fixbi: Bridging domain spaces for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1094–1103, Nashville, TN, USA. IEEE. DOI: https://doi.org/10.1109/CVPR46437.2021.00115.

Nam, H. and Han, B. (2016). Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4293–4302, Las Vegas, Nevada, USA. IEEE. DOI: https://doi.org/10.1109/CVPR.2016.465.

Narayanan, A., Misra, A., Sim, K. C., Pundak, G., Tripathi, A., Elfeky, M., Haghani, P., Strohman, T., and Bacchiani, M. (2018). Toward domain-invariant speech recognition via large scale training. In 2018 IEEE Spoken Language Technology Workshop (SLT), pages 441–447, Athens, Greece. IEEE, IEEE. DOI: https://doi.org/10.1109/SLT.2018.8639610.

Niu, S., Liu, Y., Wang, J., and Song, H. (2020). A decade survey of transfer learning (2010–2020). IEEE Transactions on Artificial Intelligence, 1(2):151–166. DOI: https://doi.org/10.1109/TAI.2021.3054609.

Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., and Lawrence, N. D. (2008). Dataset shift in machine learning. Mit Press.

Ribeiro, J., Melo, F. S., and Dias, J. (2019). Multi-task learning and catastrophic forgetting in continual reinforcement learning. arXiv preprint arXiv:1909.10008. DOI: https://doi.org/10.48550/arXiv.1909.10008.

Saenko, K., Kulis, B., Fritz, M., and Darrell, T. (2010). Adapting visual category models to new domains. In European conference on computer vision, pages 213–226, Heraklion, Crete. Springer, Springer. DOI: https://doi.org/10.1007/978-3-642-15561-1_16.

Sambasivan, N., Kapania, S., Highfill, H., Akrong, D., Paritosh, P., and Aroyo, L. M. (2021). “everyone wants to do the model work, not the data work”: Data cascades in high-stakes ai. In proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–15, Okohama, Japan. ACM. DOI: https://doi.org/10.1145/3411764.3445518.

Sicilia, A., Zhao, X., Minhas, D. S., O’Connor, E. E., Aizenstein, H. J., Klunk, W. E., Tudorascu, D. L., and Hwang, S. J. (2021). Multi-domain learning by meta-learning: Taking optimal steps in multi-domain loss landscapes by inner-loop learning. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pages 650–654, Nice, France. IEEE, IEEE. DOI: https://doi.org/10.1109/ISBI48211.2021.9433977.

Stowell, D. and Plumbley, M. D. (2013). An open dataset for research on audio field recording archives: freefield1010. arXiv preprint arXiv:1309.5275. DOI: https://doi.org/10.48550/arXiv.1309.5275.

Tetteh, E., Viviano, J. D., Kruege, D., Bengio, Y., and Cohen, J. P. (2021). Multi-domain balanced sampling improves out-of-distribution generalization of chest x-ray pathology prediction models. Medical Imaging meets NeurIPS. DOI: https://doi.org/10.48550/arXiv.2112.13734.

Vanschoren, J. (2018). Meta-learning: A survey. arXiv preprint arXiv:1810.03548. DOI: https://doi.org/10.48550/arXiv.1810.03548

Wang, K., Zhang, G., Yue, H., Liu, A., Zhang, G., Feng, H., Han, J., Ding, E., and Wang, J. (2024). Multi-domain incremental learning for face presentation attack detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6):5499–5507. DOI: 10.1609/aaai.v38i6.28359.

Wang, S., Xie, T., Cheng, J., Zhang, X., and Liu, H. (2023). Mdl-nas: A joint multi-domain learning framework for vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20094–20104.

Weiss, K., Khoshgoftaar, T. M., and Wang, D. (2016). A survey of transfer learning. Journal of Big data, 3(1):1–40.

Westermann, H., Savelka, J., Walker, V., Ashley, K., and Benyekhlef, K. (2022). Data-centric machine learning: Improving model performance and understanding through dataset analysis. In Legal Knowledge and Information Systems: JURIX 2021: The Thirty-fourth Annual Conference, Vilnius, Lithuania, 8-10 December 2021, volume 346, page 54. IOS Press, IOS Press. DOI: https://doi.org/10.3233/FAIA210316.

Xie, S., Zheng, Z., Chen, L., and Chen, C. (2018). Learning semantic representations for unsupervised domain adaptation. In International conference on machine learning, pages 5423–5432. PMLR, JMLR.org.

Xu, T., Chen, W., Wang, P., Wang, F., Li, H., and Jin, R. (2021). Cdtrans: Cross-domain transformer for unsupervised domain adaptation. arXiv preprint arXiv:2109.06165. DOI: https://doi.org/10.48550/arXiv.2109.06165.

Xu, X., Zhou, X., Venkatesan, R., Swaminathan, G., and Majumder, O. (2019). d-sne: Domain adaptation using stochastic neighborhood embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2497–2506, Long Beach, CA, USA. IEEE. DOI: https://doi.org/10.1109/CVPR.2019.00260.

Integrating Domain Knowledge in Multi-Source Classification Tasks

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Metrics: