Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy

Authors

DOI:

https://doi.org/10.5753/jbcs.2024.3966

Keywords:

Catastrophic Forgetting, Continuous Learning, Neural Networks, Artificial Intelligence

Abstract

Deep Learning models have achieved remarkable performance in tasks such as image classification or generation, often surpassing human accuracy. However, they can struggle to learn new tasks and update their knowledge without access to previous data, leading to a significant loss of accuracy known as Catastrophic Forgetting (CF). This phenomenon was first observed by McCloskey and Cohen in 1989 and remains an active research topic. Incremental learning without forgetting is widely recognized as a crucial aspect in building better AI systems, as it allows models to adapt to new tasks without losing the ability to perform previously learned ones. This article surveys recent studies that tackle CF in modern Deep Learning models that use gradient descent as their learning algorithm. Although several solutions have been proposed, a definitive solution or consensus on assessing CF is yet to be established. The article provides a comprehensive review of recent solutions, proposes a taxonomy to organize them, and identifies research gaps in this area.

Downloads

Download data is not yet available.

References

Adel, T., Zhao, H., and Turner, R. E. (2020). Continual learning with adaptive weights (claw). In International Conference on Learning Representations. DOI: 10.48550/arXiv.1911.0951.

Ahn, H., Kwak, J., Lim, S., Bang, H., Kim, H., and Moon, T. (2021). Ss-il: Separated softmax for incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 844-853. Available online [link].

Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., and Tuytelaars, T. (2018). Memory aware synapses: Learning what (not) to forget. In Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y., editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part III, volume 11207 of Lecture Notes in Computer Science, pages 144-161. Springer. Available online [link].

Aljundi, R., Chakravarty, P., and Tuytelaars, T. (2017). Expert gate: Lifelong learning with a network of experts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3366-3375. Available online [link].

Ashfahani, A. and Pratama, M. (2019). Autonomous deep learning: Continual learning approach for dynamic environments. In Berger-Wolf, T. Y. and Chawla, N. V., editors, Proceedings of the 2019 SIAM International Conference on Data Mining, SDM 2019, Calgary, Alberta, Canada, May 2-4, 2019, pages 666-674. SIAM. DOI: 10.1137/1.9781611975673.75.

Ayad, O. (2014). Learning under concept drift with support vector machines. In Wermter, S., Weber, C., Duch, W., Honkela, T., Koprinkova-Hristova, P. D., Magg, S., Palm, G., and Villa, A. E. P., editors, Artificial Neural Networks and Machine Learning - ICANN 2014 - 24th International Conference on Artificial Neural Networks, Hamburg, Germany, September 15-19, 2014. Proceedings, volume 8681 of Lecture Notes in Computer Science, pages 587-594. Springer. DOI: 10.1007/978-3-319-11179-7_74.

Banayeeanzade, M., Mirzaiezadeh, R., Hasani, H., and Soleymani, M. (2021). Generative vs. discriminative: Rethinking the meta-continual learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Belouadah, E. and Popescu, A. (2019). Il2m: Class incremental learning with dual memory. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 583-592. Available online [link].

Belouadah, E. and Popescu, A. (2020). Scail: Classifier weights scaling for class incremental learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1266-1275. Available online [link].

Belouadah, E., Popescu, A., and Kanellos, I. (2020). Initial classifier weights replay for memoryless class incremental learning. In 31st British Machine Vision Conference 2020, BMVC 2020, Virtual Event, UK, September 7-10, 2020. BMVA Press. DOI: 10.48550/arXiv.2008.13710.

Belouadah, E., Popescu, A., and Kanellos, I. (2021). A comprehensive study of class incremental learning algorithms for visual tasks. Neural Networks, 135:38-54. DOI: 10.1016/j.neunet.2020.12.003.

Benna, M. K. and Fusi, S. (2016). Computational principles of synaptic memory consolidation. Nature neuroscience, 19(12):1697-1706. DOI: 10.1038/nn.4401.

Biesialska, M., Biesialska, K., and Costa-jussà, M. R. (2020). Continual lifelong learning in natural language processing: A survey. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6523-6541, Barcelona, Spain (Online). International Committee on Computational Linguistics. DOI: 10.18653/v1/2020.coling-main.574.

Blundell, C., Cornebise, J., Kavukcuoglu, K., and Wierstra, D. (2015). Weight uncertainty in neural network. In Bach, F. R. and Blei, D. M., editors, Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 1613-1622. JMLR.org. Available online [link].

Borsos, Z., Mutny, M., and Krause, A. (2020). Coresets via bilevel optimization for continual learning and streaming. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Available online [link].

Caccia, M., Rodr'iguez, P., Ostapenko, O., Normandin, F., Lin, M., Page-Caccia, L., Laradji, I. H., Rish, I., Lacoste, A., V'azquez, D., and Charlin, L. (2020). Online fast adaptation and knowledge accumulation (OSAKA): a new approach to continual learning. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Available online [link].

Cai, S., Xu, Z., Huang, Z., Chen, Y., and Kuo, C. J. (2018). Enhancing CNN incremental learning capability with an expanded network. In 2018 IEEE International Conference on Multimedia and Expo, ICME 2018, San Diego, CA, USA, July 23-27, 2018, pages 1-6. IEEE Computer Society. DOI: 10.1109/ICME.2018.8486457.

Castro, F. M., Mar'in-Jim'enez, M. J., Guil, N., Schmid, C., and Alahari, K. (2018). End-to-end incremental learning. In Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y., editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XII, volume 11216 of Lecture Notes in Computer Science, pages 241-257. Springer. Available online [link].

Cha, H., Lee, J., and Shin, J. (2021). Co2l: Contrastive continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9516-9525. Available online [link].

Chaudhry, A., Ranzato, M., Rohrbach, M., and Elhoseiny, M. (2019a). Efficient lifelong learning with a-GEM. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. DOI: 10.48550/arXiv.1812.00420.

Chaudhry, A., Rohrbach, M., Elhoseiny, M., Ajanthan, T., Dokania, P. K., Torr, P. H. S., and Ranzato, M. (2019b). Continual learning with tiny episodic memories. CoRR, abs/1902.10486. Available online [link].

Chen, X. and He, K. (2021). Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15750-15758. Available online [link].

Cheraghian, A., Rahman, S., Fang, P., Roy, S. K., Petersson, L., and Harandi, M. (2021). Semantic-aware knowledge distillation for few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2534-2543. Available online [link].

Church, K. W. (2017). Word2vec. Natural Language Engineering, 23(1):155–162. DOI: 10.1017/S1351324916000334.

Clune, J., Mouret, J., and Lipson, H. (2013). Summary of "the evolutionary origins of modularity". In Blum, C. and Alba, E., editors, Genetic and Evolutionary Computation Conference, GECCO '13, Amsterdam, The Netherlands, July 6-10, 2013, Companion Material Proceedings, pages 23-24. ACM. DOI: 10.1145/2464576.2464596.

Coop, R., Mishtal, A., and Arel, I. (2013). Ensemble learning in fixed expansion layer networks for mitigating catastrophic forgetting. IEEE Trans. Neural Networks Learn. Syst., 24(10):1623-1634. DOI: 10.1109/TNNLS.2013.2264952.

Davari, M., Asadi, N., Mudur, S., Aljundi, R., and Belilovsky, E. (2022). Probing representation forgetting in supervised and unsupervised continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022). Available online [link].

De Lange, M., Aljundi, R., Masana, M., Parisot, S., Jia, X., Leonardis, A., Slabaugh, G., and Tuytelaars, T. (2021). A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, :1-1. DOI: 10.1109/TPAMI.2021.3057446.

De Lange, M. and Tuytelaars, T. (2021). Continual prototype evolution: Learning online from non-stationary data streams. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8250-8259. Available online [link].

Dhar, P., Singh, R. V., Peng, K.-C., Wu, Z., and Chellappa, R. (2019). Learning without memorizing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Available online [link].

Dong, N., Zhang, Y., Ding, M., and Lee, G. H. (2021). Bridging non co-occurrence with unlabeled in-the-wild data for incremental object detection. Advances in Neural Information Processing Systems, 34. Available online [link].

Douillard, A., Ram'e, A., Couairon, G., and Cord, M. (2022). Dytox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9285-9295. Available online [link].

Egorov, E., Kuzina, A., and Burnaev, E. (2021). Boovae: Boosting approach for continual learning of vae. Advances in Neural Information Processing Systems, 34. Available online [link].

Ellefsen, K. O., Mouret, J., and Clune, J. (2015). Neural modularity helps organisms evolve to learn new skills without forgetting old skills. PLoS Comput. Biol., 11(4). DOI: 10.1371/journal.pcbi.1004128.

Fernando, C., Banarse, D., Blundell, C., Zwols, Y., Ha, D., Rusu, A. A., Pritzel, A., and Wierstra, D. (2017). Pathnet: Evolution channels gradient descent in super neural networks. CoRR, abs/1701.08734. DOI: 10.48550/arXiv.1701.08734.

French, R. M. (1999). Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 3(4):128-135. DOI: 10.1016/S1364-6613(99)01294-2.

Gao, Z., Xu, C., Li, F., Jia, Y., Harandi, M., and Wu, Y. (2023). Exploring data geometry for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24325-24334. Available online [link].

Gepperth, A. and Gondal, S. A. (2018). Incremental learning with deep neural networks using a test-time oracle. In 26th European Symposium on Artificial Neural Networks, ESANN 2018, Bruges, Belgium, April 25-27, 2018. Available online [link].

Gepperth, A., Lefort, M., and Hecht, T. (2015). Resource-efficient Incremental learning in very high dimensions. ESANN, Bruges, Belgium. Book.

Goodfellow, I. J., Bengio, Y., and Courville, A. C. (2016). Deep Learning. Adaptive computation and machine learning. MIT Press. Available online [link].

Goodfellow, I. J., Mirza, M., Da, X., Courville, A. C., and Bengio, Y. (2014a). An empirical investigation of catastrophic forgeting in gradient-based neural networks. In Bengio, Y. and LeCun, Y., editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings. DOI: 10.48550/arXiv.1312.6211.

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. C., and Bengio, Y. (2014b). Generative adversarial nets. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., and Weinberger, K. Q., editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 2672-2680. Available online [link].

Goodrich, B. and Arel, I. (2014). Neuron clustering for mitigating catastrophic forgetting in feedforward neural networks. In 2014 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments, CIDUE 2014, Orlando, FL, USA, December 9-12, 2014, pages 62-68. IEEE. DOI: 10.1109/CIDUE.2014.7007868.

Gregor, K., Danihelka, I., Graves, A., Rezende, D. J., and Wierstra, D. (2015). DRAW: A recurrent neural network for image generation. In Bach, F. R. and Blei, D. M., editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 1462-1471, Lille, France. PMLR. Available online [link].

Greve, R. B., Jacobsen, E. J., and Risi, S. (2016). Evolving neural turing machines for reward-based learning. In Friedrich, T., Neumann, F., and Sutton, A. M., editors, Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO '16, page 117–124, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/2908812.2908930.

Guo, H., Yan, Z., Yang, J., and Li, S. (2018). An incremental scheme with weight pruning to train deep neural network. In Liang, Q., Liu, X., Na, Z., Wang, W., Mu, J., and Zhang, B., editors, Communications, Signal Processing, and Systems - Proceedings of the 2018 CSPS, Volume III: Systems, Dalian, China, 14-16 July 2018, volume 517 of Lecture Notes in Electrical Engineering, pages 295-302. Springer. DOI: 10.1007/978-981-13-6508-9_37.

Han, S., Pool, J., Narang, S., Mao, H., Gong, E., Tang, S., Elsen, E., Vajda, P., Paluri, M., Tran, J., Catanzaro, B., and Dally, W. J. (2017). DSD: dense-sparse-dense training for deep neural networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. DOI: 10.48550/arXiv.1607.04381.

Han, S., Pool, J., Tran, J., and Dally, W. J. (2015). Learning both weights and connections for efficient neural network. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 1135-1143. Available online [link].

Hattori, M. and Tsuboi, H. (2018). Reduction of catastrophic forgetting for multilayer neural networks trained by no-prop algorithm. In 2018 International Conference on Information and Communications Technology (ICOIACT), pages 214-219. IEEE. DOI: 10.1109/ICOIACT.2018.835066.

Hayes, T. L., Cahill, N. D., and Kanan, C. (2019). Memory efficient experience replay for streaming learning. In International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20-24, 2019, pages 9769-9776. IEEE. DOI: 10.1109/ICRA.2019.8793982.

Hayes, T. L., Kafle, K., Shrestha, R., Acharya, M., and Kanan, C. (2020). REMIND your neural network to prevent catastrophic forgetting. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J., editors, Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VIII, volume 12353 of Lecture Notes in Computer Science, pages 466-483. Springer. DOI: 10.1007/978-3-030-58598-3_28.

He, C., Wang, R., Shan, S., and Chen, X. (2018). Exemplar-supported generative reproduction for class incremental learning. In British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3-6, 2018, page 98. BMVA Press. Available online [link].

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770-778. IEEE Computer Society. DOI: 10.1109/CVPR.2016.90.

Henning, C., Cervera, M., D'Angelo, F., Von Oswald, J., Traber, R., Ehret, B., Kobayashi, S., Grewe, B. F., and Sacramento, J. (2021). Posterior meta-replay for continual learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580. cite arxiv:1207.0580. DOI: 10.48550/arXiv.1207.0580.

Hinton, G. E., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural network. CoRR, abs/1503.02531. DOI: 10.48550/arXiv.1503.02531.

Hong, D., Li, Y., and Shin, B.-S. (2019). Predictive ewc: mitigating catastrophic forgetting of neural network through pre-prediction of learning data. Journal of Ambient Intelligence and Humanized Computing, :1-10. DOI: 10.1007/s12652-019-01346-7.

Hou, S., Pan, X., Loy, C. C., Wang, Z., and Lin, D. (2018). Lifelong learning via progressive distillation and retrospection. In Proceedings of the European Conference on Computer Vision (ECCV), pages 437-452. Available online [link].

Hou, S., Pan, X., Loy, C. C., Wang, Z., and Lin, D. (2019). Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Available online [link].

Hu, W., Lin, Z., Liu, B., Tao, C., Tao, Z., Ma, J., Zhao, D., and Yan, R. (2019). Overcoming catastrophic forgetting for continual learning via model adaptation. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. Available online [link].

Hu, X., Tang, K., Miao, C., Hua, X.-S., and Zhang, H. (2021). Distilling causal effect of data in class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3957-3966. Available online [link].

Hu, Z., Li, Y., Lyu, J., Gao, D., and Vasconcelos, N. (2023). Dense network expansion for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11858-11867. Available online [link].

Huang, G., Zhu, Q., and Siew, C. K. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70(1-3):489-501. DOI: 10.1016/j.neucom.2005.12.126.

Hung, C.-Y., Tu, C.-H., Wu, C.-E., Chen, C.-H., Chan, Y.-M., and Chen, C.-S. (2019). Compacting, picking and growing for unforgetting continual learning. In Wallach, H., Larochelle, H., Beygelzimer, A., dtextquotesingle Alch'e-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc. Available online [link].

Imai, S. and Nobuhara, H. (2018). Stepwise pathnet: Transfer learning algorithm to improve network structure versatility. In IEEE International Conference on Systems, Man, and Cybernetics, SMC 2018, Miyazaki, Japan, October 7-10, 2018, pages 918-922. IEEE. DOI: 10.1109/SMC.2018.00163.

Javed, K. and White, M. (2019). Meta-learning representations for continual learning. In Wallach, H., Larochelle, H., Beygelzimer, A., dtextquotesingle Alch'e-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc. Available online [link].

Jin, X., Sadhu, A., Du, J., and Ren, X. (2021). Gradient-based editing of memory examples for online task-free continual learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Joseph, K. J. and Balasubramanian, V. N. (2020). Meta-consolidation for continual learning. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Available online [link].

Jung, S., Ahn, H., Cha, S., and Moon, T. (2020). Continual learning with node-importance based adaptive group sparse regularization. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Available online [link].

K"ading, C., Rodner, E., Freytag, A., and Denzler, J. (2016). Fine-tuning deep neural networks in continuous learning scenarios. In Chen, C., Lu, J., and Ma, K., editors, Computer Vision - ACCV 2016 Workshops - ACCV 2016 International Workshops, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part III, volume 10118 of Lecture Notes in Computer Science, pages 588-605. Springer. DOI: 10.1007/978-3-319-54526-4_43.

Karatzas, I. and Shreve, S. E. (1991). Brownian motion and stochastic calculus. Graduate Texts in Mathematics (113) (Book 113). Springer New York. DOI: 10.1007/978-1-4612-0949-2.

Ke, Z., Liu, B., and Huang, X. (2020). Continual learning of a mixed sequence of similar and dissimilar tasks. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Available online [link].

Ke, Z., Liu, B., Wang, H., and Shu, L. (2021). Continual learning with knowledge transfer for sentiment classification. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14-18, 2020, Proceedings, Part III, pages 683-698. Springer. DOI: 10.1007/978-3-030-67664-3_41.

Kemker, R., McClure, M., Abitino, A., Hayes, T. L., and Kanan, C. (2018). Measuring catastrophic forgetting in neural networks. In McIlraith, S. A. and Weinberger, K. Q., editors, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 3390-3398. AAAI Press. DOI: 10.1609/aaai.v32i1.11651.

Kim, G., Liu, B., and Ke, Z. (2022). A multi-head model for continual learning via out-of-distribution replay. In Chandar, S., Pascanu, R., and Precup, D., editors, Conference on Lifelong Learning Agents, CoLLAs 2022, 22-24 August 2022, McGill University, Montr'eal, Qu'ebec, Canada, volume 199 of Proceedings of Machine Learning Research, pages 548-563. PMLR. Available online [link].

Kirkpatrick, J., Pascanu, R., Rabinowitz, N. C., Veness, J., Desjardins, G., Rusu, A. A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., Hassabis, D., Clopath, C., Kumaran, D., and Hadsell, R. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521-3526. DOI: 10.1073/pnas.1611835114.

Kitchenham, B. A. and Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. Technical Report EBSE 2007-001, Keele University and Durham University Joint Report. Available online [link].

Knoblauch, J., Husain, H., and Diethe, T. (2020). Optimal continual learning has perfect memory and is NP-hard. In III, H. D. and Singh, A., editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 5327-5337. PMLR. Available online [link].

Kobayashi, T. (2018). Check regularization: Combining modularity and elasticity for memory consolidation. In Kurkov'a, V., Manolopoulos, Y., Hammer, B., Iliadis, L. S., and Maglogiannis, I., editors, Artificial Neural Networks and Machine Learning - ICANN 2018 - 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part II, volume 11140 of Lecture Notes in Computer Science, pages 315-325. Springer. DOI: 10.1007/978-3-030-01421-6_31.

Lancewicki, T., Goodrich, B., and Arel, I. (2015). Sequential covariance-matrix estimation with application to mitigating catastrophic forgetting. In Li, T., Kurgan, L. A., Palade, V., Goebel, R., Holzinger, A., Verspoor, K., and Wani, M. A., editors, 14th IEEE International Conference on Machine Learning and Applications, ICMLA 2015, Miami, FL, USA, December 9-11, 2015, pages 628-633. DOI: 10.1109/ICMLA.2015.109.

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324. DOI: 10.1109/5.726791.

Ledoit, O., Wolf, M., et al. (2012). Nonlinear shrinkage estimation of large-dimensional covariance matrices. The Annals of Statistics, 40(2):1024 - 1060. DOI: 10.1214/12-AOS989.

Lee, K., Lee, K., Shin, J., and Lee, H. (2019). Overcoming catastrophic forgetting with unlabeled data in the wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 312-321. Available online [link].

Lee, S., Ha, J., Zhang, D., and Kim, G. (2020). A neural dirichlet process mixture model for task-free continual learning. In International Conference on Learning Representations. DOI: 10.48550/arXiv.2001.00689.

Lee, S., Kim, J., Jun, J., Ha, J., and Zhang, B. (2017a). Overcoming catastrophic forgetting by incremental moment matching. In Guyon, I., von Luxburg, U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N., and Garnett, R., editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 4652-4662. Available online [link] .

Lee, S., Lee, C., Kwak, D., Ha, J., Kim, J., and Zhang, B. (2017b). Dual-memory neural networks for modeling cognitive activities of humans via wearable sensors. Neural Networks, 92:17-28. DOI: 10.1016/j.neunet.2017.02.008.

Leontev, M. I., Mikheev, A., Sviatov, K., and Sukhov, S. V. (2019). Overcoming catastrophic interference with bayesian learning and stochastic langevin dynamics. In Lu, H., Tang, H., and Wang, Z., editors, Advances in Neural Networks - ISNN 2019 - 16th International Symposium on Neural Networks, ISNN 2019, Moscow, Russia, July 10-12, 2019, Proceedings, Part I, volume 11554 of Lecture Notes in Computer Science, pages 370-378. Springer. DOI: 10.1007/978-3-030-22796-8_39.

Li, T., Ke, Q., Rahmani, H., Ho, R. E., Ding, H., and Liu, J. (2021). Else-net: Elastic semantic network for continual action recognition from skeleton data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13434-13443. Available online [link].

Li, X., Zhou, Y., Wu, T., Socher, R., and Xiong, C. (2019). Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 3925-3934. PMLR. Available online [link].

Li, Z. and Hoiem, D. (2018). Learning without forgetting. 40(12):2935–2947. DOI: 10.1109/TPAMI.2017.2773081.

Liang, D., Yang, F., Zhang, T., and Yang, P. (2018). Understanding mixup training methods. IEEE Access, 6:58774-58783. DOI: 10.1109/ACCESS.2018.2872698.

Lipson, H. (2007). Principles of modularity, regularity, and hierarchy for scalable systems. Journal of Biological Physics and Chemistry, 7(4):125. DOI: 10.4024/40701.jbpc.07.04.

Liu, H., Gu, L., Chi, Z., Wang, Y., Yu, Y., Chen, J., and Tang, J. (2022). Few-shot class-incremental learning via entropy-regularized data-free replay. In Avidan, S., Brostow, G., Cissé, M., Farinella, G. M., and Hassner, T., editors, Computer Vision - ECCV 2022, pages 146-162, Cham. Springer Nature Switzerland. DOI: 10.1007/978-3-031-20053-3_9.

Liu, Q., Majumder, O., Achille, A., Ravichandran, A., Bhotika, R., and Soatto, S. (2020a). Incremental few-shot meta-learning via indirect discriminant alignment. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.-M., editors, Computer Vision - ECCV 2020, pages 685-701, Cham. Springer International Publishing. DOI: 10.1007/978-3-030-58571-6_40.

Liu, X., Masana, M., Herranz, L., Van de Weijer, J., López, A. M., and Bagdanov, A. D. (2018). Rotate your networks: Better weight consolidation and less catastrophic forgetting. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 2262-2268. DOI: 10.1109/ICPR.2018.8545895.

Liu, Y., Parisot, S., Slabaugh, G., Jia, X., Leonardis, A., and Tuytelaars, T. (2020b). More classifiers, less forgetting: A generic multi-classifier paradigm for incremental learning. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.-M., editors, Computer Vision - ECCV 2020, pages 699-716, Cham. Springer International Publishing. DOI: 10.1007/978-3-030-58574-7_42.

Liu, Y., Schiele, B., and Sun, Q. (2021). Rmm: Reinforced memory management for class-incremental learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Liu, Y., Su, Y., Liu, A.-A., Schiele, B., and Sun, Q. (2020c). Mnemonics training: Multi-class incremental learning without forgetting. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pages 12245-12254. Available online [link].

Lomonaco, V., Pellegrini, L., Cossu, A., Carta, A., Graffieti, G., Hayes, T. L., Lange, M. D., Masana, M., Pomponi, J., van de Ven, G., Mundt, M., She, Q., Cooper, K., Forest, J., Belouadah, E., Calderara, S., Parisi, G. I., Cuzzolin, F., Tolias, A., Scardapane, S., Antiga, L., Amhad, S., Popescu, A., Kanan, C., van de Weijer, J., Tuytelaars, T., Bacciu, D., and Maltoni, D. (2021). Avalanche: an end-to-end library for continual learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2nd Continual Learning in Computer Vision Workshop. Available online [link].

Lopez-Paz, D. and Ranzato, M. A. (2017). Gradient episodic memory for continual learning. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30, pages 6467-6476. Curran Associates, Inc. Available online [link].

L"uders, B., Schl"ager, M., Korach, A., and Risi, S. (2017). Continual and one-shot learning through neural networks with dynamic external memory. In Squillero, G. and Sim, K., editors, Applications of Evolutionary Computation - 20th European Conference, EvoApplications 2017, Amsterdam, The Netherlands, April 19-21, 2017, Proceedings, Part I, volume 10199 of Lecture Notes in Computer Science, pages 886-901. DOI: 10.1007/978-3-319-55849-3_57.

Ma, C., Ji, Z., Huang, Z., Shen, Y., Gao, M., and Xu, J. (2023). Progressive voronoi diagram subdivision enables accurate data-free class-incremental learning. In The Eleventh International Conference on Learning Representations. Available online [link].

Madaan, D., Yoon, J., Li, Y., Liu, Y., and Hwang, S. J. (2022). Representational continuity for unsupervised continual learning. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. DOI: 10.48550/arXiv.2110.06976.

Mai, Z., Li, R., Jeong, J., Quispe, D., Kim, H., and Sanner, S. (2021a). Online continual learning in image classification: An empirical survey. arXiv preprint arXiv:2101.10423, . DOI: 10.1016/j.neucom.2021.10.021.

Mai, Z., Li, R., Kim, H., and Sanner, S. (2021b). Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3589-3599. Available online [link].

Mallya, A., Davis, D., and Lazebnik, S. (2018). Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In Ferrari, V., Hebert, M., Sminchisescu, C., and Weiss, Y., editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part IV, volume 11208 of Lecture Notes in Computer Science, pages 72-88. Springer. DOI: 10.1007/978-3-030-01225-0_5.

Mallya, A. and Lazebnik, S. (2018). Packnet: Adding multiple tasks to a single network by iterative pruning. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 7765-7773. IEEE Computer Society. DOI: 10.1109/CVPR.2018.00810.

Masana, M., Liu, X., Twardowski, B., Menta, M., Bagdanov, A. D., and van de Weijer, J. (2020). Class-incremental learning: survey and performance evaluation. CoRR, abs/2010.15277. DOI: 10.48550/arXiv.2010.15277.

Masse, N. Y., Grant, G. D., and Freedman, D. J. (2018). Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proc. Natl. Acad. Sci. USA, 115(44):E10467-E10475. DOI: 10.1073/pnas.1803839115.

McCloskey, M. and Cohen, N. J. (1989). Catastrophic interference in connectionist networks: The sequential learning problem. In Bower, G. H., editor, Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem, volume 24 of Psychology of Learning and Motivation, pages 109-165. Academic Press. DOI: 10.1016/S0079-7421(08)60536-8.

Mellado, D., Saavedra, C., Chabert, S., and Salas, R. (2017). Pseudorehearsal approach for incremental learning of deep convolutional neural networks. In Barone, D. A. C., Teles, E. O., and Brackmann, C. P., editors, Computational Neuroscience, pages 118-126, Cham. Springer International Publishing. DOI: 10.1007/978-3-319-71011-2_10.

Mi, F., Chen, L., Zhao, M., Huang, M., and Faltings, B. (2020). Continual learning for natural language generation in task-oriented dialog systems. In Cohn, T., He, Y., and Liu, Y., editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020, Online Event, 16-20 November 2020, pages 3461-3474. Association for Computational Linguistics. DOI: 10.18653/v1/2020.findings-emnlp.310.

Mirzadeh, S., Farajtabar, M., Pascanu, R., and Ghasemzadeh, H. (2020). Understanding the role of training regimes in continual learning. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F., and Lin, H., editors, Advances in Neural Information Processing Systems, volume 33, pages 7308-7320. Curran Associates, Inc. Available online [link].

Movellan, J. R. (1991). Contrastive hebbian learning in the continuous hopfield model. In Touretzky, D. S., Elman, J. L., Sejnowski, T. J., and Hinton, G. E., editors, Connectionist Models, pages 10-17. Morgan Kaufmann. DOI: 10.1016/B978-1-4832-1448-1.50007-X.

Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, page 807–814, Madison, WI, USA. Omnipress. DOI: 10.5555/3104322.3104425.

Nguyen, C. V., Li, Y., Bui, T. D., and Turner, R. E. (2018). Variational continual learning. In International Conference on Learning Representations. Available online [link].

Oh, Y., Baek, D., and Ham, B. (2022). ALIFE: Adaptive logit regularizer and feature replay for incremental semantic segmentation. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K., editors, Advances in Neural Information Processing Systems. Available online [link].

Ostapenko, O., Puscas, M., Klein, T., Jahnichen, P., and Nabi, M. (2019). Learning to remember: A synaptic plasticity driven framework for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11321-11329. Available online [link].

Ostapenko, O., Rodriguez, P., Caccia, M., and Charlin, L. (2021). Continual learning via local module composition. Advances in Neural Information Processing Systems, 34. Available online [link].

Pan, P., Swaroop, S., Immer, A., Eschenhagen, R., Turner, R. E., and Khan, M. E. (2020). Continual deep learning by functional regularisation of memorable past. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, volume 33, pages 4453-4464. Curran Associates, Inc. Available online [link].

Parisi, G. I., Kemker, R., Part, J. L., Kanan, C., and Wermter, S. (2019). Continual lifelong learning with neural networks: A review. Neural Networks, 113:54-71. DOI: 10.1016/j.neunet.2019.01.012.

Pennington, J., Socher, R., and Manning, C. D. (2014). Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532-1543. DOI: 10.3115/v1/D14-1162.

Perez-Rua, J.-M., Zhu, X., Hospedales, T. M., and Xiang, T. (2020). Incremental few-shot object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13846-13855. Available online [link].

Petit, G., Popescu, A., Schindler, H., Picard, D., and Delezoide, B. (2023). Fetril: Feature translation for exemplar-free class-incremental learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3911-3920. Available online [link].

Pf"ulb, B. and Gepperth, A. (2019). A comprehensive, application-oriented study of catastrophic forgetting in dnns. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. DOI: 10.48550/arXiv.1905.08101.

Pham, Q., Liu, C., and Steven, H. (2022). Continual normalization: Rethinking batch normalization for online continual learning. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. DOI: 10.48550/arXiv.2203.16102.

PourKeshavarzi, M., Zhao, G., and Sabokrou, M. (2022). Looking back on learned experiences for class/task incremental learning. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. Available online [link].

Prabhu, A., Torr, P. H. S., and Dokania, P. K. (2020). Gdumb: A simple approach that questions our progress in continual learning. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.-M., editors, Computer Vision - ECCV 2020, pages 524-540, Cham. Springer International Publishing. DOI: 10.1007/978-3-030-58536-5_31.

Qin, Q., Hu, W., Peng, H., Zhao, D., and Liu, B. (2021). Bns: Building network structures dynamically for continual learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Rajasegaran, J., Khan, S., Hayat, M., Khan, F. S., and Shah, M. (2020). itaml: An incremental task-agnostic meta-learning approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13588-13597. Available online [link].

Ramesh, R. and Chaudhari, P. (2022). Model zoo: A growing brain that learns continually. In International Conference on Learning Representations. Available online [link].

Rannen, A., Aljundi, R., Blaschko, M. B., and Tuytelaars, T. (2017). Encoder based lifelong learning. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1320-1328. Available online [link].

Rebuffi, S., Kolesnikov, A., Sperl, G., and Lampert, C. H. (2017). icarl: Incremental classifier and representation learning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 5533-5542. IEEE Computer Society. DOI: 10.1109/CVPR.2017.587.

Ritter, H., Botev, A., and Barber, D. (2018). Online structured laplace approximations for overcoming catastrophic forgetting. In Bengio, S., Wallach, H. M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montr'eal, Canada, pages 3742-3752. Available online [link].

Robins, A. V. (1993). Catastrophic forgetting in neural networks: the role of rehearsal mechanisms. In First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, ANNES '93, Dunedin, New Zealand, November 24-26, 1993, pages 65-68. IEEE. DOI: 10.1109/ANNES.1993.323080.

Robins, A. V. (1995). Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci., 7(2):123-146. DOI: 10.1080/09540099550039318.

Rostami, M., Kolouri, S., and Pilly, P. K. (2019). Complementary learning for overcoming catastrophic forgetting using experience replay. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 3339-3345. International Joint Conferences on Artificial Intelligence Organization. DOI: 10.48550/arXiv.1903.04566.

Roy, D., Panda, P., and Roy, K. (2020). Tree-cnn: A hierarchical deep convolutional neural network for incremental learning. Neural Networks, 121:148-160. DOI: 10.1016/j.neunet.2019.09.010.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M. S., Berg, A. C., and Li, F. (2015). Imagenet large scale visual recognition challenge. Int. J. Comput. Vis., 115(3):211-252. DOI: 10.1007/s11263-015-0816-y.

Rusu, A. A., Rabinowitz, N. C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., and Hadsell, R. (2016). Progressive neural networks. CoRR, abs/1606.04671. DOI: 10.48550/arXiv.1606.04671.

Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T. P. (2016). Meta-learning with memory-augmented neural networks. In Balcan, M. and Weinberger, K. Q., editors, Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML'16, page 1842–1850. JMLR.org. DOI: 10.5555/3045390.3045585.

Sarfraz, F., Arani, E., and Zonooz, B. (2023). Sparse coding in a dual memory system for lifelong learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8):9714-9722. DOI: 10.1609/aaai.v37i8.26161.

Schaefer, T. J. (1978). The complexity of satisfiability problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC '78, page 216–226, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/800133.804350.

Schroff, F., Kalenichenko, D., and Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pages 815-823. IEEE Computer Society. DOI: 10.1109/CVPR.2015.7298682.

Serr`a, J., Suris, D., Miron, M., and Karatzoglou, A. (2018). Overcoming catastrophic forgetting with hard attention to the task. In Dy, J. G. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsm"assan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pages 4555-4564. PMLR. Available online [link].

Serre, D. (2002). Elementary theory. pages 1-14. Springer. DOI: 10.1007/0-387-22758-X_1.

Shankar, S. and Sarawagi, S. (2018). Labeled memory networks for online model adaptation. In McIlraith, S. A. and Weinberger, K. Q., editors, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 4034-4041. AAAI Press. Available online [link].

Shi, Y., Zhou, K., Liang, J., Jiang, Z., Feng, J., Torr, P. H., Bai, S., and Tan, V. Y. (2022). Mimicking the oracle: an initial phase decorrelation approach for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16722-16731. Available online [link].

Shim, D., Mai, Z., Jeong, J., Sanner, S., Kim, H., and Jang, J. (2021). Online class-incremental continual learning with adversarial shapley value. In Proceedings of the AAAI Conference on Artificial Intelligence, number 11 in 35, pages 9630-9638. DOI: 10.1609/aaai.v35i11.17159.

Shin, H., Lee, J. K., Kim, J., and Kim, J. (2017). Continual learning with deep generative replay. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc. Available online [link].

Simon, C., Koniusz, P., and Harandi, M. (2021). On learning the geodesic path for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1591-1600. Available online [link].

Smith, J., Hsu, Y., Balloch, J., Shen, Y., Jin, H., and Kira, Z. (2021). Always be dreaming: A new approach for data-free class-incremental learning. CoRR, abs/2106.09701. Available online [link].

Smith, J. S., Karlinsky, L., Gutta, V., Cascante-Bonilla, P., Kim, D., Arbelle, A., Panda, R., Feris, R., and Kira, Z. (2023). Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11909-11919. Available online [link].

Sprechmann, P., Jayakumar, S. M., Rae, J. W., Pritzel, A., Badia, A. P., Uria, B., Vinyals, O., Hassabis, D., Pascanu, R., and Blundell, C. (2018). Memory-based parameter adaptation. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. Available online [link].

Stanley, K. O. and Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evol. Comput., 10(2):99–127. DOI: 10.1162/106365602320169811.

Sun, S., Calandriello, D., Hu, H., Li, A., and Titsias, M. (2022). Information-theoretic online memory selection for continual learning. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. DOI: 10.48550/arXiv.2204.04763.

Tang, S., Chen, D., Zhu, J., Yu, S., and Ouyang, W. (2021). Layerwise optimization by gradient decomposition for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9634-9643. Available online [link].

Titsias, M. K., Schwarz, J., de G. Matthews, A. G., Pascanu, R., and Teh, Y. W. (2020). Functional regularisation for continual learning with gaussian processes. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. Available online [link].

van de Ven, G. M., Siegelmann, H. T., and Tolias, A. S. (2020). Brain-inspired replay for continual learning with artificial neural networks. Nature Communications, 11(1). Available online [link].

van de Ven, G. M. and Tolias, A. (2019). Three scenarios for continual learning. ArXiv, abs/1904.07734. DOI: 10.48550/arXiv.1904.07734.

Velez, R. and Clune, J. (2017). Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks. PloS one, 12(11):e0187736. DOI: 10.1371/journal.pone.0187736.

Verma, V. K., Liang, K. J., Mehta, N., Rai, P., and Carin, L. (2021). Efficient feature transformations for discriminative and generative continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13865-13875. Available online [link].

Verwimp, E., De Lange, M., and Tuytelaars, T. (2021). Rehearsal revealed: The limits and merits of revisiting samples in continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9385-9394. Available online [link].

Villa, A., Alc'azar, J. L., Alfarra, M., Alhamoud, K., Hurtado, J., Heilbron, F. C., Soto, A., and Ghanem, B. (2023). Pivot: Prompting for video continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24214-24223. Available online [link].

von Oswald, J., Henning, C., Sacramento, J., and Grewe, B. F. (2020). Continual learning with hypernetworks. In International Conference on Learning Representations. DOI: 10.48550/arXiv.1906.00695.

Von Oswald, J., Zhao, D., Kobayashi, S., Schug, S., Caccia, M., Zucchet, N., and Sacramento, J. (2021). Learning where to learn: Gradient sparsity in meta and continual learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Wagner, G. P., Pavlicev, M., and Cheverud, J. M. (2007). The road to modularity. Nature Reviews Genetics, 8(12):921-931. DOI: 10.1038/nrg2267.

Wang, L., Zhang, M., Jia, Z., Li, Q., Bao, C., Ma, K., Zhu, J., and Zhong, Y. (2021). Afec: Active forgetting of negative transfer in continual learning. Advances in Neural Information Processing Systems, 34. Available online [link].

Wang, L., Zhang, X., Yang, K., Yu, L., Li, C., Hong, L., Zhang, S., Li, Z., Zhong, Y., and Zhu, J. (2022a). Memory replay with data compression for continual learning. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. DOI: 10.48550/arXiv.2202.06592.

Wang, Y., Huang, Z., and Hong, X. (2022b). S-prompts learning with pre-trained transformers: An occam's razor for domain incremental learning. In Conference on Neural Information Processing Systems (NeurIPS). Available online [link].

Wang, Z., Liu, L., Duan, Y., Kong, Y., and Tao, D. (2022c). Continual learning with lifelong vision transformer. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 171-181. DOI: 10.1109/CVPR52688.2022.00027.

Wang, Z., Liu, L., Kong, Y., Guo, J., and Tao, D. (2022d). Online continual learning with contrastive vision transformer. In Avidan, S., Brostow, G., Cissé, M., Farinella, G. M., and Hassner, T., editors, Computer Vision - ECCV 2022, pages 631-650, Cham. Springer Nature Switzerland. DOI: 10.1007/978-3-031-20044-1_36.

Wang, Z., Zhan, Z., Gong, Y., Yuan, G., Niu, W., Jian, T., Ren, B., Ioannidis, S., Wang, Y., and Dy, J. (2022e). SparCL: Sparse continual learning on the edge. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K., editors, Advances in Neural Information Processing Systems. Available online [link].

Widrow, B., Greenblatt, A., Kim, Y., and Park, D. (2013). The no-prop algorithm: A new learning algorithm for multilayer neural networks. Neural Networks, 37:182-188. DOI: 10.1016/j.neunet.2012.09.020.

Wu, C., Herranz, L., Liu, X., wang, y., van de Weijer, J., and Raducanu, B. (2018). Memory replay gans: Learning to generate new categories without forgetting. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc. Available online [link].

Wu, G., Gong, S., and Li, P. (2021). Striking a balance between stability and plasticity for class-incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1124-1133. Available online [link].

Wu, T., Caccia, M., Li, Z., Li, Y.-F., Qi, G., and Haffari, G. (2022). Pretrained language model in continual learning: A comparative study. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. Available online [link].

Wu, Y., Chen, Y., Wang, L., Ye, Y., Liu, Z., Guo, Y., and Fu, Y. (2019). Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Available online [link].

Xiang, Y., Fu, Y., Ji, P., and Huang, H. (2019). Incremental learning using conditional adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 6619-6628. Available online [link].

Xiong, F., Liu, Z., and Yang, X. (2018). Overcoming catastrophic forgetting with self-adaptive identifiers. In Cheng, L., Leung, A. C., and Ozawa, S., editors, Neural Information Processing - 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part III, volume 11303 of Lecture Notes in Computer Science, pages 497-505. Springer. DOI: 10.1007/978-3-030-04182-3_43.

Xu, J. and Zhu, Z. (2018). Reinforced continual learning. In Bengio, S., Wallach, H. M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montr'eal, Canada, pages 907-916. Available online [link].

Xue, M., Zhang, H., Song, J., and Song, M. (2022). Meta-attention for vit-backed continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 150-159. Available online [link].

Yan, S., Xie, J., and He, X. (2021). Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3014-3023. Available online [link].

Yang, Y., Yuan, H., Li, X., Lin, Z., Torr, P., and Tao, D. (2023). Neural collapse inspired feature-classifier alignment for few-shot class-incremental learning. In The Eleventh International Conference on Learning Representations. Available online [link].

Yang, Y., Zhou, D.-W., Zhan, D.-C., Xiong, H., and Jiang, Y. (2019). Adaptive deep models for incremental learning: Considering capacity scalability and sustainability. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '19, page 74–82, New York, NY, USA. Association for Computing Machinery. DOI: 10.1145/3292500.3330865.

Yin, H., Li, P., et al. (2021). Mitigating forgetting in online continual learning with neuron calibration. Advances in Neural Information Processing Systems, 34. Available online [link].

Yin, H., Molchanov, P., Alvarez, J. M., Li, Z., Mallya, A., Hoiem, D., Jha, N. K., and Kautz, J. (2020). Dreaming to distill: Data-free knowledge transfer via deepinversion. In The IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR). Available online [link].

Yoon, J., Madaan, D., Yang, E., and Hwang, S. J. (2022). Online coreset selection for rehearsal-based continual learning. In 10th International Conference on Learning Representations, ICLR 2022, Virtual, April 25 - April 29, 2022, Conference Track Proceedings. DOI: 10.48550/arXiv.2106.01085.

Yoon, J., Yang, E., Lee, J., and Hwang, S. J. (2018). Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations. DOI: 10.48550/arXiv.1708.01547.

Yoon, S. W., Kim, D.-Y., Seo, J., and Moon, J. (2020). XtarNet: Learning to extract task-adaptive representation for incremental few-shot learning. In III, H. D. and Singh, A., editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 10852-10860. PMLR. Available online [link].

Yu, L., Liu, X., and van de Weijer, J. (2022). Self-training for class-incremental semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems, 1:1-12. DOI: 10.1109/TNNLS.2022.3155746.

Yu, L., Twardowski, B., Liu, X., Herranz, L., Wang, K., Cheng, Y., Jui, S., and Weijer, J. v. d. (2020). Semantic drift compensation for class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6982-6991. Available online [link].

Zacarias, A. S. and Alexandre, L. A. (2018a). Improving sena-cnn by automating task recognition. In Yin, H., Camacho, D., Novais, P., and Tall'on-Ballesteros, A. J., editors, Intelligent Data Engineering and Automated Learning - IDEAL 2018 - 19th International Conference, Madrid, Spain, November 21-23, 2018, Proceedings, Part I, volume 11314 of Lecture Notes in Computer Science, pages 711-721. Springer. DOI: 10.1007/978-3-030-03493-1_74.

Zacarias, A. S. and Alexandre, L. A. (2018b). Sena-cnn: Overcoming catastrophic forgetting in convolutional neural networks by selective network augmentation. In Pancioni, L., Schwenker, F., and Trentin, E., editors, Artificial Neural Networks in Pattern Recognition - 8th IAPR TC3 Workshop, ANNPR 2018, Siena, Italy, September 19-21, 2018, Proceedings, volume 11081 of Lecture Notes in Computer Science, pages 102-112. Springer. DOI: 10.1007/978-3-319-99978-4_8.

Zbontar, J., Jing, L., Misra, I., LeCun, Y., and Deny, S. (2021). Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, pages 12310-12320. PMLR. Available online [link].

Zeng, G., Chen, Y., Cui, B., and Yu, S. (2019). Continual learning of context-dependent processing in neural networks. Nature Machine Intelligence, 1(8):364-372. DOI: 10.1038/s42256-019-0080-x.

Zenke, F., Poole, B., and Ganguli, S. (2017). Continual learning through synaptic intelligence. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML'17, page 3987–3995. JMLR.org. DOI: 10.5555/3305890.3306093.

Zhai, M., Chen, L., He, J., Nawhal, M., Tung, F., and Mori, G. (2020). Piggyback gan: Efficient lifelong learning for image conditioned generation. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.-M., editors, Computer Vision - ECCV 2020, pages 397-413, Cham. Springer International Publishing. DOI: 10.1007/978-3-030-58589-1_24.

Zhai, M., Chen, L., Tung, F., He, J., Nawhal, M., and Mori, G. (2019). Lifelong gan: Continual learning for conditional image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Available online [link].

Zhang, Y., Pfahringer, B., Frank, E., Bifet, A., Lim, N. J. S., and Jia, A. (2022). A simple but strong baseline for online continual learning: Repeated augmented rehearsal. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K., editors, Advances in Neural Information Processing Systems. Available online [link].

Zhou, D.-W., Wang, Q.-W., Ye, H.-J., and Zhan, D.-C. (2023). A model or 603 exemplars: Towards memory-efficient class-incremental learning. In The Eleventh International Conference on Learning Representations. Available online [link].

Zhu, F., Cheng, Z., Zhang, X.-y., and Liu, C.-l. (2021a). Class-incremental learning via dual augmentation. Advances in Neural Information Processing Systems, 34. Available online [link].

Zhu, F., Zhang, X.-Y., Wang, C., Yin, F., and Liu, C.-L. (2021b). Prototype augmentation and self-supervision for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5871-5880. Available online [link].

Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2021). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43-76. DOI: 10.1109/JPROC.2020.3004555.

Downloads

Published

2024-08-06

How to Cite

Aleixo, E. L., Colonna, J. G., Cristo, M., & Fernandes, E. (2024). Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy. Journal of the Brazilian Computer Society, 30(1), 175–211. https://doi.org/10.5753/jbcs.2024.3966

Issue

Section

Articles