Long-Text Abstractive Summarization using Transformer Models: A Systematic Review
DOI:
https://doi.org/10.5753/jbcs.2025.5786Keywords:
Long text summarization, Transformer models, Text summarization, Long documents, Systematic literature reviewAbstract
Transformer models have significantly advanced abstractive summarization, achieving near-human performance. However, while effective for short texts, long-text summarization remains a challenge. This systematic review analyzes 56 studies on transformer-based long-text abstractive summarization published between 2017 and 2024, following predefined inclusion criteria. Findings indicate that 69.64% of studies adopt a hybrid approach while 30.36% focus on improving transformer attention mechanisms. News articles and scientific papers are the most studied domains, with widely used datasets including CNN/Daily Mail, PubMed, arXiv, GovReport, QMSum, and XSum. ROUGE is the dominant evaluation metric (61%), followed by BERTScore (20%), with others such as BARTScore, human evaluation, METEOR, and BLEU-4 also used. Despite progress, challenges persist, including contextual information loss, high computational costs, implementation complexity, lack of standardized evaluation metrics, and limited model generalization. These findings highlight the need for more robust hybrid approaches, efficient attention mechanisms, and standardized evaluation frameworks to enhance long-text abstractive summarization. This review provides a comprehensive analysis of existing methods, datasets, and evaluation techniques, identifying research gaps and offering insights for future advancements in transformer-based long-text abstractive summarization.
Downloads
References
Aksenov, D., Moreno-Schneider, J., Bourgonje, P., Schwarzenberg, R., Hennig, L., and Rehm, G. (2020). Abstractive text summarization based on language model conditioning and locality modeling. arXiv preprint arXiv:2003.13027. DOI: 10.48550/arxiv.2003.13027.
Akter, M., Çano, E., Weber, E., Dobler, D., and Habernal, I. (2025). A comprehensive survey on legal summarization: Challenges and future directions. arXiv preprint arXiv:2501.17830. DOI: 10.48550/arxiv.2501.17830.
Altmami, N. I. and Menai, M. E. B. (2022). Automatic summarization of scientific articles: A survey. Journal of King Saud University-Computer and Information Sciences, 34(4):1011-1028. DOI: 10.1016/j.jksuci.2020.04.020.
Angioi, M. and Hiller, C. E. (2023). Systematic literature reviews. Research Methods in the Dance Sciences, M. Angioi and CE Hiller, Eds. University Press of Florida, pages 265-280. DOI: 10.2307/j.ctv33jb41z.24.
Bajaj, A., Dangati, P., Krishna, K., Kumar, P. A., Uppaal, R., Windsor, B., Brenner, E., Dotterrer, D., Das, R., and McCallum, A. (2021). Long document summarization in a low resource setting using pretrained language models. arXiv preprint arXiv:2103.00751. DOI: 10.18653/v1/2021.acl-srw.7.
Benedetto, I., La Quatra, M., Cagliero, L., Vassio, L., and Trevisan, M. (2024). Tasp: Topic-based abstractive summarization of facebook text posts. Expert Systems with Applications, 255:124567. DOI: 10.1016/j.eswa.2024.124567.
Bettayeb, M., Halawani, Y., Khan, M. U., Saleh, H., and Mohammad, B. (2024). Efficient memristor accelerator for transformer self-attention functionality. Scientific Reports, 14(1):24173. DOI: 10.1038/s41598-024-75021-z.
Calizzano, R., Ostendorff, M., Ruan, Q., and Rehm, G. (2022). Generating extended and multilingual summaries with pre-trained transformers. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1640-1650. Available online [link].
Cao, S. and Wang, L. (2022). Hibrids: Attention with hierarchical biases for structure-aware long document summarization. arXiv preprint arXiv:2203.10741. DOI: 10.18653/v1/2022.acl-long.58.
Cao, S. and Wang, L. (2023). Awesome: Gpu memory-constrained long document summarization using memory mechanism and global salient content. arXiv preprint arXiv:2305.14806. DOI: 10.18653/v1/2024.naacl-long.330.
Chandrashekar, G. and Sahin, F. (2014). A survey on feature selection methods. Computers & electrical engineering, 40(1):16-28. DOI: 10.1016/j.compeleceng.2013.11.024.
Chaves, A., Kesiku, C., and García-Zapirain, B. (2022). Automatic text summarization of biomedical text data: A systematic review.
Chen, W. and Iwaihara, M. (2023). Efficient summarization of long documents using hybrid extractive-abstractive method. Available online [link].
Chen, Y., Wan, Z., Li, Y., He, X., Wei, X., and Han, J. (2024). Graph curvature flow-based masked attention. Journal of Chemical Information and Modeling, 64(21):8153-8163. DOI: 10.1021/acs.jcim.4c01616.
Chu, C.-L., Chen, Y.-C., Cheng, W., Lin, C., and Chang, Y.-H. (2024). Attentionrc: A novel approach to improve locality sensitive hashing attention on dual-addressing memory. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 43(11):3925-3936. DOI: 10.1109/tcad.2024.3447217.
Dai, W. and He, Q. (2024). Automatic summarization model based on clustering algorithm. Scientific Reports, 14(1):15302. DOI: 10.21203/rs.3.rs-3992927/v1.
Dat, D. H., Anh, D. D., Luu, A. T., and Buntine, W. (2024). Discrete diffusion language model for long text summarization. arXiv e-prints, pages arXiv-2407. DOI: 10.48550/arxiv.2407.10998.
Ezzat, Z., Khalfallah, A., and Khoriba, G. (2024). Fused transformers: Fused information of arabic long article for summarization. Procedia Computer Science, 244:96-104. DOI: 10.1016/j.procs.2024.10.182.
Fikri, F. B., Oflazer, K., and Yanıkoğlu, B. (2024). Abstractive summarization with deep reinforcement learning using semantic similarity rewards. Natural Language Engineering, 30(3):554-576. DOI: 10.1017/s1351324923000505.
Fu, X. (2024). Transformer models in text summarization. Applied and Computational Engineering, 101(1):35-41. DOI: 10.54254/2755-2721/101/20240946.
Gokhan, T., Price, M. J., and Lee, M. (2024). Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models. Artificial Intelligence Review, 57(7):189. DOI: 10.1007/s10462-024-10828-w.
Hardy, H., Ballesteros, M., Ladhak, F., Khalifa, M., Castelli, V., and McKeown, K. (2022). Novel chapter abstractive summarization using spinal tree aware sub-sentential content selection. arXiv preprint arXiv:2211.04903. DOI: 10.48550/arxiv.2211.04903.
He, Y. (2024). Research on the optimization model of semantic coherence and fluency in language translation. Applied Mathematics and Nonlinear Sciences, 9(1). DOI: 10.2478/amns-2024-2769.
Huang, L., Cao, S., Parulian, N., Ji, H., and Wang, L. (2021a). Efficient attentions for long document summarization. arXiv preprint arXiv:2104.02112. DOI: 10.18653/v1/2021.naacl-main.112.
Huang, Y., Li, Z., Chen, Z., Zhang, C., and Ma, H. (2024). Sentence salience contrastive learning for abstractive text summarization. Neurocomputing, 593:127808. DOI: 10.1016/j.neucom.2024.127808.
Huang, Y., Sun, L., Han, C., and Guo, J. (2023). A high-precision two-stage legal judgment summarization. Mathematics, 11(6):1320. DOI: 10.3390/math11061320.
Huang, Y., Yu, Z., Guo, J., Xiang, Y., and Xian, Y. (2021b). Element graph-augmented abstractive summarization for legal public opinion news with graph transformer. Neurocomputing, 460:166-180. DOI: 10.1016/j.neucom.2021.07.013.
Jain, D., Borah, M. D., and Biswas, A. (2021). Summarization of legal documents: Where are we now and the way forward. Computer Science Review, 40:100388. DOI: 10.1016/j.cosrev.2021.100388.
Jain, D., Borah, M. D., and Biswas, A. (2024). Summarization of lengthy legal documents via abstractive dataset building: An extract-then-assign approach. Expert Systems with Applications, 237:121571. DOI: 10.1016/j.eswa.2023.121571.
Jeeson-Daniel, A., Lin, C., Mila-Quebec AI Institute, and Lau, E. (2021). Abstractive summarization using longformer-pegasus. Available online [link].
Karlbom, H. and Clifton, A. (2020). Abstractive podcast summarization using bart with longformer attention. In The 29th Text Retrieval Conference (TREC) notebook. NIST. Available online [link].
Kashyap, P. (2022). Coling 2022 shared task: Led finteuning and recursive summary generation for automatic summarization of chapters from novels. In Proceedings of The Workshop on Automatic Summarization for Creative Writing, pages 19-23. Available online [link].
Kirstein, F., Wahle, J. P., Gipp, B., and Ruas, T. (2025). Cads: A systematic literature review on the challenges of abstractive dialogue summarization. Journal of Artificial Intelligence Research, 82:313-365. DOI: 10.1613/jair.1.16674.
Kiruluta, A., Lemos, A., and Lundy, E. (2021). New approaches to long document summarization: Fourier transform based attention in a transformer model. arXiv preprint arXiv:2111.15473. DOI: 10.48550/arXiv.2111.15473.
Koh, H. Y., Ju, J., Liu, M., and Pan, S. (2023). An empirical survey on long document summarization: Datasets, models, and metrics. ACM computing surveys, 55(8):1–35. DOI: 10.1145/3545176.
Kumar, S., Kohli, G. S., Shinde, K., and Ekbal, A. (2022). Team ainlpml@ mup in sdp 2021: scientific document summarization by end-to-end extractive and abstractive approach. In Proceedings of the Third Workshop on Scholarly Document Processing, pages 285-290. Available online [link].
Liao, P., Zhang, C., Chen, X., and Zhou, X. (2020). Improving abstractive text summarization with history aggregation. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1-9. IEEE. DOI: 10.1109/ijcnn48605.2020.9207502.
Lim, J. and Song, H.-J. (2023). Improving multi-stage long document summarization with enhanced coarse summarizer. In Proceedings of the 4th New Frontiers in Summarization Workshop, pages 135-144. DOI: 10.18653/v1/2023.newsum-1.13.
Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74-81. Available online [link].
Liu, D., Hong, X., Lin, P.-J., Chang, E., and Demberg, V. (2022a). Two-stage movie script summarization: An efficient method for low-resource long document summarization. In Proceedings of The Workshop on Automatic Summarization for Creative Writing, pages 57-66. Available online [link].
Liu, S., Cao, J., Li, Y., Yang, R., and Wen, Z. (2024). Low-resource court judgment summarization for common law systems. Information Processing & Management, 61(5):103796. DOI: 10.1016/j.ipm.2024.103796.
Liu, X. and Xu, Y. (2023). Learning to rank utterances for query-focused meeting summarization. arXiv preprint arXiv:2305.12753. DOI: 10.18653/v1/2023.findings-acl.538.
Liu, Y., Ni, A., Nan, L., Deb, B., Zhu, C., Awadallah, A. H., and Radev, D. (2022b). Leveraging locality in abstractive text summarization. arXiv preprint arXiv:2205.12476. DOI: 10.18653/v1/2022.emnlp-main.408.
Liu, Z. and Chen, N. F. (2022). Dynamic sliding window modeling for abstractive meeting summarization. In INTERSPEECH, pages 5150-5154. Available online [link].
Lu, G., Larcher, S. B., and Tran, T. (2023). Hybrid long document summarization using c2f-far and chatgpt: A practical study. arXiv preprint arXiv:2306.01169. DOI: 10.48550/arxiv.2306.01169.
Ma, Y. and Zong, L. (2020). Neural abstractive multi-document summarization: Hierarchical or flat structure? In Proceedings of the Second International Workshop of Discourse Processing, pages 29-37. DOI: 10.18653/v1/2020.iwdp-1.6.
Madan, S., Lentzen, M., Brandt, J., Rueckert, D., Hofmann-Apitius, M., and Fröhlich, H. (2024). Transformer models in biomedicine. BMC medical informatics and decision making, 24(1):214. DOI: 10.1186/s12911-024-02600-5.
Mei, A., Kabir, A., Bapat, R., Judge, J., Sun, T., and Wang, W. Y. (2022). Learning to prioritize: Precision-driven sentence filtering for long text summarization. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 313-318. Available online [link].
Moro, G. and Ragazzi, L. (2023). Align-then-abstract representation learning for low-resource summarization. Neurocomputing, 548:126356. DOI: 10.1016/j.neucom.2023.126356.
Nguyen, H. and Ding, J. (2023). Keyword-based augmentation method to enhance abstractive summarization for legal documents. In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, pages 437-441. DOI: 10.1145/3594536.3595120.
Obonyo, I., Casola, S., and Saggion, H. (2022). Exploring the limits of a base bart for multi-document summarization in the medical domain. In Proceedings of the Third Workshop on Scholarly Document Processing, pages 193-198. Available online [link].
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., and Moher, D. (2021). Updating guidance for reporting systematic reviews: development of the prisma 2020 statement. Journal of clinical epidemiology, 134:103-112. DOI: 10.31222/osf.io/jb4dx.
Pang, B., Nijkamp, E., Kryściński, W., Savarese, S., Zhou, Y., and Xiong, C. (2022). Long document summarization with top-down and bottom-up inference. arXiv preprint arXiv:2203.07586. DOI: 10.18653/v1/2023.findings-eacl.94.
Pant, M. and Chopra, A. (2022). Multilingual financial documentation summarization by team_tredence for fns2022. In Proceedings of the 4th Financial Narrative Processing Workshop@ LREC2022, pages 112-115. Available online [link].
Patel, D., Saxena, A. K., and Dubey, A. (2025). A particle swarm optimization based model for feature selection. In Innovative and Intelligent Digital Technologies; Towards an Increased Efficiency: Volume 2, pages 329-340. Springer. DOI: 10.1007/978-3-031-71649-2_28.
Petticrew, M. and Roberts, H. (2008). Systematic reviews in the social sciences: A practical guide. John Wiley & Sons. Book.
Pilault, J., Li, R., Subramanian, S., and Pal, C. (2020). On extractive and abstractive neural document summarization with transformer language models. In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pages 9308-9319. DOI: 10.18653/v1/2020.emnlp-main.748.
Pu, D., Wang, Y., and Demberg, V. (2024). Incorporating distributions of discourse structure for long document abstractive summarization. arXiv preprint arXiv:2305.16784. DOI: 10.18653/v1/2023.acl-long.306.
Rahman, S., Labib, M. A., Murad, H., and Das, U. (2024). Cuet_sstm at the gem’24 summarization task: Integration of extractive and abstractive method for long text summarization in swahili language. In Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges, pages 112-117. DOI: 10.18653/v1/2024.inlg-genchal.12.
Ranggianto, N. A., Purwitasari, D., Fatichah, C., and Sholikah, R. W. (2023). Abstractive and extractive approaches for summarizing multi-document travel reviews. Jurnal RESTI, 7(6):1464-1475. DOI: 10.29207/resti.v7i6.5170.
Rennard, V., Shang, G., Hunter, J., and Vazirgiannis, M. (2023). Abstractive meeting summarization: A survey. Transactions of the Association for Computational Linguistics, 11:861-884. DOI: 10.1162/tacl_a_00578.
Rethlefsen, M. L., Kirtley, S., Waffenschmidt, S., Ayala, A. P., Moher, D., Page, M. J., and Koffel, J. B. (2021). Prisma-s: an extension to the prisma statement for reporting literature searches in systematic reviews. Systematic reviews, 10(1):39. DOI: 10.31219/osf.io/sfc38.
Saleh, M. E., Wazery, Y. M., and Ali, A. A. (2024). A systematic literature review of deep learning-based text summarization: Techniques, input representation, training strategies, mechanisms, datasets, evaluation, and challenges. Expert Systems with Applications, 252:124153. DOI: 10.1016/j.eswa.2024.124153.
Sanchan, N. (2024). Comparative study on automated reference summary generation using bert models and rouge score assessment. Journal of Current Science and Technology, 14(2):26-26. DOI: 10.59796/jcst.v14n2.2024.26.
Saxena, R. and Keller, F. (2024). Select and summarize: Scene saliency for movie script summarization. arXiv preprint arXiv:2404.03561. DOI: 10.18653/v1/2024.findings-naacl.218.
Steblianko, O., Shymkovych, V., Kravets, P., Novatskyi, A., and Shymkovych, L. (2024). Scientific article summarization model with unbounded input length. Information, Computing and Intelligent systems, (5):150-158. DOI: 10.20535/2786-8729.5.2024.314724.
Sun, W., Fang, C., Chen, Y., Zhang, Q., Tao, G., You, Y., Han, T., Ge, Y., Hu, Y., Luo, B., et al. (2024). An extractive-and-abstractive framework for source code summarization. ACM Transactions on Software Engineering and Methodology, 33(3):1-39. DOI: 10.1145/3632742.
Tretyak, V. and Stepanov, D. (2020). Combination of abstractive and extractive approaches for summarization of long scientific texts. arXiv preprint arXiv:2006.05354. DOI: 10.48550/arxiv.2006.05354.
Ulker, M. and Ozer, A. B. (2024). Abstractive summarization model for summarizing scientific article. IEEE Access, 12:91252-91262. DOI: 10.1109/access.2024.3420163.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. DOI: 10.48550/arxiv.1706.03762.
Wang, G., Garg, P., and Wu, W. (2024a). Segmented summarization and refinement: A pipeline for long-document analysis on social media. Journal of Social Computing, 5(2):132-144. DOI: 10.23919/jsc.2024.0010.
Wang, T., Yang, C., Zou, M., Liang, J., Xiang, D., Yang, W., Wang, H., and Li, J. (2024b). A study of extractive summarization of long documents incorporating local topic and hierarchical information. Scientific Reports, 14(1):10140. DOI: 10.1038/s41598-024-60779-z.
Wibawa, A. P., Kurniawan, F., et al. (2024). A survey of text summarization: Techniques, evaluation and challenges. Natural Language Processing Journal, 7:100070. DOI: 10.1016/j.nlp.2024.100070.
Wilman, P., Atara, T., and Suhartono, D. (2024). Abstractive english document summarization using bart model with chunk method. Procedia Computer Science, 245:1010-1019. DOI: 10.1016/j.procs.2024.10.329.
Wu, W., Li, W., Xiao, X., Liu, J., Cao, Z., Li, S., Wu, H., and Wang, H. (2021). Bass: Boosting abstractive summarization with unified semantic graph. arXiv preprint arXiv:2105.12041. DOI: 10.18653/v1/2021.acl-long.472.
Wu, Y., Li, H., Nenadic, G., and Zeng, X.-J. (2024). Extract-and-abstract: Unifying extractive and abstractive summarization within single encoder-decoder framework. arXiv preprint arXiv:2409.11827. DOI: 10.48550/arxiv.2409.11827.
Wu, Y.-H., Lin, Y.-J., and Kao, H.-Y. (2023). Ikm_lab at biolaysumm task 1: Longformer-based prompt tuning for biomedical lay summary generation. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 602-610. DOI: 10.18653/v1/2023.bionlp-1.64.
Xue, B., Zhang, M., Browne, W. N., and Yao, X. (2015). A survey on evolutionary computation approaches to feature selection. IEEE Transactions on evolutionary computation, 20(4):606-626. DOI: 10.26686/wgtn.14214497.v1.
Ying, S., Zhao, Z. Y., and Zou, W. (2021). Longsumm 2021: Session based automatic summarization model for scientific document. In Proceedings of the Second Workshop on Scholarly Document Processing, pages 97-102. DOI: 10.18653/v1/2021.sdp-1.12.
You, Z., Radhakrishna, S., Ming, S., and Kilicoglu, H. (2024). Uiuc_bionlp at biolaysumm: an extract-then-summarize approach augmented with wikipedia knowledge for biomedical lay summarization. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pages 132-143. DOI: 10.18653/v1/2024.bionlp-1.11.
Yu, T., Ji, Z., and Fung, P. (2023). Improving query-focused meeting summarization with query-relevant knowledge. arXiv preprint arXiv:2309.02105. DOI: 10.18653/v1/2023.findings-ijcnlp.5.
Yuan, R., Wang, Z., Cao, Z., and Li, W. (2023). Preserve context information for extract-generate long-input summarization framework. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13932-13939. DOI: 10.1609/aaai.v37i11.26631.
Zhang, M., Lu, J., Yang, J., Zhou, J., Wan, M., and Zhang, X. (2024). From coarse to fine: Enhancing multi-document summarization with multi-granularity relationship-based extractor. Information Processing & Management, 61(3):103696. DOI: 10.1016/j.ipm.2024.103696.
Zhang, M., Zhou, G., Yu, W., and Liu, W. (2021). Ki-habs: Key information guided hierarchical abstractive summarization. KSII Transactions on Internet & Information Systems, 15(12). Available online [link].
Zhang, X., Meng, K., and Liu, G. (2019). Hie-transformer: a hierarchical hybrid transformer for abstractive article summarization. In International Conference on Neural Information Processing, pages 248-258. Springer. DOI: 10.1007/978-3-030-36718-3_21.
Zhang, Y., Ni, A., Mao, Z., Wu, C. H., Zhu, C., Deb, B., Awadallah, A. H., Radev, D., and Zhang, R. (2022). Summ^ n: A multi-stage summarization framework for long input dialogues and documents. arXiv preprint arXiv:2110.10150. DOI: 10.48550/arXiv.2110.10150.
Zhao, Y., Saleh, M., and Liu, P. J. (2020). Seal: Segment-wise extractive-abstractive long-form text summarization. arXiv preprint arXiv:2006.10213. DOI: 10.48550/arxiv.2006.10213.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Abubakar Salisu Bashir, Abdulkadir Abubakar Bichi, Usman Mahmud, Abdulrahman Mohammed Bello

This work is licensed under a Creative Commons Attribution 4.0 International License.

