A Review of Multi-Layer Semantic Chunking and Enhanced Retrieval-Augmented Generation Frameworks

Pratiksha

Authors

Pratiksha Department of Computer Science, PCTE Institute of Engineering and Technology, Ludhiana, Punjab, India

Keywords:

Retrieval-Augmented Generation, Large Language Models, Semantic Chunking, Dense Retrieval, Embeddings, Knowledge-Intensive NLP

Abstract

Retrieval-Augmented Generation (RAG) has become a paradigmatic model of improving the discursive quality and truthfulness of facts, the transparency and domain flexibility of Large Language Models (LLMs) by condensing external knowledge retrieval during generation. Although the traditional RAG architectures have shown significant advancement over the purely parametric language models, they still have marked limitations in regard to fixed-size chunking of text, shallow semantic representations, and access to irrelevant or noisy information. These deficiencies are more acute in long-context, narrative-based, and multi-hop reasoning tasks. This review article conducts an in-depth analysis of the current developments in the field of RAG architectures, in particular, semantic-sensitive retrieval schemes, such as multi-layer semantic chunking, improved embedding algorithms, and smart outlier treatment schemes. The MERCED RAG framework, as presented in recent literature, receives special attention as one of the categories of next-generation RAG systems that seek to address the shortcomings of traditional solutions with hierarchical representations and relevance-sensitive retrieval pipelines. The article summarizes the available methodologies, experimental results, evaluation measures, shortcomings, and research gaps and presents an organized view of future possibilities of scalable, robust, and reliable retrieval-enhanced generation systems.

DOI: https://doi.org/10.24321/3051.4304.202509

References

T. Brown, B. Mann, N. Ryder, et al., “Language Models Are Few-Shot Learners,” Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 1877–1901, 2020.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Proceedings of NAACL-HLT, pp. 4171–4186, 2019.

J. Ji, N. Amiri, D. D. Agarwal, et al., “Survey of Hallucination in Natural Language Generation,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023.

A Review of Multi-Layer Semantic Chunking and Enhanced Retrieval-Augmented Generation Frameworks

Authors

Keywords:

Abstract

References

Published

Issue

Section

Make a Submission