Vergleich von lexikalischer und semantischer Suche in einem Retrieval Augmented Generation (RAG) System

Loading...
Thumbnail Image
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Nowadays, searching for information in documents is particularly challenging due to the enormous amount of stored files. Our thesis presents a company-specific Retrieval-Augmented Generation (RAG) system that replaces the conventional full-text search. It allows users to ask questions to which the system searches for answers in a large collection of files. The system contains two databases as sources of information: Elasticsearch for a lexical search, and Milvus for a vector search. These are filled with identical data for comparison and analysed in the course of this thesis. Files that are to be provided for the search are stored in a folder and automatically extracted, transformed and inserted into the databases by a service. This folder is also monitored and changes are automatically transferred. If a user asks a question in the front end, significant word groups are determined with the help of artificial intelligence. After further processing, the search is carried out and the system receives a section of a file from the information sources. This section is sent back to the artificial intelligence and the user question is answered in natural language. A web interface has been developed as the GUI. Only authorised people have access. Users have their own chats and can ask questions. It is also possible to expand the context of the search in the background with your own text. The individual system components are hosted in Docker containers. Centralised control takes place via the backend, which provides a REST API and processes all HTTP requests.
Description
Keywords
Citation
Collections