• Nits-AI
  • Posts
  • Pre-Retrieval Techniques in RAG: Query Expansion and Query Rewriting

Pre-Retrieval Techniques in RAG: Query Expansion and Query Rewriting

Enhancing RAG System Efficiency with Query Transformation Techniques

Query transformation in RAG systems can be important in optimizing the system—Query expansion broadens the horizon, while query rewriting sharpens the focus, ensuring no relevant insight is left behind.

Every RAG system requires three fundamental components: the user query, the retriever, and the generator.

However, creating an efficient RAG system goes beyond these three components and involves optimizing and enhancing each of them. One particularly interesting optimization lies in pre-retrieval techniques, specifically query transformation methods.

Among these methods, two popular techniques stand out:

  • Query Rewriting

  • Query Expansion

Query Expansion

Query expansion involves enhancing the original user query by adding additional terms, phrases, or contextually relevant information. The goal is to increase recall by broadening the query’s scope and ensuring that semantically related documents are retrieved.

A general example is 

User query: car repairvehicle maintenance, auto services

Benefits of Query Expansion

  • Improves Recall: Ensures more relevant documents are retrieved, especially those using alternate terminology.

  • Mitigates Vocabulary Mismatch: Bridges gaps between user queries and document terms.

  • Enhances Semantic Understanding: Adds context that improves downstream tasks like document ranking and generation.

Challenges of Query Expansion

  • Over-expansion can reduce precision by introducing irrelevant terms.

  • Increased computational overhead due to additional terms.

Query Rewriting

Query rewriting involves transforming or rephrasing the original user query to improve clarity, specificity, or compatibility with the retrieval system. This process often focuses on making the query better aligned with the format or logic expected by the retrieval mechanism.

A general example is

User query: fix carWhere can I repair my car near me?

Benefits of Query Rewriting

  • Improves Precision: Refines the query to better align with user intent.

  • Enhances Retrieval Compatibility: Adapts queries to specific system requirements.

  • Reduces Ambiguity: Helps ensure the system retrieves exactly what the user needs.

Challenges of Query Rewriting

  • Requires sophisticated models or rules to infer intent accurately.

  • May unintentionally narrow the scope of the query, missing relevant results

Key Differences Between Query Expansion and Query Rewriting

Aspect

Query Expansion

Query Rewriting

Definition

Adding related terms or phrases to broaden the query’s scope.

Modifying the query structure or content to improve clarity or alignment.

Focus

Increasing recall by broadening the search.

Improving precision and query compatibility.

Example (General)

"car repair" → "vehicle maintenance, auto service."

"fix car" → "Where can I repair my car near me?"

When to Use Each Technique

  • Use Query Expansion when:

    • The query is too narrow and risks missing relevant documents.

    • Semantic diversity is important for comprehensive retrieval.

  • Use Query Rewriting when:

    • The query is ambiguous, poorly structured, or lacks clarity.

    • Alignment with the retrieval system’s format or expectations is required.

Conclusion

Both query expansion and query rewriting are vital pre-retrieval techniques that significantly enhance the performance of information retrieval systems, especially in RAG pipelines. While query expansion focuses on increasing recall by broadening the scope of the query, query rewriting emphasizes improving precision by refining and optimizing the query structure. Together, these techniques help bridge the gap between user intent and system retrieval capabilities, ensuring high-quality outputs for downstream tasks.

Reply

or to participate.