••
7 min read
Complete RAG Guide: 4 Methods to Connect Your Agents with Data
Learn when to use filters, SQL, full context, or vectors so your AI agents respond accurately.
ai-agentsragguiden8nautomationvector-database
Table of Contents(11 sections)
On This Page
Complete RAG Guide: 4 Methods to Connect Your Agents with Data
When your AI agent doesn't respond correctly, the problem is almost always in how it accesses data. This guide teaches you the 4 main RAG (Retrieval-Augmented Generation) methods and when to use each one for accurate responses.
Important: Not everything needs a vector database. Choosing the right method can make your agent faster, cheaper, and more accurate.
Prerequisites
Before starting, you'll need:
- Basic knowledge of AI agents
- Familiarity with tools like n8n, LangChain, or similar
- Understanding the difference between structured and unstructured data
- Estimated reading time: 15 minutes
The Most Common Mistake
When developers discover their agent needs external data, they run straight to implementing a vector database. But this can be a costly mistake.
The Problem with Vectors
Vector databases work by splitting documents into "chunks" and searching semantically among them. This causes several problems:
- Loss of context: The agent doesn't understand the complete document
- No metadata: It doesn't know which document each chunk comes from
- Bad for tabular data: Can't calculate averages, totals, or trends
- Incomplete summaries: It only summarizes the chunks it found, not the entire document
Real Example of the Problem
Imagine you have sales data and ask: "What week did we have the most sales?"
With chunk-based retrieval:
- The agent searches for "most sales" semantically
- Finds a chunk with some weeks
- Responds "Week 6" (the best in that chunk)
- But weeks 4, 14, and 19 had more sales - they were in other chunks
Method 1: Filters
The simplest and most underrated method. Works like Excel spreadsheet filters.
When to Use It
- Structured data in rows and columns
- You know exactly which fields you want to filter
- The question is answered with a small subset of records
Practical Example
Question: "How many Bluetooth speakers did we sell on September 16th?"
Agent process:
- Filter
product = "Bluetooth Speaker" - Filter
date = "2024-09-16" - Sum the quantities
Advantages
| Aspect | Benefit |
|---|---|
| Speed | Very fast |
| Cost | Very cheap (few tokens) |
| Accuracy | High (exact search) |
| Scalability | Good for large datasets |
Important Configuration
In the system prompt, you need to specify valid options:
Valid products: ["Wireless Headphones", "Bluetooth Speaker", "Phone Case"]
Date format: YYYY-MM-DD
If the agent writes "bluetooth speaker" (lowercase), the filter won't work because it's not semantic search, it's exact matching.
Golden rule: If a human would use filters in Excel to answer, use filters in your agent.
Method 2: SQL Queries
When you need the database to do the heavy lifting: calculations, groupings, sorting.
When to Use It
- You need totals, averages, rankings, or trends
- The question involves many rows
- You need to combine or compare data from multiple tables
Practical Example
Question: "What are our 3 most profitable products?"
Query generated by the agent:
SELECT product, SUM(total_price) as total_revenue
FROM sales_data
GROUP BY product
ORDER BY total_revenue DESC
LIMIT 3;
SQL does all the work: sums, groups, sorts, and limits. The agent only interprets the result.
Advantages over Filters
- The database does the calculations (more reliable than AI)
- Can process millions of rows without bringing them all to the agent
- Cheaper because it sends less data to the model
System Prompt Configuration
Available tables: sales_data
Columns: order_id, customer_name, product, quantity, unit_price, total_price, date
Examples of valid queries:
- SELECT product, COUNT(*) FROM sales_data GROUP BY product
- SELECT AVG(total_price) FROM sales_data WHERE date > '2024-01-01'
Golden rule: If a human would use pivot tables or formulas, use SQL.
Method 3: Full Context
Sometimes, the best solution is to let the agent read the entire document.
When to Use It
- You need summaries, timelines, or step-by-step explanations
- The order of information matters
- The dataset is small enough to fit in the context window
3 Ways to Implement It
1. Tools to choose documents
The agent has tools to select which documents to read:
Available tools:
- read_transcript_video_a()
- read_transcript_video_b()
Advantage: Only reads what it needs.
2. Direct context in the prompt
System: You have access to these documents:
[DOCUMENT 1]
{complete content of document 1}
[DOCUMENT 2]
{complete content of document 2}
Advantage: Faster responses (doesn't call tools). Disadvantage: Always processes all tokens.
3. Dynamically loaded documents
Each time the agent responds, updated documents are loaded as variables.
Advantage: Always updated content without editing the prompt.
Cost Comparison
| Implementation | Average Tokens |
|---|---|
| Tools (1 doc) | ~4,000 |
| Everything in prompt | ~6,500+ |
| Vector chunks | ~2,600 |
The difference grows exponentially with more documents.
Golden rule: If a human would read the whole document before answering, the agent should too.
Method 4: Vector Database
The most well-known method, but not always the best. Ideal for specific searches in large volumes of data.
How It Works
- Chunking: Documents are divided into fragments (e.g., 500 tokens each)
- Embedding: Each chunk is converted into a numerical vector
- Semantic search: The agent searches for chunks similar to the question
- Retrieval: The N most relevant chunks are returned
When to Use It
- Very large knowledge bases (thousands of documents)
- Specific questions answered with isolated fragments
- FAQs where one answer doesn't depend on another
- When cost and speed matter more than complete context
When NOT to Use It
- Tabular data (sales, metrics, inventory)
- When you need summaries of complete documents
- When order or sequence matters
- Comparisons between different parts of the same document
Improving Results
If you decide to use vectors, you can improve accuracy with:
Metadata tagging:
{
"chunk_id": "doc1_chunk_15",
"source": "user_manual.pdf",
"page": 12,
"section": "Initial Setup"
}
Increase chunk limit: Instead of bringing back 4 chunks, bring back 10-20 to give more context.
Hybrid search: Combine semantic search with keyword search.
How to Choose the Right Method
Decision Tree
Is your data structured (tables/rows)?
├── YES → Do you need complex calculations?
│ ├── YES → Use SQL
│ └── NO → Use Filters
└── NO → Is the document short (<10 pages)?
├── YES → Does order/context matter?
│ ├── YES → Use Full Context
│ └── NO → Use Vectors
└── NO → Are you looking for specific answers?
├── YES → Use Vectors
└── NO → Consider splitting into smaller documents
Quick Summary
| Method | Best For | Avoid When |
|---|---|---|
| Filters | Simple tabular data, exact searches | You need calculations or free text |
| SQL | Calculations, rankings, trends, aggregations | Unstructured data |
| Full Context | Summaries, order matters, short docs | Very large datasets |
| Vectors | Searches in large volumes, FAQs | Tabular data, complete summaries |
Context Engineering: The 5 Pillars
Beyond the method you choose, these principles apply to any implementation:
1. Start with the End Goal
Before building, ask yourself:
- What type of questions will this agent receive?
- What data does it need to see to respond correctly?
- How would I measure if the response is good?
2. Design Your Data Pipeline
- Where does the data come from?
- How often is it updated?
- How do you ensure it's clean?
3. Ensure Accuracy
Garbage in, garbage out. If your data has errors, your agent will inherit them.
4. Optimize the Context Window
Fewer tokens = cheaper + fewer hallucinations + faster responses.
Always ask yourself: How can I give the agent only what it needs?
5. Specialize Your Agents
An agent that does everything does everything poorly. Consider having:
- Sales agent (SQL)
- Support agent (Vectors)
- Onboarding agent (Full context)
Next Steps
Now that you understand the 4 methods:
- Audit your current implementation: Are you using the right method?
- Experiment with alternatives: Try filters or SQL before going to vectors
- Measure results: Compare accuracy, cost, and speed between methods
Additional Resources
Questions? Join our Discord community to discuss RAG implementations.
Related content
- 📘 n8n Complete Beginners Guide — Learn n8n to build no-code RAG pipelines
- 📘 Create an AI News Digest with n8n — Practical example of a workflow using AI data processing
- 📘 Prompt Engineering for Claude: Best Practices — Optimize the prompts your RAG pipelines use
- 📝 Build a Documentation Chatbot with Claude and RAG — Hands-on tutorial applying these RAG methods
Was this helpful?
Share this content
0comments