Simple RAG Tutorial
Welcome to your first RAG implementation! In this tutorial, you'll build a basic Retrieval-Augmented Generation system using LlamaIndex.
What You'll Build
A simple RAG system that can:
- Load and index documents
- Answer questions based on document content
- Provide source attribution
Prerequisites
- ✅ Environment Setup completed
- ✅ API keys configured
- ✅ Basic Python knowledge
Step 1: Import Dependencies
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
Step 2: Configure Settings
# Configure LLM and embedding model
Settings.llm = OpenAI(
model="gpt-3.5-turbo",
api_key=os.getenv("OPENAI_API_KEY")
)
Settings.embed_model = OpenAIEmbedding(
model="text-embedding-ada-002",
api_key=os.getenv("OPENAI_API_KEY")
)
Step 3: Load Documents
# Create a data directory and add some text files
# For this example, create a file called 'sample.txt' in ./data/
# Load documents from data directory
documents = SimpleDirectoryReader("./data").load_data()
print(f"Loaded {len(documents)} documents")
Step 4: Create Vector Index
# Create vector store index
index = VectorStoreIndex.from_documents(documents)
print("Index created successfully!")
Step 5: Create Query Engine
# Create query engine
query_engine = index.as_query_engine()
# Test query
response = query_engine.query("What is this document about?")
print(f"Response: {response}")
Complete Code
Here's the full implementation:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import os
from dotenv import load_dotenv
def main():
# Load environment variables
load_dotenv()
# Configure Settings
Settings.llm = OpenAI(
model="gpt-3.5-turbo",
api_key=os.getenv("OPENAI_API_KEY")
)
Settings.embed_model = OpenAIEmbedding(
model="text-embedding-ada-002",
api_key=os.getenv("OPENAI_API_KEY")
)
# Load documents
documents = SimpleDirectoryReader("./data").load_data()
print(f"Loaded {len(documents)} documents")
# Create index
index = VectorStoreIndex.from_documents(documents)
# Create query engine
query_engine = index.as_query_engine()
# Interactive query loop
while True:
query = input("\nAsk a question (or 'quit' to exit): ")
if query.lower() == 'quit':
break
response = query_engine.query(query)
print(f"\nAnswer: {response}")
if __name__ == "__main__":
main()
Running the Code
-
Create data directory:
mkdir data
-
Add sample document: Create
data/sample.txt
with some content -
Run the script:
python simple_rag.py
Expected Output
Loaded 1 documents
Index created successfully!
Ask a question (or 'quit' to exit): What is this document about?
Answer: [Your document summary based on content]
What's Happening?
- Document Loading:
SimpleDirectoryReader
loads text files - Embedding: Documents are converted to vector embeddings
- Indexing: Vectors are stored in a searchable index
- Retrieval: Query finds relevant document chunks
- Generation: LLM generates answer using retrieved context
Next Steps
🎉 Congratulations! You've built your first RAG system.
Continue learning:
- Semantic Chunking - Improve document splitting
- Chunk Size Optimization - Find optimal chunk sizes
Troubleshooting
Common issues:
- API key errors: Check your
.env
file - No documents found: Ensure files are in
./data/
directory - Import errors: Run
pip install -r requirements.txt
Need help? Join our community discussions!