A complete RAG ecosystem with interactive Chainlit UI, multi-provider LLM support, and specialized context engines.
This repository demonstrates a production-ready Retrieval-Augmented Generation (RAG) system featuring:
- π― Three specialized context engines for different domains
- π₯οΈ Interactive Chainlit UI for seamless user experience
- π Multi-provider LLM support (OpenAI, Anthropic, DeepSeek, Ollama)
- β‘ Smart provider fallback and automatic configuration
- π Prometheus metrics and comprehensive testing
- ποΈ Microservices architecture with FastAPI
| Engine | Domain | Use Cases |
|---|---|---|
| π’ Enterprise | WICS framework, company policies | HR questions, policy lookup, process guidance |
| π° Financial Compliance | Basel III, BACEN, CVM regulations | Regulatory compliance, risk assessment |
| βοΈ DevOps | SRE practices, troubleshooting | Infrastructure issues, operational guidance |
context_engine_core/: Shared RAG pipeline with LangChain + LangGraphapp.py: Main Chainlit application with UIconfig.json: Multi-provider configurationcontext_quality_monitor/: Prometheus metrics servicecontext_engineering_testing_suite/: Comprehensive test suite
- Python 3.8+
- API key for at least one LLM provider
# Clone and navigate to the project
cd rag-context-engineering-examples
# Install dependencies
pip install -r requirements.txt
# Set your API key (choose one)
export OPENAI_API_KEY=sk-your-openai-key
export ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
export DEEPSEEK_API_KEY=sk-your-deepseek-keyOption A: Easy Launch (Recommended)
python launch.pyOption B: Direct Chainlit
chainlit run app.py --host 0.0.0.0 --port 8000Option C: Python execution
python app.pyOpen your browser to: http://localhost:8000
The Chainlit interface will provide:
- π§ Engine selection via settings panel
- π¬ Interactive chat with all context engines
- π± Mobile-friendly responsive design
- π¨ Rich markdown formatting
The system automatically selects the best available provider. Configure per-engine preferences in config.json:
{
"engines": {
"enterprise_context_engine": {
"llm": {
"provider": "anthropic",
"anthropic": {
"model_name": "claude-3-sonnet-20240229",
"temperature": 0.1
}
}
},
"financial_compliance_context_engine": {
"llm": {
"provider": "openai",
"openai": {
"model_name": "gpt-4",
"temperature": 0
}
}
}
}
}| Provider | Setup | Models | Best For |
|---|---|---|---|
| OpenAI | export OPENAI_API_KEY=sk-... |
GPT-3.5, GPT-4 | Balanced performance |
| Anthropic | export ANTHROPIC_API_KEY=sk-ant-... |
Claude-3 Sonnet/Opus | Complex reasoning |
| DeepSeek | export DEEPSEEK_API_KEY=sk-... |
DeepSeek Chat/Coder | Cost-effective |
| Ollama | Local installation | Llama2, Mistral | Privacy/offline |
# Full test suite
cd context_engineering_testing_suite
pytest tests/ -v
# Single test
pytest tests/test_rag_flow.py::test_query_returns_string -v
# Test with coverage
pytest tests/ --cov=context_engine_core --cov-report=html# Run individual examples
cd examples/
python quick_demo.py # Basic RAG demonstration
python multi_provider_testing.py # Provider comparison
python provider_demo.py # Provider switching demo# Start individual engines as APIs
uvicorn enterprise_context_engine.src.enterprise_context_engine.api:app --reload --port 8001
uvicorn financial_compliance_context_engine.src.financial_compliance_context_engine.api:app --reload --port 8002
uvicorn devops_context_engine.src.devops_context_engine.api:app --reload --port 8003
# Start metrics service
uvicorn context_quality_monitor.src.context_quality_monitor.api:app --reload --port 8004Q: "What is WICS?"
A: "WICS (Work Integration and Coordination System) is a framework for..."
Q: "What is the remote work policy?"
A: "The remote work policy allows for flexible arrangements..."
Q: "What are Basel III requirements?"
A: "Basel III introduces enhanced capital requirements including..."
Q: "Tell me about BACEN regulations"
A: "BACEN (Central Bank of Brazil) regulations cover..."
Q: "How do I troubleshoot high CPU usage?"
A: "For high CPU usage troubleshooting, follow these steps..."
Q: "What are SRE best practices?"
A: "Site Reliability Engineering best practices include..."
graph TD
A[User Query] --> B[Context Engine]
B --> C[Document Retrieval]
C --> D[FAISS Vector Search]
D --> E[Relevant Documents]
E --> F[LLM Provider]
F --> G[Generated Response]
G --> H[Chainlit UI]
- Retrieve: Query vector store for relevant documents
- Generate: Use LLM to synthesize answer from context
- Return: Format response for UI display
- BaseContextEngine: Core RAG logic shared across all engines
- LLMProviders: Abstraction layer for multiple AI providers
- Chainlit App: User interface with engine switching
- FastAPI Services: Alternative REST API access
- Prometheus: Metrics collection and monitoring
Access metrics at: http://localhost:8004/metrics
Available metrics:
rag_queries_total: Total queries per enginerag_query_duration_seconds: Query processing timerag_errors_total: Error count by type
- UI Health:
http://localhost:8000 - API Health:
http://localhost:8001/health(per service) - Metrics Health:
http://localhost:8004/health
-
Create Engine Directory
mkdir new_context_engine/src/new_context_engine/
-
Implement Engine Class
from context_engine_core.base_engine import BaseContextEngine class NewContextEngine(BaseContextEngine): def _default_docs(self): return ["Your domain-specific documents here"]
-
Add FastAPI Wrapper
from fastapi import FastAPI app = FastAPI() @app.post("/query") async def query_endpoint(query: str): engine = NewContextEngine() return {"response": engine.query(query)}
-
Update Configuration
{ "engines": { "new_context_engine": { "llm": {"provider": "openai"} } } } -
Integrate with UI
# Add to app.py engines dictionary engines["new_engine"] = NewContextEngine()
# Linting
flake8 context_engine_core/ --max-line-length=100
# Type checking
mypy context_engine_core/
# Security scan
bandit -r context_engine_core/β "No module named 'context_engine_core'"
# Ensure you're in the project root
cd rag-context-engineering-examples
python app.pyβ "API key not found"
# Set your API key
export OPENAI_API_KEY=your-key-here
python launch.py # Will check API keysβ "Chainlit not found"
# Install Chainlit
pip install chainlit>=1.0.0β "Port already in use"
# Use different port
chainlit run app.py --port 8001# Enable verbose logging
export LANGCHAIN_VERBOSE=true
export LANGCHAIN_TRACING=true
python app.pyThis system demonstrates key RAG concepts:
- Context Engineering: Beyond prompt engineering to systematic context management
- Multi-Modal RAG: Different engines for different knowledge domains
- Provider Abstraction: Flexible LLM provider switching
- Production Patterns: Monitoring, testing, configuration management
- User Experience: Interactive UI with real-time engine switching
- Start with Examples: Run
python examples/quick_demo.py - Explore UI: Use Chainlit interface to understand user experience
- Study Architecture: Review
context_engine_core/base_engine.py - Extend System: Add your own context engine
- Production Deploy: Scale with Docker/Kubernetes
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Run tests:
pytest tests/ - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- π Documentation: Check the inline code documentation
- π Issues: Open GitHub issues for bugs
- π‘ Features: Request enhancements via GitHub
- π§ Contact: For educational/commercial inquiries
Built with β€οΈ for the RAG community. Designed for learning, built for production.