ADR-031: Vector Database Selection for RAG System
Status
Accepted
Context
We need a vector database to enable Retrieval Augmented Generation (RAG) capabilities in Fawkes for AI-assisted development. The vector database will:
- Store embeddings of internal documentation, code, and platform knowledge
- Enable semantic search based on meaning rather than keywords
- Support AI assistants with contextual information retrieval
- Scale to handle the platform's growing documentation and code base
Requirements
Functional Requirements:
- Vector similarity search with high precision (>0.7 relevance score)
- Support for text embeddings (initially)
- GraphQL or REST API for integration
- Schema flexibility for different content types
- Hybrid search (vector + keyword) capabilities
Non-Functional Requirements:
- Kubernetes-native deployment
- Horizontal scalability
- Built-in monitoring (Prometheus metrics)
- Backup and restore capabilities
- Open-source with active community
- Production-ready and battle-tested
Integration Requirements:
- Compatible with transformer models (sentence-transformers)
- Easy integration with Python applications
- Support for batch operations
- Low-latency queries (<100ms)
Decision
We will use Weaviate as our vector database for the following reasons:
Technical Rationale
-
Native Vector Search
-
Built specifically for vector operations using HNSW (Hierarchical Navigable Small World) algorithm
- Provides fast approximate nearest neighbor search
-
Supports multiple distance metrics (cosine, L2, etc.)
-
GraphQL API
-
Modern, flexible API that's easy to use
- Strong typing and schema validation
- Good documentation and tooling support
-
Native Python client library
-
Built-in Vectorization
-
Supports text2vec-transformers module out of the box
- Can use sentence-transformers models directly
- Extensible to other vectorization methods (OpenAI, Cohere, etc.)
-
Handles vectorization automatically
-
Hybrid Search Capability
-
Combines vector search with traditional keyword search (BM25)
- Best of both worlds for different query types
-
Configurable weight between vector and keyword search
-
Kubernetes Native
- Official Helm charts maintained by Weaviate
- Designed for cloud-native deployments
- Supports StatefulSets for persistence
- Good resource management
Operational Rationale
-
Production Ready
-
Used by many organizations in production
- Proven track record for reliability
- Good performance characteristics
-
Mature codebase (4+ years old)
-
Active Community
-
Large and growing community
- Excellent documentation
- Active development (frequent releases)
-
Good support channels (Discord, GitHub)
-
Monitoring and Observability
-
Built-in Prometheus metrics
- Grafana dashboards available
- Detailed logging
-
Health check endpoints
-
Backup and Recovery
- Built-in backup functionality
- Point-in-time recovery
- Multiple backup backends supported
- Well-documented disaster recovery procedures
Alternatives Considered
Pinecone
Pros:
- Fully managed service
- Very easy to use
- Good performance
- Excellent documentation
Cons:
- Cloud-only (SaaS)
- Vendor lock-in
- Not self-hosted
- Cost increases with scale
Decision: ❌ Rejected - We need a self-hosted solution to maintain control and reduce operational costs.
Milvus
Pros:
- High performance
- Based on FAISS
- Large feature set
- Good scalability
Cons:
- Complex setup and operation
- Heavy resource requirements
- Steeper learning curve
- More infrastructure to manage
Decision: ❌ Rejected - Too complex for our current needs; Weaviate provides sufficient performance with simpler operations.
PostgreSQL with pgvector Extension
Pros:
- Familiar database
- Simple extension
- Easy to get started
- No new infrastructure
Cons:
- Not purpose-built for vectors
- Limited scalability
- Slower for large datasets
- Less sophisticated search algorithms
Decision: ❌ Rejected - Not specialized enough; performance degrades at scale.
ChromaDB
Pros:
- Simple and lightweight
- Python-first design
- Easy to embed
- Good for prototyping
Cons:
- Relatively new/immature
- Limited production usage
- Fewer features
- Less proven at scale
Decision: ❌ Rejected - Too new and unproven for production use; prefer more mature solution.
Qdrant
Pros:
- Good performance
- Written in Rust
- Growing community
- Modern architecture
Cons:
- Smaller community than Weaviate
- Less mature ecosystem
- Fewer integrations
- Less documentation
Decision: ❌ Considered but Weaviate has better ecosystem and documentation.
Consequences
Positive
-
Fast Semantic Search
-
HNSW algorithm provides excellent performance
- Sub-100ms queries for most use cases
-
Scales well with dataset size
-
Flexible Schema
-
Can easily add new document types
- Strong typing prevents errors
-
GraphQL makes schema discovery easy
-
Easy Integration
-
Well-documented Python client
- Simple API design
-
Good examples and tutorials
-
Kubernetes Native
-
Fits well with existing platform
- Uses standard Kubernetes patterns
-
Easy to operate with existing tools
-
Active Development
-
Regular updates and improvements
- Security patches
-
New features added frequently
-
Good Monitoring
- Integrates with existing Prometheus/Grafana stack
- Pre-built dashboards available
- Detailed metrics exposed
Negative
-
Learning Curve
-
Team needs to learn vector database concepts
- GraphQL may be new to some developers
-
HNSW tuning requires understanding
-
Additional Infrastructure
-
New component to maintain
- Requires persistent storage
-
Adds to infrastructure complexity
-
Resource Requirements
-
Memory-intensive for large datasets
- CPU for vectorization
-
Storage for vectors and data
-
Operational Overhead
- Need to manage backups
- Need to monitor performance
- Need to plan capacity
Mitigation Strategies
-
Training and Documentation
-
Create comprehensive documentation (done: docs/ai/vector-database.md)
- Provide examples and tutorials
-
Conduct knowledge sharing sessions
-
Start Small
-
Begin with 1 replica
- Use modest resources (2Gi RAM, 1 CPU)
-
Scale up based on actual usage
-
Monitoring from Day 1
-
Enable Prometheus metrics
- Create Grafana dashboards
-
Set up alerts for issues
-
Backup Strategy
- Implement automated daily backups
- Test restore procedures
- Document disaster recovery process
Implementation Plan
-
Phase 1: Deployment (Done)
-
Deploy Weaviate via ArgoCD ✅
- Configure persistent storage (10GB) ✅
- Enable text2vec-transformers module ✅
-
Set up Prometheus monitoring ✅
-
Phase 2: Testing (In Progress)
-
Create test indexing script ✅
- Index sample documents ✅
- Validate search functionality ⏳
-
Verify relevance scores >0.7 ⏳
-
Phase 3: Production Indexing (Future)
-
Index all platform documentation
- Index ADRs and runbooks
- Index code examples
-
Set up incremental indexing
-
Phase 4: Integration (Future)
- Integrate with AI assistant
- Build RAG pipeline
- Create query interface
- Add to Backstage portal
Validation
The decision will be validated by:
-
Performance Metrics
-
Query latency <100ms for 95th percentile
- Relevance scores >0.7 for semantic queries
-
Indexing throughput >100 documents/second
-
Operational Metrics
-
Uptime >99.9%
- Successful backups daily
-
Recovery time <30 minutes
-
User Feedback
- AI assistant provides relevant context
- Documentation search returns useful results
- Development productivity improvements
References
- Weaviate Documentation
- HNSW Algorithm Paper
- Vector Database Comparison
- RAG Architecture Patterns
- Weaviate Helm Chart
Related Decisions
- ADR-001: Kubernetes Orchestration (infrastructure platform)
- ADR-003: ArgoCD for GitOps (deployment method)
- ADR-006: PostgreSQL (relational data storage)
Revision History
- 2025-12-21: Initial version - Vector database selection for RAG system