AI Documentation - Agent Guide¶
Purpose¶
This directory contains architecture and design documentation for the AI model zoo and inference pipeline. It provides comprehensive documentation for understanding, configuring, and extending the AI subsystem.
Directory Contents¶
docs/ai/
├── AGENTS.md # This file - navigation guide
└── model-zoo.md # AI Model Zoo Architecture documentation
Key Documents¶
Model Zoo Architecture (model-zoo.md)¶
Comprehensive documentation covering:
- Architecture Overview - Detection pipeline and service topology
- Always-Loaded Models - YOLO26, Florence-2, CLIP, Nemotron
- On-Demand Models - Threat detection, pose estimation, demographics, clothing, vehicle, pet, re-ID, depth, action recognition
- VRAM Management - On-demand loading with LRU eviction and priority-based ordering
- API Reference - Unified enrichment endpoint and model management APIs
- Environment Variables - Configuration for all AI services
- Adding New Models - Step-by-step guide for extending the model zoo
Quick Links¶
| Topic | Location |
|---|---|
| Model zoo architecture | model-zoo.md |
| AI service implementation | ai/AGENTS.md |
| YOLO26 detection | ai/yolo26/AGENTS.md |
| Florence-2 vision-language | ai/florence/AGENTS.md |
| CLIP embeddings | ai/clip/AGENTS.md |
| Enrichment service | ai/enrichment/AGENTS.md |
| Nemotron LLM | ai/nemotron/AGENTS.md |
Common Tasks¶
Understanding Model Capabilities¶
- Read model-zoo.md for the complete model inventory
- Each model section includes:
- Model source (HuggingFace link)
- VRAM requirements
- Input/output formats
- Trigger conditions
Configuring VRAM Budget¶
See the "VRAM Management" section in model-zoo.md:
- Default budget: 6.8GB for on-demand models
- Configure via
VRAM_BUDGET_GBenvironment variable - Priority system controls eviction order
Adding New Models¶
Follow the "Adding New Models" guide in model-zoo.md:
- Create model wrapper in
ai/enrichment/models/ - Register in
model_registry.py - Add trigger conditions
- (Optional) Add API endpoint
- Update docker-compose volumes
- Update documentation
Debugging Model Loading¶
Check model status via API:
Review logs for loading/eviction events:
Related Documentation¶
- Architecture Overview:
docs/architecture/overview.md - AI Pipeline API:
docs/developer/api/ai-pipeline.md - Deployment Guide:
docs/operator/deployment/README.md - Troubleshooting:
docs/reference/troubleshooting/ai-issues.md
Entry Points¶
- Model zoo documentation: model-zoo.md - Start here for AI architecture
- Implementation code: ai/AGENTS.md - For service implementation details
- Backend integration: backend/services/ - Client code for AI services