Architecture Overview¶

AI-generated visualization of the full system architecture showing Frontend, Backend, AI Services, and GPU tiers.
Last Updated: 2026-01-04 Target Audience: Future maintainers, technical contributors
System Purpose¶
Home Security Intelligence transforms commodity IP cameras into an intelligent threat detection system. Rather than simply alerting when motion is detected, the system uses AI to understand what is happening and why it might be concerning.
Problem solved: Traditional security cameras generate endless motion alerts with no context. A person walking their dog and a stranger approaching at 2 AM both trigger the same notification. This system provides contextual risk assessment using AI, turning raw camera feeds into actionable security intelligence.
Key value proposition:
- Contextual alerts: Not just "person detected" but "unfamiliar person approaching back entrance at 2 AM, risk: high"
- Batch reasoning: Groups multiple detections into coherent events for better context
- Fast path: High-confidence critical detections bypass batching for immediate alerts
- Local processing: All AI inference runs locally on your hardware, no cloud dependencies
High-Level Architecture¶

The system is organized into four layers:
- Camera Layer: Foscam IP cameras upload images via FTP
- Application Layer: React frontend + FastAPI backend
- GPU Services: 5 AI inference containers (YOLO26, Nemotron, Florence-2, CLIP, Enrichment)
- Data Layer: PostgreSQL, Redis, and filesystem storage
Detailed System Architecture Diagram¶
<!-- Original Mermaid diagram preserved for reference:
flowchart TB
subgraph Cameras["Camera Layer"]
CAM1[Foscam Camera 1]
CAM2[Foscam Camera 2]
CAM3[Foscam Camera N]
end
subgraph FTP["FTP Upload"]
FTPS["/export/foscam/{camera}/"]
end
subgraph Docker["Docker Services"]
direction TB
FE["Frontend<br/>React + Vite<br/>:5173"]
BE["Backend<br/>FastAPI<br/>:8000"]
RD["Redis<br/>:6379"]
end
subgraph GPU["Containerized GPU Services"]
DET["YOLO26<br/>Object Detection<br/>:8095"]
FLO["Florence-2<br/>Vision extraction (optional)<br/>:8092"]
CLIP["CLIP<br/>Re-identification (optional)<br/>:8093"]
ENR["Enrichment API<br/>Model-zoo enrichment (optional)<br/>:8094"]
LLM["Nemotron LLM<br/>Risk Analysis<br/>:8091"]
end
subgraph Storage["Persistent Storage"]
DB[(PostgreSQL<br/>security)]
FS[("Filesystem<br/>thumbnails/")]
end
CAM1 & CAM2 & CAM3 -->|FTP Upload| FTPS
FTPS -->|FileWatcher| BE
BE <-->|Queues & Pub/Sub| RD
BE -->|HTTP /detect| DET
BE -->|HTTP (optional)| FLO
BE -->|HTTP (optional)| CLIP
BE -->|HTTP (optional)| ENR
BE -->|HTTP /completion| LLM
BE <-->|SQLAlchemy| DB
BE -->|PIL/Pillow| FS
FE <-->|REST API| BE
FE <-->|WebSocket| BE
````
-->
---
## Technology Stack
| Layer | Technology | Version | Why This Choice |
| -------------------- | ---------------- | ------- | ------------------------------------------------------------------------------ |
| **Frontend** | React | 18.2 | Industry standard, excellent ecosystem, component model fits dashboard UI |
| | TypeScript | 5.3 | Type safety catches bugs early, better IDE support, self-documenting code |
| | Tailwind CSS | 3.4 | Utility-first approach, dark theme customization, responsive design |
| | Tremor | 3.17 | Pre-built data visualization components (charts, gauges) for dashboards |
| | Vite | 5.0 | Fast dev server with HMR, modern bundling, excellent DX |
| **Backend** | Python | 3.14+ | AI/ML ecosystem (PyTorch, transformers), async support, rapid development |
| | FastAPI | 0.104+ | Modern async framework, automatic OpenAPI docs, type hints, WebSocket support |
| | SQLAlchemy | 2.0 | Async ORM, excellent PostgreSQL support, type-safe queries with `Mapped` hints |
| | Pydantic | 2.0 | Request validation, settings management, schema generation |
| **Database** | PostgreSQL | 15+ | Concurrent writes, JSONB, full-text search, proper transaction isolation |
| | Redis | 7.x | Fast pub/sub for WebSocket, reliable queues for pipeline, ephemeral cache |
| **AI - Detection** | YOLO26 | - | Real-time transformer detector, 30-50ms inference, COCO-trained |
| | PyTorch | 2.x | GPU acceleration, HuggingFace Transformers integration |
| **AI - Reasoning** | Nemotron-3-Nano-30B | Q4_K_M | NVIDIA v3 Nano 30B LLM, 128K context, ~14.7GB VRAM (dev: Mini 4B ~3GB) |
| | llama.cpp | - | Efficient inference, GGUF format, GPU offloading, HTTP API |
| **Containerization** | Docker Compose | 2.x | Multi-service orchestration, health checks, networking |
| **Monitoring** | Prometheus | 2.48 | Time-series metrics, optional monitoring stack |
| | Grafana | 10.2 | Dashboards for system monitoring |
---
## Component Responsibilities
### Backend Components
| Component | Location | Responsibility |
| ----------------------- | ------------------------- | ------------------------------------------------------------------- |
| **FastAPI App** | `backend/main.py` | HTTP/WebSocket server, middleware, lifespan management |
| **API Routes** | `backend/api/routes/` | REST endpoints for cameras, events, detections, system, media, logs |
| **Pydantic Schemas** | `backend/api/schemas/` | Request/response validation, OpenAPI documentation |
| **Middleware** | `backend/api/middleware/` | Authentication (optional), request ID propagation |
| **ORM Models** | `backend/models/` | SQLAlchemy models: Camera, Detection, Event, GPUStats, Log, APIKey |
| **Core Infrastructure** | `backend/core/` | Config, database, Redis, logging, metrics |
### Service Layer (AI Pipeline)
| Service | Location | Responsibility |
| ------------------------- | -------------------------------------------- | --------------------------------------------------------- |
| **FileWatcher** | `backend/services/file_watcher.py` | Monitor camera directories, debounce, queue new images |
| **DedupeService** | `backend/services/dedupe.py` | Prevent duplicate processing via content hashes |
| **DetectorClient** | `backend/services/detector_client.py` | HTTP client for YOLO26, store detections |
| **BatchAggregator** | `backend/services/batch_aggregator.py` | Group detections into time-windowed batches |
| **NemotronAnalyzer** | `backend/services/nemotron_analyzer.py` | LLM risk analysis, event creation |
| **ThumbnailGenerator** | `backend/services/thumbnail_generator.py` | Bounding box overlays, preview images |
| **EventBroadcaster** | `backend/services/event_broadcaster.py` | WebSocket event distribution via Redis pub/sub |
| **SystemBroadcaster** | `backend/services/system_broadcaster.py` | Periodic system status broadcasts |
| **GPUMonitor** | `backend/services/gpu_monitor.py` | NVIDIA GPU metrics via pynvml |
| **CleanupService** | `backend/services/cleanup_service.py` | Data retention enforcement |
| **HealthMonitor** | `backend/services/health_monitor.py` | Service health checks, auto-recovery |
| **RetryHandler** | `backend/services/retry_handler.py` | Exponential backoff, dead-letter queues |
| **PipelineWorkerManager** | `backend/services/pipeline_workers.py` | Background worker lifecycle management |
| **AlertEngine** | `backend/services/alert_engine.py` | Alert rule evaluation and notification triggering |
| **ZoneService** | `backend/services/zone_service.py` | Geographic zone management for detections |
| **BaselineService** | `backend/services/baseline.py` | Anomaly detection via activity baselines |
| **EnrichmentPipeline** | `backend/services/enrichment_pipeline.py` | Orchestrate multi-model detection enrichment |
| **EnrichmentClient** | `backend/services/enrichment_client.py` | HTTP client for enrichment API service |
| **FlorenceClient** | `backend/services/florence_client.py` | HTTP client for Florence-2 vision extraction |
| **CLIPClient** | `backend/services/clip_client.py` | HTTP client for CLIP re-identification |
| **PerformanceCollector** | `backend/services/performance_collector.py` | AI pipeline performance metrics collection |
| **PromptService** | `backend/services/prompt_service.py` | Dynamic prompt template management |
| **PromptVersionService** | `backend/services/prompt_version_service.py` | Prompt versioning and A/B testing support |
| **AuditService** | `backend/services/audit_logger.py` | Security audit logging and compliance tracking |
| **NotificationService** | `backend/services/notification.py` | Alert delivery via multiple channels |
| **SceneChangeDetector** | `backend/services/scene_change_detector.py` | Detect significant scene changes between frames |
| **SceneBaseline** | `backend/services/scene_baseline.py` | Maintain per-camera scene baselines for anomaly detection |
| **VideoProcessor** | `backend/services/video_processor.py` | Process and analyze video clips |
| **ReidService** | `backend/services/reid_service.py` | Person/entity re-identification across detections |
| **ContextEnricher** | `backend/services/context_enricher.py` | Add contextual metadata to detections |
| **CircuitBreaker** | `backend/services/circuit_breaker.py` | Protect services from cascading failures |
| **DegradationManager** | `backend/services/degradation_manager.py` | Graceful degradation during service failures |
| **CacheService** | `backend/services/cache_service.py` | Redis-based caching for frequently accessed data |
| **AlertDedupService** | `backend/services/alert_dedup.py` | Deduplicate repeated alerts |
| **SearchService** | `backend/services/search.py` | Full-text search across events and detections |
| **SeverityService** | `backend/services/severity.py` | Calculate and normalize severity scores |
| **ModelZoo** | `backend/services/model_zoo.py` | Manage optional ML model loading and inference |
| **VisionExtractor** | `backend/services/vision_extractor.py` | Extract visual features from detection images |
| **BBoxValidation** | `backend/services/bbox_validation.py` | Validate and normalize bounding box coordinates |
### Frontend Components
| Component | Location | Responsibility |
| -------------------- | ------------------------------------ | ----------------------------------------------------- |
| **DashboardPage** | `frontend/src/components/dashboard/` | Main view with risk gauge, camera grid, activity feed |
| **EventTimeline** | `frontend/src/components/events/` | Chronological event list with filtering |
| **EventDetailModal** | `frontend/src/components/events/` | Full event details, detections, reasoning |
| **SettingsPage** | `frontend/src/components/settings/` | Camera management, AI status, processing config |
| **LogsDashboard** | `frontend/src/components/logs/` | System logs with filtering and statistics |
| **Layout** | `frontend/src/components/layout/` | Header, sidebar, navigation |
| **API Client** | `frontend/src/services/api.ts` | Type-safe REST API wrapper |
### Frontend Hooks
| Hook | Location | Responsibility |
| -------------------------- | ---------------------------------------------- | ---------------------------------------------------- |
| **useWebSocket** | `frontend/src/hooks/useWebSocket.ts` | Core WebSocket connection management |
| **WebSocketManager** | `frontend/src/hooks/webSocketManager.ts` | Singleton WebSocket instance with reconnection logic |
| **useEventStream** | `frontend/src/hooks/useEventStream.ts` | Subscribe to real-time security events |
| **useSystemStatus** | `frontend/src/hooks/useSystemStatus.ts` | Subscribe to system health broadcasts |
| **useWebSocketStatus** | `frontend/src/hooks/useWebSocketStatus.ts` | Track WebSocket connection state |
| **useConnectionStatus** | `frontend/src/hooks/useConnectionStatus.ts` | Combined API and WebSocket connection status |
| **useHealthStatus** | `frontend/src/hooks/useHealthStatus.ts` | Monitor backend service health |
| **useServiceStatus** | `frontend/src/hooks/useServiceStatus.ts` | Track individual AI service availability |
| **useAIMetrics** | `frontend/src/hooks/useAIMetrics.ts` | AI pipeline performance metrics (latency, accuracy) |
| **usePerformanceMetrics** | `frontend/src/hooks/usePerformanceMetrics.ts` | System performance metrics (CPU, memory, GPU) |
| **useGpuHistory** | `frontend/src/hooks/useGpuHistory.ts` | Historical GPU utilization data |
| **useStorageStats** | `frontend/src/hooks/useStorageStats.ts` | Storage usage and retention statistics |
| **useModelZooStatus** | `frontend/src/hooks/useModelZooStatus.ts` | Optional model zoo loading status |
| **useDetectionEnrichment** | `frontend/src/hooks/useDetectionEnrichment.ts` | Fetch enriched detection metadata |
| **useSavedSearches** | `frontend/src/hooks/useSavedSearches.ts` | Manage user-saved search filters |
| **useSidebarContext** | `frontend/src/hooks/useSidebarContext.ts` | Sidebar state management context |
### AI Services
| Service | Location | Responsibility |
| -------------------- | -------------------- | ---------------------------------------------------- |
| **YOLO26 Server** | `ai/yolo26/model.py` | Object detection inference, security-class filtering |
| **Nemotron LLM** | `ai/nemotron/` | Risk reasoning via llama.cpp server |
| **Florence-2** | `ai/florence/` | Optional vision extraction used by enrichment |
| **CLIP** | `ai/clip/` | Optional entity re-identification used by enrichment |
| **Enrichment API** | `ai/enrichment/` | Optional higher-level enrichment endpoint |
---
## Communication Patterns
### REST API
Used for: CRUD operations, data queries, configuration
| Endpoint Pattern | Methods | Purpose |
| ------------------------------ | ------------------ | -------------------------------- |
| `/api/cameras` | GET, POST | List/create cameras |
| `/api/cameras/{id}` | GET, PATCH, DELETE | Single camera operations |
| `/api/events` | GET | List events with filtering |
| `/api/events/{id}` | GET, PATCH | Get/update event (mark reviewed) |
| `/api/detections` | GET | List detections with filtering |
| `/api/system/health` | GET | Comprehensive health check |
| `/api/system/gpu` | GET | GPU statistics |
| `/api/media/thumbnails/{file}` | GET | Serve detection thumbnails |
| `/api/logs` | GET | List system logs |
| `/api/dlq/*` | GET, POST, DELETE | Dead-letter queue management |
| `/api/metrics` | GET | Prometheus metrics |
### WebSocket
Used for: Real-time updates without polling
| Channel | Endpoint | Purpose | Message Frequency |
| ------------ | ------------------------ | ---------------------------- | ----------------- |
| **Events** | `/ws/events` | Security event notifications | On event creation |
| **System** | `/ws/system` | System health and GPU stats | Every 5 seconds |
| **Job Logs** | `/ws/jobs/{job_id}/logs` | Real-time job log streaming | On log emission |
**Message Format:**
```json
{
"type": "event",
"data": {
"id": 123,
"camera_id": "front_door",
"risk_score": 75,
"risk_level": "high",
"summary": "Person detected..."
}
}
````
### Redis Pub/Sub
Used for: Multi-instance WebSocket broadcasting
| Channel | Publisher | Subscribers | Purpose |
| ----------------- | ---------------- | ------------------- | ------------------------------------------ |
| `security_events` | NemotronAnalyzer | EventBroadcaster(s) | Distribute events to all backend instances |
### Redis Queues (Lists)
Used for: Reliable async job processing
| Queue | Producer | Consumer | Data |
| --------------------- | --------------- | -------------------- | -------------------------------------- |
| `detection_queue` | FileWatcher | DetectionQueueWorker | `{camera_id, file_path, timestamp}` |
| `analysis_queue` | BatchAggregator | AnalysisQueueWorker | `{batch_id, camera_id, detection_ids}` |
| `dlq:detection_queue` | RetryHandler | Manual/API | Failed detection jobs |
| `dlq:analysis_queue` | RetryHandler | Manual/API | Failed analysis jobs |
### HTTP (Internal Services)
Used for: AI inference requests
| Service | Endpoint | Method | Request | Response |
| -------- | ------------- | ------ | ----------------------------------- | ------------------------------------------- |
| YOLO26 | `/detect` | POST | Multipart image | `{detections: [{class, confidence, bbox}]}` |
| YOLO26 | `/health` | GET | - | `{status, model_loaded, cuda_available}` |
| Nemotron | `/completion` | POST | `{prompt, temperature, max_tokens}` | `{content: "..."}` |
| Nemotron | `/health` | GET | - | `{status: "ok"}` |
---
## Deployment Topology

<!--
Original Mermaid diagram preserved for reference:
```mermaid
flowchart TB
subgraph Host["Host Machine (with GPU)"]
subgraph Docker["Docker Compose Network"]
FE["Frontend Container<br/>Node 22 Alpine<br/>Port 5173 (dev) / 80 (prod)"]
BE["Backend Container<br/>Python 3.14<br/>Port 8000"]
RD["Redis Container<br/>Redis 7 Alpine<br/>Port 6379"]
end
subgraph GPUContainers["Containerized GPU Services (CDI)"]
DET["YOLO26 Container<br/>PyTorch + Transformers<br/>Port 8095<br/>~3–4GB VRAM"]
FLO["Florence-2 Container<br/>Vision extraction (optional)<br/>Port 8092<br/>VRAM varies"]
CLIP["CLIP Container<br/>Re-ID (optional)<br/>Port 8093<br/>VRAM varies"]
ENR["Enrichment Container<br/>Model-zoo API (optional)<br/>Port 8094<br/>VRAM varies"]
LLM["Nemotron Container<br/>llama.cpp<br/>Port 8091<br/>~14.7GB VRAM*"]
end
subgraph Storage["Persistent Storage"]
VOL1["PostgreSQL<br/>database volume"]
VOL2["/export/foscam/<br/>Camera uploads (RO)"]
VOL3["redis_data<br/>Redis persistence"]
end
GPU["NVIDIA GPU<br/>RTX A5500 (24GB)<br/>CUDA 12.0+"]
end
FE <--> BE
BE <--> RD
BE -->|localhost:8095| DET
BE -->|localhost:8092 (optional)| FLO
BE -->|localhost:8093 (optional)| CLIP
BE -->|localhost:8094 (optional)| ENR
BE -->|localhost:8091| LLM
DET --> GPU
LLM --> GPU
BE --> VOL1
BE --> VOL2
RD --> VOL3
````
-->
### What Runs Where
| Component | Deployment | Why |
| -------------- | ------------------------------- | ---------------------------------------- |
| **Frontend** | Podman (dev: Vite, prod: Nginx) | No GPU needed, isolated environment |
| **Backend** | Podman | No GPU needed, isolated environment |
| **Redis** | Podman | No GPU needed, ephemeral data acceptable |
| **PostgreSQL** | Podman | Database isolation, volume persistence |
| **YOLO26** | Podman (GPU via CDI) | GPU access via NVIDIA Container Toolkit |
| **Nemotron** | Podman (GPU via CDI) | GPU access via NVIDIA Container Toolkit |
| **Florence-2** | Podman (GPU via CDI, optional) | Optional enrichment capability |
| **CLIP** | Podman (GPU via CDI, optional) | Optional enrichment capability |
| **Enrichment** | Podman (GPU via CDI, optional) | Optional enrichment capability |
### Port Summary
| Port | Service | Protocol | Exposed To |
| ---- | --------------- | -------- | ---------------------------- |
| 5173 | Frontend (dev) | HTTP | Browser |
| 80 | Frontend (prod) | HTTP | Browser |
| 8000 | Backend API | HTTP/WS | Browser, Frontend container |
| 6379 | Redis | TCP | Backend container only |
| 8095 | YOLO26 | HTTP | Backend container, localhost |
| 8092 | Florence-2 | HTTP | Backend container, localhost |
| 8093 | CLIP | HTTP | Backend container, localhost |
| 8094 | Enrichment | HTTP | Backend container, localhost |
| 8091 | Nemotron | HTTP | Backend container, localhost |
---
## Data Flow
### Complete Pipeline: Camera to Dashboard

<!--
Original Mermaid diagram preserved for reference:
```mermaid
sequenceDiagram
participant CAM as Foscam Camera
participant FTP as /export/foscam/
participant FW as FileWatcher
participant DQ as detection_queue
participant DW as DetectionWorker
participant DET as YOLO26
participant DB as PostgreSQL
participant EN as EnrichmentPipeline
participant BA as BatchAggregator
participant AQ as analysis_queue
participant AW as AnalysisWorker
participant LLM as Nemotron
participant EB as EventBroadcaster
participant WS as WebSocket
participant UI as Dashboard
CAM->>FTP: FTP upload image
FTP->>FW: watchdog event (file created)
FW->>FW: debounce (0.5s)
FW->>FW: validate image (PIL)
FW->>DQ: queue {camera_id, file_path}
DQ->>DW: BLPOP (blocking pop)
DW->>DW: dedupe check (Redis/DB)
DW->>DET: POST /detect (image)
DET->>DET: inference (30-50ms)
DET-->>DW: {detections: [...]}
DW->>DB: INSERT detections
opt Enrichment enabled
DW->>EN: enrich detections (context + optional services)
EN-->>DW: enriched attributes/entities
DW->>DB: UPDATE detection metadata
end
alt Fast Path (person > 90% confidence)
DW->>LLM: immediate analysis
LLM-->>DW: risk assessment
DW->>DB: INSERT event (is_fast_path=true)
DW->>EB: broadcast_event()
else Normal Path
DW->>BA: add_detection(camera_id, detection_id)
BA->>BA: check batch timeout (90s window / 30s idle)
BA->>AQ: queue {batch_id, detection_ids}
end
AQ->>AW: BLPOP
AW->>DB: SELECT detections WHERE id IN (...)
AW->>LLM: POST /completion (prompt)
LLM->>LLM: inference (2-5s)
LLM-->>AW: {risk_score, risk_level, summary, reasoning}
AW->>DB: INSERT event
AW->>EB: broadcast_event()
EB->>WS: send to connected clients
WS->>UI: {"type": "event", "data": {...}}
UI->>UI: update activity feed
````
-->
### Batching Logic
Why batch detections instead of analyzing each frame?
A single "person walks to door" scenario might generate 15 images over 30 seconds. Batching provides:
1. **Better context:** LLM sees the full sequence, not isolated frames
2. **Reduced API calls:** One LLM call per event, not per frame
3. **Coherent events:** User sees "Person approached door" not 15 separate alerts
**Batch timing:**
### Fast Path Flow
Critical detections can bypass batching for immediate alerts:

<!--
Original Mermaid diagram preserved for reference:
```mermaid
flowchart TB
D[Detection]
D --> C{Confidence > 90%<br/>AND<br/>type = person?}
C -->|Yes| FP[Fast Path<br/>Immediate LLM analysis]
C -->|No| NP[Normal Path<br/>Add to batch]
FP --> E1[Event created<br/>is_fast_path=true]
NP --> B[Batch accumulates]
B --> T{Timeout?}
T -->|Yes| E2[Event created<br/>is_fast_path=false]
````
-->
---
## Database Schema
```mermaid
erDiagram
cameras ||--o{ detections : "has"
cameras ||--o{ events : "has"
cameras {
string id PK "UUID"
string name "Human-readable name"
string folder_path "FTP upload path"
string status "online/offline/error"
datetime created_at
datetime last_seen_at
}
detections {
int id PK "Auto-increment"
string camera_id FK
string file_path "Original image"
datetime detected_at
string object_type "person/car/dog/etc"
float confidence "0.0-1.0"
int bbox_x "Bounding box"
int bbox_y
int bbox_width
int bbox_height
string thumbnail_path "With boxes drawn"
}
events {
int id PK "Auto-increment"
string batch_id "Groups detections"
string camera_id FK
datetime started_at
datetime ended_at
int risk_score "0-100 from LLM"
string risk_level "low/medium/high/critical"
text summary "LLM summary"
text reasoning "LLM explanation"
text detection_ids "JSON array"
bool reviewed "User marked"
text notes "User notes"
bool is_fast_path "Bypassed batching"
}
gpu_stats {
int id PK
datetime recorded_at
float gpu_utilization "0-100%"
int memory_used "MB"
int memory_total "MB"
float temperature "Celsius"
float inference_fps
}
logs {
int id PK
datetime timestamp
string level "DEBUG/INFO/WARNING/ERROR/CRITICAL"
string component "Logger name"
text message
string camera_id "Nullable"
int event_id "Nullable"
string request_id "Correlation ID"
int detection_id "Nullable"
int duration_ms "Nullable"
json extra "Additional context"
string source "backend/frontend"
}
api_keys {
int id PK
string key_hash "SHA-256"
string name
datetime created_at
bool is_active
}
````
### Key Indexes
| Table | Index | Purpose |
| ---------- | --------------------------- | ---------------------------- |
| detections | (camera_id, detected_at) | Camera-specific time queries |
| events | started_at | Timeline queries |
| events | risk_score | High-risk filtering |
| events | reviewed | Workflow queries |
| gpu_stats | recorded_at | Time-series queries |
| logs | timestamp, level, component | Dashboard filters |
---
## Component Interaction Diagram

<!--
Original Mermaid diagram preserved for reference:
```mermaid
flowchart TB
subgraph Frontend["Frontend (React)"]
DASH[DashboardPage]
TL[EventTimeline]
SET[SettingsPage]
AIP[AIPerformancePage]
subgraph Hooks["Custom Hooks"]
HWS[useWebSocket]
HES[useEventStream]
HSS[useSystemStatus]
HAI[useAIMetrics]
HPM[usePerformanceMetrics]
HHS[useHealthStatus]
HSV[useServiceStatus]
HCS[useConnectionStatus]
HMZ[useModelZooStatus]
end
subgraph Services["Services"]
API[api.ts]
LOG[logger.ts]
end
end
subgraph Backend["Backend (FastAPI)"]
subgraph Routes["API Routes"]
RC[/cameras]
RE[/events]
RD[/detections]
RS[/system]
RM[/media]
RW[/ws/*]
end
subgraph Pipeline["AI Pipeline"]
FW[FileWatcher]
DC[DetectorClient]
BA[BatchAggregator]
NA[NemotronAnalyzer]
TG[ThumbnailGenerator]
EP[EnrichmentPipeline]
FC[FlorenceClient]
CC[CLIPClient]
end
subgraph Workers["Background Workers"]
DW[DetectionWorker]
AW[AnalysisWorker]
BW[BatchTimeoutWorker]
QW[QueueMetricsWorker]
end
subgraph Background["Background Services"]
GPU[GPUMonitor]
CL[CleanupService]
HM[HealthMonitor]
EB[EventBroadcaster]
SB[SystemBroadcaster]
PC[PerformanceCollector]
CB[CircuitBreaker]
DM[DegradationManager]
end
end
subgraph External["External Services"]
REDIS[(Redis)]
POSTGRES[(PostgreSQL)]
YOLO26[YOLO26]
NEMOTRON[Nemotron LLM]
FLORENCE[Florence-2]
CLIP[CLIP]
end
%% Frontend connections
DASH --> HES
DASH --> HSS
AIP --> HAI
AIP --> HPM
TL --> API
SET --> API
SET --> HHS
SET --> HSV
HES --> HWS
HSS --> HWS
HAI --> HWS
HPM --> API
HWS --> RW
API --> RC & RE & RD & RS & RM
%% Backend internal
FW --> REDIS
DW --> DC
DC --> YOLO26
DC --> POSTGRES
DW --> EP
EP --> FC
EP --> CC
FC --> FLORENCE
CC --> CLIP
DW --> BA
BA --> REDIS
AW --> NA
NA --> NEMOTRON
NA --> POSTGRES
NA --> EB
TG --> POSTGRES
GPU --> POSTGRES
GPU --> SB
PC --> POSTGRES
CL --> POSTGRES
EB --> REDIS
SB --> RW
EB --> RW
CB --> DM
%% Routes to DB
RC & RE & RD & RS --> POSTGRES
RM --> POSTGRES
````
-->
---
## Error Handling and Resilience
### Graceful Degradation
| Component | Failure Mode | Fallback Behavior |
| ---------- | ------------- | ----------------------------------------------------------------- |
| YOLO26 | Unreachable | DetectorClient returns empty list, skips detection |
| Nemotron | Unreachable | NemotronAnalyzer returns default risk (50, medium) |
| Florence-2 | Unreachable | EnrichmentPipeline skips vision extraction |
| CLIP | Unreachable | EnrichmentPipeline skips re-identification |
| Redis | Unreachable | Deduplication fails open (allows processing), pub/sub unavailable |
| GPU | Not available | GPUMonitor returns mock data |
| Enrichment | Unreachable | EnrichmentClient returns empty enrichment, continues processing |
### Circuit Breaker Pattern
The `CircuitBreaker` service (`backend/services/circuit_breaker.py`) protects against cascading failures:
| State | Behavior |
| ---------- | -------------------------------------------------------------------- |
| **Closed** | Normal operation, requests pass through |
| **Open** | All requests immediately fail, prevents overwhelming failing service |
| **Half** | Limited requests allowed to test if service recovered |
### Degradation Manager
The `DegradationManager` service (`backend/services/degradation_manager.py`) coordinates graceful degradation:
- Monitors service health across all AI components
- Automatically disables non-critical enrichment features when resources constrained
- Prioritizes core detection and risk analysis over optional enhancements
- Broadcasts degradation status changes via WebSocket
### Retry and Dead-Letter Queues
```mermaid
flowchart TB
JOB[Job]
JOB --> W[Worker]
W --> P{Processing}
P -->|Success| DONE[Complete]
P -->|Fail| R{Retries < 3?}
R -->|Yes| BACK[Exponential Backoff]
BACK --> W
R -->|No| DLQ[Dead Letter Queue]
DLQ --> API["/api/dlq/*"]
API --> REQUEUE[Manual Requeue]
REQUEUE --> W
````
### Health Monitoring
The `HealthMonitor` service:
1. Periodically checks service health (YOLO26, Nemotron, Redis)
2. On failure, attempts restart with exponential backoff
3. Broadcasts status changes via WebSocket
4. Gives up after max retries (prevents infinite restart loops)
---
## Security Model
### Production Hardening (Recommended)
- Enable `API_KEY_ENABLED=true` with strong keys
- Use HTTPS for AI service endpoints
- Restrict CORS origins
- Add rate limiting
- Run behind reverse proxy with TLS
- Review `docs/ROADMAP.md` security hardening section
---
## Performance Characteristics
| Operation | Typical Latency | Notes |
| ------------------------- | --------------- | ----------------------------------- |
| YOLO26 inference | 30-50ms | Per image, on RTX A5500 |
| Nemotron analysis | 2-5s | Per batch, depends on prompt length |
| WebSocket broadcast | <10ms | Redis pub/sub to clients |
| Database query | <5ms | PostgreSQL with proper indexes |
| Full pipeline (fast path) | ~3-6s | Camera to dashboard notification |
| Full pipeline (batched) | 30-120s | Depends on batch timeout settings |
### Resource Usage
**Other resource usage:**
| Resource | Typical Usage |
| ------------ | ------------- |
| Backend RAM | ~500MB |
| Frontend RAM | ~100MB |
| Redis RAM | ~50MB |
---
## Configuration Summary
See `docs/reference/config/env-reference.md` for complete reference.
**Key environment variables:**
```bash
# Database and Redis (PostgreSQL required)
DATABASE_URL=postgresql+asyncpg://security:password@localhost:5432/security # pragma: allowlist secret
REDIS_URL=redis://localhost:6379/0
# AI Services
YOLO26_URL=http://localhost:8095
NEMOTRON_URL=http://localhost:8091
FLORENCE_URL=http://localhost:8092
CLIP_URL=http://localhost:8093
ENRICHMENT_URL=http://localhost:8094
# Optional enrichment feature toggles (see docs/reference/config/env-reference.md for authoritative list)
VISION_EXTRACTION_ENABLED=true
REID_ENABLED=true
SCENE_CHANGE_ENABLED=true
# Detection
DETECTION_CONFIDENCE_THRESHOLD=0.5
FAST_PATH_CONFIDENCE_THRESHOLD=0.90
FAST_PATH_OBJECT_TYPES=["person"]
# Batching
BATCH_WINDOW_SECONDS=90
BATCH_IDLE_TIMEOUT_SECONDS=30
# Retention
RETENTION_DAYS=30 Related Documentation¶
| Document | Purpose |
|---|---|
docs/reference/config/env-reference.md | Complete environment variable reference |
docs/operator/deployment/ | Docker deployment guide |
docs/operator/ai-installation.md | AI services setup and troubleshooting |
docs/ROADMAP.md | Post-MVP enhancement ideas |
backend/AGENTS.md | Backend architecture details |
frontend/AGENTS.md | Frontend architecture details |
ai/AGENTS.md | AI pipeline details |
This document provides a comprehensive overview of the Home Security Intelligence system architecture. For implementation details, refer to the source code and component-specific AGENTS.md files.