Architecture Overview¶

Architecture Overview
AI-generated visualization of the full system architecture showing Frontend, Backend, AI Services, and GPU tiers.
Last Updated: 2026-01-04 Target Audience: Future maintainers, technical contributors
System Purpose¶

Home Security Intelligence transforms commodity IP cameras into an intelligent threat detection system. Rather than simply alerting when motion is detected, the system uses AI to understand what is happening and why it might be concerning.
Problem solved: Traditional security cameras generate endless motion alerts with no context. A person walking their dog and a stranger approaching at 2 AM both trigger the same notification. This system provides contextual risk assessment using AI, turning raw camera feeds into actionable security intelligence.
Key value proposition:
Contextual alerts: Not just "person detected" but "unfamiliar person approaching back entrance at 2 AM, risk: high"
Batch reasoning: Groups multiple detections into coherent events for better context
Fast path: High-confidence critical detections bypass batching for immediate alerts
Local processing: All AI inference runs locally on your hardware, no cloud dependencies
High-Level Architecture¶

System architecture overview diagram showing four layers: Camera Layer (Foscam IP cameras uploading via FTP), Application Layer (React frontend and FastAPI backend with Redis), GPU Services Layer (YOLO26, Nemotron, Florence-2, CLIP, and Enrichment containers with GPU passthrough), and Data Layer (PostgreSQL database and filesystem storage for thumbnails)
The system is organized into four layers:
Camera Layer: Foscam IP cameras upload images via FTP
Application Layer: React frontend + FastAPI backend
GPU Services: 5 AI inference containers (YOLO26, Nemotron, Florence-2, CLIP, Enrichment)
Data Layer: PostgreSQL, Redis, and filesystem storage
Detailed System Architecture Diagram¶

System architecture diagram showing four layers: Camera Layer with Foscam cameras, FTP Upload storage, Docker Services (Frontend, Backend, Redis), Containerized GPU Services (YOLO26, Nemotron, Florence-2, CLIP, Enrichment), and Persistent Storage (PostgreSQL, Filesystem)
<!-- Original Mermaid diagram preserved for reference:
flowchart TB
    subgraph Cameras["Camera Layer"]
        CAM1[Foscam Camera 1]
        CAM2[Foscam Camera 2]
        CAM3[Foscam Camera N]
    end

    subgraph FTP["FTP Upload"]
        FTPS["/export/foscam/{camera}/"]
    end

    subgraph Docker["Docker Services"]
        direction TB
        FE["Frontend<br/>React + Vite<br/>:5173"]
        BE["Backend<br/>FastAPI<br/>:8000"]
        RD["Redis<br/>:6379"]
    end

    subgraph GPU["Containerized GPU Services"]
        DET["YOLO26<br/>Object Detection<br/>:8095"]
        FLO["Florence-2<br/>Vision extraction (optional)<br/>:8092"]
        CLIP["CLIP<br/>Re-identification (optional)<br/>:8093"]
        ENR["Enrichment API<br/>Model-zoo enrichment (optional)<br/>:8094"]
        LLM["Nemotron LLM<br/>Risk Analysis<br/>:8091"]
    end

    subgraph Storage["Persistent Storage"]
        DB[(PostgreSQL<br/>security)]
        FS[("Filesystem<br/>thumbnails/")]
    end

    CAM1 & CAM2 & CAM3 -->|FTP Upload| FTPS

    FTPS -->|FileWatcher| BE
    BE <-->|Queues & Pub/Sub| RD
    BE -->|HTTP /detect| DET
    BE -->|HTTP (optional)| FLO
    BE -->|HTTP (optional)| CLIP
    BE -->|HTTP (optional)| ENR
    BE -->|HTTP /completion| LLM
    BE <-->|SQLAlchemy| DB
    BE -->|PIL/Pillow| FS
    FE <-->|REST API| BE
    FE <-->|WebSocket| BE

````
-->

---

## Technology Stack

| Layer                | Technology       | Version | Why This Choice                                                                |
| -------------------- | ---------------- | ------- | ------------------------------------------------------------------------------ |
| **Frontend**         | React            | 18.2    | Industry standard, excellent ecosystem, component model fits dashboard UI      |
|                      | TypeScript       | 5.3     | Type safety catches bugs early, better IDE support, self-documenting code      |
|                      | Tailwind CSS     | 3.4     | Utility-first approach, dark theme customization, responsive design            |
|                      | Tremor           | 3.17    | Pre-built data visualization components (charts, gauges) for dashboards        |
|                      | Vite             | 5.0     | Fast dev server with HMR, modern bundling, excellent DX                        |
| **Backend**          | Python           | 3.14+   | AI/ML ecosystem (PyTorch, transformers), async support, rapid development      |
|                      | FastAPI          | 0.104+  | Modern async framework, automatic OpenAPI docs, type hints, WebSocket support  |
|                      | SQLAlchemy       | 2.0     | Async ORM, excellent PostgreSQL support, type-safe queries with `Mapped` hints |
|                      | Pydantic         | 2.0     | Request validation, settings management, schema generation                     |
| **Database**         | PostgreSQL       | 15+     | Concurrent writes, JSONB, full-text search, proper transaction isolation       |
|                      | Redis            | 7.x     | Fast pub/sub for WebSocket, reliable queues for pipeline, ephemeral cache      |
| **AI - Detection**   | YOLO26        | -       | Real-time transformer detector, 30-50ms inference, COCO-trained                |
|                      | PyTorch          | 2.x     | GPU acceleration, HuggingFace Transformers integration                         |
| **AI - Reasoning**   | Nemotron-3-Nano-30B | Q4_K_M  | NVIDIA v3 Nano 30B LLM, 128K context, ~14.7GB VRAM (dev: Mini 4B ~3GB)         |
|                      | llama.cpp        | -       | Efficient inference, GGUF format, GPU offloading, HTTP API                     |
| **Containerization** | Docker Compose   | 2.x     | Multi-service orchestration, health checks, networking                         |
| **Monitoring**       | Prometheus       | 2.48    | Time-series metrics, optional monitoring stack                                 |
|                      | Grafana          | 10.2    | Dashboards for system monitoring                                               |

---

## Component Responsibilities

### Backend Components

| Component               | Location                  | Responsibility                                                      |
| ----------------------- | ------------------------- | ------------------------------------------------------------------- |
| **FastAPI App**         | `backend/main.py`         | HTTP/WebSocket server, middleware, lifespan management              |
| **API Routes**          | `backend/api/routes/`     | REST endpoints for cameras, events, detections, system, media, logs |
| **Pydantic Schemas**    | `backend/api/schemas/`    | Request/response validation, OpenAPI documentation                  |
| **Middleware**          | `backend/api/middleware/` | Authentication (optional), request ID propagation                   |
| **ORM Models**          | `backend/models/`         | SQLAlchemy models: Camera, Detection, Event, GPUStats, Log, APIKey  |
| **Core Infrastructure** | `backend/core/`           | Config, database, Redis, logging, metrics                           |

### Service Layer (AI Pipeline)

| Service                   | Location                                     | Responsibility                                            |
| ------------------------- | -------------------------------------------- | --------------------------------------------------------- |
| **FileWatcher**           | `backend/services/file_watcher.py`           | Monitor camera directories, debounce, queue new images    |
| **DedupeService**         | `backend/services/dedupe.py`                 | Prevent duplicate processing via content hashes           |
| **DetectorClient**        | `backend/services/detector_client.py`        | HTTP client for YOLO26, store detections               |
| **BatchAggregator**       | `backend/services/batch_aggregator.py`       | Group detections into time-windowed batches               |
| **NemotronAnalyzer**      | `backend/services/nemotron_analyzer.py`      | LLM risk analysis, event creation                         |
| **ThumbnailGenerator**    | `backend/services/thumbnail_generator.py`    | Bounding box overlays, preview images                     |
| **EventBroadcaster**      | `backend/services/event_broadcaster.py`      | WebSocket event distribution via Redis pub/sub            |
| **SystemBroadcaster**     | `backend/services/system_broadcaster.py`     | Periodic system status broadcasts                         |
| **GPUMonitor**            | `backend/services/gpu_monitor.py`            | NVIDIA GPU metrics via pynvml                             |
| **CleanupService**        | `backend/services/cleanup_service.py`        | Data retention enforcement                                |
| **HealthMonitor**         | `backend/services/health_monitor.py`         | Service health checks, auto-recovery                      |
| **RetryHandler**          | `backend/services/retry_handler.py`          | Exponential backoff, dead-letter queues                   |
| **PipelineWorkerManager** | `backend/services/pipeline_workers.py`       | Background worker lifecycle management                    |
| **AlertEngine**           | `backend/services/alert_engine.py`           | Alert rule evaluation and notification triggering         |
| **ZoneService**           | `backend/services/zone_service.py`           | Geographic zone management for detections                 |
| **BaselineService**       | `backend/services/baseline.py`               | Anomaly detection via activity baselines                  |
| **EnrichmentPipeline**    | `backend/services/enrichment_pipeline.py`    | Orchestrate multi-model detection enrichment              |
| **EnrichmentClient**      | `backend/services/enrichment_client.py`      | HTTP client for enrichment API service                    |
| **FlorenceClient**        | `backend/services/florence_client.py`        | HTTP client for Florence-2 vision extraction              |
| **CLIPClient**            | `backend/services/clip_client.py`            | HTTP client for CLIP re-identification                    |
| **PerformanceCollector**  | `backend/services/performance_collector.py`  | AI pipeline performance metrics collection                |
| **PromptService**         | `backend/services/prompt_service.py`         | Dynamic prompt template management                        |
| **PromptVersionService**  | `backend/services/prompt_version_service.py` | Prompt versioning and A/B testing support                 |
| **AuditService**          | `backend/services/audit_logger.py`          | Security audit logging and compliance tracking            |
| **NotificationService**   | `backend/services/notification.py`           | Alert delivery via multiple channels                      |
| **SceneChangeDetector**   | `backend/services/scene_change_detector.py`  | Detect significant scene changes between frames           |
| **SceneBaseline**         | `backend/services/scene_baseline.py`         | Maintain per-camera scene baselines for anomaly detection |
| **VideoProcessor**        | `backend/services/video_processor.py`        | Process and analyze video clips                           |
| **ReidService**           | `backend/services/reid_service.py`           | Person/entity re-identification across detections         |
| **ContextEnricher**       | `backend/services/context_enricher.py`       | Add contextual metadata to detections                     |
| **CircuitBreaker**        | `backend/services/circuit_breaker.py`        | Protect services from cascading failures                  |
| **DegradationManager**    | `backend/services/degradation_manager.py`    | Graceful degradation during service failures              |
| **CacheService**          | `backend/services/cache_service.py`          | Redis-based caching for frequently accessed data          |
| **AlertDedupService**     | `backend/services/alert_dedup.py`            | Deduplicate repeated alerts                               |
| **SearchService**         | `backend/services/search.py`                 | Full-text search across events and detections             |
| **SeverityService**       | `backend/services/severity.py`               | Calculate and normalize severity scores                   |
| **ModelZoo**              | `backend/services/model_zoo.py`              | Manage optional ML model loading and inference            |
| **VisionExtractor**       | `backend/services/vision_extractor.py`       | Extract visual features from detection images             |
| **BBoxValidation**        | `backend/services/bbox_validation.py`        | Validate and normalize bounding box coordinates           |

### Frontend Components

| Component            | Location                             | Responsibility                                        |
| -------------------- | ------------------------------------ | ----------------------------------------------------- |
| **DashboardPage**    | `frontend/src/components/dashboard/` | Main view with risk gauge, camera grid, activity feed |
| **EventTimeline**    | `frontend/src/components/events/`    | Chronological event list with filtering               |
| **EventDetailModal** | `frontend/src/components/events/`    | Full event details, detections, reasoning             |
| **SettingsPage**     | `frontend/src/components/settings/`  | Camera management, AI status, processing config       |
| **LogsDashboard**    | `frontend/src/components/logs/`      | System logs with filtering and statistics             |
| **Layout**           | `frontend/src/components/layout/`    | Header, sidebar, navigation                           |
| **API Client**       | `frontend/src/services/api.ts`       | Type-safe REST API wrapper                            |

### Frontend Hooks

| Hook                       | Location                                       | Responsibility                                       |
| -------------------------- | ---------------------------------------------- | ---------------------------------------------------- |
| **useWebSocket**           | `frontend/src/hooks/useWebSocket.ts`           | Core WebSocket connection management                 |
| **WebSocketManager**       | `frontend/src/hooks/webSocketManager.ts`       | Singleton WebSocket instance with reconnection logic |
| **useEventStream**         | `frontend/src/hooks/useEventStream.ts`         | Subscribe to real-time security events               |
| **useSystemStatus**        | `frontend/src/hooks/useSystemStatus.ts`        | Subscribe to system health broadcasts                |
| **useWebSocketStatus**     | `frontend/src/hooks/useWebSocketStatus.ts`     | Track WebSocket connection state                     |
| **useConnectionStatus**    | `frontend/src/hooks/useConnectionStatus.ts`    | Combined API and WebSocket connection status         |
| **useHealthStatus**        | `frontend/src/hooks/useHealthStatus.ts`        | Monitor backend service health                       |
| **useServiceStatus**       | `frontend/src/hooks/useServiceStatus.ts`       | Track individual AI service availability             |
| **useAIMetrics**           | `frontend/src/hooks/useAIMetrics.ts`           | AI pipeline performance metrics (latency, accuracy)  |
| **usePerformanceMetrics**  | `frontend/src/hooks/usePerformanceMetrics.ts`  | System performance metrics (CPU, memory, GPU)        |
| **useGpuHistory**          | `frontend/src/hooks/useGpuHistory.ts`          | Historical GPU utilization data                      |
| **useStorageStats**        | `frontend/src/hooks/useStorageStats.ts`        | Storage usage and retention statistics               |
| **useModelZooStatus**      | `frontend/src/hooks/useModelZooStatus.ts`      | Optional model zoo loading status                    |
| **useDetectionEnrichment** | `frontend/src/hooks/useDetectionEnrichment.ts` | Fetch enriched detection metadata                    |
| **useSavedSearches**       | `frontend/src/hooks/useSavedSearches.ts`       | Manage user-saved search filters                     |
| **useSidebarContext**      | `frontend/src/hooks/useSidebarContext.ts`      | Sidebar state management context                     |

### AI Services

| Service              | Location             | Responsibility                                       |
| -------------------- | -------------------- | ---------------------------------------------------- |
| **YOLO26 Server** | `ai/yolo26/model.py` | Object detection inference, security-class filtering |
| **Nemotron LLM**     | `ai/nemotron/`       | Risk reasoning via llama.cpp server                  |
| **Florence-2**       | `ai/florence/`       | Optional vision extraction used by enrichment        |
| **CLIP**             | `ai/clip/`           | Optional entity re-identification used by enrichment |
| **Enrichment API**   | `ai/enrichment/`     | Optional higher-level enrichment endpoint            |

---

## Communication Patterns

### REST API

Used for: CRUD operations, data queries, configuration

| Endpoint Pattern               | Methods            | Purpose                          |
| ------------------------------ | ------------------ | -------------------------------- |
| `/api/cameras`                 | GET, POST          | List/create cameras              |
| `/api/cameras/{id}`            | GET, PATCH, DELETE | Single camera operations         |
| `/api/events`                  | GET                | List events with filtering       |
| `/api/events/{id}`             | GET, PATCH         | Get/update event (mark reviewed) |
| `/api/detections`              | GET                | List detections with filtering   |
| `/api/system/health`           | GET                | Comprehensive health check       |
| `/api/system/gpu`              | GET                | GPU statistics                   |
| `/api/media/thumbnails/{file}` | GET                | Serve detection thumbnails       |
| `/api/logs`                    | GET                | List system logs                 |
| `/api/dlq/*`                   | GET, POST, DELETE  | Dead-letter queue management     |
| `/api/metrics`                 | GET                | Prometheus metrics               |

### WebSocket

Used for: Real-time updates without polling

| Channel      | Endpoint                 | Purpose                      | Message Frequency |
| ------------ | ------------------------ | ---------------------------- | ----------------- |
| **Events**   | `/ws/events`             | Security event notifications | On event creation |
| **System**   | `/ws/system`             | System health and GPU stats  | Every 5 seconds   |
| **Job Logs** | `/ws/jobs/{job_id}/logs` | Real-time job log streaming  | On log emission   |


**Message Format:**

```json
{
  "type": "event",
  "data": {
    "id": 123,
    "camera_id": "front_door",
    "risk_score": 75,
    "risk_level": "high",
    "summary": "Person detected..."
  }
}
````

### Redis Pub/Sub

Used for: Multi-instance WebSocket broadcasting

| Channel           | Publisher        | Subscribers         | Purpose                                    |
| ----------------- | ---------------- | ------------------- | ------------------------------------------ |
| `security_events` | NemotronAnalyzer | EventBroadcaster(s) | Distribute events to all backend instances |

### Redis Queues (Lists)

Used for: Reliable async job processing

| Queue                 | Producer        | Consumer             | Data                                   |
| --------------------- | --------------- | -------------------- | -------------------------------------- |
| `detection_queue`     | FileWatcher     | DetectionQueueWorker | `{camera_id, file_path, timestamp}`    |
| `analysis_queue`      | BatchAggregator | AnalysisQueueWorker  | `{batch_id, camera_id, detection_ids}` |
| `dlq:detection_queue` | RetryHandler    | Manual/API           | Failed detection jobs                  |
| `dlq:analysis_queue`  | RetryHandler    | Manual/API           | Failed analysis jobs                   |

### HTTP (Internal Services)

Used for: AI inference requests

| Service  | Endpoint      | Method | Request                             | Response                                    |
| -------- | ------------- | ------ | ----------------------------------- | ------------------------------------------- |
| YOLO26   | `/detect`     | POST   | Multipart image                     | `{detections: [{class, confidence, bbox}]}` |
| YOLO26   | `/health`     | GET    | -                                   | `{status, model_loaded, cuda_available}`    |
| Nemotron | `/completion` | POST   | `{prompt, temperature, max_tokens}` | `{content: "..."}`                          |
| Nemotron | `/health`     | GET    | -                                   | `{status: "ok"}`                            |

---

## Deployment Topology

![Deployment topology diagram showing host machine with Docker Compose network (Frontend, Backend, Redis containers), Containerized GPU Services via CDI (YOLO26, Nemotron, Florence-2, CLIP, Enrichment), Persistent Storage volumes, and NVIDIA RTX A5500 GPU](../images/architecture/overview-deployment-topology.svg)

<!--
Original Mermaid diagram preserved for reference:

```mermaid
flowchart TB
    subgraph Host["Host Machine (with GPU)"]
        subgraph Docker["Docker Compose Network"]
            FE["Frontend Container<br/>Node 22 Alpine<br/>Port 5173 (dev) / 80 (prod)"]
            BE["Backend Container<br/>Python 3.14<br/>Port 8000"]
            RD["Redis Container<br/>Redis 7 Alpine<br/>Port 6379"]
        end

        subgraph GPUContainers["Containerized GPU Services (CDI)"]
            DET["YOLO26 Container<br/>PyTorch + Transformers<br/>Port 8095<br/>~3–4GB VRAM"]
            FLO["Florence-2 Container<br/>Vision extraction (optional)<br/>Port 8092<br/>VRAM varies"]
            CLIP["CLIP Container<br/>Re-ID (optional)<br/>Port 8093<br/>VRAM varies"]
            ENR["Enrichment Container<br/>Model-zoo API (optional)<br/>Port 8094<br/>VRAM varies"]
            LLM["Nemotron Container<br/>llama.cpp<br/>Port 8091<br/>~14.7GB VRAM*"]
        end

        subgraph Storage["Persistent Storage"]
            VOL1["PostgreSQL<br/>database volume"]
            VOL2["/export/foscam/<br/>Camera uploads (RO)"]
            VOL3["redis_data<br/>Redis persistence"]
        end

        GPU["NVIDIA GPU<br/>RTX A5500 (24GB)<br/>CUDA 12.0+"]
    end

    FE <--> BE

    BE <--> RD
    BE -->|localhost:8095| DET
    BE -->|localhost:8092 (optional)| FLO
    BE -->|localhost:8093 (optional)| CLIP
    BE -->|localhost:8094 (optional)| ENR
    BE -->|localhost:8091| LLM
    DET --> GPU
    LLM --> GPU
    BE --> VOL1
    BE --> VOL2
    RD --> VOL3

````
-->

### What Runs Where

| Component      | Deployment                      | Why                                      |
| -------------- | ------------------------------- | ---------------------------------------- |
| **Frontend**   | Podman (dev: Vite, prod: Nginx) | No GPU needed, isolated environment      |
| **Backend**    | Podman                          | No GPU needed, isolated environment      |
| **Redis**      | Podman                          | No GPU needed, ephemeral data acceptable |
| **PostgreSQL** | Podman                          | Database isolation, volume persistence   |
| **YOLO26**  | Podman (GPU via CDI)            | GPU access via NVIDIA Container Toolkit  |
| **Nemotron**   | Podman (GPU via CDI)            | GPU access via NVIDIA Container Toolkit  |
| **Florence-2** | Podman (GPU via CDI, optional)  | Optional enrichment capability           |
| **CLIP**       | Podman (GPU via CDI, optional)  | Optional enrichment capability           |
| **Enrichment** | Podman (GPU via CDI, optional)  | Optional enrichment capability           |

### Port Summary

| Port | Service         | Protocol | Exposed To                   |
| ---- | --------------- | -------- | ---------------------------- |
| 5173 | Frontend (dev)  | HTTP     | Browser                      |
| 80   | Frontend (prod) | HTTP     | Browser                      |
| 8000 | Backend API     | HTTP/WS  | Browser, Frontend container  |
| 6379 | Redis           | TCP      | Backend container only       |
| 8095 | YOLO26       | HTTP     | Backend container, localhost |
| 8092 | Florence-2      | HTTP     | Backend container, localhost |
| 8093 | CLIP            | HTTP     | Backend container, localhost |
| 8094 | Enrichment      | HTTP     | Backend container, localhost |
| 8091 | Nemotron        | HTTP     | Backend container, localhost |

---

## Data Flow

### Complete Pipeline: Camera to Dashboard

![Sequence diagram showing the complete data flow from camera FTP upload through FileWatcher, detection queue, YOLO26 inference, optional enrichment, fast path vs normal batching decision, Nemotron LLM analysis, and WebSocket broadcast to the dashboard](../images/architecture/overview-pipeline-sequence.svg)

<!--
Original Mermaid diagram preserved for reference:

```mermaid
sequenceDiagram
    participant CAM as Foscam Camera
    participant FTP as /export/foscam/
    participant FW as FileWatcher
    participant DQ as detection_queue
    participant DW as DetectionWorker
    participant DET as YOLO26
    participant DB as PostgreSQL
    participant EN as EnrichmentPipeline
    participant BA as BatchAggregator
    participant AQ as analysis_queue
    participant AW as AnalysisWorker
    participant LLM as Nemotron
    participant EB as EventBroadcaster
    participant WS as WebSocket
    participant UI as Dashboard

    CAM->>FTP: FTP upload image
    FTP->>FW: watchdog event (file created)
    FW->>FW: debounce (0.5s)
    FW->>FW: validate image (PIL)
    FW->>DQ: queue {camera_id, file_path}

    DQ->>DW: BLPOP (blocking pop)
    DW->>DW: dedupe check (Redis/DB)
    DW->>DET: POST /detect (image)
    DET->>DET: inference (30-50ms)
    DET-->>DW: {detections: [...]}
    DW->>DB: INSERT detections
    opt Enrichment enabled
        DW->>EN: enrich detections (context + optional services)
        EN-->>DW: enriched attributes/entities
        DW->>DB: UPDATE detection metadata
    end

    alt Fast Path (person > 90% confidence)
        DW->>LLM: immediate analysis
        LLM-->>DW: risk assessment
        DW->>DB: INSERT event (is_fast_path=true)
        DW->>EB: broadcast_event()
    else Normal Path
        DW->>BA: add_detection(camera_id, detection_id)
        BA->>BA: check batch timeout (90s window / 30s idle)
        BA->>AQ: queue {batch_id, detection_ids}
    end

    AQ->>AW: BLPOP
    AW->>DB: SELECT detections WHERE id IN (...)
    AW->>LLM: POST /completion (prompt)
    LLM->>LLM: inference (2-5s)
    LLM-->>AW: {risk_score, risk_level, summary, reasoning}
    AW->>DB: INSERT event
    AW->>EB: broadcast_event()

    EB->>WS: send to connected clients
    WS->>UI: {"type": "event", "data": {...}}
    UI->>UI: update activity feed
````

-->

### Batching Logic

Why batch detections instead of analyzing each frame?

A single "person walks to door" scenario might generate 15 images over 30 seconds. Batching provides:

1. **Better context:** LLM sees the full sequence, not isolated frames
2. **Reduced API calls:** One LLM call per event, not per frame
3. **Coherent events:** User sees "Person approached door" not 15 separate alerts

**Batch timing:**


### Fast Path Flow

Critical detections can bypass batching for immediate alerts:

![Fast path decision flowchart showing detection input flowing to a decision diamond checking if confidence is greater than 90% and type is person, with Yes leading to Fast Path immediate LLM analysis and No leading to Normal Path batch accumulation with timeout](../images/architecture/overview-fast-path-decision.svg)

<!--
Original Mermaid diagram preserved for reference:

```mermaid
flowchart TB
    D[Detection]
    D --> C{Confidence > 90%<br/>AND<br/>type = person?}

    C -->|Yes| FP[Fast Path<br/>Immediate LLM analysis]
    C -->|No| NP[Normal Path<br/>Add to batch]
    FP --> E1[Event created<br/>is_fast_path=true]
    NP --> B[Batch accumulates]
    B --> T{Timeout?}
    T -->|Yes| E2[Event created<br/>is_fast_path=false]

````
-->

---

## Database Schema

```mermaid
erDiagram
    cameras ||--o{ detections : "has"
    cameras ||--o{ events : "has"

    cameras {
        string id PK "UUID"
        string name "Human-readable name"
        string folder_path "FTP upload path"
        string status "online/offline/error"
        datetime created_at
        datetime last_seen_at
    }

    detections {
        int id PK "Auto-increment"
        string camera_id FK
        string file_path "Original image"
        datetime detected_at
        string object_type "person/car/dog/etc"
        float confidence "0.0-1.0"
        int bbox_x "Bounding box"
        int bbox_y
        int bbox_width
        int bbox_height
        string thumbnail_path "With boxes drawn"
    }

    events {
        int id PK "Auto-increment"
        string batch_id "Groups detections"
        string camera_id FK
        datetime started_at
        datetime ended_at
        int risk_score "0-100 from LLM"
        string risk_level "low/medium/high/critical"
        text summary "LLM summary"
        text reasoning "LLM explanation"
        text detection_ids "JSON array"
        bool reviewed "User marked"
        text notes "User notes"
        bool is_fast_path "Bypassed batching"
    }

    gpu_stats {
        int id PK
        datetime recorded_at
        float gpu_utilization "0-100%"
        int memory_used "MB"
        int memory_total "MB"
        float temperature "Celsius"
        float inference_fps
    }

    logs {
        int id PK
        datetime timestamp
        string level "DEBUG/INFO/WARNING/ERROR/CRITICAL"
        string component "Logger name"
        text message
        string camera_id "Nullable"
        int event_id "Nullable"
        string request_id "Correlation ID"
        int detection_id "Nullable"
        int duration_ms "Nullable"
        json extra "Additional context"
        string source "backend/frontend"
    }

    api_keys {
        int id PK
        string key_hash "SHA-256"
        string name
        datetime created_at
        bool is_active
    }
````

### Key Indexes

| Table      | Index                       | Purpose                      |
| ---------- | --------------------------- | ---------------------------- |
| detections | (camera_id, detected_at)    | Camera-specific time queries |
| events     | started_at                  | Timeline queries             |
| events     | risk_score                  | High-risk filtering          |
| events     | reviewed                    | Workflow queries             |
| gpu_stats  | recorded_at                 | Time-series queries          |
| logs       | timestamp, level, component | Dashboard filters            |

---

## Component Interaction Diagram

![Component interaction diagram showing Frontend (React pages, custom hooks, services), Backend (FastAPI routes, AI pipeline, background workers, background services), and External Services (Redis, PostgreSQL, YOLO26, Nemotron LLM, Florence-2, CLIP) with connection lines showing data flow between components](../images/architecture/overview-component-interaction.svg)

<!--
Original Mermaid diagram preserved for reference:

```mermaid
flowchart TB
    subgraph Frontend["Frontend (React)"]
        DASH[DashboardPage]
        TL[EventTimeline]
        SET[SettingsPage]
        AIP[AIPerformancePage]

        subgraph Hooks["Custom Hooks"]
            HWS[useWebSocket]
            HES[useEventStream]
            HSS[useSystemStatus]
            HAI[useAIMetrics]
            HPM[usePerformanceMetrics]
            HHS[useHealthStatus]
            HSV[useServiceStatus]
            HCS[useConnectionStatus]
            HMZ[useModelZooStatus]
        end

        subgraph Services["Services"]
            API[api.ts]
            LOG[logger.ts]
        end
    end

    subgraph Backend["Backend (FastAPI)"]
        subgraph Routes["API Routes"]
            RC[/cameras]
            RE[/events]
            RD[/detections]
            RS[/system]
            RM[/media]
            RW[/ws/*]
        end

        subgraph Pipeline["AI Pipeline"]
            FW[FileWatcher]
            DC[DetectorClient]
            BA[BatchAggregator]
            NA[NemotronAnalyzer]
            TG[ThumbnailGenerator]
            EP[EnrichmentPipeline]
            FC[FlorenceClient]
            CC[CLIPClient]
        end

        subgraph Workers["Background Workers"]
            DW[DetectionWorker]
            AW[AnalysisWorker]
            BW[BatchTimeoutWorker]
            QW[QueueMetricsWorker]
        end

        subgraph Background["Background Services"]
            GPU[GPUMonitor]
            CL[CleanupService]
            HM[HealthMonitor]
            EB[EventBroadcaster]
            SB[SystemBroadcaster]
            PC[PerformanceCollector]
            CB[CircuitBreaker]
            DM[DegradationManager]
        end
    end

    subgraph External["External Services"]
        REDIS[(Redis)]
        POSTGRES[(PostgreSQL)]
        YOLO26[YOLO26]
        NEMOTRON[Nemotron LLM]
        FLORENCE[Florence-2]
        CLIP[CLIP]
    end

    %% Frontend connections
    DASH --> HES

    DASH --> HSS
    AIP --> HAI
    AIP --> HPM
    TL --> API
    SET --> API
    SET --> HHS
    SET --> HSV
    HES --> HWS
    HSS --> HWS
    HAI --> HWS
    HPM --> API
    HWS --> RW
    API --> RC & RE & RD & RS & RM

    %% Backend internal
    FW --> REDIS
    DW --> DC
    DC --> YOLO26
    DC --> POSTGRES
    DW --> EP
    EP --> FC
    EP --> CC
    FC --> FLORENCE
    CC --> CLIP
    DW --> BA
    BA --> REDIS
    AW --> NA
    NA --> NEMOTRON
    NA --> POSTGRES
    NA --> EB
    TG --> POSTGRES
    GPU --> POSTGRES
    GPU --> SB
    PC --> POSTGRES
    CL --> POSTGRES
    EB --> REDIS
    SB --> RW
    EB --> RW
    CB --> DM

    %% Routes to DB
    RC & RE & RD & RS --> POSTGRES
    RM --> POSTGRES

````
-->

---

## Error Handling and Resilience

### Graceful Degradation

| Component  | Failure Mode  | Fallback Behavior                                                 |
| ---------- | ------------- | ----------------------------------------------------------------- |
| YOLO26  | Unreachable   | DetectorClient returns empty list, skips detection                |
| Nemotron   | Unreachable   | NemotronAnalyzer returns default risk (50, medium)                |
| Florence-2 | Unreachable   | EnrichmentPipeline skips vision extraction                        |
| CLIP       | Unreachable   | EnrichmentPipeline skips re-identification                        |
| Redis      | Unreachable   | Deduplication fails open (allows processing), pub/sub unavailable |
| GPU        | Not available | GPUMonitor returns mock data                                      |
| Enrichment | Unreachable   | EnrichmentClient returns empty enrichment, continues processing   |

### Circuit Breaker Pattern

The `CircuitBreaker` service (`backend/services/circuit_breaker.py`) protects against cascading failures:

| State      | Behavior                                                             |
| ---------- | -------------------------------------------------------------------- |
| **Closed** | Normal operation, requests pass through                              |
| **Open**   | All requests immediately fail, prevents overwhelming failing service |
| **Half**   | Limited requests allowed to test if service recovered                |

### Degradation Manager

The `DegradationManager` service (`backend/services/degradation_manager.py`) coordinates graceful degradation:

- Monitors service health across all AI components
- Automatically disables non-critical enrichment features when resources constrained
- Prioritizes core detection and risk analysis over optional enhancements
- Broadcasts degradation status changes via WebSocket

### Retry and Dead-Letter Queues

```mermaid
flowchart TB
    JOB[Job]
    JOB --> W[Worker]
    W --> P{Processing}
    P -->|Success| DONE[Complete]
    P -->|Fail| R{Retries < 3?}
    R -->|Yes| BACK[Exponential Backoff]
    BACK --> W
    R -->|No| DLQ[Dead Letter Queue]
    DLQ --> API["/api/dlq/*"]
    API --> REQUEUE[Manual Requeue]
    REQUEUE --> W
````

### Health Monitoring

The `HealthMonitor` service:

1. Periodically checks service health (YOLO26, Nemotron, Redis)
2. On failure, attempts restart with exponential backoff
3. Broadcasts status changes via WebSocket
4. Gives up after max retries (prevents infinite restart loops)

---

## Security Model


### Production Hardening (Recommended)

- Enable `API_KEY_ENABLED=true` with strong keys
- Use HTTPS for AI service endpoints
- Restrict CORS origins
- Add rate limiting
- Run behind reverse proxy with TLS
- Review `docs/ROADMAP.md` security hardening section

---

## Performance Characteristics

| Operation                 | Typical Latency | Notes                               |
| ------------------------- | --------------- | ----------------------------------- |
| YOLO26 inference          | 30-50ms         | Per image, on RTX A5500             |
| Nemotron analysis         | 2-5s            | Per batch, depends on prompt length |
| WebSocket broadcast       | <10ms           | Redis pub/sub to clients            |
| Database query            | <5ms            | PostgreSQL with proper indexes      |
| Full pipeline (fast path) | ~3-6s           | Camera to dashboard notification    |
| Full pipeline (batched)   | 30-120s         | Depends on batch timeout settings   |

### Resource Usage


**Other resource usage:**

| Resource     | Typical Usage |
| ------------ | ------------- |
| Backend RAM  | ~500MB        |
| Frontend RAM | ~100MB        |
| Redis RAM    | ~50MB         |

---

## Configuration Summary

See `docs/reference/config/env-reference.md` for complete reference.

**Key environment variables:**

```bash
# Database and Redis (PostgreSQL required)
DATABASE_URL=postgresql+asyncpg://security:password@localhost:5432/security  # pragma: allowlist secret
REDIS_URL=redis://localhost:6379/0

# AI Services
YOLO26_URL=http://localhost:8095
NEMOTRON_URL=http://localhost:8091
FLORENCE_URL=http://localhost:8092
CLIP_URL=http://localhost:8093
ENRICHMENT_URL=http://localhost:8094

# Optional enrichment feature toggles (see docs/reference/config/env-reference.md for authoritative list)
VISION_EXTRACTION_ENABLED=true
REID_ENABLED=true
SCENE_CHANGE_ENABLED=true

# Detection
DETECTION_CONFIDENCE_THRESHOLD=0.5
FAST_PATH_CONFIDENCE_THRESHOLD=0.90
FAST_PATH_OBJECT_TYPES=["person"]

# Batching
BATCH_WINDOW_SECONDS=90
BATCH_IDLE_TIMEOUT_SECONDS=30

# Retention
RETENTION_DAYS=30
Document	Purpose
`docs/reference/config/env-reference.md`	Complete environment variable reference
`docs/operator/deployment/`	Docker deployment guide
`docs/operator/ai-installation.md`	AI services setup and troubleshooting
`docs/ROADMAP.md`	Post-MVP enhancement ideas
`backend/AGENTS.md`	Backend architecture details
`frontend/AGENTS.md`	Frontend architecture details
`ai/AGENTS.md`	AI pipeline details
This document provides a comprehensive overview of the Home Security Intelligence system architecture. For implementation details, refer to the source code and component-specific AGENTS.md files.
Architecture Overview¶

System Purpose¶

High-Level Architecture¶

Detailed System Architecture Diagram¶

Related Documentation¶