AI Pipeline Overview¶

High-level understanding of the detection and analysis flow.

Time to read: ~8 min Prerequisites: Codebase Tour

What the Pipeline Does¶

The AI pipeline transforms raw camera images into risk-scored security events through a multi-stage process:

Camera FTP -> FileWatcher -> detection_queue -> YOLO26 -> Detections
  -> Enrichment (context + model zoo + optional Florence/CLIP) -> BatchAggregator -> analysis_queue
  -> Nemotron LLM -> Event -> WebSocket

Pipeline Stages¶

1. Image Capture¶

Foscam cameras upload images via FTP when motion is detected.

Directory structure:

/export/foscam/
  front_door/
    MDAlarm_20251228_120000.jpg
    MDAlarm_20251228_120030.jpg
  backyard/
    snap_20251228_120015.jpg

2. File Watching¶

The FileWatcher service monitors directories using OS-native notifications (inotify on Linux, FSEvents on macOS).

Key behaviors:

Debounce delay: 0.5 seconds (handles FTP chunked uploads)
Image validation via PIL
Deduplication via SHA256 content hash

Source: backend/services/file_watcher.py

3. Object Detection¶

YOLO26 receives images and returns detected objects with bounding boxes.

Security-relevant classes:

person, car, truck, bus
dog, cat, bird
bicycle, motorcycle

Source: backend/services/detector_client.py, ai/yolo26/model.py

4. Batch Aggregation¶

Detections are grouped into time-windowed batches before LLM analysis.

Why batch?

Better LLM context (see full activity, not isolated frames)
Reduced noise from frame-to-frame variations
More efficient (one LLM call per event, not per frame)

Timing:

Window: 90 seconds maximum
Idle timeout: 30 seconds no activity

Source: backend/services/batch_aggregator.py

5. Fast Path¶

High-confidence critical detections bypass batching:

Confidence >= 0.90
Object type is person
Creates event with is_fast_path=True

Latency: ~3-6 seconds (vs. 30-90 seconds for normal path)

6. Risk Analysis¶

Nemotron LLM analyzes batched detections and generates:

Risk score (0-100)
Risk level (low/medium/high/critical)
Human-readable summary
Reasoning explanation

Source: backend/services/nemotron_analyzer.py

6.5 Enrichment (Context + Model Zoo)¶

Before calling the LLM, the backend can enrich a batch with additional context:

ContextEnricher: zones, baseline deviation, cross-camera activity
EnrichmentPipeline: model-zoo enrichment (plates/OCR, faces, clothing/vehicle/pet signals, image quality/tamper, etc.)
Optional AI services: Florence-2 and CLIP (when enabled/configured)

This enrichment is designed to be best-effort: if an enrichment service is unavailable, the system should continue with a less enriched prompt rather than failing the entire pipeline.

Source: backend/services/context_enricher.py, backend/services/enrichment_pipeline.py

7. Event Creation¶

Analysis results are stored as Event records in PostgreSQL with links to source detections.

8. WebSocket Broadcast¶

New events are published via Redis pub/sub to all connected WebSocket clients.

Source: backend/services/event_broadcaster.py

Data Flow Diagram¶

%%{init: {'theme': 'dark'}}%%
flowchart LR
    subgraph Input["Image Capture"]
        Camera[Camera]
        FW[FileWatcher]
    end

    subgraph Detection["Object Detection"]
        DQ[detection_queue]
        YOLO["YOLO26<br/>(Port 8095)"]
    end

    subgraph Batching["Batch Aggregation"]
        BA[BatchAggregator]
        AQ[analysis_queue]
    end

    subgraph Analysis["Risk Analysis"]
        NEM["Nemotron LLM<br/>(Port 8091)"]
    end

    subgraph Storage["Event Storage"]
        DB[(PostgreSQL<br/>events table)]
        Redis[(Redis<br/>pub/sub)]
        WS[WebSocket clients]
    end

    Camera --> FW
    FW --> DQ
    DQ --> YOLO
    YOLO --> BA
    BA --> AQ
    BA -.->|Fast Path<br/>>90% person| NEM
    AQ --> NEM
    NEM --> DB
    DB --> Redis
    Redis --> WS

Timing Characteristics¶

Stage	Duration	Notes
File upload detection	~10ms	OS filesystem notifications
Debounce delay	500ms	Configurable
Image validation	~5-10ms	PIL verify()
YOLO26 inference	30-50ms	GPU accelerated
Database write	~5-10ms	PostgreSQL async
Batch window	30-90s	Collects related detections
Nemotron inference	2-5s	GPU accelerated
Event creation	~10ms	Database + WebSocket

Total latency:

Fast path: ~3-6 seconds
Normal path: 30-95 seconds (dominated by batch window)

Key Configuration¶

Variable	Default	Description
`BATCH_WINDOW_SECONDS`	90	Maximum batch duration
`BATCH_IDLE_TIMEOUT_SECONDS`	30	Close batch if no activity
`FAST_PATH_CONFIDENCE_THRESHOLD`	0.90	Bypass batching threshold
`FAST_PATH_OBJECT_TYPES`	["person"]	Types eligible for fast path
`DETECTION_CONFIDENCE_THRESHOLD`	0.5	Minimum confidence to store

Resource Usage¶

VRAM depends on the deployed model sizes and which enrichment services are enabled. For authoritative ports/env, see Environment Variable Reference.

Next Steps¶

Detection Service - YOLO26 integration details
Batching Logic - How detections are grouped
Risk Analysis - LLM processing and scoring