Deployment Guide¶

Complete guide for deploying Home Security Intelligence with Docker/Podman and GPU-accelerated AI services.

Table of Contents¶

Quick Start
Prerequisites
Container Runtime Setup
GPU Passthrough
Compose Files
Deployment Options
AI Services Setup
Service Dependencies
Deployment Checklist
Upgrade Procedures
Rollback Procedures
Troubleshooting

Deployment Architecture¶

The following diagram shows the complete production deployment topology with all containers, their connections, ports, and GPU assignments.

%%{init: {
  'theme': 'dark',
  'themeVariables': {
    'primaryColor': '#3B82F6',
    'primaryTextColor': '#FFFFFF',
    'primaryBorderColor': '#60A5FA',
    'secondaryColor': '#A855F7',
    'tertiaryColor': '#009688',
    'background': '#121212',
    'mainBkg': '#1a1a2e',
    'lineColor': '#666666'
  }
}}%%
flowchart TB
    subgraph External["External Access Points"]
        direction LR
        USER["User Browser"]
        CAM["Foscam Cameras<br/>(FTP Upload)"]
        ADMN["Admin/Ops"]
    end

    subgraph Network["security-net (Bridge Network)"]
        subgraph FrontendLayer["Frontend Layer"]
            FE["<b>frontend</b><br/>nginx-unprivileged<br/>Internal: 8080<br/>External: 5173 (HTTP), 8443 (HTTPS)<br/>Memory: 512M"]
        end

        subgraph BackendLayer["Backend Layer"]
            BE["<b>backend</b><br/>FastAPI + Uvicorn<br/>Port: 8000<br/>Memory: 6G<br/>CPU: 2 cores"]
        end

        subgraph AILayer["AI Services Layer (GPU Required)"]
            direction LR
            subgraph GPU0["GPU 0 (Primary - High VRAM)"]
                LLM["<b>ai-llm</b><br/>Nemotron 30B<br/>Port: 8091<br/>~14.7GB VRAM"]
                FLOR["<b>ai-florence</b><br/>Florence-2<br/>Port: 8092<br/>~530MB VRAM"]
            end
            subgraph GPU1["GPU 1 (Secondary - 4GB+)"]
                YOLO["<b>ai-yolo26</b><br/>YOLO26 TensorRT<br/>Port: 8095<br/>~2GB VRAM"]
                CLIP["<b>ai-clip</b><br/>CLIP ViT-L<br/>Port: 8093<br/>~722MB VRAM"]
                ENRL["<b>ai-enrichment-light</b><br/>Pose, Threat, ReID, Pet, Depth<br/>Port: 8096<br/>~1.2GB VRAM"]
            end
            ENR["<b>ai-enrichment</b><br/>Vehicle, Fashion, Age, Gender, Action<br/>Port: 8094<br/>GPU: Configurable<br/>~4.3GB VRAM"]
        end

        subgraph DataLayer["Data Layer"]
            PG[("<b>postgres</b><br/>PostgreSQL 16-alpine<br/>Port: 5432<br/>Memory: 1G")]
            RD[("<b>redis</b><br/>Redis 7.4-alpine<br/>Port: 6379<br/>Memory: 512M")]
            ES[("<b>elasticsearch</b><br/>ES 8.12<br/>Port: 9200<br/>Memory: 4G")]
        end

        subgraph MonitoringLayer["Monitoring & Observability"]
            direction TB
            subgraph MetricsTracing["Metrics & Tracing"]
                PROM["<b>prometheus</b><br/>Port: 9090<br/>Memory: 512M"]
                JAEG["<b>jaeger</b><br/>Port: 16686<br/>Memory: 512M"]
                GRAF["<b>grafana</b><br/>Port: 3002<br/>Memory: 256M"]
            end
            subgraph LogsProfiling["Logs & Profiling"]
                LOKI["<b>loki</b><br/>Port: 3100<br/>Memory: 512M"]
                PYRO["<b>pyroscope</b><br/>Port: 4040<br/>Memory: 512M"]
                ALLOY["<b>alloy</b><br/>Port: 12345<br/>Memory: 768M"]
            end
            subgraph Exporters["Exporters & Alerting"]
                AM["<b>alertmanager</b><br/>Port: 9093<br/>Memory: 128M"]
                BB["<b>blackbox-exporter</b><br/>Port: 9115"]
                RE["<b>redis-exporter</b><br/>Port: 9121"]
                JE["<b>json-exporter</b><br/>Port: 7979"]
            end
        end
    end

    %% External connections
    USER -->|"HTTP :5173<br/>HTTPS :8443"| FE
    CAM -->|"FTP to<br/>/cameras mount"| BE
    ADMN -->|"Grafana :3002<br/>Prometheus :9090<br/>Jaeger :16686"| MonitoringLayer

    %% Frontend to Backend
    FE -->|"Proxy /api, /ws"| BE

    %% Backend to Data
    BE -->|"asyncpg"| PG
    BE -->|"aioredis"| RD

    %% Backend to AI (HTTP inference calls)
    BE -->|"POST /detect"| YOLO
    BE -->|"POST /v1/completions"| LLM
    BE -->|"POST /caption"| FLOR
    BE -->|"POST /embed"| CLIP
    BE -->|"POST /analyze"| ENR
    BE -->|"POST /analyze"| ENRL

    %% Monitoring data flows
    PROM -.->|"scrape /metrics"| BE
    PROM -.->|"scrape"| YOLO
    PROM -.->|"scrape"| LLM
    PROM -.->|"scrape"| RE
    PROM -.->|"scrape"| BB
    PROM -.->|"scrape"| JE
    PROM -->|"alert rules"| AM
    AM -->|"webhooks"| BE

    GRAF -->|"query"| PROM
    GRAF -->|"query"| LOKI
    GRAF -->|"query"| JAEG
    GRAF -->|"query"| PYRO

    JAEG -->|"store spans"| ES

    ALLOY -->|"push logs"| LOKI
    ALLOY -->|"push profiles"| PYRO
    BE -->|"OTLP traces"| ALLOY

    %% Health check dependencies (startup order)
    BE -.->|"depends_on<br/>healthy"| PG
    BE -.->|"depends_on<br/>healthy"| RD
    BE -.->|"depends_on<br/>healthy"| YOLO
    BE -.->|"depends_on<br/>healthy"| LLM
    FE -.->|"depends_on<br/>healthy"| BE
    PROM -.->|"depends_on<br/>healthy"| AM
    JAEG -.->|"depends_on<br/>healthy"| ES

Architecture Summary¶

Layer	Services	Resource Profile
Frontend	nginx reverse proxy	512M RAM, 1 CPU
Backend	FastAPI application server	6G RAM, 2 CPUs, GPU access
AI Services	YOLO26, Nemotron, Florence-2, CLIP, Enrichment (light + heavy)	GPU required (~19GB total)
Data	PostgreSQL, Redis, Elasticsearch	5.5G RAM total
Monitoring	Prometheus, Grafana, Jaeger, Loki, Pyroscope, Alloy	~3G RAM total

GPU Assignment Strategy¶

The default GPU assignment distributes models across two GPUs:

GPU	Services	Total VRAM	Typical GPU
GPU 0	Nemotron LLM, Florence-2	~15.2GB	RTX A5500/RTX 4090
GPU 1	YOLO26, CLIP, Enrichment-Light	~2.9GB	RTX A400/RTX 3060

Enrichment (heavy models) defaults to GPU 1 but can be configured via GPU_ENRICHMENT environment variable.

Quick Start¶

# 1. Clone repository
git clone https://github.com/your-org/home-security-intelligence.git
cd home-security-intelligence

# 2. Run setup (generates .env with secure passwords)
python setup.py              # Quick mode
python setup.py --guided     # Guided mode with explanations

# 3. Download AI models (~2.7GB)
./ai/download_models.sh

# 4. Start services
docker compose -f docker-compose.prod.yml up -d

# 5. Verify deployment
curl http://localhost:8000/api/system/health/ready

Prerequisites¶

Hardware Requirements¶

Resource	Minimum	Recommended	Purpose
CPU	4 cores	8 cores	Backend workers, AI inference
RAM	16 GB	32 GB	Services + AI model loading
GPU VRAM	8 GB	24 GB	YOLO26 + Nemotron + optional models
Disk Space	100 GB	500 GB	Database, logs, media files
Camera Storage	50 GB	200 GB	FTP upload directory

Software Requirements¶

Software	Version	Purpose	Installation
Docker or Podman	20.10+	Container runtime	See Container Runtime Setup
NVIDIA Driver	535+	GPU support	`apt install nvidia-driver-535`
nvidia-container-toolkit	1.13+	GPU passthrough	See GPU Passthrough
PostgreSQL Client	15+	Database administration	`apt install postgresql-client`

Network Requirements¶

Port	Service	Protocol	Access
80/5173	Frontend	HTTP	Browser
8000	Backend API	HTTP/WS	Frontend
8095	YOLO26	HTTP	Backend
8091	Nemotron	HTTP	Backend
8092	Florence-2	HTTP	Backend (optional)
8093	CLIP	HTTP	Backend (optional)
8094	Enrichment	HTTP	Backend (optional)
5432	PostgreSQL	TCP	Backend
6379	Redis	TCP	Backend

Container Runtime Setup¶

This project supports Docker Engine, Docker Desktop, and Podman.

Runtime	Platform	License	Installation
Docker Engine	Linux	Free	`apt install docker.io`
Docker Desktop	macOS, Windows, Linux	Commercial	docker.com
Podman	Linux, macOS	Free (Apache 2.0)	`brew install podman` or `dnf install podman`

Docker Setup¶

# Install Docker Engine (Linux)
sudo apt install docker.io docker-compose-plugin

# Verify installation
docker --version
docker compose version

Podman Setup¶

# macOS
brew install podman podman-compose
podman machine init
podman machine start

# Linux (Fedora/RHEL)
sudo dnf install podman podman-compose

# Verify installation
podman info

Command Equivalents¶

Docker	Podman
`docker compose up -d`	`podman-compose up -d`
`docker compose down`	`podman-compose down`
`docker compose logs`	`podman-compose logs`
`docker ps`	`podman ps`
`docker build`	`podman build`

GPU Passthrough¶

AI services require NVIDIA GPU access via Container Device Interface (CDI).

Prerequisites¶

NVIDIA driver 535+
NVIDIA Container Toolkit

Verify GPU Access¶

# Verify NVIDIA driver
nvidia-smi

# Test Docker GPU access
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi

# Test Podman GPU access (CDI)
podman run --rm --device nvidia.com/gpu=all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi

Install NVIDIA Container Toolkit¶

# Ubuntu/Debian
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install -y nvidia-container-toolkit

# Configure Docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Compose Files¶

File	Purpose	AI Services	Use Case
`docker-compose.yml`	Development	Host (native)	Local development, hot reload
`docker-compose.prod.yml`	Production	Containerized	Full deployment with GPU
`docker-compose.ghcr.yml`	Pre-built	External	Fast deploy from GHCR images

Deployment Mode Selection Guide¶

Choose your deployment mode based on your needs:

Question	Recommended Mode
First time deploying / Want simplest setup?	Production (`docker-compose.prod.yml`)
Developing locally with code hot-reload?	Development (`docker-compose.yml` + host AI)
Need GPU debugging / AI runs better on host?	Hybrid (container backend + host AI)
Have a dedicated GPU server?	Remote AI host mode
Want fastest deployment from pre-built images?	GHCR (`docker-compose.ghcr.yml`)

Decision flowchart:

Production deployment? Use docker-compose.prod.yml - everything containerized, no networking complexity
Active development? Use docker-compose.yml with host AI for hot-reload and easier debugging
GPU issues in containers? Run AI services on host, backend in container (see Deployment Modes)

Tip: If AI services are unreachable, it's usually a networking mode mismatch. See Deployment Modes & AI Networking for URL configuration by mode.

Production Deployment¶

# Start all services
docker compose -f docker-compose.prod.yml up -d

# View logs
docker compose -f docker-compose.prod.yml logs -f

# Stop services
docker compose -f docker-compose.prod.yml down

Development with Host AI¶

# Terminal 1: Start YOLO26
./ai/start_detector.sh

# Terminal 2: Start Nemotron
./ai/start_llm.sh

# Terminal 3: Start application stack
docker compose up -d

Deploy from GHCR¶

# Set image location
export GHCR_OWNER=your-org
export GHCR_REPO=home-security-intelligence
export IMAGE_TAG=latest

# Authenticate (requires GitHub token with read:packages)
echo $GITHUB_TOKEN | docker login ghcr.io -u YOUR_USERNAME --password-stdin

# Deploy
docker compose -f docker-compose.ghcr.yml up -d

Deployment Options¶

Cross-Platform Host Resolution¶

When backend is containerized but AI runs on the host:

Platform	Container Runtime	Host Resolution
macOS	Docker Desktop	`host.docker.internal` (default)
macOS	Podman	`host.containers.internal`
Linux	Docker Engine	Host IP address
Linux	Podman	Host IP address

Container Networking Resolution Flowchart¶

flowchart TD
    Start([Start: Resolve AI Host]) --> Platform{What platform?}

    Platform -->|macOS| MacRuntime{Container Runtime?}
    Platform -->|Linux| LinuxRuntime{Container Runtime?}

    MacRuntime -->|Docker Desktop| MacDocker[Use: host.docker.internal]
    MacRuntime -->|Podman| MacPodman[Use: host.containers.internal]

    LinuxRuntime -->|Docker Engine| LinuxDocker[Use: Host IP Address]
    LinuxRuntime -->|Podman| LinuxPodman[Use: Host IP Address]

    MacDocker --> SetEnv1["export AI_HOST=host.docker.internal"]
    MacPodman --> SetEnv2["export AI_HOST=host.containers.internal"]
    LinuxDocker --> GetIP["AI_HOST=$(hostname -I | awk '{print $1}')"]
    LinuxPodman --> GetIP

    SetEnv1 --> Verify{Test Connection}
    SetEnv2 --> Verify
    GetIP --> SetEnv3["export AI_HOST=$AI_HOST"] --> Verify

    Verify -->|Success| Done([AI Services Reachable])
    Verify -->|Fail| Debug[Check firewall and service status]
    Debug --> Verify

    style Start fill:#e1f5fe
    style Done fill:#c8e6c9
    style Debug fill:#ffecb3

# macOS with Docker Desktop (default, no action needed)
docker compose up -d

# macOS with Podman
export AI_HOST=host.containers.internal
podman-compose up -d

# Linux (Docker or Podman)
export AI_HOST=$(hostname -I | awk '{print $1}')
docker compose up -d

AI Service URLs by Deployment Mode¶

Production (docker-compose.prod.yml):

# AI services on compose network (internal DNS)
YOLO26_URL=http://ai-yolo26:8095
NEMOTRON_URL=http://ai-llm:8091
FLORENCE_URL=http://ai-florence:8092
CLIP_URL=http://ai-clip:8093
ENRICHMENT_URL=http://ai-enrichment:8094

Development with host AI:

YOLO26_URL=http://localhost:8095
NEMOTRON_URL=http://localhost:8091

Docker Desktop (macOS/Windows):

YOLO26_URL=http://host.docker.internal:8095
NEMOTRON_URL=http://host.docker.internal:8091

AI Services Setup¶

AI Architecture¶

The system supports a multi-service AI stack:

Service	Port	VRAM	Purpose
YOLO26	8095	~2GB	Object detection
Nemotron	8091	~3GB (4B) / ~14.7GB (30B)	Risk reasoning
Florence-2	8092	~2GB	Vision extraction (optional)
CLIP	8093	~2GB	Re-identification (optional)
Enrichment	8094	~4GB	Vehicle/pet/clothing (optional)

Model Downloads¶

# Automated download
./ai/download_models.sh

# What it downloads:
# - Nemotron Mini 4B (~2.5GB) for development
# - YOLO26 auto-downloads on first use via HuggingFace

Production Model Specifications¶

Model	File	Size	VRAM	Context
NVIDIA Nemotron-3-Nano-30B-A3B	`Nemotron-3-Nano-30B-A3B-Q4_K_M.gguf`	~18 GB	~14.7 GB	131,072
Nemotron Mini 4B (dev)	`nemotron-mini-4b-instruct-q4_k_m.gguf`	~2.5 GB	~3 GB	4,096

Verify AI Services¶

# Health checks
curl http://localhost:8095/health   # YOLO26
curl http://localhost:8091/health   # Nemotron
curl http://localhost:8092/health   # Florence-2 (optional)
curl http://localhost:8093/health   # CLIP (optional)
curl http://localhost:8094/health   # Enrichment (optional)

Service Dependencies¶

Startup Order¶

Services start in dependency order via Docker Compose health checks.

Service Startup Sequence Diagram¶

sequenceDiagram
    autonumber
    participant DC as Docker Compose
    participant PG as PostgreSQL
    participant RD as Redis
    participant RT as YOLO26
    participant NM as Nemotron
    participant BE as Backend
    participant FE as Frontend

    Note over DC,FE: Phase 1: Data Infrastructure (0-15s)
    DC->>PG: Start PostgreSQL
    DC->>RD: Start Redis
    PG-->>DC: Healthy (10-15s)
    RD-->>DC: Healthy (5-10s)

    Note over DC,FE: Phase 2: AI Services (60-180s)
    DC->>RT: Start YOLO26
    DC->>NM: Start Nemotron
    RT-->>DC: Healthy (60-90s, model loading)
    NM-->>DC: Healthy (90-120s, VRAM allocation)

    Note over DC,FE: Phase 3: Application (30-60s)
    DC->>BE: Start Backend (depends: PG, RD)
    BE->>PG: Connect
    BE->>RD: Connect
    BE-->>DC: Healthy (30-60s)

    Note over DC,FE: Phase 4: Frontend (10-20s)
    DC->>FE: Start Frontend (depends: BE)
    FE->>BE: Health check
    FE-->>DC: Healthy (10-20s)

Phase 1: Data Infrastructure (0-15s)

PostgreSQL (~10-15s)
Redis (~5-10s)

Phase 2: AI Services (60-180s)

YOLO26 (~60-90s, model loading)
Nemotron (~90-120s, VRAM allocation)
Florence-2, CLIP, Enrichment (optional)

Phase 3: Application (30-60s)

Backend (~30-60s, waits for DB + Redis)

Phase 4: Frontend (10-20s)

Frontend (~10-20s, waits for Backend)

Health Check Configuration¶

# docker-compose.prod.yml example
backend:
  healthcheck:
    test:
      [
        'CMD',
        'python',
        '-c',
        "import httpx; r = httpx.get('http://localhost:8000/api/system/health/ready'); exit(0 if r.status_code == 200 else 1)",
      ]
    interval: 10s
    timeout: 5s
    retries: 3
    start_period: 30s
  depends_on:
    postgres:
      condition: service_healthy
    redis:
      condition: service_healthy

Dependency Matrix¶

Service	Hard Dependencies	Soft Dependencies	Auto-Recovers
PostgreSQL	None	None	N/A
Redis	None	None	N/A
YOLO26	GPU	None	No
Nemotron	GPU	None	No
Backend	PostgreSQL, Redis	AI Services	AI via monitor
Frontend	Backend	None	No

Deployment Checklist¶

Pre-Deployment¶

[ ] Docker/Podman installed and running
[ ] NVIDIA driver and container toolkit installed (nvidia-smi works)
[ ] Camera FTP directory exists and is accessible
[ ] AI models downloaded (./ai/download_models.sh)
[ ] Network ports are not in use by other services
[ ] Firewall rules allow required traffic
[ ] .env file created via python setup.py

Deployment Steps¶

Start services:

docker compose -f docker-compose.prod.yml up -d

Monitor startup:

docker compose -f docker-compose.prod.yml logs -f

Verify health:

# Wait for services (Redis: ~5s, Postgres: ~15s, AI: ~120s, Backend: ~60s)
curl http://localhost:8000/api/system/health/ready

Test AI pipeline:

# Copy test image
cp backend/data/test_images/sample.jpg /export/foscam/test_camera/test_$(date +%s).jpg

# Monitor processing
docker compose -f docker-compose.prod.yml logs -f backend | grep -E "detect|batch|analyze"

Access dashboard:
Open browser to http://localhost:5173 (dev) or http://localhost (prod)
Verify WebSocket connection status
Check camera grid and activity feed

Post-Deployment¶

[ ] Dashboard accessible
[ ] Health endpoint returns healthy
[ ] WebSocket connection working
[ ] Test image processed successfully
[ ] GPU metrics displaying

Upgrade Procedures¶

Pre-Upgrade¶

[ ] Read release notes for breaking changes
[ ] Backup database
[ ] Check disk space (at least 10 GB free)
[ ] Review new environment variables in .env.example

Upgrade Steps¶

# 1. Backup
docker compose -f docker-compose.prod.yml exec -T postgres pg_dump -U security -d security -F c > backup-pre-upgrade-$(date +%Y%m%d).dump
cp .env .env.backup-$(date +%Y%m%d)

# 2. Pull updates
git fetch origin
git pull origin main

# 3. Review config changes
diff .env.example .env

# 4. Stop services
docker compose -f docker-compose.prod.yml down

# 5. Apply database migrations
docker compose -f docker-compose.prod.yml up -d postgres
until docker compose -f docker-compose.prod.yml exec postgres pg_isready -U security; do sleep 1; done
docker compose -f docker-compose.prod.yml run --rm backend alembic upgrade head

# 6. Rebuild and start
docker compose -f docker-compose.prod.yml build --no-cache
docker compose -f docker-compose.prod.yml up -d

# 7. Verify
curl http://localhost:8000/api/system/health/ready

Rollback Procedures¶

Application Rollback (Database Intact)¶

# 1. Stop current version
docker compose -f docker-compose.prod.yml down

# 2. Checkout previous version
git checkout <previous-commit-sha>

# 3. Restore config if needed
cp .env.backup-<date> .env

# 4. Restart
docker compose -f docker-compose.prod.yml up -d

# 5. Verify
curl http://localhost:8000/api/system/health/ready

Database Rollback¶

# 1. Stop all services
docker compose -f docker-compose.prod.yml down

# 2. Start only PostgreSQL
docker compose -f docker-compose.prod.yml up -d postgres
until docker compose -f docker-compose.prod.yml exec postgres pg_isready -U security; do sleep 1; done

# 3. Drop and recreate database
docker compose -f docker-compose.prod.yml exec postgres psql -U security -d postgres -c "DROP DATABASE security;"
docker compose -f docker-compose.prod.yml exec postgres psql -U security -d postgres -c "CREATE DATABASE security;"

# 4. Restore from backup
docker compose -f docker-compose.prod.yml exec -T postgres pg_restore -U security -d security < backup-pre-upgrade-<date>.dump

# 5. Checkout previous code and restart
git checkout <previous-commit>
docker compose -f docker-compose.prod.yml up -d

Rollback Decision Matrix¶

Symptom	Action	Downtime
Frontend UI broken	Rollback frontend only	1-2 min
API errors, DB intact	Rollback backend only	2-5 min
Database corruption	Restore DB backup + rollback code	10-30 min
AI service crash loop	Check GPU, restart AI services	5-10 min
Complete system failure	Full rollback (all services + DB)	15-45 min

Troubleshooting¶

Service Won't Start¶

# Check container status
docker compose -f docker-compose.prod.yml ps

# Check logs for specific service
docker compose -f docker-compose.prod.yml logs backend
docker compose -f docker-compose.prod.yml logs ai-yolo26

# Check health endpoint
curl -v http://localhost:8000/health

AI Services Unreachable¶

Check AI container status:

docker ps --filter name=ai-

Test health endpoints directly:

curl http://localhost:8095/health
curl http://localhost:8091/health

Check GPU access:

nvidia-smi
docker compose -f docker-compose.prod.yml exec ai-yolo26 nvidia-smi

Verify URL configuration:
Check Deployment Modes for correct URLs

GPU Out of Memory¶

# Check GPU usage
nvidia-smi

# Kill GPU processes
nvidia-smi --query-compute-apps=pid --format=csv,noheader | xargs kill

# Restart AI services
docker compose -f docker-compose.prod.yml restart ai-yolo26 ai-llm

Database Connection Failed¶

# Check PostgreSQL status
docker compose -f docker-compose.prod.yml exec postgres pg_isready -U security

# Check logs
docker compose -f docker-compose.prod.yml logs postgres

# Verify DATABASE_URL in .env
grep DATABASE_URL .env

Health Check Timeout¶

# Increase start_period for slow model loading
# Edit docker-compose.prod.yml:
# healthcheck:
#   start_period: 120s  # Increase from 60s

Deployment Guide¶

Table of Contents¶

Deployment Architecture¶

Architecture Summary¶

GPU Assignment Strategy¶

Quick Start¶

Prerequisites¶

Hardware Requirements¶

Software Requirements¶

Network Requirements¶

Container Runtime Setup¶

Docker Setup¶

Podman Setup¶

Command Equivalents¶

GPU Passthrough¶

Prerequisites¶

Verify GPU Access¶

Install NVIDIA Container Toolkit¶

Compose Files¶

Deployment Mode Selection Guide¶

Production Deployment¶

Development with Host AI¶

Deploy from GHCR¶

Deployment Options¶

Cross-Platform Host Resolution¶

Container Networking Resolution Flowchart¶

AI Service URLs by Deployment Mode¶

AI Services Setup¶

AI Architecture¶

Model Downloads¶

Production Model Specifications¶

Verify AI Services¶

Service Dependencies¶

Startup Order¶

Service Startup Sequence Diagram¶

Health Check Configuration¶

Dependency Matrix¶

Deployment Checklist¶

Pre-Deployment¶

Deployment Steps¶

Post-Deployment¶

Upgrade Procedures¶

Pre-Upgrade¶

Upgrade Steps¶

Rollback Procedures¶

Application Rollback (Database Intact)¶

Database Rollback¶

Rollback Decision Matrix¶

Troubleshooting¶

Service Won't Start¶

AI Services Unreachable¶

GPU Out of Memory¶

Database Connection Failed¶

Health Check Timeout¶

See Also¶