Connection Troubleshooting¶
Solving network, container, and connectivity problems.
Time to read: ~6 min Prerequisites: None
Service Not Reachable¶
Symptoms¶
- Error:
Connection refused - Health check fails
curlreturns "connection refused"
Diagnosis¶
# Check if service is listening
ss -tlnp | grep 8000 # Backend
ss -tlnp | grep 8095 # YOLO26
ss -tlnp | grep 8091 # Nemotron
ss -tlnp | grep 8092 # Florence-2 (optional)
ss -tlnp | grep 8093 # CLIP (optional)
ss -tlnp | grep 8094 # Enrichment (optional)
# Check container status
docker compose -f docker-compose.prod.yml ps
# Check container logs
docker compose -f docker-compose.prod.yml logs backend
Solutions¶
1. Start the service:
# All services
docker compose -f docker-compose.prod.yml up -d
# AI services only
./scripts/start-ai.sh start
2. Check port conflicts:
3. Verify network configuration:
For Docker:
- Services on same network can use service names (
postgres,redis) - External access uses
localhost:PORT
For production compose, AI services are reachable from the backend by compose DNS:
ai-yolo26:8095ai-llm:8091ai-florence:8092(optional)ai-clip:8093(optional)ai-enrichment:8094(optional)
For native development:
- All services use
localhost - Verify each service is bound to correct address
Redis Not Available¶
Symptoms¶
- Error:
Connection refusedto Redis - Health check shows Redis unhealthy
- Queue operations fail
Diagnosis¶
# Check Redis is running
docker compose -f docker-compose.prod.yml ps redis
# Test Redis connection
redis-cli ping
# Check Redis logs
docker compose -f docker-compose.prod.yml logs redis
Solutions¶
1. Start Redis:
2. Check REDIS_URL:
# Local development
REDIS_URL=redis://localhost:6379/0
# Docker network
REDIS_URL=redis://redis:6379/0
3. Check Redis memory:
If Redis is out of memory, increase maxmemory or clear old data.
Container Crashes¶
Symptoms¶
- Container exits immediately after starting
- Container in "restarting" state
- Exit code 1 or 137
Diagnosis¶
# Check container status and exit codes
docker compose -f docker-compose.prod.yml ps -a
# Check container logs
docker compose -f docker-compose.prod.yml logs backend
# Check recent events
docker events --since="10m" --filter="type=container"
Solutions¶
Exit code 1 (application error):
Check logs for:
- Missing environment variables
- Database connection failures
- Configuration errors
Exit code 137 (OOM killed):
# Check if OOM killed
dmesg | grep -i "killed process"
# Increase container memory limit
# In docker-compose.yml:
services:
backend:
deploy:
resources:
limits:
memory: 2G
Exit code 139 (segfault):
Usually indicates:
- Incompatible library versions
- GPU driver issues
- Corrupted container image
Try: docker compose -f docker-compose.prod.yml pull && docker compose -f docker-compose.prod.yml up -d
File Watcher Issues¶
Symptoms¶
- Images uploaded but not processed
- No detections created from new images
- Pipeline status shows file watcher not running
Diagnosis¶
# Check pipeline status
curl http://localhost:8000/api/system/pipeline | jq .file_watcher
# Check camera folder permissions
ls -la /export/foscam/
# Check for inotify limits (Linux)
cat /proc/sys/fs/inotify/max_user_watches
Solutions¶
1. Verify folder path:
Ensure FOSCAM_BASE_PATH matches where cameras upload images.
2. Enable polling for Docker:
If using Docker Desktop on macOS/Windows:
3. Increase inotify watches (Linux):
# Temporary
echo 524288 | sudo tee /proc/sys/fs/inotify/max_user_watches
# Permanent
echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
4. Check file permissions:
Backend needs read access to camera folders.
WebSocket Issues¶
Symptoms¶
- Dashboard shows stale data
- "WebSocket connection failed"
- Frequent reconnections
Diagnosis¶
# Test WebSocket connection
websocat ws://localhost:8000/ws/events
# Check for proxy issues
curl -v http://localhost:8000/ws/events
# Check WebSocket rate limiting
curl http://localhost:8000/api/system/telemetry
Solutions¶
1. Check idle timeout:
Default is 300 seconds. Client should send periodic pings:
2. Check proxy configuration:
If behind nginx/reverse proxy:
location /ws/ {
proxy_pass http://backend:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 3600s;
}
3. Check rate limits:
WebSocket connections limited to 10/minute by default. Adjust:
CORS Errors¶
Symptoms¶
- Browser console: "CORS policy blocked"
- API works in curl but not browser
- OPTIONS requests fail
Diagnosis¶
# Check CORS headers
curl -v -X OPTIONS -H "Origin: http://localhost:3000" \
http://localhost:8000/api/events
Solutions¶
1. Add origin to allowed list:
2. For development, allow all (not for production):
3. Check frontend URLs:
Ensure VITE_API_BASE_URL matches actual backend URL:
# Frontend expects backend at:
VITE_API_BASE_URL=http://localhost:8000
# Backend must allow this origin in CORS_ORIGINS
Timeouts¶
Symptoms¶
- Health checks timeout
- Requests hang then fail
- "Request timeout" errors
Diagnosis¶
# Check service response time
time curl http://localhost:8000/health
# Check AI service response time
time curl http://localhost:8095/health
time curl http://localhost:8091/health
Solutions¶
1. Increase timeout settings:
AI_CONNECT_TIMEOUT=30.0
AI_HEALTH_TIMEOUT=10.0
YOLO26_READ_TIMEOUT=120.0
NEMOTRON_READ_TIMEOUT=300.0
2. Check service load:
AI services may be overloaded. Check queue depths:
3. Check network latency:
If AI services are remote:
Next Steps¶
- AI Issues - AI service-specific problems
- Database Issues - PostgreSQL problems
- Troubleshooting Index - Back to symptom index
See Also¶
- Environment Variable Reference - Network configuration options
- Local Setup - Development environment setup
- AI Services - Starting and stopping services