Decision: Entity-Detection Referential Integrity¶
Date: 2026-01-28 Status: Decided Related Issues: NEM-1880, NEM-2210, NEM-2431, NEM-2670
Context¶
The Home Security Intelligence system tracks unique individuals and objects (entities) across multiple cameras using re-identification techniques. Each entity can be associated with a "primary detection" - the best or most representative detection image for that entity.
The Problem¶
The entities table needs to reference records in the detections table via a primary_detection_id column. However, the detections table uses PostgreSQL table partitioning by detected_at timestamp for performance optimization, which creates a composite primary key (id, detected_at).
PostgreSQL Limitation: Foreign key constraints cannot reference only part of a composite primary key on partitioned tables. This means we cannot create a standard FK constraint from entities.primary_detection_id to detections.id.
Current Schema¶
-- Detections table (partitioned)
CREATE TABLE detections (
id SERIAL,
detected_at TIMESTAMPTZ NOT NULL,
camera_id VARCHAR NOT NULL,
-- ... other columns
PRIMARY KEY (id, detected_at) -- Composite PK required for partitioning
) PARTITION BY RANGE (detected_at);
-- Entities table
CREATE TABLE entities (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
entity_type VARCHAR(20) NOT NULL,
primary_detection_id INTEGER, -- Cannot have FK to detections.id alone
-- ... other columns
);
Decision Summary¶
Enforce referential integrity at the application level rather than the database level.
The primary_detection_id column will NOT have a foreign key constraint. Instead:
- An index is created on
primary_detection_idfor query performance - Application-level validation methods enforce integrity before writes
- The relationship is defined as
viewonly=Truein SQLAlchemy
Options Evaluated¶
Option 1: Database-Level FK (Not Possible)¶
| Pros | Cons |
|---|---|
| Automatic integrity enforcement | Not supported - PG partitioned tables require full composite key |
| - | Would require restructuring entire detection storage |
Option 2: Application-Level Validation (Selected)¶
| Pros | Cons |
|---|---|
| Works with partitioned tables | Requires discipline to use validation methods |
| No schema changes needed | Orphaned references possible if validation skipped |
| Explicit validation provides better error messages | Slightly more code |
| Allows soft-delete patterns | - |
Option 3: Denormalize into Entities Table¶
| Pros | Cons |
|---|---|
| No cross-table references | Data duplication |
| Simple queries | Storage overhead |
| - | Sync complexity |
Option 4: Junction Table with Full Composite Key¶
| Pros | Cons |
|---|---|
| Could enable FK with full key | Requires storing detected_at in entities |
| - | More complex queries |
| - | Breaks existing API contracts |
Rationale¶
We selected Option 2 (Application-Level Validation) for the following reasons:
1. Partitioning Benefits Outweigh FK Enforcement¶
The detections table is partitioned by detected_at for:
- Efficient pruning: Queries on recent data only scan relevant partitions
- Maintenance: Old partitions can be dropped without vacuuming
- Performance: 10-100x faster for time-range queries
Removing partitioning to enable FK would significantly degrade performance for the core detection workflow.
2. Primary Detection is Optional and Display-Only¶
The primary_detection_id is used for:
- Displaying a representative thumbnail for an entity
- Quick access to the "best" detection
It is NOT used for:
- Critical business logic
- Financial transactions
- Data that requires ACID guarantees
If a referenced detection is deleted (e.g., retention policy), the entity remains valid with a NULL reference.
3. Validation Methods Provide Better UX¶
Application-level validation returns meaningful errors:
is_valid, error = await entity.validate_primary_detection_async(session)
if not is_valid:
raise ValueError(error)
# Error: "Detection with id=123 does not exist. Cannot set as primary_detection_id for Entity."
Database FK violations would only return generic constraint violation errors.
4. Follows Existing Patterns¶
The codebase already uses application-level validation for other constraints (e.g., enrichment_data schema validation in the Detection model).
Implementation¶
Database Schema¶
# backend/models/entity.py
class Entity(Base):
__tablename__ = "entities"
# No ForeignKey constraint - intentional for partitioned table compatibility
primary_detection_id: Mapped[int | None] = mapped_column(
Integer,
nullable=True,
index=True, # Index for efficient joins
)
# Relationship without FK - uses explicit primaryjoin
primary_detection: Mapped[Detection | None] = relationship(
"Detection",
primaryjoin="Entity.primary_detection_id == Detection.id",
foreign_keys=[primary_detection_id],
lazy="selectin",
viewonly=True, # No cascade since there's no FK
)
Validation Methods¶
# Async validation for use with database sessions
async def validate_primary_detection_async(
self, session: AsyncSession
) -> tuple[bool, str | None]:
"""Validate that primary_detection_id references an existing detection."""
if self.primary_detection_id is None:
return True, None
result = await session.execute(
select(Detection.id)
.where(Detection.id == self.primary_detection_id)
.limit(1)
)
detection_exists = result.scalar_one_or_none() is not None
if not detection_exists:
return False, (
f"Detection with id={self.primary_detection_id} does not exist. "
f"Cannot set as primary_detection_id for Entity."
)
return True, None
# Convenience method that validates before setting
async def set_primary_detection_validated(
self,
session: AsyncSession,
detection_id: int | None,
) -> tuple[bool, str | None]:
"""Set primary_detection_id with validation."""
# ... implementation
Usage Pattern¶
# Creating an entity with a primary detection
entity = Entity.from_detection(
entity_type=EntityType.PERSON,
detection_id=detection.id,
embedding=embedding_vector,
)
# Validate before persisting
is_valid, error = await entity.validate_primary_detection_async(session)
if not is_valid:
raise ValueError(error)
session.add(entity)
await session.commit()
# Or use the validated setter
success, error = await entity.set_primary_detection_validated(session, detection.id)
if not success:
raise ValueError(error)
Consequences¶
Positive¶
- Partitioning preserved: Detection queries remain fast with time-based partitioning
- Flexible schema: Can add soft-delete or archive patterns without FK complications
- Clear errors: Validation provides descriptive error messages
- No migration needed: Works with existing partitioned table structure
Negative¶
- Requires discipline: Developers must call validation methods
- Orphaned references possible: If validation is skipped, invalid IDs can be stored
- No cascade delete: Deleting a detection does not auto-null the entity reference
Mitigations¶
- Code review checks: Enforce validation in PR reviews
- Cleanup job: Periodic job nullifies orphaned
primary_detection_idvalues - Monitoring: Alert on orphaned reference counts
Monitoring and Maintenance¶
Orphan Detection Query¶
-- Find entities with orphaned primary_detection_id
SELECT e.id, e.primary_detection_id
FROM entities e
LEFT JOIN detections d ON e.primary_detection_id = d.id
WHERE e.primary_detection_id IS NOT NULL
AND d.id IS NULL;
Cleanup Job¶
async def cleanup_orphaned_entity_references(session: AsyncSession) -> int:
"""Null out primary_detection_id for entities pointing to deleted detections."""
result = await session.execute(
update(Entity)
.where(
Entity.primary_detection_id.isnot(None),
~exists(
select(Detection.id)
.where(Detection.id == Entity.primary_detection_id)
)
)
.values(primary_detection_id=None)
)
await session.commit()
return result.rowcount
Metrics¶
| Metric | Alert Threshold |
|---|---|
| Orphaned references count | > 100 |
| Validation failures/hour | > 10 |
Related Documentation¶
- Entity Model - Implementation
- Detection Model - Partitioned table
- Re-identification Feature - Entity tracking overview
- PostgreSQL Partitioning - Official docs
Decision Approval¶
| Role | Name | Date | Approval |
|---|---|---|---|
| Tech Lead | - | 2026-01-28 | Approved |
| Backend Lead | - | 2026-01-28 | Approved |
| DBA | - | 2026-01-28 | Approved |