ADR-004: Multi-Tenant Data Isolation Strategy β
Status: Proposed β pending architect approval Impacts:
domain-driven-design.mdΒ§2.1,schema-backoffice.md, all schema files
Context β
Busflow is a multi-tenant SaaS where all tenants share a single PostgreSQL database. Every domain table carries a tenant_id referencing backoffice.operators. The domain-driven-design.md Β§2 describes schema-level isolation but provides no enforcement mechanism for row-level tenant isolation. A coding agent needs concrete rules for how to wire isolation into Hasura permissions, Postgres policies, and NestJS middleware.
Decision β
A two-layer isolation model with fail-closed enforcement:
Layer 1: Hasura Permission Rules (Primary) β
Every table permission (select, insert, update, delete) for tenant-scoped Hasura roles (manager, dispatcher, driver) includes:
filter:
tenant_id: { _eq: "x-hasura-tenant-id" }The x-hasura-tenant-id JWT claim is set during authentication and reflects the user's currently active tenant (see multi-tenant-jwt-session ADR).
insertpermissions include a column preset:tenant_id = x-hasura-tenant-id(prevents client-side forgery).updateanddeletepermissions include the sametenant_idfilter (prevents cross-tenant mutation).
Layer 2: Postgres Row Level Security (Defense-in-Depth) β
Each tenant-scoped table has an RLS policy as a safeguard:
ALTER TABLE backoffice.tour_templates ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON backoffice.tour_templates
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);NestJS sets app.current_tenant_id via SET LOCAL at the start of each database connection (see Β§TenantInterceptor below).
Exempt Entities β
| Category | Examples | Rationale |
|---|---|---|
| Global reference tables | countries, currencies, vehicle_types, legal_forms | Shared, immutable reference data. No tenant_id. Schema: public. |
| Platform-scoped tables | auth.users (Nhost) | Cross-tenant identity. Filtered by Hasura role, not tenant. |
Busflow Staff Bypass β
The Hasura role busflow_staff has unrestricted select permissions (no tenant_id filter). Staff have no direct mutation permissions on tenant-scoped tables β Hasura Actions route all writes through mandatory change_events audit logging (see Β§Staff Audit Enforcement below).
NestJS TenantInterceptor (Fail-Closed) β
Every NestJS request that acquires a database connection must set the tenant context before any query executes. A global NestJS interceptor enforces this.
| Step | Action | Failure Mode |
|---|---|---|
| 1 | Extract x-hasura-tenant-id from the incoming request header (forwarded by Hasura via Action/Event Trigger). | If header is missing β reject with 403 (TENANT_CONTEXT_MISSING). Never fall through to an unscoped query. |
| 2 | Execute SET LOCAL app.current_tenant_id = '<tenant_id>' on the acquired connection. | If SET LOCAL fails β reject with 500, release connection. |
| 3 | Proceed with handler logic. | Normal execution. |
| 4 | Connection returned to pool. SET LOCAL resets on transaction end. | No cleanup required. |
Busflow Staff bypass: If x-hasura-role = busflow_staff, skip step 1. app.current_tenant_id is NOT set β RLS bypass via Postgres BYPASSRLS role (see Β§Dev-Environment RLS).
Key principle: The system fails closed β a missing tenant context produces an explicit error, never a silently empty result set.
New-Table Migration Checklist (CI-Enforced) β
Every new domain table must pass the following checklist. A CI lint script scans migration files and Hasura metadata to verify compliance.
[ ] 1. Table has `tenant_id UUID NOT NULL REFERENCES backoffice.operators(id)`
[ ] 2. Index: CREATE INDEX idx_<table>_tenant ON <schema>.<table>(tenant_id)
[ ] 3. RLS enabled: ALTER TABLE <schema>.<table> ENABLE ROW LEVEL SECURITY
[ ] 4. RLS policy:
CREATE POLICY tenant_isolation ON <schema>.<table>
USING (tenant_id = current_setting('app.current_tenant_id')::uuid)
[ ] 5. Hasura CRUD permissions include: filter: { tenant_id: { _eq: "x-hasura-tenant-id" } }
[ ] 6. Hasura INSERT includes column preset: tenant_id: x-hasura-tenant-id
[ ] 7. Table added to exempt list (if global reference) β requires ADR justificationCI Lint Guard (scripts/lint-tenant-isolation.sh) β
| Check | Logic | Failure |
|---|---|---|
| RLS enabled | For every non-exempt table, pg_class.relrowsecurity = true | β Block merge |
| RLS policy exists | pg_policies has a row with qual containing app.current_tenant_id | β Block merge |
| Hasura permission filter | Hasura metadata YAML for every role includes tenant_id filter | β Block merge |
| INSERT preset | Hasura INSERT permission includes tenant_id column preset | β Block merge |
Exempt Table Registry β
Maintained in domain-driven-design.md Β§2.1:
| Table | Schema | Justification |
|---|---|---|
countries | public | Global reference, immutable |
currencies | public | Global reference, immutable |
vehicle_types | public | Global reference, immutable |
legal_forms | public | Global reference, immutable |
auth.users | auth | Nhost-managed, platform-scoped |
Hasura Metadata Validation (CI) β
| Check | Tool | Failure |
|---|---|---|
| Metadata consistency | hasura metadata ic (inconsistency check) | β Block merge |
| Permission completeness | Custom script: for every non-exempt table, verify CRUD permissions exist for manager and dispatcher roles | β οΈ Warning |
| Column preset presence | Custom script: for every insert_permissions entry, verify tenant_id preset | β Block merge |
Staff Audit Enforcement β
| Constraint | Enforcement |
|---|---|
| All staff mutations go through Hasura Actions | busflow_staff has no direct mutation permissions on tenant-scoped tables |
Action handlers create change_events | Every handler under busflow_staff context inserts a change_events row. actor_id = staff user's auth.users.id |
target_tenant_id on staff events | Staff change_events include target_tenant_id for per-tenant audit filtering |
| CI enforcement | Integration tests assert every busflow_staff Action handler calls AuditService.logChange() |
Dev-Environment RLS Bypass β
| Environment | RLS Status | Hasura Console | Direct SQL |
|---|---|---|---|
| Production | Enabled, enforced | Console disabled (HASURA_GRAPHQL_ENABLE_CONSOLE=false) | All connections use tenant-scoped role |
| Staging | Enabled, enforced | Console enabled, hasura_admin role with BYPASSRLS | hasura_admin role |
| Development | Enabled, enforced | Console enabled, hasura_admin role with BYPASSRLS | hasura_admin role |
-- Manager connection: BYPASSRLS, never used for application queries.
CREATE ROLE hasura_admin WITH LOGIN BYPASSRLS;
-- Application connection: no BYPASSRLS.
CREATE ROLE busflow_app WITH LOGIN;Key principle: RLS is always enabled in every environment. Bypass is at the role level, not by disabling policies.
Multi-Tab Tenant Context β
When a user switches tenants in one tab, other tabs hold a stale JWT:
| Mechanism | Behavior |
|---|---|
BroadcastChannel API | Active tab broadcasts { type: 'TENANT_SWITCHED', tenant_id } to all same-origin tabs |
| Receiving tab | Compares broadcast tenant_id against its JWT's x-hasura-tenant-id. If different β modal: "Workspace switched. Reload to continue." |
| No auto-reload | Tabs do NOT auto-reload. Unsaved form data preserved. Stale tab queries still work until JWT expiry. |
| Token refresh fallback | On JWT refresh, Nhost issues a new JWT with the current active tenant (from latest switch). Naturally corrects stale tabs. |
Consequences β
Positive:
- Defense-in-depth: even if someone misconfigures Hasura, RLS prevents data leaks
- Hasura presets prevent client-side
tenant_idinjection - CI gates prevent shipping unprotected tables
- Fail-closed interceptor prevents silent data hiding
- Staff mutations are always audited
Negative:
- Every new table requires both Hasura permissions and an RLS policy β additional migration work
- RLS adds minor query planning overhead (negligible for this scale)
- CI lint guard is a custom script requiring maintenance
- NestJS must consistently set
app.current_tenant_idβ the interceptor enforces this but adds a layer