ADR-030: Subscription Tier Gating β JWT Injection vs. Per-Query Resolution β
Status: π’ Approved (mechanism + PAST_DUE policy) β tier catalog still pending product Impacts:
adr-005-multi-tenant-jwt-session.md(amendment),schema-backoffice.md(sync mechanism),rbac-matrix.md(implementation notes) Related ADRs: ADR-005, ADR-003
WARNING
Tier catalog still pending. Tier names, count, and feature-to-tier mappings require product validation. Placeholder values (CORE, PRO, ENTERPRISE) appear where examples are needed. The gating mechanism and PAST_DUE policy are approved and can proceed.
Context β
Busflow monetizes via subscription tiers that gate access to premium features. Each tier unlocks additional capabilities above the base offering. The tenant_subscriptions.plan_id column holds the authoritative tier.
The challenge: how does Hasura (the API layer) know which tier to enforce per request?
Two options evaluated:
JWT Injection: Bake the tier into the JWT at issuance. Hasura permission rules check the claim. Staleness window of up to 15 min (JWT TTL per ADR-005). Fast, stateless.
Per-Query Resolution: Hasura's custom auth webhook resolves the tier per incoming GraphQL query by hitting the database. Always current. At scale (4k+ tenants Γ 100 QPS), this implies 400k+ DB queries/sec for plan resolution alone.
Decision β
Option 1: JWT Injection.
The Nhost custom claims webhook injects X-Hasura-Plan (derived from denormalized operators.subscription_tier) into all JWTs at issuance and every 15-min refresh per ADR-005 Β§TTL. Hasura permission rules on premium tables check this claim.
The 15-min staleness window is acceptable because plan changes are rare events, the safe default is the lowest tier (pessimistic), and frontend assumes the base tier by default.
Engineering Sub-Decisions β
1. Sync Mechanism: Postgres Trigger vs. NestJS Fallback β
Preferred: Postgres AFTER UPDATE trigger on tenant_subscriptions.plan_id fires a stored function syncing operators.subscription_tier atomically in the same transaction.
Fallback: If Ubicloud denies CREATE FUNCTION, a NestJS application-layer transaction wraps both updates.
Pre-flight validation: Test CREATE FUNCTION on target Ubicloud instance before implementation. Both paths must be documented in schema-backoffice.md Β§tenant_subscriptions.
2. NULL Fallback: Lowest Tier (Pessimistic Default) β
If operators.subscription_tier IS NULL (sync failure or unprovisioned):
- Claims hook injects
X-Hasura-Planwith the lowest tier value. - User loses premium access (safe). Session logs warning.
- Prevention: ProvisionTenant initializes
subscription_tierwith the lowest tier in the same transaction. Schema enforcesNOT NULL DEFAULTconstraint.
3. busflow_staff Unconditional Bypass β
When auth.users.default_role = 'busflow_staff':
X-Hasura-Planalways set to the highest tier (no plan checking).x-hasura-tenant-idomitted (no tenant scoping).- All mutations audited via
change_events(scope='CONFIG').
Rationale: Staff needs cross-tenant support/onboarding access without plan restrictions.
4. Claim Casing: Follows Hasura Convention β
Hasura normalizes all x-hasura-* headers/claims to lowercase on intake. The JWT claim is x-hasura-plan (lowercase). Rules reference x-hasura-plan.
Product Sub-Decisions β
A. PAST_DUE Subscription Status (β Locked: Option A) β
When tenant_subscriptions.status = PAST_DUE but plan_id still reflects a paid tier:
- Decision: Option A β Keep plan in claim; enforce status in Hasura rules.
- Claims hook ignores status; injects current
plan_id. - Hasura permission rules on premium tables add a status filter:
tenant_subscriptions.status = 'ACTIVE'. busflow_staffbypasses the status check unconditionally.- Result: PAST_DUE tenants lose premium access via the status check, not the plan check. Separates plan entitlement (what they paid for) from plan billing state (payment status). Enables grace periods and easy restoration on payment recovery without JWT re-issue.
Rejected alternative: Option B (downgrade claim to lowest tier on PAST_DUE). Simpler logic, but loses plan context and requires JWT re-issue on payment recovery.
B. Tier Catalog (Pending Product) β
IMPORTANT
Tier names, count, and feature-to-tier mappings require a PRODUCT_pricing-tiers.md document. Current placeholders (CORE, PRO, ENTERPRISE) are used throughout the documentation for illustration only.
Consequences β
Positive β
- No per-query overhead. Single JWT claim evaluation per Hasura session (sub-millisecond).
- Stateless tenant switching.
/auth/select-tenantreturns new JWT with target tenant's tier; no server-side session. - Clear audit trail. Plan changes logged via
change_events(scope='CONFIG').
Negative β
- 15-min staleness. Plan upgrade at t=0; active JWT holders see old tier until next 15-min refresh. Workaround: users log out and re-log in to force refresh.
- Denormalization sync complexity. Sync mechanism must be atomic (trigger or NestJS transaction). Desync recovery requires ops runbook.
- Staff role shortcut.
busflow_staffbypasses all plan checks (justified for support; requires audit trail discipline).
Trade-Offs Accepted β
| Trade-Off | Acceptance Rationale |
|---|---|
| 15-min propagation delay on upgrades | Rare event; lowest tier is safe default; users can force refresh via logout |
| Denormalization sync | Postgres trigger is atomic; NestJS fallback documented; desync detection runbook sufficient for MVP |
busflow_staff unconditional bypass | Necessary for support workflows; all mutations audited; risk accepted |
Alternatives Considered β
Alternative A: Per-Query Resolution (Rejected) β
Approach: Hasura custom auth webhook queries the database for the tier on every incoming GraphQL query.
Pros:
- Always current (no 15-min window).
- No denormalization sync needed.
Cons:
- At scale (4k tenants Γ 100 QPS), implies 400k+ DB queries/sec just for plan resolution.
- Hasura auth hook adds latency to every request.
- Cascade failures: if auth DB is slow, all queries fail.
Decision: Rejected. 15-min staleness acceptable; per-query unscalable.
Alternative B: Forced JWT Refresh on Plan Change (Deferred) β
Approach: On plan update, invalidate customer's cached JWT in Redis, forcing refresh on next request.
Pros:
- Reduces staleness window (< 1 min instead of 15 min).
Cons:
- Requires Redis cluster (additional stateful service).
- Nhost capability to invalidate JWTs must be verified.
- Operational complexity.
Decision: Deferred. MVP relies on 15-min TTL. Stretch goal: implement if Nhost provides cache invalidation hooks.
References β
- ADR-005: Defines JWT structure, tenant-switch flow, 15-min TTL. This ADR extends ADR-005 with
X-Hasura-Planclaim injection. - ADR-003: Tenant provisioning must initialize
tenant_subscriptions(plan_id)andoperators(subscription_tier)atomically. - Schema Matrix Enforcement: Hasura permission rules on premium tables and Actions use
X-Hasura-Planclaim per the gating mechanism defined in this ADR. See schema-backoffice.md Β§tour_templates and rbac-matrix.md for enforcement points.
Approval Checklist β
Before implementation begins, architect must confirm:
- [x] JWT injection strategy (15-min staleness) is acceptable vs. per-query (unscalable).
- [ ] Sync mechanism pre-flight test scheduled; NestJS fallback accepted if Postgres trigger fails.
- [x]
busflow_staffunconditional bypass is acceptable for support workflows. - [x] NULL β lowest-tier fallback is the right pessimistic default.
- [x] PAST_DUE policy: Option A (keep plan, check status in rules) is confirmed.
- [ ] Product has defined: tier names and feature-to-tier mappings.
Architect signature: ________________ Date: ________________
Revision History β
| Date | Author | Status | Change |
|---|---|---|---|
| 2026-04-21 | Agent | Proposed | Restored with product content stripped. Engineering decisions preserved: JWT injection vs. per-query, sync mechanism, NULL fallback, staff bypass, casing convention. Product decisions (tier catalog, PAST_DUE policy) flagged as pending. |
| 2026-04-21 | Architect | Approved | JWT injection confirmed. PAST_DUE: Option A locked (keep plan, check status). Tier catalog remains pending product. |