ADR-028: GDPR TTL retention per entity + legal-hold override + UTC-schedule policy β
Status: π‘ Proposed β pending Legal/DPO sign-off on the per-entity table + the
communications.messagesseparation-of-concerns position Impacts:gdpr-strategy.mdΒ§4,docs/schemas/schema-commerce.md,docs/schemas/schema-backoffice.md,apps/api/migrations/2026xxxx_gdpr_ttl.sql,apps/api/src/workers/pii-backfill.worker.ts,apps/api/src/workers/cron-health.worker.ts, new runbookdocs/protocols/legal-hold-runbook.md
Context β
The Level-2 spec described GDPR data scrubbing as a uniform 3-year pg_cron sweep over passengers and cascaded children. Three problems with that framing surfaced in the architect loop:
- Tax law conflicts with GDPR uniformity. German GoBD (
Β§ 147 AO,Β§ 14b UStG) mandates 10-year retention on tax-relevant artefacts (invoices, receipts, payment ledgers). A blanket 3-year wipe oninvoices.recipient_snapshotbreaches tax law. - Idempotency cannot rely on text sentinels. A guard like
first_name <> '[REDACTED]'fails when a real traveller's legal first name is[REDACTED]. Low-probability, non-zero, and indistinguishable when it happens. - Text-level references to PII in
communications.messages.rendered_contentare a Legal/DPO policy question, not a purely technical one. We need a written position.
Plus: no DDL existed for the backoffice.legal_holds table the scrub functions needed to read.
Decision β
Per-entity retention windows. Legal-blessed:
Entity Window Trigger column Legal basis commerce.passengers3 years last_booking_atBDSG Β§ 35 backoffice.resellers2 years last_active_atBDSG Β§ 35 commerce.invoices(PII inrecipient_snapshot)10 years issued_atGoBD Β§ 147 AO, Β§ 14b UStG Chat transcripts (Loki log streams) 14 days log rotation 30-day GDPR window Idempotency via a dedicated column.
passengers.pii_redacted_at TIMESTAMPTZ NULL. Scrub functions guard onpii_redacted_at IS NULL.NULLmeans "not yet redacted"; a set timestamp is the proof-of-redaction.Tombstone rows, do not delete.
payments.refund_passenger_idretains the UUID after scrub; only PII columns are wiped. Hasura is granted read-only permissions on tombstoned rows for financial reconciliation.Online back-fill for
last_booking_at. A NestJS worker (pii-backfill.worker.ts) chunks 10 000 rows with a 1 s delay. The column stays NULLABLE βNULLmeans "unknown, skip in scrub".Legal-hold override.
backoffice.legal_holds (id UUID PRIMARY KEY, tenant_id UUID, passenger_id UUID NULL, reason TEXT, until TIMESTAMPTZ NULL, created_by UUID, created_at TIMESTAMPTZ DEFAULT NOW()). Each per-entity redaction function reads this table first. A subject on an active hold is skipped and logged asSKIPPED_LEGAL_HOLDintenant_scrub_logs. Operations runbook:docs/protocols/legal-hold-runbook.md.UTC schedules + local reporting. All
pg_cronschedules are UTC.tenant_scrub_logs.scrubbed_atis UTC. Operator-local reports derive display time fromtenants.tzat render. Auditors don't need to reconcile two timezones.Append-only audit table.
tenant_scrub_logsis append-only;REVOKE UPDATE, DELETEfrom every role exceptpostgres. Records UUID + entity type + timestamp + skip reason. Never the redacted data.Cron-health probe. A NestJS worker (
cron-health.worker.ts) runs at 03:30 UTC (30 min after the scrub), queriescron.job_run_details WHERE jobname LIKE 'gdpr_%'via an admin role, and posts failures to Slack#opsβ because Ubicloud may not exposecron.job_run_detailsto external Prometheus scrapers.FILLFACTOR = 80onpassengersandresellers, applied directly inCREATE TABLEDDL (greenfield). NightlyVACUUM ANALYZEat 02:00 UTC;pg_stat_user_tables.n_dead_tupmonitored in Mimir.communications.messagesposition (pending DPO sign-off): the table stays in Postgres as canonical conversation record. Rendered names ("Dear Jane, your bookingβ¦") are treated as derived data. Once the sourcecommerce.passengersrow is redacted,rendered_contentcontains an orphaned name reference that can no longer re-identify anyone without the source row. The Loki 14-day TTL covers any PII that leaks into log streams. If DPO rejects this position, a gated Stage 9 cascading text-replace (passenger β booking β conversation linkage) is added as a follow-up decision.Ubicloud
pg_crondependency: Ubicloud confirmspg_crononstandard-2;shared_preload_librariesships withpg_cron. No support ticket required.
Consequences β
Positive:
- GoBD compliance is explicit, not an assumption in a comment.
- Scrub idempotency is proof-based (
pii_redacted_at) not text-based. - Legal-hold policy is enforceable in SQL, not in an eng process.
Negative:
- Legal/DPO sign-off is a real-world gate before Stage 5 (cron schedules go live).
- A rejected
communications.messagesposition forces a Stage 9 cascading text-replace; the architecture is drafted but not built.
Neutral:
- Retention windows are data-controller-configurable per tenant in a follow-up ADR if operators demand per-tenant variation.