Busflow Docs

Internal documentation portal

Skip to content

GDPR & Data Protection Strategy ​

GDPR compliance is built directly into the BusFlow platform architecture via core Privacy by Design principles.

1. Technical and Organizational Measures (TOMs) ​

  • Encryption: AES-256 for data at rest (Hetzner Volumes) and TLS 1.3 for all data in transit (Traefik edge).
  • EU data residency: all primary application data resides on Hetzner infrastructure in Falkenstein (fsn1). Production Postgres runs on Ubicloud Managed Postgres in eu-central-h1 β€” the Ubicloud SLA Β§(EU residency) clause guarantees bare-metal residency within the EU and is cited here as the DPIA audit-trail reference. The cutover record lives in ADR-022.
  • Access Control: RBAC via Hasura limits API mutations strictly to authenticated tenants. Staff access requires SSO+2FA.
  • Audit Logging: Postgres logical replication streams critical entity mutations safely into a write-once audit ledger in Hasura.
  • Data Minimization: Passenger schemas store only the fields explicitly needed for ticket generation and transport liability.

2. Operator Onboarding (DPA) ​

  • During the B2B tenant onboarding flow, a standardized Data Processing Agreement (DPA) based on the BITKOM template is electronically signed via DocuSign integration.
  • Busflow acts purely as the Data Processor (Auftragsverarbeiter); the Operator remains the Data Controller (Verantwortlicher).

3. Data Flow & Boundary Isolation ​

  • The Commerce Bounded Context strictly constrains PII (Passenger Details, Reseller Contacts). Other contexts refer to these entities via generic UUIDs, avoiding downstream PII leaks.
  • Client-Side Filtering: The observability layer (Grafana Faro SDK) sanitizes query parameters and DOM elements to catch PII leaks prior to log transmission.

4. Data Subject Rights Workflows ​

  • Right to Access/Portability: Tenants can invoke a "Generate GDPR Export" REST endpoint via NestJS, which aggregates passenger payload data from the Commerce schema into a standard machine-readable JSON structure.

  • Right to Erasure (Deletion): The system automates strict physical data scrubbing via a pg_cron pipeline with per-entity retention windows β€” not a uniform 3-year wipe. Each window requires Legal/DPO sign-off and is codified in ADR-028:

    EntityWindowTrigger columnLegal basis
    commerce.passengers3 yearslast_booking_atBDSG Β§ 35
    backoffice.resellers2 yearslast_active_atBDSG Β§ 35
    commerce.invoices (PII in recipient_snapshot)10 yearsissued_atGoBD Β§ 147 AO, Β§ 14b UStG
    Chat transcripts (Loki log streams)14 dayslog rotation30-day GDPR window
    communications.messages (Postgres β€” see below)retained (see "Separation of concerns")n/aoperational data
  • Sub-Entity Cascading (tombstone, not delete): The scrubbing pipeline cascades pseudonymization across sub-entities β€” Commerce passengers (redact first_name, last_name, email, phone), associated tickets (void). payments.refund_passenger_id keeps the UUID so referential integrity is preserved; the row is "tombstoned" (PII columns wiped) not deleted. Hasura gets read-only permissions on tombstoned rows for financial reconciliation.

  • Idempotency via dedicated column: scrubs guard on passengers.pii_redacted_at IS NULL (a dedicated TIMESTAMPTZ), not on a text sentinel like first_name <> '[REDACTED]' β€” which would fail against a real traveller whose legal first name happens to be [REDACTED].

  • UTC schedules, local reporting: all pg_cron schedules and all tenant_scrub_logs.scrubbed_at values are UTC. Operator-local compliance reports derive display time from tenants.tz at render time. A DPIA auditor who reads "scrub ran at 03:00" will never need to reconcile two timezones.

  • Online back-fill: last_booking_at is populated by a NestJS worker (pii-backfill.worker.ts) that chunks 10 000 rows with a 1 s delay. The column stays NULLABLE forever β€” a NULL means "unknown, skip in scrub", not "ready to redact".

  • Legal-hold override: the backoffice.legal_holds table (created in the same migration as the scrub functions) records active investigations. Each per-entity redaction function reads this table first; a passenger/reseller/invoice/conversation on an active hold is skipped and logged as SKIPPED_LEGAL_HOLD in tenant_scrub_logs. Runbook: legal-hold-runbook.md.

  • Scrub Audit Log: A dedicated tenant_scrub_logs table (append-only; REVOKE UPDATE, DELETE) records the UUIDs of scrubbed records, entity type, execution timestamp, and skip reason. The log omits actual redacted data β€” it proves the scrub occurred without reintroducing PII.

  • communications.messages β€” Separation of concerns: The communications.messages table is operational Postgres data, not a log stream. rendered_content holds rendered names ("Dear Jane, your booking…"). Our position is that once the source commerce.passengers row is redacted, the rendered reference is an orphaned name fragment that can no longer re-identify anyone without the source row. The Loki 14-day TTL handles leaked PII in log streams. If Legal/DPO rejects this position, we activate a gated Stage 9 cascading text-replace (passenger β†’ booking β†’ conversation linkage) β€” not yet implemented; see the architect-loop doc for the contract.

  • Immutability Edge Case (logs, not messages): For raw log trails (Loki/Tempo), retroactive scrubbing of append-only chunks is computationally hazardous. Instead, the system relies on a strict 14-day retention TTL to naturally age out any inadvertently logged PII, well within the legally compliant 30-day window. This is managed by the Loki compactor β€” not by MinIO bucket-lifecycle rules, which cause "chunk not found" errors at query time.

5. Breach Protocol & Secrets ​

  • Infrastructure Secrets: All core infrastructure credentials utilize Swarm Docker Secrets (/run/secrets/), with values automatically injected via GitHub Actions CI/CD to eliminate manual rotation.
  • Tenant Credentials: Sensitive tenant API configurations are protected at the database level using application-scale encryption (pgsodium) and securely masked before API exposure.
  • Continuous SAST checks limit inadvertent data leakage at build-time.
  • Production exposure events immediately trigger the #security-alerts escalation protocol to meet the 72-hour notification mandate.

Internal documentation β€” Busflow