Incident Broadcast Protocol
Domain: Operations → Communications (cross-domain) Trigger:
IncidentCreateddomain event (severity=CRITICAL, type=DELAY|BREAKDOWN|PASSENGER_ISSUE) Output: WhatsApp broadcast to downstream passengers with dispatcher approval Sources: event-contracts-operations.md, Journey 2: Alpine Stau
§1 Overview
Flow: Driver 1-Tap → incidents INSERT → Hasura Event Trigger → NestJS handler (severity filter) → passenger targeting query → Communications workflow → dispatcher approval gate → BullMQ dispatch → Meta Cloud API → WhatsApp delivery.
IMPORTANT
Production planning now defaults to the Busflow-owned Communications service + BullMQ dispatch pipeline. Later n8n references in this protocol describe the earlier prototype adapter shape and should not be treated as the target production dependency.
IncidentCreated is the sole trigger for all passenger broadcasts — DELAY, BREAKDOWN, and PASSENGER_ISSUE. For telemetry-detected delays (no driver-reported Incident), the ServiceLegDelayed handler auto-creates a system DELAY Incident, which fires IncidentCreated and enters this chain. See event-contracts-operations.md §ServiceLegDelayed.
WARNING
Phase 1 limitation: DELAY broadcasts use the "Without ETA" template variant until the team specifies the ETA recalculation service. Passengers receive "aktuelle Situation wird geprüft" instead of a concrete \{\{new_eta\}\} / \{\{delay_minutes\}\}. The recalculated_eta field on IncidentCreatedPayload is nullable by design to accommodate this.
§2 Passenger Targeting Query
The query resolves across Commerce and Backoffice schemas using soft references. This is strictly a read operation — the modular monolith permits cross-schema reads (see schema-communications.md §conversations note).
Input from enriched IncidentCreated payload:
tour_offering_id— resolved fromservice_legs.tour_offering_idboarding_point_id→boarding_point_librarylookup — the incident's current position ([v0.2]: requires boarding_order)
SELECT
p.id AS passenger_id,
p.first_name,
p.last_name,
p.phone,
p.email,
p.person_profile_id,
bpl.name AS boarding_point_name
FROM commerce.passengers p
JOIN commerce.bookings b ON p.booking_id = b.id
JOIN backoffice.boarding_point_library bpl ON p.boarding_point_id = bpl.id
WHERE b.tour_offering_id = :tour_offering_id
AND b.status IN ('DEPOSIT_PAID', 'FULLY_PAID')
AND p.status = 'ACTIVE'
-- AND bpl.boarding_order > :incident_boarding_order [v0.2: requires boarding_order column]
AND p.phone IS NOT NULL;Output: TargetedPassenger[]
| Field | Type |
|---|---|
passenger_id | UUID |
first_name | VARCHAR |
last_name | VARCHAR |
phone | VARCHAR |
email | VARCHAR (nullable) |
person_profile_id | UUID |
boarding_point_name | VARCHAR |
Resolving
incident_boarding_order([v0.2]): The handler readsservice_legs.boarding_point_id(set for PICKUP legs duringTourDeparturePublished— see schema-operations.md §service_legs), then queriesboarding_point_libraryfor that ID. At V0.1,boarding_orderdoes not exist on the library — the query falls back to all passengers on thetour_offering_id(conservative: notify everyone). The dispatcher approval gate (§5) handles inappropriate broadcasts. When boarding order lands at V0.2, theboarding_order > :incident_boarding_orderfilter activates.
WARNING
Schema change: backoffice.boarding_points has been replaced by boarding_point_library (single operator-level library with optional door pickup per stop). The boarding_order and scheduled_departure_time columns do not exist at V0.1 — both are deferred to [v0.2] as dispatch-side concerns. The targeting query currently falls back to all passengers. When boarding order lands, reactivate the boarding_order > :incident_boarding_order filter. See boarding-points.md.
§3 Communications Consumer Handler
Trigger: Hasura Event Trigger on operations.incidents INSERT → NestJS webhook handler.
Routing rules (per workflow-orchestration.md §Boundary Rules):
| Condition | Action |
|---|---|
severity ≠ CRITICAL | No-op. LOW/MEDIUM incidents are dispatch board only (Hasura subscription). |
severity = CRITICAL | Proceed: run passenger targeting query (§2), forward { incident, passengers } to the Communications broadcast workflow. All types (DELAY, BREAKDOWN, PASSENGER_ISSUE) route uniformly. |
Input: IncidentCreatedPayload (see event-contracts-operations.md §IncidentCreated).
Output: POST /webhook/incident-broadcast with { incident: IncidentCreatedPayload, passengers: TargetedPassenger[] }.
§4 n8n Workflow Contract
Per communications.md §Message Delivery Pipeline and workflow-orchestration.md §Boundary Rules: external communications route through the Busflow-owned Communications/BullMQ pipeline by default.
Input
POST /webhook/incident-broadcast
| Field | Type | Description |
|---|---|---|
incident | IncidentCreatedPayload | Full enriched payload (see event-contracts-operations.md §IncidentCreated) |
passengers | TargetedPassenger[] | Output of targeting query (§2) |
locale | VARCHAR | Template locale. Resolved as: passenger locale (if available on person_profiles) → operator default locale (operators.locale or de-DE) → fallback de-DE. Phase 1: hardcoded de-DE (DACH-only). |
Side Effects
| Effect | Target | Description |
|---|---|---|
Contact upsert | communications.contacts | For each passenger, resolve or create a Contact via person_profile_id → contacts.person_profile_id. |
Conversation create | communications.conversations | One per contact, with service_leg_id set to the incident's originating leg (see schema-communications.md §conversations note). |
Message create | communications.messages | One per passenger, status = QUEUED, direction = OUTBOUND, content_type = TEMPLATE, template_id referencing the resolved INCIDENT_BROADCAST template. |
DispatchMessageJob enqueue | BullMQ queue | One job per message. Workers dispatch via Meta Cloud API using the tenant's channel_accounts.provider_config. |
Precondition
Dispatcher has approved the broadcast (see §5). The workflow pauses at the approval gate until the dispatcher action releases it.
Error Handling
Per workflow-orchestration.md §BullMQ + NestJS — Durable Job Queues: idempotent workflow → retry max 3x with exponential backoff. Persistent failures enter the Communications dead-letter/recovery path. If an optional n8n prototype adapter is in the path, it must carry its own circuit breaker and replay plan.
§5 Dispatcher Approval Gate
Journey 2 mandates: "The Dispatcher reviews, edits if needed, and clicks 'Approve'."
Implementation: BullMQ human-in-the-loop (per workflow-orchestration.md §BullMQ — "Heavy / stateful / needs human approval → enqueue a BullMQ job").
| Step | Actor | Action |
|---|---|---|
| 1 | System | Creates broadcast workflow instance with status = PENDING_REVIEW |
| 2 | Dispatch Board | Shows review card: passenger list, AI-drafted message, "Approve" / "Edit" / "Dismiss" buttons. Annotation: If type = PASSENGER_ISSUE AND boarding_point_id IS NULL, the card shows "⚠️ Individual incident at transit stop — all passengers targeted. Consider dismissing if only one passenger suffers the impact." |
| 3a | Dispatcher | Clicks "Approve" → POST /api/workflows/:id/review → BullMQ job released → Communications pipeline sends broadcast |
| 3b | Dispatcher | Clicks "Edit" → inline editor → modified message body → "Approve" releases with edited content |
| 3c | Dispatcher | Clicks "Dismiss" → workflow status = DISMISSED, no broadcast sent |
SLA escalation: Configurable broadcast_review_timeout (default: 5 min).
| Step | Trigger | Action |
|---|---|---|
| Timeout | broadcast_review_timeout expires | BullMQ BroadcastEscalationJob (delayed job, scheduled at workflow creation) fires. NestJS WebSocket Gateway emits escalation event to the Dispatch Board — all connected dispatchers see a high-priority alert card. The system writes a change_event with scope = GENERAL, entity_type = incident, entity_id = incident_id, action = UPDATE, new_values = { reason: "broadcast_review_timeout" } for SLA reporting. |
| Second timeout | 2× broadcast_review_timeout (10 min default) | Log as change_event with entity_type = incident, new_values.reason = "escalation_timeout". Still no auto-send — the operational risk of an incorrect broadcast outweighs the delay risk. |
§6 All-Clear Handler
When the team resolves a CRITICAL incident, an all-clear message goes to passengers who received the initial broadcast.
Trigger: Hasura Event Trigger on operations.incidents UPDATE where status → RESOLVED → NestJS webhook handler.
Routing rules:
| Condition | Action |
|---|---|
severity ≠ CRITICAL | No-op. |
| No broadcast was sent (dispatcher dismissed, or resolved before approval) | No-op. |
severity = CRITICAL AND broadcast was sent | Proceed: re-query passengers (§2), forward to n8n POST /webhook/incident-allclear. No dispatcher approval gate — auto-send. All types (DELAY, BREAKDOWN, PASSENGER_ISSUE) route uniformly. |
No approval gate rationale: The dispatcher has already resolved the incident (ACKNOWLEDGED → IN_PROGRESS → RESOLVED transitions), implicitly approving the all-clear.
Broadcast-sent guard: Query communications.messages WHERE template_id matches the INCIDENT_BROADCAST template AND conversation.service_leg_id = incident.service_leg_id AND status IN (SENT, DELIVERED, READ). If no matching messages exist (dispatcher dismissed the broadcast, or incident resolved before approval), no all-clear becomes necessary.
§7 Edge States
| # | Edge State | Resolution |
|---|---|---|
| E-1 | Resolved before broadcast approval | If IncidentResolved arrives while the broadcast workflow has status = PENDING_REVIEW, auto-dismiss with reason RESOLVED_BEFORE_BROADCAST. No messages sent. |
| E-2 | Broadcast sent → resolved before delivery | [Future] Passenger may receive delay notification and all-clear nearly simultaneously. No action now. If UX problem: add min_allclear_delay (5 min after broadcast send time) before dispatching the all-clear. |
| E-3 | Offline delay (driver has no signal) | Incident created offline with status=OPEN, syncs on reconnect. Broadcast delay is inherent. The sync handler should check if occurred_at is stale (> 30 min) and annotate the broadcast review card with "⚠️ Incident reported X min ago (offline delay)" so the dispatcher can assess relevance. |
| E-4 | No WhatsApp (fallback) | Attempt WhatsApp first. If Meta API returns permanent error (e.g., recipient_not_on_whatsapp): fall back to SMS. If no phone: fall back to email. If neither: log as undeliverable. Add fallback_chain: ['WHATSAPP', 'SMS', 'EMAIL'] on DispatchMessageJob. |
| E-5 | Multiple CRITICAL incidents on same leg | Dedup check runs in NestJS handler (§3) before forwarding to n8n. Query: broadcast_workflows WHERE service_leg_id = :leg AND status IN ('PENDING_REVIEW', 'SENT') AND created_at > now() - :dedup_window. If PENDING_REVIEW: append new incident_id to existing workflow metadata; update review card with new incident details (type, description, geo). Passenger list remains the same — same leg, same downstream passengers. Dispatcher's pending review stays intact. If SENT: broadcast already dispatched — create a new workflow for the second incident. Each incident's all-clear resolves independently (§6). Configurable: incident_broadcast_dedup_window_minutes (default: 30). |
| E-6 | Wallet pass update failure | Deferred to Phase 2. WhatsApp + tracking URL covers core value. If Phase 2: extend Ticket with wallet_pass_token, wallet_push_token, wallet_provider. UpdateWalletPassJob via BullMQ. Non-blocking — failure logged, retried 3x, then marked failed. |
§8 Template Variables
INCIDENT_BROADCAST (trigger_event on notification_templates)
Core variables (always available):
| Variable | Source | Description |
|---|---|---|
\{\{passenger_name\}\} | passengers.first_name | Recipient's first name |
\{\{boarding_point_name\}\} | boarding_point_library.name | The passenger's pickup stop |
\{\{original_departure_time\}\} | [v0.2] | Originally scheduled pickup time (requires scheduled_departure_time column) |
\{\{incident_type\}\} | IncidentCreatedPayload.type → localized label | DELAY → "Verspätung", BREAKDOWN → "Panne", PASSENGER_ISSUE → "Störung" |
\{\{incident_description\}\} | IncidentCreatedPayload.description | Driver's free-text description |
\{\{tracking_url\}\} | GET /api/track/:tracking_token | Live tracking link (see event-contracts-operations.md §Consumer ETA Tracking) |
\{\{operator_name\}\} | backoffice.operators.company_name | Operator's company name |
\{\{operator_phone\}\} | backoffice.operators.phone | Operator's contact phone |
Conditional variables (available only when IncidentCreatedPayload.recalculated_eta IS NOT NULL):
| Variable | Source | Description |
|---|---|---|
\{\{new_eta\}\} | IncidentCreatedPayload.recalculated_eta | Recalculated arrival time at the passenger's stop |
\{\{delay_minutes\}\} | recalculated_eta - scheduled_departure_time | Delay in minutes |
Null handling: For BREAKDOWN incidents,
recalculated_etais typically null (the bus has stopped, not just delayed). The template rendering engine must handle this: ifrecalculated_eta IS NULL, omit\{\{new_eta\}\}and\{\{delay_minutes\}\}from the rendered message. The WhatsApp template body should use conditional sections (Meta template components supportif/elsein the body) or the n8n workflow should select between two template variants:
- With ETA: "…neue voraussichtliche Ankunft: …"
- Without ETA: "…aktuelle Situation wird geprüft. Wir informieren Sie, sobald es Neuigkeiten gibt…"
Locale resolution: Phase 1 uses de-DE (DACH-only). The n8n workflow queries notification_templates WHERE trigger_event = 'INCIDENT_BROADCAST' AND tenant_id = :tenant_id AND channel = 'WHATSAPP' AND locale = :locale.
INCIDENT_ALLCLEAR (trigger_event on notification_templates)
| Variable | Source | Description |
|---|---|---|
\{\{passenger_name\}\} | passengers.first_name | Recipient's first name |
\{\{boarding_point_name\}\} | boarding_point_library.name | The passenger's pickup stop |
\{\{updated_eta\}\} | Latest route_waypoints.eta for the passenger's boarding stop | Current ETA (may differ from original if the system rerouted the bus). Nullable — if ETA service hasn't recalculated post-resolution, omit from message. |
\{\{incident_type\}\} | IncidentResolvedPayload.type → localized label | Matches the original broadcast type for message continuity |
\{\{operator_name\}\} | backoffice.operators.company_name | Operator's company name |