Busflow Docs

Internal documentation portal

Skip to content

Incident Broadcast Protocol

Domain: Operations → Communications (cross-domain) Trigger: IncidentCreated domain event (severity=CRITICAL, type=DELAY|BREAKDOWN|PASSENGER_ISSUE) Output: WhatsApp broadcast to downstream passengers with dispatcher approval Sources: event-contracts-operations.md, Journey 2 ("Alpine Stau")


§1 Overview

Flow: Driver 1-Tap → incidents INSERT → Hasura Event Trigger → NestJS handler (severity filter) → passenger targeting query → n8n webhook → dispatcher approval gate → BullMQ dispatch → Meta Cloud API → WhatsApp delivery.

IncidentCreated is the sole trigger for all passenger broadcasts — DELAY, BREAKDOWN, and PASSENGER_ISSUE. For telemetry-detected delays (no driver-reported Incident), the ServiceLegDelayed handler auto-creates a system DELAY Incident, which fires IncidentCreated and enters this chain. See event-contracts-operations.md §ServiceLegDelayed.

WARNING

Phase 1 limitation: DELAY broadcasts use the "Without ETA" template variant until the team specifies the ETA recalculation service. Passengers receive "aktuelle Situation wird geprüft" instead of a concrete \{\{new_eta\}\} / \{\{delay_minutes\}\}. The recalculated_eta field on IncidentCreatedPayload is nullable by design to accommodate this.

mermaid
sequenceDiagram
    participant D as Driver Hub
    participant H as Hasura
    participant N as NestJS Handler
    participant Q as Passenger Query
    participant W as n8n Workflow
    participant DB as Dispatch Board
    participant B as BullMQ
    participant M as Meta Cloud API

    D->>H: INSERT incidents (CRITICAL, BREAKDOWN)
    H->>N: Event Trigger → IncidentCreated
    N->>N: Filter: severity=CRITICAL
    N->>Q: findDownstreamPassengers(payload)
    Q-->>N: TargetedPassenger[]
    N->>W: POST /webhook/incident-broadcast
    W->>W: Resolve INCIDENT_BROADCAST template
    W->>W: Create workflow (PENDING_REVIEW)
    W->>DB: Show review card
    DB-->>W: Dispatcher clicks "Approve"
    W->>B: Enqueue DispatchMessageJob per message
    B->>M: Send WhatsApp HSM template
    M-->>B: Delivery status webhook

§2 Passenger Targeting Query

The query resolves across Commerce and Backoffice schemas using soft references. This is strictly a read operation — the modular monolith permits cross-schema reads (see schema-communications.md §conversations note).

Input from enriched IncidentCreated payload:

  • tour_offering_id — resolved from service_legs.tour_offering_id
  • boarding_point_idboarding_point_library lookup — the incident's current position ([v0.2]: requires boarding_order)
sql
SELECT
  p.id AS passenger_id,
  p.first_name,
  p.last_name,
  p.phone,
  p.email,
  p.passenger_profile_id,
  bpl.name AS boarding_point_name
FROM commerce.passengers p
  JOIN commerce.bookings b ON p.booking_id = b.id
  JOIN backoffice.boarding_point_library bpl ON p.boarding_point_id = bpl.id
WHERE b.tour_offering_id = :tour_offering_id
  AND b.status IN ('DEPOSIT_PAID', 'FULLY_PAID')
  AND p.status = 'ACTIVE'
  -- AND bpl.boarding_order > :incident_boarding_order  [v0.2: requires boarding_order column]
  AND p.phone IS NOT NULL;

Output: TargetedPassenger[]

FieldType
passenger_idUUID
first_nameVARCHAR
last_nameVARCHAR
phoneVARCHAR
emailVARCHAR (nullable)
passenger_profile_idUUID
boarding_point_nameVARCHAR

Resolving incident_boarding_order ([v0.2]): The handler reads service_legs.boarding_point_id (set for PICKUP legs during TripPublished — see schema-operations.md §service_legs), then queries boarding_point_library for that ID. At V0.1, boarding_order does not exist on the library — the query falls back to all passengers on the tour_offering_id (conservative: notify everyone). The dispatcher approval gate (§5) handles inappropriate broadcasts. When boarding order lands at V0.2, the boarding_order > :incident_boarding_order filter activates.

WARNING

Schema change: backoffice.boarding_points has been replaced by boarding_point_library (single operator-level library with optional door pickup per stop). The boarding_order and scheduled_departure_time columns do not exist at V0.1 — both are deferred to [v0.2] as dispatch-side concerns. The targeting query currently falls back to all passengers. When boarding order lands, reactivate the boarding_order > :incident_boarding_order filter. See boarding-points.md.


§3 Communications Consumer Handler

Trigger: Hasura Event Trigger on operations.incidents INSERT → NestJS webhook handler.

Routing rules (per workflow-orchestration.md §Boundary Rules):

ConditionAction
severity ≠ CRITICALNo-op. LOW/MEDIUM incidents are dispatch board only (Hasura subscription).
severity = CRITICALProceed: run passenger targeting query (§2), forward { incident, passengers } to n8n webhook POST /webhook/incident-broadcast. All types (DELAY, BREAKDOWN, PASSENGER_ISSUE) route uniformly.

Input: IncidentCreatedPayload (see event-contracts-operations.md §IncidentCreated).

Output: POST /webhook/incident-broadcast with { incident: IncidentCreatedPayload, passengers: TargetedPassenger[] }.


§4 n8n Workflow Contract

Per communications.md §Message Delivery Pipeline and workflow-orchestration.md §Boundary Rules: "External comms chain → forward to n8n webhook."

Input

POST /webhook/incident-broadcast

FieldTypeDescription
incidentIncidentCreatedPayloadFull enriched payload (see event-contracts-operations.md §IncidentCreated)
passengersTargetedPassenger[]Output of targeting query (§2)
localeVARCHARTemplate locale. Resolved as: passenger locale (if available on passenger_profiles) → operator default locale (operators.locale or de-DE) → fallback de-DE. Phase 1: hardcoded de-DE (DACH-only).

Side Effects

EffectTargetDescription
Contact upsertcommunications.contactsFor each passenger, resolve or create a Contact via passenger_profile_idcontacts.passenger_profile_id.
Conversation createcommunications.conversationsOne per contact, with trip_id set to the incident's service_leg_id (see schema-communications.md §conversations note).
Message createcommunications.messagesOne per passenger, status = QUEUED, direction = OUTBOUND, content_type = TEMPLATE, template_id referencing the resolved INCIDENT_BROADCAST template.
DispatchMessageJob enqueueBullMQ queueOne job per message. Workers dispatch via Meta Cloud API using the tenant's channel_accounts.provider_config.

Precondition

Dispatcher has approved the broadcast (see §5). The workflow pauses at the approval gate until the dispatcher action releases it.

Error Handling

Per workflow-orchestration.md §n8n Error Handling: idempotent workflow → retry max 3x with exponential backoff. Persistent failures → n8n_dead_letters table. opossum circuit breaker on the NestJS → n8n webhook call; if n8n is down, payloads queue in n8n_recovery BullMQ queue.


§5 Dispatcher Approval Gate

Journey 2 mandates: "The Dispatcher reviews, edits if needed, and clicks 'Approve'."

Implementation: BullMQ human-in-the-loop (per workflow-orchestration.md §BullMQ — "Heavy / stateful / needs human approval → enqueue a BullMQ job").

StepActorAction
1SystemCreates broadcast workflow instance with status = PENDING_REVIEW
2Dispatch BoardShows review card: passenger list, AI-drafted message, "Approve" / "Edit" / "Dismiss" buttons. Annotation: If type = PASSENGER_ISSUE AND boarding_point_id IS NULL, the card shows "⚠️ Individual incident at transit stop — all passengers targeted. Consider dismissing if only one passenger suffers the impact."
3aDispatcherClicks "Approve" → POST /api/workflows/:id/review → BullMQ job released → n8n sends broadcast
3bDispatcherClicks "Edit" → inline editor → modified message body → "Approve" releases with edited content
3cDispatcherClicks "Dismiss" → workflow status = DISMISSED, no broadcast sent

SLA escalation: Configurable broadcast_review_timeout (default: 5 min).

StepTriggerAction
Timeoutbroadcast_review_timeout expiresBullMQ BroadcastEscalationJob (delayed job, scheduled at workflow creation) fires. NestJS WebSocket Gateway emits escalation event to the Dispatch Board — all connected dispatchers see a high-priority alert card. The system writes a change_event with scope = GENERAL, entity_type = incident, entity_id = incident_id, action = UPDATE, new_values = { reason: "broadcast_review_timeout" } for SLA reporting.
Second timeoutbroadcast_review_timeout (10 min default)Log as change_event with entity_type = incident, new_values.reason = "escalation_timeout". Still no auto-send — the operational risk of an incorrect broadcast outweighs the delay risk.

§6 All-Clear Handler

When the team resolves a CRITICAL incident, an all-clear message goes to passengers who received the initial broadcast.

Trigger: Hasura Event Trigger on operations.incidents UPDATE where status → RESOLVED → NestJS webhook handler.

Routing rules:

ConditionAction
severity ≠ CRITICALNo-op.
No broadcast was sent (dispatcher dismissed, or resolved before approval)No-op.
severity = CRITICAL AND broadcast was sentProceed: re-query passengers (§2), forward to n8n POST /webhook/incident-allclear. No dispatcher approval gate — auto-send. All types (DELAY, BREAKDOWN, PASSENGER_ISSUE) route uniformly.

No approval gate rationale: The dispatcher has already resolved the incident (ACKNOWLEDGED → IN_PROGRESS → RESOLVED transitions), implicitly approving the all-clear.

Broadcast-sent guard: Query communications.messages WHERE template_id matches the INCIDENT_BROADCAST template AND conversation.trip_id = incident.service_leg_id AND status IN (SENT, DELIVERED, READ) (see schema-communications.md §conversations note on trip_id usage). If no matching messages exist (dispatcher dismissed the broadcast, or incident resolved before approval), no all-clear becomes necessary.


§7 Edge States

#Edge StateResolution
E-1Resolved before broadcast approvalIf IncidentResolved arrives while the broadcast workflow has status = PENDING_REVIEW, auto-dismiss with reason RESOLVED_BEFORE_BROADCAST. No messages sent.
E-2Broadcast sent → resolved before delivery[Future] Passenger may receive delay notification and all-clear nearly simultaneously. No action now. If UX problem: add min_allclear_delay (5 min after broadcast send time) before dispatching the all-clear.
E-3Offline delay (driver has no signal)Incident created offline with status=OPEN, syncs on reconnect. Broadcast delay is inherent. The sync handler should check if occurred_at is stale (> 30 min) and annotate the broadcast review card with "⚠️ Incident reported X min ago (offline delay)" so the dispatcher can assess relevance.
E-4No WhatsApp (fallback)Attempt WhatsApp first. If Meta API returns permanent error (e.g., recipient_not_on_whatsapp): fall back to SMS. If no phone: fall back to email. If neither: log as undeliverable. Add fallback_chain: ['WHATSAPP', 'SMS', 'EMAIL'] on DispatchMessageJob.
E-5Multiple CRITICAL incidents on same legDedup check runs in NestJS handler (§3) before forwarding to n8n. Query: broadcast_workflows WHERE service_leg_id = :leg AND status IN ('PENDING_REVIEW', 'SENT') AND created_at > now() - :dedup_window. If PENDING_REVIEW: append new incident_id to existing workflow metadata; update review card with new incident details (type, description, geo). Passenger list remains the same — same leg, same downstream passengers. Dispatcher's pending review stays intact. If SENT: broadcast already dispatched — create a new workflow for the second incident. Each incident's all-clear resolves independently (§6). Configurable: incident_broadcast_dedup_window_minutes (default: 30).
E-6Wallet pass update failureDeferred to Phase 2. WhatsApp + tracking URL covers core value. If Phase 2: extend Ticket with wallet_pass_token, wallet_push_token, wallet_provider. UpdateWalletPassJob via BullMQ. Non-blocking — failure logged, retried 3x, then marked failed.

§8 Template Variables

INCIDENT_BROADCAST (trigger_event on notification_templates)

Core variables (always available):

VariableSourceDescription
\{\{passenger_name\}\}passengers.first_nameRecipient's first name
\{\{boarding_point_name\}\}boarding_point_library.nameThe passenger's pickup stop
\{\{original_departure_time\}\}[v0.2]Originally scheduled pickup time (requires scheduled_departure_time column)
\{\{incident_type\}\}IncidentCreatedPayload.type → localized labelDELAY → "Verspätung", BREAKDOWN → "Panne", PASSENGER_ISSUE → "Störung"
\{\{incident_description\}\}IncidentCreatedPayload.descriptionDriver's free-text description
\{\{tracking_url\}\}GET /api/track/:tracking_tokenLive tracking link (see event-contracts-operations.md §Consumer ETA Tracking)
\{\{operator_name\}\}backoffice.operators.company_nameOperator's company name
\{\{operator_phone\}\}backoffice.operators.phoneOperator's contact phone

Conditional variables (available only when IncidentCreatedPayload.recalculated_eta IS NOT NULL):

VariableSourceDescription
\{\{new_eta\}\}IncidentCreatedPayload.recalculated_etaRecalculated arrival time at the passenger's stop
\{\{delay_minutes\}\}recalculated_eta - scheduled_departure_timeDelay in minutes

Null handling: For BREAKDOWN incidents, recalculated_eta is typically null (the bus has stopped, not just delayed). The template rendering engine must handle this: if recalculated_eta IS NULL, omit \{\{new_eta\}\} and \{\{delay_minutes\}\} from the rendered message. The WhatsApp template body should use conditional sections (Meta template components support if/else in the body) or the n8n workflow should select between two template variants:

  • With ETA: "…neue voraussichtliche Ankunft: …"
  • Without ETA: "…aktuelle Situation wird geprüft. Wir informieren Sie, sobald es Neuigkeiten gibt…"

Locale resolution: Phase 1 uses de-DE (DACH-only). The n8n workflow queries notification_templates WHERE trigger_event = 'INCIDENT_BROADCAST' AND tenant_id = :tenant_id AND channel = 'WHATSAPP' AND locale = :locale.

INCIDENT_ALLCLEAR (trigger_event on notification_templates)

VariableSourceDescription
\{\{passenger_name\}\}passengers.first_nameRecipient's first name
\{\{boarding_point_name\}\}boarding_point_library.nameThe passenger's pickup stop
\{\{updated_eta\}\}Latest route_waypoints.eta for the passenger's boarding stopCurrent ETA (may differ from original if the system rerouted the bus). Nullable — if ETA service hasn't recalculated post-resolution, omit from message.
\{\{incident_type\}\}IncidentResolvedPayload.type → localized labelMatches the original broadcast type for message continuity
\{\{operator_name\}\}backoffice.operators.company_nameOperator's company name

Internal documentation — Busflow