Busflow Docs

Internal documentation portal

Skip to content

Communications Architecture ​

Overview ​

The Communications module is a Shared Core Domain operating as a standalone microservice, decoupled from the Backoffice, Commerce, and Operations contexts. It provides two capabilities:

  1. Omnichannel Inbox β€” a unified agent workspace for managing customer, driver, and vendor conversations across WhatsApp, Email, Webchat, and SMS.
  2. Trigger-Based Automated Messaging β€” reactive and scheduled notifications orchestrated via Hasura Event/Cron Triggers, without the domain pillars having any awareness of messaging protocols.

For domain entities (ChannelAccount, Contact, Conversation, Message) and full architectural decisions (API Aggregator model, contextual linking, nullable FK enforcement), see PRODUCT_domain-model.md.

Integration with Workflow Orchestration ​

Communications plugs into the existing four-layer workflow stack (see workflow-orchestration.md):

LayerCommunications Role
Hasura Event TriggersDomain mutations (e.g., trip status β†’ DELAYED) fire webhooks to the Communications service
Hasura Cron TriggersScheduled workflows (e.g., T-24h pre-trip reminders) invoke the Communications service to query upcoming events and dispatch notifications
n8nMulti-channel message dispatch chains (WhatsApp + Email fallback), template rendering, delivery tracking
BullMQBatch message fan-out, retry queues for failed deliveries, rate-limited sending

CPaaS Provider Strategy ​

The Communications architecture utilizes an API Aggregator Model to keep domain logic provider-agnostic. While we choose specific providers for initial execution, the backend abstracts all external dispatch operations.

Selected Default Providers:

  • WhatsApp Business: Meta Cloud API directly (avoids third-party middleman markup from Twilio/360dialog).
  • Email: Amazon SES.
  • SMS: Amazon SNS.

The aggregator ensures all domain events map to a generic SendMessageCommand, which only translates to Meta/AWS specific JSON payloads at the very edge of the network.

Channel Provisioning & Multi-Tenancy ​

Every tenant (Operator) provisions their own channels to maintain brand identity (white-labeled messaging).

  1. Onboarding: Operator registers their Meta/Facebook Business account and verifies their domain for Amazon SES.
  2. Configuration Storage: The system saves the generated API keys, Webhook Verification Tokens, and Sender IDs in the ChannelAccount.provider_config JSONB column.
  3. Webhook Routing: Each tenant receives a per-WABA webhook endpoint (e.g., POST /api/webhooks/meta/{tenant_id}) registered via Meta's WABA-level override API (POST /<WABA_ID>/subscribed_apps). The handler resolves tenant_id from the URL path. SES/SNS webhooks are account-level β€” the handler resolves tenant_id from the Message row via external_message_id. For details, see notification-pipeline-protocol.md Β§8 and channel-provisioning-protocol.md Β§7.

For registration flows (Meta Embedded Signup, SES domain verification, platform-managed SMS), ChannelAccount lifecycle state machine, Hasura Action contracts, and the operator_integrations sync contract, see channel-provisioning-protocol.md.

Message Delivery Pipeline ​

The delivery pipeline operates asynchronously to ensure external network latency never blocks core domain transactions.

  1. Trigger: A domain event occurs (e.g., Incident created) and triggers a Hasura Event Trigger webhook pointing to NestJS, which enriches the payload and forwards to n8n (per workflow-orchestration.md Β§Routing Rule).
  2. Template Resolution: The n8n workflow identifies the target passenger_profile_id, resolves the relevant NotificationTemplate, and interpolates variables.
  3. Queueing: n8n creates a Message record with status: QUEUED and pushes a generic DispatchMessageJob onto the BullMQ execution queue.
  4. Dispatch: The DispatchWorker (a NestJS BullMQ processor) consumes the job:
    • Decrypts the tenant's provider_config secrets via EncryptionService.
    • Selects the appropriate provider adapter (WhatsApp, Email, or SMS) via CpaasAdapterFactory.
    • Acquires a rate limit token from the per-tenant RateLimiterService (tier-aware for WhatsApp).
    • Invokes the adapter's dispatch() method, which constructs the provider-specific API request (Meta Cloud API, SES, or SNS), executes it, and extracts the external_message_id from the response.
    • Updates Message.external_message_id and transitions status to SENT.
    • On permanent failure, triggers the fallback chain (if configured) by creating a new Message for the next channel and re-enqueing.
  5. Delivery Tracking: Meta/AWS fire delivery callbacks (Delivered, Read, Failed) to the app-level webhook endpoints. NestJS CPaaS webhook handlers validate signatures, extract the external_message_id, query the Message row, and update status, delivered_at/read_at/failed_reason. Hasura subscriptions push changes to the inbox instantly.

For the full NotificationRequest interface, DispatchMessageJob/BatchDispatchJob BullMQ specs, template resolution algorithm, multi-channel decision matrix, full trigger_event registry (22 external entries), adapter class contracts, error classification per provider, rate limiter configuration, and CPaaS webhook specification, see notification-pipeline-protocol.md.

Inbox Architecture ​

The Omnichannel Inbox operates in real-time using Hasura GraphQL Subscriptions β€” no custom WebSocket layer required. Hasura multiplexes all subscriptions over a single WebSocket per client, handles reconnection, and integrates with the existing Nhost infrastructure.

Core subscriptions: Three Hasura subscriptions power the inbox. (1) Conversation list (sidebar, sorted by last_message_at DESC). (2) Message thread (live messages for the open conversation). (3) Global unread badge (aggregate unread-conversation count).

Agent routing: Phase 1 uses manual claim-based routing. Unassigned conversations (assigned_to = NULL) appear in a shared queue. Any dispatcher can claim a conversation via the claimConversation Hasura Action, which uses SELECT ... FOR UPDATE for concurrent-claim safety (matching the takeOverIncident pattern from Operations). Admins can reassign conversations via reassignConversation.

Unread tracking: A conversation_read_cursors table stores per-agent, per-conversation timestamp cursors. A Hasura computed field (conversation_unread_count) counts INBOUND messages with sent_at > last_read_at for the session user. Only inbound messages count as unread β€” outbound agent/system messages do not.

Conversation lifecycle: OPEN β†’ RESOLVED / SNOOZED. Dispatchers trigger transitions via resolveConversation and snoozeConversation Hasura Actions. A new inbound message from the same Contact automatically reopens RESOLVED or SNOOZED conversations. A Hasura Cron Trigger (conversation_snooze_wakeup) reopens snoozed conversations past their expiry. See schema-communications.md Β§Conversation Lifecycle State Machine.

Inbound message path: CPaaS webhook (Meta Cloud API / SES / SNS) β†’ NestJS InboundMessageHandler β†’ Contact resolution (contacts.identifiers JSONB lookup) β†’ Conversation resolution (match by Contact + open status) β†’ messages INSERT (direction = INBOUND, status = DELIVERED) β†’ Hasura subscription push. Inbound messages skip the BullMQ dispatch pipeline entirely β€” that pipeline handles outbound only.

Agent replies: Dispatchers send replies via the sendAgentMessage Hasura Action, which enters the same outbound pipeline as automated messages: NestJS handler creates a Message (status = QUEUED) β†’ enqueues DispatchMessageJob on BullMQ β†’ CPaaS delivery.

Cross-context sidebar: The inbox conversation detail panel displays booking, trip, or invoice context data via Hasura manual object relationships on the conversations table's soft FK columns (booking_id, trip_id, invoice_id). These resolve as standard SQL JOINs within the same Postgres instance (read-side coupling, permitted per domain-driven-design.md Β§7.2).

For full subscription queries, Hasura Action contracts, resolution algorithms, and computed field SQL, see inbox-protocol.md.

Observability ​

To prevent silent communication failures:

  1. Hasura Event Logs: Track failed webhook deliveries from the database to n8n.
  2. BullMQ Dashboards: Expose active, delayed, and stalled queues. All failed dispatch attempts utilize an exponential backoff retry strategy.
  3. Status Rollups: A background cron job scans the Message table for items stuck in QUEUED for >5 minutes, triggering high-severity alerts in the observability stack (Grafana).

Internal documentation β€” Busflow