Busflow Docs

Internal documentation portal

Skip to content

Real-Time & Reactivity Strategy ​

Decision ​

Every user-facing surface must feel instant. Changes propagate in real-time, efficiently — not through polling or manual refresh. This document defines the architectural patterns, boundary rules, and surface classification that govern how reactivity works across the stack.


Core Principle: Subscribe by Default, Query by Exception ​

New UI surfaces default to Hasura GraphQL Subscriptions for any data that can change during a user session. Standard one-shot queries apply only to archival, paginated, or non-live views (e.g., resolved conversations, past invoices, historical reports).

IMPORTANT

If a surface displays data that another user or system process can mutate while the current user looks at it, that surface must use a subscription or demonstrate why polling is acceptable.


Reactivity Tiers ​

Every UI surface falls into one of two tiers. The tier determines the transport mechanism, acceptable latency, and UX treatment.

TierLatency TargetMechanismWhen to Use
Live< 1sHasura Subscription (WebSocket)Any data that can change while the user looks at it — dispatch board, inbox, seat maps, dashboards, assignment lists, booking status, tracking
Eventual10s–5minEvent-driven cache invalidation + query on next interactionPricing caches, tour catalog, report generation — background-synced data where changes are infrequent batch operations

NOTE

Why no polling tier? At busflow's scale (≤50 concurrent users in year 1, hard ceiling ~10k), Hasura subscriptions are strictly cheaper than polling. Hasura multiplexes identical subscription queries — 10 dispatchers watching the same board = 1 SQL query, not 10. Polling generates more HTTP requests, higher server load, and worse latency. Use subscriptions everywhere; reserve polling for nothing.

Surface Classification ​

SurfaceTierRationale
Omnichannel InboxLiveDispatchers monitor conversations all day. Incoming messages must appear instantly.
Dispatch Board (Gantt/Calendar)LiveLeg status changes, incident alerts, and concurrent assignment conflicts require instant visibility.
Booking Widget (Seat Map)LiveConcurrent seat holds must reflect in real-time to prevent double-booking frustration.
Workspace Dashboards (Revenue, Booking Pace)LiveSubscriptions are cheap at our scale. Dispatchers benefit from seeing numbers update without refresh.
Passenger Tracking PageLiveSubscription on route_waypoints — the GPS push interval (~10s) bounds update frequency, but the subscription itself adds no overhead.
Driver Hub (Assignment List)LiveNew assignments and manifest changes appear immediately. The driver's app subscribes while foregrounded.
Passenger Portal (Booking Status)LiveStatus transitions are infrequent but must reflect without manual refresh. Subscription cost is negligible per session.
Tour Catalog / PricingEventualCached via PriceMatrixPublished / SeasonPricingFinalized events. Price changes are infrequent batch operations — when they happen, the event-driven invalidation ensures the next page load shows the correct price.

Frontend Patterns ​

1. Optimistic UI ​

Mutations that affect the current user's view must provide immediate visual feedback before the server confirms.

  • Render the expected result instantly (e.g., message appears with "sending" indicator, seat shows as "held").
  • The subscription (or mutation response) reconciles the optimistic state with the server state.
  • On conflict or failure: revert the optimistic state and show an inline error — never a full-page error.

Applies to: agent replies (inbox), seat selection (booking widget), status transitions (dispatch board assignments), form submissions.

2. Subscription Lifecycle ​

  • Subscriptions activate on route enter (or component mount for persistent widgets like the inbox sidebar).
  • Subscriptions deactivate on route leave (or component unmount).
  • Global subscriptions (e.g., unread badge, assignment alerts) live at the App Shell level and persist across route changes.
  • After WebSocket reconnection, the subscription auto-recovers. If stale data is possible (offline > 30s), fetch the full current state before resuming the subscription stream.

3. Loading & Transition States ​

  • First load: Skeleton screens (shimmer), never blank pages or raw spinners.
  • Subscription data updates: No loading indicator — data appears inline (append, update in place, remove with animation).
  • Action buttons: The button that triggered the action shows a loading state (spinner/disabled) until the mutation resolves or the optimistic render settles.
  • Mutations in flight: Inline indicators (progress bar, "saving..." text) on the affected element, not global spinners.
  • Route transitions: Prefetch data on hover/focus of navigation links where feasible (Nuxt prefetch, Vue Router lazy loading).

4. Error & Reconnection UX ​

  • WebSocket disconnect: After 3s, show a subtle banner: "Reconnecting..." with auto-retry (exponential backoff). After 30s, escalate to a prominent banner: "Connection lost. Changes may arrive late."
  • Subscription error: Log to Faro for observability. Attempt reconnection with exponential backoff. If unrecoverable, show a banner prompting page refresh.
  • Mutation failure after optimistic render: Revert with a toast notification explaining the failure. Never silently drop data.

Backend Patterns ​

1. Hasura Subscriptions as the Only Real-Time Channel ​

All Live surfaces use Hasura's native subscription infrastructure. No custom WebSocket server for domain data. No polling.

Exception: BullMQ workflow escalations (e.g., broadcast approval timeout) use a dedicated NestJS WebSocket Gateway — this is the only non-Hasura real-time channel. It carries system alerts, not domain data.

Capacity at busflow Scale ​

Hasura multiplexes subscriptions: identical queries across N clients execute as 1 SQL query. At busflow's operating profile (≤50 concurrent users year 1, hard ceiling ~10k tenants with ~3 concurrent dispatchers each), the subscription load is trivial:

  • Year 1: ~50 users Ɨ ~4 subscriptions = ~200 subscription slots → effectively ~30–50 unique SQL queries/second (multiplexed).
  • Full scale: ~10k total users, ~300 concurrent Ɨ ~4 subscriptions = ~1,200 slots → ~200–400 unique SQL queries/second. A single Postgres instance handles this without stress.
  • Connection pool: Hasura manages its own pool (default ~50 connections). At full scale, tune to ~100. No additional infrastructure needed.

2. Denormalize for Subscription Performance ​

Hasura subscriptions poll the database at a configurable interval (default: 1s). Complex joins and computed fields in subscription queries degrade performance at scale.

Rule: If a subscription query requires a correlated subquery or expensive JOIN, denormalize the result into a column or pre-computed field (maintained via Postgres trigger). Examples:

  • conversations.last_message_at — updated by trigger on messages INSERT.
  • conversations.unread_count — materialize via trigger if the computed field becomes a bottleneck.
  • service_legs.boarding_count — maintained by trigger on boarding_events INSERT.

3. Mutation → Subscription Flow ​

For any write operation that must reflect in real-time:

User action → Hasura Mutation (or Action) → DB write → Hasura subscription poll detects change → push to all subscribers

The subscription poll interval (default: 1s) determines the worst-case propagation delay. For Live surfaces, this is acceptable. Tune HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_REFETCH_INTERVAL if you need sub-second latency.

4. Event-Driven Cache Invalidation (Eventual Tier) ​

For Eventual tier data, domain events drive cache invalidation:

Domain Event (e.g., PriceMatrixPublished) → Hasura Event Trigger → NestJS handler → invalidate CDN/edge cache or update read model

The frontend uses stale-while-revalidate: serve the cached version while fetching the updated version in the background. The user never sees a loading state for eventually-consistent data.


Boundary Rules ​

SignalReactivity Pattern
Data changes while user watches itLive — Hasura Subscription
Data changes on a schedule or via batch eventsEventual — Event-driven invalidation + stale-while-revalidate
User performs a write and expects instant feedbackOptimistic UI — render immediately, reconcile via subscription
Server needs to push an alert to a specific userNestJS WebSocket Gateway (escalations, assignment alerts only)
External system sends data (CPaaS webhook, payment callback)Webhook → DB write → subscription propagation (no custom push needed)

Open Items (Future Protocol Work) ​

The following require dedicated protocol documents when implementation begins:

  • [ ] Dispatch Board Protocol — formal subscription contracts for leg status, incident overlays, crew/vehicle availability
  • [ ] Booking Widget Real-Time Protocol — seat hold visibility, checkout status, price update propagation
  • [ ] Passenger Tracking Protocol — polling/SSE interval, vehicle position payload, ETA display
  • [ ] Driver Hub Push Strategy — PWA push notifications for assignment changes, manifest updates
  • [ ] Frontend Subscription Composable — useSubscription(), useOptimisticMutation() reference implementations
  • [ ] Wallet Pass Push Protocol — Apple/Google pass update triggers and delivery mechanism

Internal documentation — Busflow