Real-Time & Reactivity Strategy ā
Decision ā
Every user-facing surface must feel instant. Changes propagate in real-time, efficiently ā not through polling or manual refresh. This document defines the architectural patterns, boundary rules, and surface classification that govern how reactivity works across the stack.
Core Principle: Subscribe by Default, Query by Exception ā
New UI surfaces default to Hasura GraphQL Subscriptions for any data that can change during a user session. Standard one-shot queries apply only to archival, paginated, or non-live views (e.g., resolved conversations, past invoices, historical reports).
IMPORTANT
If a surface displays data that another user or system process can mutate while the current user looks at it, that surface must use a subscription or demonstrate why polling is acceptable.
Reactivity Tiers ā
Every UI surface falls into one of two tiers. The tier determines the transport mechanism, acceptable latency, and UX treatment.
| Tier | Latency Target | Mechanism | When to Use |
|---|---|---|---|
| Live | < 1s | Hasura Subscription (WebSocket) | Any data that can change while the user looks at it ā dispatch board, inbox, seat maps, dashboards, assignment lists, booking status, tracking |
| Eventual | 10sā5min | Event-driven cache invalidation + query on next interaction | Pricing caches, tour catalog, report generation ā background-synced data where changes are infrequent batch operations |
NOTE
Why no polling tier? At busflow's scale (ā¤50 concurrent users in year 1, hard ceiling ~10k), Hasura subscriptions are strictly cheaper than polling. Hasura multiplexes identical subscription queries ā 10 dispatchers watching the same board = 1 SQL query, not 10. Polling generates more HTTP requests, higher server load, and worse latency. Use subscriptions everywhere; reserve polling for nothing.
Surface Classification ā
| Surface | Tier | Rationale |
|---|---|---|
| Omnichannel Inbox | Live | Dispatchers monitor conversations all day. Incoming messages must appear instantly. |
| Dispatch Board (Gantt/Calendar) | Live | Leg status changes, incident alerts, and concurrent assignment conflicts require instant visibility. |
| Booking Widget (Seat Map) | Live | Concurrent seat holds must reflect in real-time to prevent double-booking frustration. |
| Workspace Dashboards (Revenue, Booking Pace) | Live | Subscriptions are cheap at our scale. Dispatchers benefit from seeing numbers update without refresh. |
| Passenger Tracking Page | Live | Subscription on route_waypoints ā the GPS push interval (~10s) bounds update frequency, but the subscription itself adds no overhead. |
| Driver Hub (Assignment List) | Live | New assignments and manifest changes appear immediately. The driver's app subscribes while foregrounded. |
| Passenger Portal (Booking Status) | Live | Status transitions are infrequent but must reflect without manual refresh. Subscription cost is negligible per session. |
| Tour Catalog / Pricing | Eventual | Cached via PriceMatrixPublished / SeasonPricingFinalized events. Price changes are infrequent batch operations ā when they happen, the event-driven invalidation ensures the next page load shows the correct price. |
Frontend Patterns ā
1. Optimistic UI ā
Mutations that affect the current user's view must provide immediate visual feedback before the server confirms.
- Render the expected result instantly (e.g., message appears with "sending" indicator, seat shows as "held").
- The subscription (or mutation response) reconciles the optimistic state with the server state.
- On conflict or failure: revert the optimistic state and show an inline error ā never a full-page error.
Applies to: agent replies (inbox), seat selection (booking widget), status transitions (dispatch board assignments), form submissions.
2. Subscription Lifecycle ā
- Subscriptions activate on route enter (or component mount for persistent widgets like the inbox sidebar).
- Subscriptions deactivate on route leave (or component unmount).
- Global subscriptions (e.g., unread badge, assignment alerts) live at the App Shell level and persist across route changes.
- After WebSocket reconnection, the subscription auto-recovers. If stale data is possible (offline > 30s), fetch the full current state before resuming the subscription stream.
3. Loading & Transition States ā
- First load: Skeleton screens (shimmer), never blank pages or raw spinners.
- Subscription data updates: No loading indicator ā data appears inline (append, update in place, remove with animation).
- Action buttons: The button that triggered the action shows a loading state (spinner/disabled) until the mutation resolves or the optimistic render settles.
- Mutations in flight: Inline indicators (progress bar, "saving..." text) on the affected element, not global spinners.
- Route transitions: Prefetch data on hover/focus of navigation links where feasible (Nuxt
prefetch, Vue Router lazy loading).
4. Error & Reconnection UX ā
- WebSocket disconnect: After 3s, show a subtle banner: "Reconnecting..." with auto-retry (exponential backoff). After 30s, escalate to a prominent banner: "Connection lost. Changes may arrive late."
- Subscription error: Log to Faro for observability. Attempt reconnection with exponential backoff. If unrecoverable, show a banner prompting page refresh.
- Mutation failure after optimistic render: Revert with a toast notification explaining the failure. Never silently drop data.
Backend Patterns ā
1. Hasura Subscriptions as the Only Real-Time Channel ā
All Live surfaces use Hasura's native subscription infrastructure. No custom WebSocket server for domain data. No polling.
Exception: BullMQ workflow escalations (e.g., broadcast approval timeout) use a dedicated NestJS WebSocket Gateway ā this is the only non-Hasura real-time channel. It carries system alerts, not domain data.
Capacity at busflow Scale ā
Hasura multiplexes subscriptions: identical queries across N clients execute as 1 SQL query. At busflow's operating profile (ā¤50 concurrent users year 1, hard ceiling ~10k tenants with ~3 concurrent dispatchers each), the subscription load is trivial:
- Year 1: ~50 users Ć ~4 subscriptions = ~200 subscription slots ā effectively ~30ā50 unique SQL queries/second (multiplexed).
- Full scale: ~10k total users, ~300 concurrent Ć ~4 subscriptions = ~1,200 slots ā ~200ā400 unique SQL queries/second. A single Postgres instance handles this without stress.
- Connection pool: Hasura manages its own pool (default ~50 connections). At full scale, tune to ~100. No additional infrastructure needed.
2. Denormalize for Subscription Performance ā
Hasura subscriptions poll the database at a configurable interval (default: 1s). Complex joins and computed fields in subscription queries degrade performance at scale.
Rule: If a subscription query requires a correlated subquery or expensive JOIN, denormalize the result into a column or pre-computed field (maintained via Postgres trigger). Examples:
conversations.last_message_atā updated by trigger onmessagesINSERT.conversations.unread_countā materialize via trigger if the computed field becomes a bottleneck.service_legs.boarding_countā maintained by trigger onboarding_eventsINSERT.
3. Mutation ā Subscription Flow ā
For any write operation that must reflect in real-time:
User action ā Hasura Mutation (or Action) ā DB write ā Hasura subscription poll detects change ā push to all subscribersThe subscription poll interval (default: 1s) determines the worst-case propagation delay. For Live surfaces, this is acceptable. Tune HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_REFETCH_INTERVAL if you need sub-second latency.
4. Event-Driven Cache Invalidation (Eventual Tier) ā
For Eventual tier data, domain events drive cache invalidation:
Domain Event (e.g., PriceMatrixPublished) ā Hasura Event Trigger ā NestJS handler ā invalidate CDN/edge cache or update read modelThe frontend uses stale-while-revalidate: serve the cached version while fetching the updated version in the background. The user never sees a loading state for eventually-consistent data.
Boundary Rules ā
| Signal | Reactivity Pattern |
|---|---|
| Data changes while user watches it | Live ā Hasura Subscription |
| Data changes on a schedule or via batch events | Eventual ā Event-driven invalidation + stale-while-revalidate |
| User performs a write and expects instant feedback | Optimistic UI ā render immediately, reconcile via subscription |
| Server needs to push an alert to a specific user | NestJS WebSocket Gateway (escalations, assignment alerts only) |
| External system sends data (CPaaS webhook, payment callback) | Webhook ā DB write ā subscription propagation (no custom push needed) |
Open Items (Future Protocol Work) ā
The following require dedicated protocol documents when implementation begins:
- [ ] Dispatch Board Protocol ā formal subscription contracts for leg status, incident overlays, crew/vehicle availability
- [ ] Booking Widget Real-Time Protocol ā seat hold visibility, checkout status, price update propagation
- [ ] Passenger Tracking Protocol ā polling/SSE interval, vehicle position payload, ETA display
- [ ] Driver Hub Push Strategy ā PWA push notifications for assignment changes, manifest updates
- [ ] Frontend Subscription Composable ā
useSubscription(),useOptimisticMutation()reference implementations - [ ] Wallet Pass Push Protocol ā Apple/Google pass update triggers and delivery mechanism