Agentic-Led Company — Governance & Control Spec Sheet

Working title: Agentic-Led Company Author: Julian Brüning · Date: 2026-04-17 Status: ACCEPTED — Implementation via Paperclip + MCP (see §10)

1. Purpose

Define the governance model for a solo-founder SaaS company (Busflow) where AI agents perform work across all departments. The model must guarantee:

AI self-governance — agents catch their own mistakes before output reaches the founder
Founder sovereignty — every decision and output is visible, scannable, and overridable
Context efficiency — agents operate with minimal, scoped context to reduce cost and hallucination risk
Proactive task surfacing — agents don't just respond; they identify and propose work

2. Core Principles

#	Principle	Description
P1	No blind delegation	Every agent output must pass through at least one review layer before it becomes a decision
P2	Scannable by default	All outputs follow a standardized hierarchy: Summary → Decisions → Details
P3	Traceability	Every output traces back to the input that triggered it and the reasoning chain used
P4	Hallucination visibility	Supervisor corrections are always surfaced, never silently merged
P5	Founder is final authority	No high-impact action without explicit founder approval (see §7 Blast Radius)
P6	Least-privilege context	Each agent receives only the context its bounded context permits
P7	Token economy	Every review layer must justify its cost; prefer lightweight checks over full re-analysis

3. Event-Driven Agent Architecture

3.1 Design Overview

The architecture organizes agents as bounded contexts (mirroring DDD) that communicate exclusively through a central Event Bus. No agent directly calls another agent. The Orchestrator routes events and the Supervisor reviews outputs — these may be the same or separate roles (see §3.4).

Agent teams map directly to BusFlow's four DDD bounded contexts (pillars), ensuring each team operates with deep, scoped knowledge of its domain rather than shallow, broad knowledge of the entire codebase.

                    ┌────────────────────────┐
                    │     FOUNDER (You)      │
                    │  Approve · Override    │
                    │  Task Board (§4)       │
                    └──────────┬─────────────┘
                               │ Management Reports
                               │ Task proposals
                    ┌──────────▼─────────────┐
                    │   CEO / CTO AGENTS     │
                    │   Routes events         │
                    │   Reviews outputs       │
                    │   Enforces standards    │
                    └──────────┬─────────────┘
                               │ Domain Events
         ┌─────────────┬───────┴───────┬─────────────┐
         │             │               │             │
   ┌─────▼──────┐ ┌────▼───────┐ ┌─────▼──────┐ ┌───▼──────────┐
   │ COMMERCE   │ │ BACKOFFICE │ │ OPERATIONS │ │ COMMS        │
   │ TEAM       │ │ TEAM       │ │ TEAM       │ │ TEAM         │
   │            │ │            │ │            │ │              │
   │ PM · Eng   │ │ PM · Eng   │ │ PM · Eng   │ │ PM · Eng     │
   │ · QA       │ │ · QA       │ │ · QA       │ │ · QA         │
   │            │ │            │ │            │ │              │
   │ booking-   │ │ workspace  │ │ driver app │ │ Real-time    │
   │ widget,    │ │ app,       │ │ packages/  │ │ Inbox,       │
   │ passenger  │ │ packages/  │ │ operations │ │ packages/    │
   │ app,       │ │ backoffice │ │            │ │ comms        │
   │ packages/  │ │            │ │            │ │              │
   │ commerce   │ │            │ │            │ │              │
   └────────────┘ └────────────┘ └────────────┘ └──────────────┘

   ┌── Cross-Cutting Roles ──────────────────────────────────┐
   │ Product Manager · Domain Expert · Knowledge Synthesis   │
   │ Co-Founder (Strategy)                                   │
   └─────────────────────────────────────────────────────────┘

NOTE

Top-Down Context Inheritance. Paperclip natively enforces hierarchical context: Company Mission → Project Goal → Agent Task. When a Commerce Engineer picks up a ticket, it automatically receives this ancestry — it sees not just the code task but that it belongs to the "Platform Payments" Project Goal aligned with the "Monetize the MVP" Company Mission. This reduces prompt engineering overhead and ensures strategic alignment without manual context injection.

3.2 Bounded Contexts — Access Control Matrix

Each agent domain has a strict read scope. Anything outside its scope is invisible. Domain teams map to BusFlow's four DDD bounded contexts; cross-cutting roles span all domains but receive only summaries, not raw access.

Domain Teams (scoped to monorepo directories)

Domain Team	Monorepo Scope	Can Read	Cannot Read
Commerce	`apps/booking-widget`, `apps/passenger`, `packages/commerce/*`	Domain source code, `schema-commerce.md`, domain tests, CI/CD logs	Other domains' code, marketing campaigns, customer PII, financials
Backoffice	`apps/workspace`, `packages/backoffice/*`	Domain source code, `schema-backoffice.md`, domain tests, CI/CD logs	Other domains' code, marketing campaigns, customer PII, financials
Operations	`apps/driver`, `packages/operations/*`	Domain source code, `schema-operations.md`, domain tests, CI/CD logs	Other domains' code, marketing campaigns, customer PII, financials
Communications	`packages/comms/*`, Real-time Inbox modules	Domain source code, `schema-communications.md`, messaging schemas, domain tests	Other domains' code, marketing campaigns, financials

Cross-Cutting Roles

Role	Can Read	Cannot Read
Product Manager	Usage analytics, feature flags, churn metrics, all domain summaries (not raw data)	Source code details, marketing campaign internals
Domain Expert	Regulations, industry publications, domain knowledge base	Source code, marketing, customer PII
Knowledge Synthesis	All public sources, industry data, competitor intel	Internal code, customer data, financials
Co-Founder (Strategy)	Domain summaries from all agents, decision log, KPIs	Raw code, raw customer data — only aggregated views

IMPORTANT

Access control serves dual purposes: (1) security/privacy and (2) token economy — agents with smaller context windows are cheaper and less prone to hallucination.

3.2a Domain Team Composition

Each of the four domain teams follows a standardized three-role structure:

Role	Responsibilities	Scoping
Domain PM/Lead	Breaks epics from the CEO agent into domain-scoped tickets. Prioritizes backlog. Ensures alignment with project goals.	Reads domain summaries + cross-domain event contracts. No code access.
Domain Engineer	Coding agent (e.g., Claude, Codex). Implements features, fixes bugs, writes tests.	Strictly restricted to changes within its domain's directories. Cannot modify files outside its bounded context.
Domain QA/Reviewer	Runs domain-specific Vitest/Playwright suites. Verifies adherence to domain schema (`docs/architecture/schema-<pillar>.md`). Enforces architectural constraints.	Reads domain code + test results. Writes learnings to `.knowledge.md` (see §3.4).

Why three roles instead of one Engineer?

Blast radius containment. A Commerce Engineer cannot accidentally break Operations code.
Distributed review. The three-stage review pipeline (§5) runs within each domain, not through a single global Supervisor bottleneck.
Deep context over broad context. Each Engineer loads only its domain's schemas, README, and .knowledge.md — smaller context windows, lower cost, fewer hallucinations (Principles P6, P7).

3.3 Event-Driven Communication

Agents communicate only through typed domain events on the Event Bus. Examples:

Event	Producer	Consumers
`feature.shipped`	Commerce / Backoffice / Operations / Comms team	PM, Co-Founder
`churn.risk.detected`	PM	Commerce team, Co-Founder
`competitor.change.detected`	Knowledge Synthesis	PM, Co-Founder
`compliance.rule.changed`	Domain Expert	Relevant domain team(s), Co-Founder
`booking.schema.changed`	Commerce team	Operations team (soft FK), Comms team
`task.proposed`	Any agent	Orchestrator → Founder Task Board

Rules:

Events are the only cross-domain data exchange mechanism
Events carry minimal payload — consumers request details through the Orchestrator if needed
All events log immutably for audit

NOTE

Implementation note (§10): Paperclip uses a ticket-based model instead of a typed event bus. Agents communicate through Paperclip's ticket system — a different mechanism achieving the same goal of structured, auditable cross-domain communication. The event bus model described here serves as the target architecture should the system evolve to Strategy D or C.

3.4 Cross-Session Domain Knowledge

AI models are stateless between fresh sessions. To simulate continuous domain mastery, each domain team maintains a persistent knowledge file:

File: packages/<domain>/.knowledge.md

Write discipline (QA agent):

When the Domain QA agent finds a bug, architectural violation, or non-obvious pattern, it instructs the Domain Engineer to fix the issue and appends the learning to .knowledge.md.
Entries follow a structured format: date, ticket reference, what went wrong, what the fix was, and the extracted rule.

Read discipline (Engineer agent):

The Engineer's SKILLS.md instructs it to always read .knowledge.md first upon waking for a new ticket.
This creates a locally-scoped, domain-specific knowledge base that grows as the team "works."

Precedent: This pattern parallels the existing .agents/skills/frontend/SKILL.md learnings section, which accumulates design system and accessibility learnings across sessions.

Session continuity:

Within a ticket: Paperclip maintains session state for ongoing tasks. An agent working on a multi-day ticket retains full context of previous tool calls and discussions for that specific ticket.
Across tickets: .knowledge.md carries forward accumulated domain wisdom. This is the only cross-session persistence mechanism.

IMPORTANT

.knowledge.md files are domain-scoped, not global. The Commerce team's knowledge base contains Commerce-specific learnings only. This preserves the bounded context boundary and keeps context windows small (Principle P6).

3.5 Orchestrator vs. Supervisor — Merged or Separate?

Aspect	Merged (recommended for start)	Separate
Token cost	Lower — one pass	Higher — two passes
Risk	Orchestrator can rubber-stamp its own routing	Better separation of concerns
Complexity	Simpler to implement	More robust at scale
Recommendation	✅ Start here	Evolve to this when agent count > 5

Start merged: The Orchestrator routes events AND reviews outputs. Split into two roles when the system grows complex enough that review quality degrades.

4. Proactive Task Creation

Agents don't just respond to requests — they propose work by emitting task.proposed events.

4.1 Task Lifecycle

Agent proposes → Orchestrator validates → Founder Task Board → Founder decides
     │                    │                       │
  task.proposed    Enriches with context     Approve / Reject /
                   Deduplicates              Defer / Delegate
                   Assigns priority

4.2 Task Schema

Every proposed task follows this structure:

yaml

id: auto-generated
source_agent: marketing
type: opportunity | risk | maintenance | improvement
priority_suggestion: low | medium | high | critical
title: "Create comparison page: Busflow vs. Busvermietung24"
rationale: "Competitor launched new pricing page. SEO opportunity."
effort_estimate: small | medium | large
blocked_by: []  # dependencies on other tasks
decision_needed: true | false
expires: 2026-05-01  # optional, for time-sensitive tasks

4.3 Founder Task Board Requirements

Single view of all proposed tasks across all agent domains
Filterable by: source agent, type, priority, decision needed
Sortable by: priority, date proposed, effort
Batch actions: approve/reject multiple tasks at once
Snooze: defer a task to a specific date
Link to context: every task links to the report/event that spawned it

5. Review Layers — Critical Self-Analysis

5.1 Three-Stage Review Pipeline

Stage	Who	Purpose	Token Cost
Self-Review	Work Agent	Critical self-analysis: "What could be wrong? What did I assume?"	Included in generation
Supervisor Review	Orchestrator/Supervisor	Cross-check against knowledge base, flag hallucinations, verify scope	~20-30% of generation cost
Founder Review	You	Final authority, strategic judgment, override	Your time

TIP

Token economy: The self-review is free (chain-of-thought). The Supervisor review should use a checklist approach (cheaper) instead of full re-generation. Only escalate to deep analysis when the checklist flags issues.

5.2 Work Agent Self-Review Requirements

Each agent output must include a Critical Self-Analysis section:

## Self-Analysis
- Confidence: medium
- Key assumptions: [list]
- What could be wrong: [list]
- Sources used: [list] / "no source — inference"
- Scope compliance: ✅ stayed within bounded context

5.3 Supervisor Review Checklist

Lightweight pass (not a full re-analysis):

[ ] Claims traceable to sources?
[ ] Agent stayed within its bounded context?
[ ] Output consistent with existing knowledge base?
[ ] No obvious hallucinations or fabricated data?
[ ] Self-analysis seems honest (not rubber-stamped)?

Correction log format:

🔧 CORRECTED: [original → fixed] — reason
⚠ UNVERIFIED: [claim] — no source found, kept with flag
✅ VALIDATED: [n] items passed checklist

6. Output Format — Management Report Standard

Every report surfaced to the founder:

┌─ MANAGEMENT SUMMARY ──────────────────────────┐
│  1-3 sentences. Traffic light: 🟢 🟡 🔴       │
└────────────────────┬───────────────────────────┘
                     ▼
┌─ DECISION POINTS ─────────────────────────────┐
│  • Context · Options · AI recommendation      │
└────────────────────┬───────────────────────────┘
                     ▼
┌─ SUPERVISOR FINDINGS ─────────────────────────┐
│  🔧 Corrections · ⚠ Unverified · ✅ Validated │
└────────────────────┬───────────────────────────┘
                     ▼
┌─ PROPOSED TASKS ──────────────────────────────┐
│  New tasks this report generated (if any)      │
└────────────────────┬───────────────────────────┘
                     ▼
┌─ DETAILED OUTPUT (drill-down) ────────────────┐
│  Full work product · Agent self-analysis       │
│  Source citations · Collapsible sections       │
└────────────────────────────────────────────────┘

Scannability rules:

10-second rule: Status clear within 10 seconds
Traffic lights: 🟢 FYI only · 🟡 decisions needed · 🔴 blocker
Progressive disclosure: Each layer is optional to read

7. Blast Radius Classification

NOTE

Replaces the binary "irreversible" language. With git, code changes are always technically reversible — but consequences may not be.

Class	Examples	Required Approval
Sandbox	Draft content, analysis, code in branch	Agent + Supervisor
Soft-reversible	Merge to main, publish blog draft, update docs	Founder approval
Hard-reversible	Deploy to production, change pricing page	Founder approval + cooldown (1h)
Irreversible consequences	Send customer email, financial transaction, legal filing	Founder approval + explicit confirmation

8. Escalation Tiers

Tier	Trigger	Handler	Founder sees
T0	Agent self-corrects	Work Agent	Logged in self-analysis
T1	Supervisor catches error	Supervisor	In Supervisor Findings
T2	Ambiguous / high-risk	Founder	As Decision Point
T3	Founder disagrees	Founder	Override in decision log

9. Acceptance Criteria

[ ] All agents operate within defined bounded contexts (§3.2)
[ ] Cross-domain communication happens only through typed events (§3.3)
[ ] Every agent output includes Critical Self-Analysis (§5.2)
[ ] Supervisor review follows the checklist approach (§5.3)
[ ] Supervisor corrections are always visible, never silently merged
[ ] Reports follow the Management Report Standard (§6)
[ ] Founder can drill from any summary to full detail
[ ] Proactive tasks appear on a single, filterable Task Board (§4.3)
[ ] Blast Radius classification governs approval requirements (§7)
[ ] The system logs all events and decisions in an immutable audit trail

10. Implementation Decision

Decision: Strategy A — Pure Paperclip + MCP bridge ADR: ADR-020Date: 2026-04-17

Chosen Platform

Paperclip — an open-source, MIT-licensed, TypeScript-based agent orchestration platform. Deployed as a standalone service (own Postgres, own Docker container) communicating with Busflow via MCP (Model Context Protocol).

Spec-to-Implementation Mapping

Spec Section	Implementation
§2 Core Principles	Embedded in all agent system prompts
§3.2 Bounded Contexts	MCP tool assignment per agent (see MCP Agent Bridge Protocol)
§3.3 Event-Driven Comms	Paperclip ticket system (different model, same intent)
§3.4 Orchestrator/Supervisor	Paperclip merged model (heartbeats + approval gates)
§4 Proactive Tasks	Agents propose via Paperclip's ticket system
§5.2 Self-Review	Enforced via system prompts (confidence, assumptions, sources)
§5.3 Supervisor Review	Paperclip approval gates; evolve to LLM checklist if needed
§6 Management Reports	Agent output format enforced via prompts; evolve based on real usage
§7 Blast Radius	Binary approval gates (sufficient for solo founder); add tiers if needed
§8 Escalation	Paperclip's audit trail + approval workflow
§9 Acceptance Criteria	Quality bar for evaluating when Strategy A is "enough" vs. when to evolve

Progressive Evolution Path

Strategy A (now)  ──►  Strategy D (if needed)  ──►  Strategy C (if needed)
Pure Paperclip         Fork + select additions       Full custom build
+ MCP bridge           (supervisor, reports,          (native Hasura/NestJS
                        blast radius plugins)          integration)

NOTE

The spec remains the canonical governance reference regardless of implementation strategy. Paperclip is the runtime; this document defines the principles.

Busflow Docs

Agentic-Led Company — Governance & Control Spec Sheet ​

1. Purpose ​

2. Core Principles ​

3. Event-Driven Agent Architecture ​

3.1 Design Overview ​

3.2 Bounded Contexts — Access Control Matrix ​

Domain Teams (scoped to monorepo directories) ​

Cross-Cutting Roles ​

3.2a Domain Team Composition ​

3.3 Event-Driven Communication ​

3.4 Cross-Session Domain Knowledge ​

3.5 Orchestrator vs. Supervisor — Merged or Separate? ​

4. Proactive Task Creation ​

4.1 Task Lifecycle ​

4.2 Task Schema ​

4.3 Founder Task Board Requirements ​

5. Review Layers — Critical Self-Analysis ​

5.1 Three-Stage Review Pipeline ​

5.2 Work Agent Self-Review Requirements ​

5.3 Supervisor Review Checklist ​

6. Output Format — Management Report Standard ​

7. Blast Radius Classification ​

8. Escalation Tiers ​

9. Acceptance Criteria ​

10. Implementation Decision ​

Chosen Platform ​

Spec-to-Implementation Mapping ​

Progressive Evolution Path ​

Agentic-Led Company — Governance & Control Spec Sheet

1. Purpose

2. Core Principles

3. Event-Driven Agent Architecture

3.1 Design Overview

3.2 Bounded Contexts — Access Control Matrix

Domain Teams (scoped to monorepo directories)

Cross-Cutting Roles

3.2a Domain Team Composition

3.3 Event-Driven Communication

3.4 Cross-Session Domain Knowledge

3.5 Orchestrator vs. Supervisor — Merged or Separate?

4. Proactive Task Creation

4.1 Task Lifecycle

4.2 Task Schema

4.3 Founder Task Board Requirements

5. Review Layers — Critical Self-Analysis

5.1 Three-Stage Review Pipeline

5.2 Work Agent Self-Review Requirements

5.3 Supervisor Review Checklist

6. Output Format — Management Report Standard

7. Blast Radius Classification

8. Escalation Tiers

9. Acceptance Criteria

10. Implementation Decision

Chosen Platform

Spec-to-Implementation Mapping

Progressive Evolution Path