All Sections
Coordination Intelligence

coordination patterns

98 claims from 32 organizations

How agents share information, synchronize work, and avoid conflicts. These patterns describe the communication architecture: shared state files, message buses, escalation flows, and data handoff protocols.

Acme Digital Agency Founding gold
C008 MEDIUM INFERENCE 2x efficiency

Agents coordinate through shared state files, not direct messaging.

Why: Direct messaging creates hidden dependencies.

Failure mode: Unlogged message causes invisible failure.

C009 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Cross-agent workflows use pub-sub through shared state.

Why: Pub-sub decouples agents.

Failure mode: Tightly coupled agents cascade failures.

C006 HIGH OBSERVED REPEATEDLY 7x efficiency

The briefing agent compiles from shared state files in a fixed order: spend pacing first, then alerts, then pipeline, then creative status. Order is consistent daily.

Why: Random ordering caused the founder to miss urgent items buried in the middle of the briefing. When spend pacing alerts always come first, the founder scans the top section in 30 seconds and knows if any account needs immediate action. Creative status at the bottom is read during a different workflow (creative review meeting). Fixed order matches the founder's decision-making cadence.

Failure mode: Variable briefing structure forces the reader to hunt for urgent items. Information architecture should match the reader's priority hierarchy.

C007 HIGH OBSERVED ONCE 5x efficiency

When a Claude analysis agent and a GPT creative agent work on the same client simultaneously, the analysis must complete and write to shared state before the creative agent reads it. No parallel execution on the same client.

Why: The ad copy generator read mid-write shared state data for a healthcare client. The analysis was half-complete -- it had processed 3 of 7 campaigns. The copy generator used the partial data to produce "new angle" copy that leaned into the wrong service line. The 3 campaigns processed first were all for Botox. The remaining 4 were weight management, which was the client's priority. The generated copy was 100% Botox-focused.

Failure mode: Parallel execution across model boundaries creates race conditions on shared state. Partial data produces skewed output.

C007 HIGH OBSERVED REPEATEDLY 7x efficiency

The ads agent and lead distribution agent share a real-time feed. When the ads agent launches or modifies a campaign for a location, the lead distribution agent pre-allocates capacity and alerts the location's front desk.

Why: A campaign launch without staffing preparation wastes the initial surge of leads. The first 48 hours of a campaign produce 60% of its total leads.

Failure mode: A Meta campaign went live for the Atlanta location on a Friday afternoon. Front desk staff had already gone home. 22 leads came in over the weekend with zero contact. Monday follow-up converted only 3 (14%). Similar campaigns with pre-staged staff convert at 35-40%.

C008 HIGH OBSERVED ONCE 5x efficiency

The class occupancy agent and trainer scheduling agent must sync daily. Class additions or cancellations require both agents to agree: occupancy justifies the class AND a qualified trainer is available.

Why: Adding a class based on demand data alone, without confirming trainer availability, creates a promise the location cannot keep.

Failure mode: Occupancy agent recommended adding a 5:30 PM HIIT class at the Dallas location based on waitlist data. The recommendation was approved. No trainer was available for that slot. The class was posted, 8 members signed up, and the class was cancelled 2 hours before start time. Three members posted negative reviews.

C009 HIGH OBSERVED REPEATEDLY 7x efficiency

Corporate communications agent must check all active location-specific promotions before sending any network-wide announcement. Conflicting promotions are flagged and held until resolved.

Why: Network-wide messages that contradict local promos confuse members and create refund requests.

Failure mode: See C001. The "First Month Free" vs "50% Off" conflict originated because the corporate comms agent and ads agent did not share a promotion calendar. Now they do.

C007 HIGH OBSERVED ONCE 5x efficiency

Intake agent writes a structured brief to Notion. Shot list agent reads from that brief. No direct agent-to-agent communication outside the shared Notion workspace.

Why: When agents communicated directly via Slack threads, the creative team couldn't see what was happening. They felt surveilled rather than supported.

Failure mode: Two agents had a 14-message Slack thread about a project scope that the lead designer wasn't tagged on. He found it later and said: "So the robots are planning my project without me now?" Trust reset took 2 weeks.

C008 HIGH OBSERVED REPEATEDLY 7x efficiency

All agent outputs are posted to the project's Notion page, never to general Slack channels. Slack is for human conversation only.

Why: The team uses Slack for creative banter, mood boards, and spontaneous ideas. Agent messages in those channels killed the vibe.

Failure mode: Timeline agent posted a deadline reminder in #general during a creative brainstorm thread. Three people said "this is annoying" within minutes. Marcus moved all agent output to Notion that day.

C018 HIGH OBSERVED REPEATEDLY 7x efficiency

End-of-week agent summary compiles all active project statuses into a single Notion dashboard view. Marcus reviews every Friday before sending weekly client updates.

Why: Clients expect consistent weekly updates. Without a compiled view, Marcus was spending 90 minutes every Friday manually assembling status across projects.

Failure mode: Before the compiled view, Marcus missed sending a weekly update to a $22K project client for 2 consecutive weeks. Client emailed: "Are we still a priority?" Took a 30-minute call to reassure them. The compiled view reduced Friday prep from 90 minutes to 15.

Atticus Legal bronze
C008 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Priya reviews a compiled daily summary at 8 AM: new intakes processed, documents assembled and awaiting review, follow-ups scheduled, and upcoming signing appointments. Beth reviews the same summary for her task list.

Why: As a solo attorney, Priya's time allocation is the bottleneck. Without the summary, she was checking three separate agent outputs, her email, and Clio before starting work. The compiled summary saves 20 minutes each morning and ensures nothing falls through the cracks.

Failure mode: Without the summary, Priya starts her day with the most urgent-seeming task rather than the most important one. A signing appointment preparation gets deprioritized because a new intake feels more urgent. Client arrives for signing and documents are not ready. Rescheduling costs the client a half-day of work.

C009 HIGH OBSERVED ONCE 5x efficiency

When the intake agent classifies a case as STANDARD, it writes the structured client data to a shared file. The assembly agent reads this file to begin document preparation. The handoff is file-based, not memory-based.

Why: In week 2, the assembly agent "remembered" a client's information from a prior session instead of reading the current file. It assembled documents using the prior client's address (same street name, different city). The error was caught during Priya's review, but it demonstrated why file-based handoffs are non-negotiable.

Failure mode: Assembly agent uses stale client data from a prior session. Trust lists the wrong city. Client signs without catching it. When the trust is administered, a title company flags the address discrepancy. Trust amendment required. If the client has passed, the amendment process becomes significantly more complicated.

C009 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Progress tracking feeds into parent communication as raw data only. The communication agent never embellishes, summarizes, or reframes progress data.

Why: Any transformation of data between agents creates opportunities for error. The communication agent formats but does not interpret.

Failure mode: Progress agent recorded "completed 3 of 5 reading passages." Communication agent drafted: "Your child completed most of the assigned reading this week." Parent interpreted "most" as 4 of 5 and praised the child for the wrong thing. Minor, but Keisha's credibility depends on precision.

C010 HIGH OBSERVED REPEATEDLY 7x efficiency

Scheduling changes trigger an automatic draft notification to affected parents, but the draft sits in Keisha's queue until she sends it.

Why: Parents need to know about schedule changes immediately, but the notification must be accurate and human-reviewed first.

Failure mode: Before this system, a schedule change on Monday wasn't communicated until Wednesday because Keisha forgot. The family missed Tuesday's session and was billed for it. Keisha refunded the session ($65) and added a handwritten apology note. The draft queue ensures changes surface immediately for review.

Candor Labs bronze
C005 HIGH MEASURED RESULT 10x efficiency

All three agents write to separate state files. The founder's morning check reads all three in a fixed 90-second ritual: support first (are users blocked?), code review second (are PRs waiting?), release notes third (is a changelog draft ready?).

Why: When agents posted to Slack individually, the founder was checking three channels plus his own notifications. Context-switching between agent outputs throughout the day fragmented his coding blocks. After consolidating into a single morning review, deep work blocks went from an average of 47 minutes to 2 hours 15 minutes (measured over 3 weeks via Toggl).

Failure mode: Distributed agent notifications fragment the solo founder's focus. Each interruption costs 15-25 minutes of re-immersion. Coding quality degrades.

C006 MEDIUM INFERENCE 2x efficiency

Agents do not communicate with each other. There is no inter-agent message bus. The founder is the sole coordinator.

Why: With 3 agents and 1 human, adding agent-to-agent communication creates complexity that exceeds the coordination benefit. The founder considered having the support triage agent send high-priority bugs to the code review agent for immediate analysis. But the founder IS the coordination layer. Adding a machine coordination layer when the human layer is a single person with full context creates a shadow decision-making process the founder can't audit in real time.

Failure mode: Agent-to-agent coordination at solo scale creates invisible workflows. The founder discovers decisions were made (or context was shared) without their knowledge. Control degrades.

C008 HIGH OBSERVED ONCE 5x efficiency

When a new engagement kicks off, Brief must generate a "context packet" that is distributed to all agents that will touch that engagement. The packet includes: client name, industry, engagement scope, confidentiality tier, key contacts, and any conflict-of-interest flags.

Why: Without a shared context initialization, agents operate with partial information. Lens might research the wrong subsidiary. Archer might reference a case study from a competitor.

Failure mode: New engagement for Barrett Holdings started without a context packet. Lens researched Barrett Industries (different company, similar name). Two days of research wasted. The error wasn't caught until the consultant reviewed the first deliverable draft.

C009 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Tock feeds utilization data to Brief every Monday morning. Brief incorporates utilization into meeting prep: if a consultant is above 85% utilization, flag it in their meeting prep as a capacity risk.

Why: Overloaded consultants cut corners. If Brief knows a consultant is stretched, it can flag the risk before the consultant walks into a client meeting and overpromises on timelines.

Failure mode: Senior consultant at 94% utilization committed to a 2-week deliverable turnaround in a client meeting. The deliverable took 5 weeks. Client escalated to the managing partner.

C010 MEDIUM OBSERVED ONCE 3x efficiency

Recon's competitive intelligence outputs must be tagged with source provenance: PUBLIC (press releases, filings, published reports), INFERRED (analysis derived from public data), or PROPRIETARY (from paid databases). Proprietary-sourced intel must not appear in client deliverables without license verification.

Why: Sharing proprietary database content in client deliverables can violate data licensing agreements and expose the firm to legal liability.

Failure mode: Recon included IBISWorld industry data verbatim in a client-facing market sizing report. The client published excerpts in their board materials. IBISWorld flagged the unauthorized redistribution. We settled for $8,500.

C005 HIGH OBSERVED REPEATEDLY 7x efficiency

Retention agent must cross-reference billing events (upgrades, plan changes, payment failures) from the last 7 days before classifying any member as at-risk.

Why: A member who just upgraded is the opposite of at-risk. A member whose payment failed may appear inactive but has a billing issue, not a retention issue.

Failure mode: See C002. The upgrade/at-risk collision was the single most embarrassing agent failure in CoreFit's history.

C006 HIGH OBSERVED ONCE 5x efficiency

Social media agent reads the class schedule and retention alerts before generating content. Posts must reflect current reality -- no promoting a class that was just cancelled, no "Join us this Saturday!" when the location is closed for maintenance.

Why: Social posts are public and permanent. A post promoting a cancelled class generates confused DMs and makes the brand look disorganized.

Failure mode: Social agent posted a "HIIT Marathon Saturday!" graphic for the downtown location. That location had cancelled Saturday HIIT two weeks prior due to low attendance. 11 people showed up to nothing.

C016 MEDIUM OBSERVED ONCE 3x efficiency

Lead nurture agent and social media agent share a content calendar. No nurture email contradicts or duplicates a social post within the same 48-hour window.

Why: Prospects who follow on social AND receive emails notice when messaging conflicts. It signals that nobody is paying attention.

Failure mode: Social posted "20% off first month!" while nurture sent "Free first week, no commitment!" to the same audience segment on the same day. Prospect screenshot both and asked which one was real.

DevForge silver
C008 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Issue triage feeds into PR review prep. When a PR references an issue, the review checklist includes the original issue requirements so Kai can verify the PR actually solves the reported problem.

Why: Contributors sometimes fix a symptom without addressing the root cause. Cross-referencing the original issue catches this.

Failure mode: A PR claimed to fix issue #312 (race condition in concurrent writes). The code change fixed one code path but not the underlying race. Without the issue cross-reference, Kai would have merged it. The triage-to-review pipeline caught that the original reporter described 3 scenarios but the PR only addressed 1.

C009 HIGH OBSERVED REPEATEDLY 7x efficiency

Merged PRs automatically trigger docs generation agent to check if documentation needs updating. The agent drafts doc changes and links them to the original PR in Linear.

Why: Documentation debt accumulates invisibly. By the time anyone notices, 20 features are undocumented and the docs site is stale.

Failure mode: Before automatic triggering, docs lagged features by an average of 3 weeks. An enterprise customer emailed: "Your changelog says feature X shipped in v2.4 but I can't find it in the docs." It had shipped 4 releases ago. The customer's team spent 2 hours figuring out the feature from source code instead of docs.

C010 HIGH OBSERVED REPEATEDLY 7x efficiency

Discord monitoring surfaces recurring questions and routes them to the docs agent as documentation gaps. If the same question is asked 3+ times in 30 days, it becomes a docs priority.

Why: Repeated questions in Discord are a documentation failure, not a community support success.

Failure mode: "How do I configure custom middleware?" was asked 11 times in a single month on Discord. The answer existed in a blog post from 8 months ago but not in the official docs. Each Discord answer took Kai 5-10 minutes. Total: ~2 hours spent answering the same question that should have been documented. After the 3-question rule, the docs gap was filled and Discord questions on that topic dropped to zero.

C007 HIGH OBSERVED REPEATEDLY 7x efficiency

When the API monitoring agent detects a Plaid outage, it simultaneously notifies the support triage agent. The triage agent auto-tags all incoming tickets mentioning "bank connection," "sync," or "balance" as "Known Issue - Plaid Outage" and responds with a templated status message.

Why: During Plaid outages, support ticket volume spikes 8-12x. Without auto-tagging, the support team spends hours triaging tickets that all have the same root cause.

Failure mode: During the 3-hour undetected Plaid outage (see C006), 23 tickets came in. The support team spent the next morning individually diagnosing each one before realizing they were all the same issue. Total wasted time: 4.5 hours across 2 support agents.

C008 MEDIUM OBSERVED ONCE 3x efficiency

The weekly metrics agent and churn prediction agent share a data pipeline. Churn predictions feed into the weekly metrics report as a "Retention Risk" section. The metrics agent contextualizes churn predictions against actual retention numbers, preventing alarm fatigue.

Why: Churn predictions in isolation create panic. Churn predictions alongside actual retention data create informed decisions.

Failure mode: Before integration, the churn agent reported "287 users at high risk" in the same week the metrics agent reported "98.2% 30-day retention." Raj panicked about the 287 number. In context, 287 out of 8,000 users at "high risk" with 98.2% actual retention meant the model was overpredicting. The false alarm cost Raj a weekend of unnecessary strategy sessions.

C009 MEDIUM OBSERVED ONCE 3x efficiency

The content agent reads the compliance agent's current regulatory watchlist before drafting any content. Topics on the watchlist (currently: crypto, investment advice, credit scoring, BNPL) require Maya's pre-approval of the topic itself before any drafting begins.

Why: Some topics are regulatory minefields. Drafting a full blog post only to have compliance reject the topic wastes everyone's time.

Failure mode: The content agent drafted a 1,500-word guide titled "Using Greenline to Track Your Crypto Portfolio." Maya rejected it outright -- Greenline is not licensed to provide crypto-related financial services in 3 of its operating states. 6 hours of content work discarded.

C008 HIGH OBSERVED REPEATEDLY 7x efficiency

The deadline agent generates a daily report at 5 AM listing all deadlines within 90 days, sorted by urgency. Items within 14 days are marked CRITICAL. Items within 30 days are marked WATCH. The report goes to Marcus and the paralegal team.

Why: Before the agent, deadline tracking was a shared spreadsheet updated by paralegals. Two near-misses in one quarter prompted the AI implementation. The 90-day window catches items that are far enough out to plan but close enough to matter.

Failure mode: Without the daily report, deadlines are tracked ad hoc. A paralegal goes on vacation. Her cases are redistributed but two deadlines are not transferred to the coverage paralegal's tracking sheet. Both deadlines pass. One is a discovery cutoff. Evidence is excluded. Case value drops by an estimated $75K.

C009 MEDIUM MEASURED RESULT 6x efficiency

When the intake agent classifies a new case as STRONG, it triggers the comms agent to schedule an initial consultation within 48 hours. The trigger is a state file update, not a direct agent-to-agent call.

Why: Speed matters in PI intake. A potential client who waits 5 days for a consultation is likely shopping competitors. Before the trigger, average time-to-consultation was 4.2 days. After implementation, it dropped to 1.8 days. The firm attributes 3 new signed cases in the first month to faster response.

Failure mode: STRONG case classified on Friday afternoon. Without the automated trigger, the scheduling request sits in an email until Monday. Client calls two other firms over the weekend. By Monday, client has retained a competitor.

C010 MEDIUM OBSERVED REPEATEDLY 4x efficiency

All four agents write status updates to individual state files. Marcus reviews a compiled summary at 7 AM before the daily team huddle at 7:30 AM. Stale state files (not updated in 24 hours) are flagged at the top.

Why: Marcus needs 15 minutes of context before the huddle. Reading four separate agent outputs takes 25 minutes. The compiled summary takes 8 minutes and highlights only items requiring action.

Failure mode: Without compilation, Marcus skims agent outputs unevenly. Misses a WATCH deadline that should have been discussed at the huddle. Paralegal does not learn about the deadline until the next day. One day of planning time lost.

C008 MEDIUM OBSERVED REPEATEDLY 4x efficiency

All five agents write daily status updates to individual state files by 6 AM. Rachel reviews a compiled summary at 7 AM. Items requiring her attention are separated into COMPLIANCE (descriptions needing review), OPPORTUNITY (hot leads), and OPERATIONS (scheduling, reporting).

Why: Rachel manages 8 agents and 45 listings. Without categorized compilation, she spent 40 minutes each morning reading five agent outputs. The categorized summary takes 12 minutes and ensures compliance items are seen first.

Failure mode: Without compilation, compliance items (listing descriptions needing review) get buried behind operational noise. Description sits unreviewed for 48 hours. Listing agent posts it manually without approval to meet an MLS deadline. Unapproved description goes live.

C009 MEDIUM MEASURED RESULT 6x efficiency

When the qualifier identifies a STRONG lead (pre-approved, timeline under 60 days, price range matches active listings), it writes to the lead state file. The scheduler reads the file and suggests first-showing routes within 4 hours. The listing agents are notified via Slack.

Why: Speed to first showing correlates strongly with conversion. Before the automated pipeline, average time from qualification to first showing was 3.2 days. After implementation: 18 hours. Two agents reported that clients specifically mentioned the fast response as a factor in choosing Keystone over competitors.

Failure mode: STRONG lead qualified on Friday afternoon. Without automated handoff, the lead file sits until Monday. Lead attends open houses with two other brokerages over the weekend. By Monday, the lead has an agent. Lost GCI: estimated $7,275 (buyer's agent commission on a $485K home).

C010 HIGH MEASURED RESULT 10x efficiency

The comp agent refreshes market data every Monday. Comps older than 14 days are flagged as STALE in the system. Seller reports reference only fresh comps. If no fresh comps are available, the report says "insufficient recent comparable sales" rather than using stale data.

Why: The Denver market moved 2.4% in a single month during spring 2026. A 30-day-old comp set was materially misleading. A seller report using stale comps showed her home as overpriced by 4% when it was actually priced within 1.5% of current market. She panicked and called Rachel demanding a price reduction.

Failure mode: Stale comps show the market has moved when it has not (or vice versa). Seller makes pricing decisions based on outdated data. In a fast market, 30-day-old comps can be $15K-$25K off. Seller either overprices and sits, or underprices and leaves money on the table.

KGORG Founding silver
C006 HIGH HUMAN DEFINED RULE 5x efficiency

Delegation is normal for orchestrator seats when specialization, tool access, or parallelism improves the outcome, but delegation should never be lazy or create unnecessary fragmentation.

Why: The command layer is intended to route work intelligently, not hoard tasks or create delegation for its own sake.

Failure mode: Poor delegation either overloads top-level agents or creates excessive handoffs that slow work and obscure ownership.

C007 HIGH HUMAN DEFINED RULE 5x efficiency

Every delegation should pass objective, relevant context, constraints or preferences, expected output format, and definition of done.

Why: KGORG relies on handoff quality to keep multi-agent workflows coherent and reduce rework.

Failure mode: Specialists receive vague requests, return mismatched outputs, or require the human principal to restate context that should have traveled with the task.

C012 MEDIUM INFERENCE 2x efficiency

The current live operating core is intentionally small: one strategic orchestrator, one operational orchestrator, one communications and scheduling specialist, and one knowledge maintenance clockwork.

Why: KGORG appears to be growing through a staged seat architecture rather than activating a large workforce all at once.

Failure mode: Prematurely populating many seats without clear demand would increase governance burden and reduce clarity without improving outcomes.

C013 HIGH HUMAN DEFINED RULE 5x efficiency

Currently, the setup has 4 active agents, with 3 serving as the core day-to-day operating team: a Chief of Staff, an Executive Assistant, and an Email/Calendar specialist, plus a background knowledge-maintenance automation agent.

Why: This reflects the present balance between direct operational support and infrastructure support inside KGORG.

Failure mode: If the operating core is described inaccurately, outsiders and future builders may misunderstand what is actually active versus what is only planned.

C014 HIGH HUMAN DEFINED RULE 5x efficiency

The broader system is intentionally designed with a bench of planned specialist seats for planning, research, travel, creative work, and personal growth, but these are not treated as active simply because the seats exist in draft.

Why: KGORG distinguishes between active operators and designed future capacity, which keeps the organization legible and honest about what is truly running.

Failure mode: Conflating planned seats with active agents leads to inflated claims, weaker governance, and confusion about actual execution coverage.

C009 MEDIUM OBSERVED ONCE 3x efficiency

Flow's utilization data must feed into Grid's scheduling proposals. If Flow identifies that Tuesday afternoons at Buckhead are consistently at 40% utilization (versus 85% target), Grid must factor this into staff scheduling: either reduce staffing or propose marketing initiatives to fill the gap (flagged to Beacon).

Why: Scheduling optimization and staff scheduling are two sides of the same problem. Optimizing appointments without adjusting staff levels (or vice versa) produces either overstaffed slow periods or understaffed peaks.

Failure mode: Flow identified that Roswell's Friday afternoons averaged 35% utilization for 6 consecutive weeks. Grid continued scheduling full staff (4 PTs) for Friday afternoons because it didn't receive Flow's utilization data. At an average PT hourly cost of $48, the overstaffing cost approximately $1,150 over those 6 weeks. After connecting Flow to Grid, Friday afternoon staffing was reduced to 2 PTs with the other 2 shifted to Monday mornings (92% utilization, consistently overbooked).

C010 HIGH OBSERVED ONCE 5x efficiency

Shield must share payer denial trend data with the clinic director monthly. If a specific payer's denial rate increases by more than 10 percentage points in a 30-day period, Shield must flag it immediately in #billing-alerts Slack channel with the payer name, denial rate change, and top denial reason codes.

Why: Insurance payer behavior changes affect cash flow directly. A payer tightening authorization requirements or changing coverage policies can shift denial rates within weeks. Early detection allows the practice to adjust verification procedures before a backlog of denied claims accumulates.

Failure mode: A regional Blue Cross plan changed its PT visit authorization policy from 30-visit blocks to 12-visit blocks with mandatory re-authorization. Shield wasn't monitoring denial rate trends. Denials for that payer jumped from 8% to 31% over 3 weeks. The practice didn't catch it until the monthly billing review. By then, 23 claims totaling $4,600 had been denied. Most were recoverable with re-authorization, but the cash flow impact was felt for 45 days.

C011 MEDIUM OBSERVED ONCE 3x efficiency

Beacon must coordinate with Flow before publishing any marketing content that promotes specific appointment availability. If Beacon advertises "same-day appointments available," Flow must confirm that same-day availability actually exists at the promoted location. Advertising availability that doesn't exist drives frustrated phone calls and negative first impressions.

Why: Marketing promises create patient expectations. A new patient who sees "walk-ins welcome" on social media and arrives to a 2-hour wait will not return. The disconnect between marketing promises and operational reality is more damaging than no marketing at all.

Failure mode: Beacon published a Google Ads campaign promoting "same-week new patient appointments at all 3 locations." Buckhead's next available new patient slot was 11 days out. Three prospective patients called Buckhead referencing the ad and were told about the wait. Two chose competitors. The ad was paused after 4 days, but $340 in ad spend had already been consumed driving leads to a clinic that couldn't serve them.

C008 MEDIUM OBSERVED REPEATEDLY 4x efficiency

All four agents write status updates to their state files by 6 AM. Corinne reviews a compiled summary at 7 AM before her 8 AM start. The summary prioritizes: EMERGENCY items first, then lease expirations within 60 days, then outstanding vendor work orders, then routine maintenance backlog.

Why: Corinne's time is the scarcest resource. She manages 120 units, 3 vendors, and 40+ tenant interactions per week. Without prioritized compilation, she was spending 30 minutes reading four agent outputs and still missing time-sensitive items. The compiled summary takes 10 minutes and ensures she acts on the most important items first.

Failure mode: Without prioritized compilation, Corinne starts her day responding to the most recent tenant message instead of the most critical one. A lease expiring in 7 days gets less attention than a routine maintenance request that came in at 6:45 AM. Lease auto-converts to month-to-month. Tenant leaves 30 days later. Vacancy cost: $1,350 plus turnover.

C009 HIGH OBSERVED ONCE 5x efficiency

When the triage agent classifies a request as EMERGENCY, it simultaneously notifies the vendor agent (for dispatch) and the comms agent (for tenant acknowledgment). Both agents receive the same classification and the same ticket number. Corinne is notified via Slack with a one-line summary.

Why: Before parallel notification, the comms agent would acknowledge the tenant before the vendor agent dispatched. Tenant received "we're aware of the issue" but no one had actually been dispatched yet. During an actual burst pipe incident, the tenant waited 40 minutes after acknowledgment before a plumber was contacted. Water damage worsened from $800 to $2,100 during the delay.

Failure mode: Sequential notification: comms acknowledges, then vendor dispatches. 40-minute gap between "we're on it" and actually being on it. Water damage compounds at approximately $50/minute in an active flood situation. Every minute of delay increases repair cost and tenant displacement risk.

C010 MEDIUM MEASURED RESULT 6x efficiency

The lease renewal agent begins surfacing renewal data 90 days before lease expiration. At 60 days, it escalates to DECISION NEEDED. At 45 days, if no decision has been made, it alerts Mark directly via Slack. Virginia law requires 30-day notice for non-renewal, so the 45-day alert is the last safe decision point.

Why: Two leases expired without renewal offers in Q1 because Corinne was handling a burst pipe crisis across two buildings. Both tenants converted to month-to-month. One left 45 days later. The 90/60/45 ladder ensures lease renewals get attention even when Corinne is buried in crisis management.

Failure mode: Lease expires without a renewal conversation. Tenant converts to month-to-month with no obligation to stay. Tenant leaves with 30-day notice during peak vacancy season (December-January in Richmond). Unit sits vacant for 35 days. Lost rent: $1,575. Turn costs: $2,100. Total: $3,675 that a timely renewal conversation would have prevented.

Learnwell silver
C009 MEDIUM OBSERVED ONCE 3x efficiency

Content QA findings feed into engagement metrics. If a study guide is pulled for errors, engagement data for the period it was live is flagged as potentially contaminated.

Why: Students who studied incorrect material and performed poorly on exams may show "low engagement" afterward -- not because the platform is failing but because they lost trust in that specific content area.

Failure mode: After the AP History incident, engagement in history study guides dropped 35%. The engagement agent flagged it as "declining interest in history content" and recommended creating more history content. The actual cause was trust erosion from the factual error. Priya wasted a week commissioning new history content that nobody wanted.

C010 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Support triage data feeds into onboarding optimization. If new users report the same confusion in their first week, onboarding agent adjusts the tutorial flow.

Why: New users who contact support in week 1 have a 3x higher churn rate than those who don't. Proactive tutorial adjustments prevent the support ticket from ever being filed.

Failure mode: 40 new students in a single cohort all filed tickets asking "how do I share a study guide with my study group?" The feature existed but was buried in settings. Support triage answered each ticket individually (4 hours of response time) instead of feeding the pattern to onboarding. The onboarding agent would have added a tooltip in the first-login flow, preventing all 40 tickets.

C011 HIGH OBSERVED ONCE 5x efficiency

Teacher outreach cadence is informed by engagement metrics. Teachers whose students show declining engagement get a proactive check-in from Priya (via agent draft) before the teacher notices and churns.

Why: Teachers who assigned Learnwell and see declining student usage feel like the platform failed them. Proactive outreach reframes the narrative: "We noticed and we're working on it."

Failure mode: A teacher with 85 students saw usage drop from 70% to 30% over 3 weeks. No outreach happened. The teacher switched to a competitor and posted in a teacher forum: "Learnwell's engagement tanked and nobody from their team reached out." 3 other teachers in the thread said they were also considering switching. Priya lost 4 teacher accounts (estimated 200 students) in a single week.

McFadyen Digital Founding silver
C010 HIGH OBSERVED REPEATEDLY 7x efficiency

Timezone handoffs between delivery centers (Virginia, Brazil, India) must include an AI-generated handoff summary posted to the project's Slack channel at the end of each team's working day. The summary includes: work completed, blockers encountered, decisions needed, and next priorities.

Why: We lost 2-3 days per sprint in "context reconstruction" where the next timezone team had to read through Jira comments and Slack threads to figure out where things stood. The handoff summaries cut this to under 30 minutes.

Failure mode: Teams started relying on the AI summary and stopped updating Jira tickets directly. When the summarizer hallucinated a "completed" status for a task that was actually blocked, the downstream team built on top of broken code. We now require Jira status to be the source of truth -- the AI summarizes Jira, it does not replace it.

C011 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Weekly AI-generated "Suite Spot" competitive intelligence briefs are produced for the leadership team, tracking competitor platform releases, partnership announcements, and pricing changes across Mirakl, VTEX, commercetools, Shopify, and Salesforce Commerce Cloud ecosystems.

Why: As the publisher of the Marketplace Suite Spot Report, we must maintain the most current competitive intelligence in the industry. Falling behind on a platform capability change directly impacts our advisory credibility.

Failure mode: The brief once missed a commercetools pricing model change because the source was announced via a partner webinar, not a press release. Our monitoring was over-indexed on written publications. We added webinar transcript scanning.

C012 MEDIUM INFERENCE 2x efficiency

The Proposal Engine and Sales Intelligence Agent share a common client/prospect database. When the Sales Agent qualifies a lead, it pre-loads the Proposal Engine with firmographic data, industry vertical, and likely platform fit so the first draft is contextualized before a human touches it.

Why: Eliminates the "cold start" problem where proposal writers spend the first day just researching the prospect. The agent-to-agent handoff means the proposal draft is already 40% contextualized when the Solutions Architect opens it.

Failure mode: The Sales Agent once passed incorrect revenue data (confused parent company with subsidiary), which caused the Proposal Engine to scope the engagement for a $2B enterprise when the actual buyer was a $90M division. The SA caught it, but it burned half a day re-scoping.

C005 HIGH OBSERVED REPEATEDLY 7x efficiency

The briefing agent reads shared state files. It never calls APIs, searches inboxes, or queries databases directly.

Why: Briefing compile time went from 90 seconds to 11 minutes when the briefing agent was making live API calls to 4 different services. One Monday morning, the Meta Ads API was slow and the briefing timed out entirely. The founder started the week with no briefing and spent 40 minutes manually checking dashboards.

Failure mode: Briefing reliability degrades as data source count grows. One slow API blocks the entire morning.

C006 HIGH OBSERVED ONCE 5x efficiency

When two agents need to reference the same entity (client, deal, campaign), they use a canonical ID from the CRM, not the display name.

Why: "Precision Plumbing" in Google Ads, "Precision Plumbing LLC" in the CRM, and "Precision" in Slack. The reporting agent matched on display name and merged two different clients named "Precision" -- Precision Plumbing and Precision Auto. The weekly report showed Precision Plumbing's spend as $14,200 (actual: $6,800 plumbing + $7,400 auto).

Failure mode: Name-based matching produces false merges. Reports show combined data for unrelated clients.

C014 MEDIUM OBSERVED ONCE 3x efficiency

Agent-to-agent communication uses structured message types (REQUEST, INFORM, ALERT). Free-text messages between agents are prohibited.

Why: The reporting agent sent the briefing agent a free-text note: "Check Precision -- numbers look off." The briefing agent interpreted "off" as "offline" and reported the Precision Plumbing account as disconnected. The actual meaning was "the numbers look unusual." The account manager called Google support to investigate a non-existent connection issue.

Failure mode: Ambiguous natural language between agents causes misinterpretation. Downstream actions are based on the wrong reading.

C011 HIGH OBSERVED REPEATEDLY 7x efficiency

Steps 5 and 6 form a tight bidirectional loop. UI prototyping and data modeling inform each other iteratively. Changes in one trigger re-evaluation of the other.

Why: UI design reveals data needs that the model did not anticipate. Data model constraints reveal UI designs that are infeasible. The loop catches mismatches early when the cost of change is near zero.

Failure mode: Team completes UI prototyping (Step 5) in isolation, then starts data modeling (Step 6). The data model cannot support several UI patterns. Rework is required on both sides. Without the bidirectional loop, this discovery happens late and costs more.

C012 MEDIUM INFERENCE 2x efficiency

Micro-macro seesawing is the primary discovery mechanism. Deep focus on a single stakeholder (micro) reveals system-wide patterns (macro). Macro insights feed back into subsequent micro iterations.

Why: System-wide architecture cannot be designed top-down with AI. It emerges from the patterns revealed by stakeholder-specific deep dives. AI accelerates each deep dive. The human recognizes the cross-cutting patterns.

Failure mode: Team stays at the macro level, designing the whole system architecture before doing any stakeholder-specific work. AI generates a plausible architecture that misses critical integration points. The architecture is revised repeatedly as stakeholder work reveals reality.

C013 MEDIUM INFERENCE 2x efficiency

Every discovery at any step that affects the PRD is written back to the PRD immediately. The human practitioner decides whether the discovery warrants a PRD update. AI proposes the update text.

Why: Discoveries that are not captured are lost. The PRD is the elephant. If the elephant changes shape and nobody updates the record, subsequent bites are cut from the wrong animal.

Failure mode: Step 7 reveals that a stakeholder's workflow is fundamentally different from what was assumed in Step 2. The discovery is noted verbally but not written into the PRD. Step 10 implementation builds on the original (wrong) assumption.

C014 MEDIUM INFERENCE 2x efficiency

AI functions as a context window manager. Encapsulation, function contracting, and test-driven development patterns from software engineering apply to how AI manages project context across the ten steps.

Why: As projects grow, the full context exceeds what any single human can hold. AI maintains the complete context (PRD, stakeholder map, data model, prototype state, test criteria) and presents relevant slices to the human at each decision point.

Failure mode: Without explicit context management, AI loses track of prior decisions. It generates artifacts that contradict earlier approved work. The human reviewer catches some contradictions but misses others. Inconsistencies compound across steps.

C009 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Prep feeds its meeting context summary to Forge after every client meeting. Forge uses this to update the working deliverable with any new information, decisions, or pivots from the conversation. This handoff must happen within 2 hours of the meeting ending.

Why: Deliverables that don't reflect the latest conversation feel stale. If the client pivoted direction in a Thursday meeting and the next deliverable draft (Monday) still reflects the old direction, the founder looks like they weren't listening.

Failure mode: Client changed the scope of a project during a Wednesday call -- shifted from cost reduction to revenue growth. Prep captured the change but the handoff to Forge didn't happen until Friday. The founder sent a draft Monday morning that was still focused on cost reduction. Client replied: "Did we not discuss this on Wednesday?" The founder had to scramble to revise.

C010 MEDIUM OBSERVED ONCE 3x efficiency

Scout's research outputs must be tagged with a freshness date. Any research older than 30 days must be re-verified before inclusion in a deliverable. Scout must proactively flag when it's pulling from stale research.

Why: Markets move. A competitive landscape analysis from 6 weeks ago may already be outdated. The founder cannot manually track the age of every research data point.

Failure mode: Forge pulled a market sizing figure from a Scout brief that was 3 months old. In the interim, a major player had exited the market, changing the competitive dynamics significantly. The client's team caught the stale data during their internal review of the deliverable. Credibility hit.

C009 MEDIUM INFERENCE 2x efficiency

Agents coordinate via INFORM and CHALLENGE messages. No ad-hoc coordination.

Why: Structured messaging creates auditable coordination.

Failure mode: Undocumented side channels. Coordination failures are invisible.

C010 MEDIUM INFERENCE 2x efficiency

Spec changes trigger INFORM to Market Intelligence for positioning update.

Why: Spec changes affect market positioning.

Failure mode: Marketing claims diverge from product reality.

C011 MEDIUM INFERENCE 2x efficiency

Competitive threats trigger INFORM to Protocol Steward for format evaluation.

Why: Competitive moves may require protocol evolution.

Failure mode: Protocol falls behind market needs.

C012 MEDIUM HUMAN DEFINED RULE 3x efficiency

Unresolved CHALLENGE messages escalate to founder within 24 hours.

Why: Stalled disagreements block progress.

Failure mode: Two agents disagree. Neither yields. Question hangs for a week.

C007 MEDIUM OBSERVED REPEATEDLY 4x efficiency

The brief generation agent and feedback synthesis agent share a client context file. Every client has a living document that tracks: brand guidelines, stated preferences, past feedback patterns, and known sensitivities. Both agents read this file before producing output.

Why: Clients develop patterns. A client who always rejects serif fonts shouldn't see serif fonts in concepts. A client who loves minimalism shouldn't receive a maximalist brief. Without shared context, agents repeat mistakes that the team already learned from.

Failure mode: The brief agent generated a brief suggesting "bold, maximalist packaging" for a client who had explicitly rejected maximalism in 3 previous projects. The feedback synthesis agent had the pattern documented, but the brief agent didn't read it. Kai caught it before presenting concepts, but wasted 2 hours exploring a direction that was dead on arrival.

C008 HIGH OBSERVED REPEATEDLY 7x efficiency

The timeline management agent reads the feedback synthesis agent's output after every client review round. If client feedback indicates scope expansion ("Can you also do the business cards?" or "What about social templates?"), the timeline agent flags potential timeline impact before Diego or Mara respond to the client.

Why: Scope creep is the primary profitability killer at a small agency. Every "Can you also..." that gets a "yes" without timeline adjustment erodes margin.

Failure mode: A client casually requested social media templates during a logo review call. Diego said "Sure, we can add that." The timeline agent wasn't in the loop. The social templates added 12 hours of work to a fixed-fee project. Margin on the project dropped from 45% to 18%.

C009 HIGH OBSERVED ONCE 5x efficiency

The competitive visual analysis agent delivers research to the creative team, not to clients. Designers use competitive analysis as input for their own creative process. The analysis is never shown to clients to justify creative decisions.

Why: Showing clients competitor analysis anchors them. They stop evaluating the creative work on its own merits and start comparing it to competitors. "Make it more like Brand X" is the death of original creative work.

Failure mode: Diego included a competitive analysis slide in a client presentation to show "where the market is." The client fixated on a competitor's design and spent the entire meeting saying "Can we do something like that?" Three revision rounds were wasted trying to replicate a competitor before Mara steered the client back to an original direction. Cost: 16 hours of design time.

R3V Founding gold
C005 HIGH OBSERVED REPEATEDLY 7x efficiency

Production workflows should be implemented as explicit flows with gates rather than hidden prompt-only coordination.

Why: The org uses multiple named flows, including Inbound Conversation variants, Archivist Nightly Flow, and Seeder Flow. Gates are used to halt execution on abort conditions, review conditions, or stage-specific checks.

Failure mode: If coordination lives only inside prompts, the platform loses visibility into where control decisions happen, making retries, audits, and quality analysis substantially weaker.

C009 HIGH HUMAN DEFINED RULE 5x efficiency

Specialist agents generate domain-specific outputs, but the orchestrator remains the final authority on routing and downstream action.

Why: Sage delegates to specialists for response generation but does not itself become the specialist, and specialists do not appear to own broader routing authority. This keeps the decision chain explicit.

Failure mode: If specialists self-route or self-execute beyond their lane, duplicate action, missed edge cases, and accountability gaps become more common.

C016 MEDIUM INFERENCE 2x efficiency

The system maintains both synchronous customer-response flows and asynchronous maintenance loops.

Why: Webhook-triggered inbound flows coexist with scheduled or recurring activities like knowledge graph maintenance, batch review, seeding, and nightly consolidation.

Failure mode: If the org only optimizes for real-time response, background quality tasks such as memory hygiene, graph linking, and batch review fall behind and degrade future decisions.

C005 HIGH OBSERVED REPEATEDLY 7x efficiency

The weekly report agent pulls from the ad monitor's shared state file, never from the Google Ads API directly.

Why: Same-day API calls to Google Ads return different numbers depending on conversion lag, attribution windows, and the time of the query. When the weekly report agent made its own API call, the founder's Friday report showed different numbers than the Monday briefing for the same date range. Small discrepancies -- $3-8 per lead -- but clients who track closely notice.

Failure mode: Two agents querying the same API independently produce slightly different snapshots. Client-facing reports show inconsistent numbers across touchpoints.

C006 HIGH OBSERVED REPEATEDLY 7x efficiency

Slack alerts are batched into a single daily digest at 8:15 AM. Individual alerts fire only for spend anomalies exceeding 2x the daily threshold.

Why: With 12 clients and 5 agents, unbatched alerts produced 15-25 Slack messages per day. The founder's Slack channel became an unreadable wall of notifications. He stopped checking the channel. He missed a legitimate CPC spike on a roofing client that cost $340 in wasted spend over 2 days before the media buyer caught it during her own check.

Failure mode: High alert volume causes channel abandonment. Legitimate alerts are buried in noise. The alerting system trains the human to ignore it.

Sneeze It Founding gold
C008 HIGH MEASURED RESULT 10x efficiency

Morning briefing runs at 6:30 AM. All scanner agents must complete by 6:00 AM. Any agent not finished by 6:00 AM is marked stale in the briefing. The briefing never waits for a slow agent.

Why: One slow API call used to delay the entire briefing by 20 minutes. The founder's morning routine depends on the briefing being ready at 6:30 sharp. Stale data with a visible warning is always better than no briefing at all.

Failure mode: Google Ads API times out at 5:50 AM. Without the hard deadline, briefing delayed until 6:47 AM. Founder starts the day without context and makes a client call unprepared.

C009 HIGH OBSERVED REPEATEDLY 7x efficiency

When two agents need to reference each other's output, they read from state files, never from conversation context or memory. State files are the single source of truth for all cross-agent data.

Why: Conversation context drifts between sessions. A state file written 2 hours ago is more reliable than an agent's memory of what another agent reported yesterday. We caught 3 errors in one week from memory-based cross-referencing.

Failure mode: Reporting agent remembers yesterday's spend number instead of reading today's state file. Weekly report goes out with yesterday's numbers. Client catches the error before the account manager does.

C010 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Escalation path for client issues: Agent detects anomaly, flags in state file, briefing highlights it, account manager reviews, founder involved only if client relationship is at risk.

Why: Early on, every anomaly went directly to the founder. 15 alerts per day within the first two weeks. Alert fatigue set in by week 3. Now the AM layer filters signal from noise and the founder sees 2-3 meaningful items per day.

Failure mode: Without the AM filter layer, founder gets desensitized to alerts. Treats everything as noise. Misses a real problem that costs a client. Client churns.

C011 HIGH MEASURED RESULT 10x efficiency

Data-intensive scans run overnight via OS-level scheduling (17 autonomous agents). The morning briefing reads cached results. The founder wakes to a complete picture, not a wait.

Why: Morning scans take 30+ minutes if run sequentially. Pre-computing overnight eliminates serial latency during the founder's most valuable working hours.

Failure mode: Founder starts the day waiting for scans to complete. First 30 minutes wasted. Or: budget cap hit during overnight run, morning briefing incomplete.

C012 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Agents coordinate via structured message bus with defined message types: INFORM (state change notification), REQUEST (action needed), PROPOSAL (joint action), RESPONSE (reply), CHALLENGE (formal disagreement). 3-exchange maximum, then auto-escalate.

Why: Ad-hoc coordination creates hidden dependencies. Structured messaging makes coordination visible, auditable, and bounded.

Failure mode: Without structure, agents coordinate through undocumented side channels. When coordination fails, no one can trace what happened. Without exchange limits, agents negotiate indefinitely.

C013 MEDIUM MEASURED RESULT 6x efficiency

When the Data Infrastructure agent detects a critical ad spend overage, it escalates through a defined ladder: alert to the founder, then auto-DM to the COO after 48 hours unanswered, then escalate to the Strategic agent.

Why: Critical alerts that go unanswered create financial risk. Automated escalation ensures someone responds even if the primary recipient is unavailable.

Failure mode: Agent detects +139% overspend on a client account. Alert sits unanswered for 16 days. Client overspends by $1,348 before anyone acts.

C014 HIGH MEASURED RESULT 10x efficiency

The morning briefing compiles output from 10 parallel scanners (Slack, calendar, email, ads, pipeline, projects, call center, meetings, tasks, proposals) into one unified document. Each scanner writes to its own state file. The compiler reads all files and produces a single briefing.

Why: 10 data sources cannot be queried sequentially in under 5 minutes. Parallel pre-computation with file-based handoff makes the briefing fast and fault-tolerant.

Failure mode: If one scanner fails, the briefing still compiles with a "stale data" warning for that source. Without this architecture, one API failure blocks the entire briefing.

C025 MEDIUM OBSERVED REPEATEDLY 4x efficiency

When the Executive Assistant processes inbound email auto-replies (bounces, out-of-office, contact changes, acquisitions), it automatically routes actionable intelligence to the Sales agent's inbox without the founder in the middle. The Sales agent uses this to update prospect records and adjust outreach sequences.

Why: Cold outreach generates auto-reply intelligence (changed emails, company acquisitions, role changes) that the Sales agent needs immediately. Routing through the founder creates a 6-24 hour delay and wastes founder attention on mechanical handoffs.

Failure mode: Without auto-routing, the Sales agent sends outreach to bounced addresses for days. Or misses that a prospect's company was acquired, sending irrelevant messaging. The founder becomes a bottleneck for information that should flow directly between agents.

Stackwise silver
C008 MEDIUM OBSERVED ONCE 3x efficiency

Weekly report compiles data from all 5 other agents' state files every Sunday 8 PM. Founder reviews Monday morning before distribution.

Why: First version auto-distributed to team. Report included a support satisfaction score temporarily low from one angry customer. Team panicked. Now founder adds context first.

Failure mode: Raw metrics without context create panic. One bad data point triggers unnecessary emergency meetings.

C009 HIGH OBSERVED REPEATEDLY 7x efficiency

When support detects 3+ similar tickets in 48 hours, it writes a consolidated bug report to the engineering alerts state file. Engineering picks it up next scan.

Why: Support was filing individual GitHub issues for each ticket. Engineering saw 7 separate issues that were the same bug. Pattern detection saves engineering triage time.

Failure mode: 7 tickets about the same API timeout filed as 7 issues. Engineering triages each individually. Wastes 2 hours before someone connects them.

C010 HIGH OBSERVED ONCE 5x efficiency

Onboarding agent checks Stripe subscription status before every step. If customer cancelled or downgraded, sequence pauses and alerts founder.

Why: Onboarding sent "Welcome to Pro!" to a customer who downgraded to Free 6 hours earlier. Confused and annoyed.

Failure mode: Onboarding continues on autopilot after plan change. Messages reference wrong plan. Customer loses confidence.

C008 MEDIUM OBSERVED REPEATEDLY 4x efficiency

All three agents write to a shared daily summary file by 6 AM. Tanya reviews the summary before the clinic opens at 7:30 AM. Items requiring physician attention are flagged in a separate section at the top.

Why: Physicians have approximately 15 minutes of non-clinical time before the first patient. They cannot review three separate agent outputs. A single summary with physician items at the top respects their time constraints.

Failure mode: Without the compiled summary, physicians check agent output sporadically between patients. Important items get buried. The 47-item backlog formed partly because physicians did not have a clear view of what was waiting.

C009 MEDIUM OBSERVED REPEATEDLY 4x efficiency

The no-show prediction model updates its risk scores every Monday using the prior 90 days of appointment data. Scores are not recalculated mid-week to avoid confusing the front desk with shifting numbers.

Why: Early implementation recalculated scores daily. Front desk staff saw a patient's score change from 45% to 72% between Monday and Wednesday with no obvious cause. They lost confidence in the system and stopped consulting it entirely for two weeks.

Failure mode: Scores shift daily based on minor data changes. Staff perceives the system as unreliable. Adoption drops to near zero. The 23% no-show rate does not improve despite the investment.

C010 HIGH OBSERVED REPEATEDLY 7x efficiency

Eval pipeline monitoring feeds into incident response. When monitoring detects anomalies exceeding thresholds (>5% error rate, >2x latency, queue depth >10K), incident response agent automatically drafts a status update and customer communication. No external sending without dual approval.

Why: Speed of communication during incidents is critical. Customers who learn about outages from their own monitoring before Synthwave communicates lose trust instantly.

Failure mode: A 45-minute outage in the eval pipeline was detected by 8 customers before Synthwave posted a status update. 3 customers tweeted about it. The status page update went live at minute 38, after customers had already started a thread in the #synthwave-users Slack community. One customer posted: "We're seeing errors. Synthwave's status page still says all green. What's going on?"

C011 HIGH OBSERVED REPEATEDLY 7x efficiency

Usage analytics feeds into customer onboarding. When a new customer's first 72 hours show low API call volume (<10% of allocated rate limit), onboarding agent flags it as a potential integration issue and drafts a proactive check-in email.

Why: Customers who don't integrate successfully in the first week have a 70% churn rate at month 3.

Failure mode: 4 new customers in a single quarter failed to integrate within the first week. None were contacted until their monthly check-in. By then, 3 of 4 had decided the product was "too complex" and were evaluating alternatives. 2 churned. $8,400/month in lost revenue that proactive outreach could have prevented.

C012 HIGH OBSERVED REPEATEDLY 7x efficiency

Docs maintenance agent monitors the changelog (from release notes) and customer support tickets (from incident response). Any feature shipped without docs is flagged as a blocker in Linear. Any support ticket caused by missing docs is tagged as a docs failure.

Why: In developer tooling, undocumented features don't exist. Customers who can't find documentation assume the feature doesn't work.

Failure mode: A new eval metric type was shipped in v3.2 without documentation. 6 customers tried to use it, got confused by the API response format, and filed support tickets. The engineering team spent 8 hours answering the same question 6 times. The docs would have taken 30 minutes to write.

C009 HIGH OBSERVED ONCE 5x efficiency

Forecast must feed weekly demand signals to Rhythm. If a SKU is trending toward stockout within 14 days, Rhythm must suppress that SKU from upcoming email campaigns and replace it with an in-stock alternative. No selling what you can't ship.

Why: Selling out of a popular item isn't a problem. Selling a popular item via email, taking the order, and then canceling it because it's out of stock is a customer experience disaster. The customer expected the item, the business ate the ad spend to acquire that email click, and the cancellation generates a refund and a negative impression.

Failure mode: Forecast flagged the Riverwalk Henley as likely stockout in 8 days. The signal didn't reach Rhythm. Rhythm featured the Henley as the hero product in Thursday's email blast. 47 orders came in. 19 couldn't be fulfilled. 19 cancellation emails. 4 one-star reviews on Trustpilot referencing the stockout. Customer acquisition cost on those 19 lost orders: ~$380 in email + ad spend.

C010 MEDIUM OBSERVED ONCE 3x efficiency

Haven must log every customer complaint category in a structured format to #cs-patterns in Slack. Vigil reads this channel daily. If complaint volume about a specific product or shipping issue spikes above the 30-day average by 2x, Vigil must flag it as a potential systemic issue.

Why: Individual complaints are noise. Complaint patterns are signal. A spike in "sizing runs small" complaints for a new product means the size chart is wrong, not that individual customers are confused. Cross-agent pattern detection catches problems faster than any single agent.

Failure mode: Haven handled 23 complaints about the new Tech Jogger running large over 10 days. Each was handled individually with exchanges. Nobody aggregated the pattern. It wasn't until the founder reviewed returns data manually that the sizing issue was identified. By then, 23 exchanges had been processed ($460 in shipping) and the product listing still had the wrong size chart. Shade also missed it because the pattern was in CS, not pricing.

C011 MEDIUM INFERENCE 2x efficiency

When Shade detects a competitor launching a new product in a category where Threadline competes, Shade must notify both Rhythm (for potential response campaigns) and the founder (for strategic assessment). The notification must include: product name, price point, positioning, and estimated overlap with Threadline's catalog.

Why: Competitor product launches in overlapping categories require coordinated response. Marketing may need to adjust messaging, and the founder may need to evaluate pricing or positioning changes. Without cross-notification, Rhythm might inadvertently run a campaign that positions Threadline against a new competitor product the founder hasn't evaluated yet.

Failure mode: A competitor launched a heavyweight tee at $36 (Threadline's is $44). Shade logged the price change in the weekly report. Rhythm, unaware, sent an email featuring Threadline's heavyweight tee as "the best value in premium basics." Several customers replied with links to the competitor's cheaper option. The email drove traffic to the competitor.

C007 HIGH OBSERVED ONCE 5x efficiency

The LP meeting prep agent reads the latest portfolio report, deal memos, compliance document status, and investor comms history before generating meeting materials. Prep materials must be consistent with the last investor-approved communication.

Why: If meeting prep materials show different numbers or language than the last quarterly report, the LP will notice. Inconsistency between communications is the fastest way to erode investor trust.

Failure mode: The LP meeting prep agent used preliminary Q4 numbers (13.1% IRR) in a deck while the most recent distributed quarterly report showed audited Q3 numbers (10.8% IRR). An LP asked why the deck showed different numbers than the report. Sarah had to explain the preliminary vs. audited distinction in the meeting, which consumed 15 minutes and shifted the conversation from growth to data reliability.

C008 MEDIUM OBSERVED ONCE 3x efficiency

When the deal memo agent finalizes a memo for a new offering, it notifies the investor comms agent and the compliance document agent. The investor comms agent prepares an introduction email. The compliance agent begins generating the PPM and subscription documents. All three outputs are reviewed together before any investor receives any material.

Why: An investor should never receive a deal introduction email without the compliance documents being ready. "We'll send the PPM next week" signals disorganization. Everything ships together or nothing ships.

Failure mode: The investor comms agent sent an introduction email for a new deal before the PPM was finalized. An eager LP replied within 2 hours asking for the subscription agreement. Upside didn't have it ready for 5 days. The LP invested anyway, but mentioned the delay in their year-end review call as a negative.

C009 MEDIUM OBSERVED ONCE 3x efficiency

The investor comms agent tracks every communication sent to each LP and maintains a per-investor communication log. The LP meeting prep agent reads this log to avoid repeating information or contradicting previous statements.

Why: LPs at this level track what they are told. Repeating information signals automation. Contradicting previous statements signals unreliability.

Failure mode: An LP was told in a Q2 email that a specific property "closed at a 6.2% cap rate." The Q3 meeting prep deck listed the same property at "6.0% cap rate" due to a rounding difference in the data source. The LP caught the discrepancy and asked which number was correct. Tomasz had to track down the source and confirm 6.18%, which rounds to 6.2%. The meeting lost 10 minutes to a data reconciliation discussion.

Vetted Goods silver
C009 HIGH OBSERVED ONCE 5x efficiency

Atlas must feed inventory signals to both Pulse and Signal. If a SKU is within 14 days of stockout, advertising for that SKU must be paused or reduced. The pause request goes to #ad-ops in Slack with the SKU, brand, estimated days to stockout, and recommended action.

Why: Advertising a product you can't ship burns ad spend and creates customer disappointment. In a multi-brand operation, the advertising team may not have visibility into which brand's inventory is running low because they manage all three brands' ads simultaneously.

Failure mode: Ridgeline's best-selling trail jacket hit stockout on a Wednesday. Pulse continued running the hero Meta campaign (the jacket was the lead creative) through the weekend. $2,300 in ad spend drove 67 clicks to an out-of-stock product page. 12 customers added to cart and received a "sold out" notification at checkout. 3 left negative reviews mentioning the "misleading advertisement."

C010 MEDIUM OBSERVED ONCE 3x efficiency

Harbor must escalate any CS interaction that references multiple brands to a human agent immediately. If a customer says "I bought from Ridgeline AND Copper & Thread," Harbor must not respond as either brand -- it must route to the CS rep who can handle the multi-brand context appropriately.

Why: Multi-brand customers represent both the highest value and the highest risk. They know the brands are connected (which some customers don't). A scripted single-brand response feels dishonest. A human can acknowledge the cross-brand relationship authentically.

Failure mode: A customer emailed Copper & Thread saying "I love your bags, but my Ridgeline jacket had a zipper issue -- can you help?" Harbor, scoped to Copper & Thread, responded: "Thank you for reaching out! Unfortunately, I can only help with Copper & Thread orders." The customer replied: "You're literally the same company." The ops manager had to step in with a unified response. The customer was right, and the robotic brand-wall response made the company look silly.

C011 MEDIUM OBSERVED ONCE 3x efficiency

Cadence must coordinate promotional calendars across all three brands. No two brands may run conflicting promotions simultaneously (e.g., Brand A 40% off while Brand C runs full price on comparable items). Cadence must present the combined promotional calendar weekly in #marketing-ops for human approval.

Why: Customers who follow multiple brands will notice conflicting promotions. A 40% off sale at Ridgeline while Copper & Thread is full price creates a perception that Copper & Thread is overpriced, not that Ridgeline is on sale.

Failure mode: Cadence scheduled a Ridgeline end-of-season sale (30% off) the same week as a Copper & Thread new collection launch (premium pricing). Customers who received both emails perceived a disconnect. One customer emailed: "Why is everything on sale at Ridgeline but Copper & Thread wants $220 for a bag?" The promotional calendars weren't aligned because each brand's Cadence instance operated independently.

C012 HIGH OBSERVED ONCE 5x efficiency

Ledger must produce per-brand P&L reports weekly, never a consolidated view unless explicitly requested. The default is always brand-level detail because Brand C's thin margins can be hidden by Brand A's strong margins in a consolidated report.

Why: Multi-brand companies fail when a losing brand hides behind a winning brand's numbers. Brand C lost money for 2 months before anyone noticed because the consolidated report showed healthy margins.

Failure mode: Ledger's default was consolidated reporting. Brand C's COGS increased 12% due to a supplier price hike. Brand C ran at a -4% margin for 8 weeks. The consolidated company margin stayed at 22% because Brand A's 31% margin absorbed the loss. The founder didn't see the Brand C problem until quarterly per-brand reports were run. $14,400 in margin loss that could have been addressed in week 2 with a pricing adjustment.