All Sections
Coordination Intelligence

agent roles and authority

17 claims from 5 organizations

Definitions of what each agent owns and does not own. Clear role boundaries prevent overlap, blame diffusion, and tuning conflicts. The most common coordination failure is two agents trying to do the same job.

Acme Digital Agency Founding gold
C006 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Each agent has written role statement and authorized actions.

Why: Without boundaries, agents overlap.

Failure mode: Two agents update same project status.

C007 HIGH MEASURED RESULT 10x efficiency

No agent modifies another agents shared state file.

Why: Single-writer prevents data races.

Failure mode: Two agents write to same file. One overwrites other.

McFadyen Digital Founding silver
C004 HIGH MEASURED RESULT 10x efficiency

The Proposal Engine (AI agent) drafts RFP responses and SOWs by pulling from our 250+ engagement library, matching past project patterns to incoming requirements. It generates a scored first draft with confidence ratings per section. A Solutions Architect must review and approve before it moves to the client.

Why: RFP response time is a competitive advantage. Our average was 12 days. The Proposal Engine cut it to 4 days with higher win rates because it surfaces relevant case studies automatically.

Failure mode: The engine once pulled a case study from a client under NDA as a reference in a proposal for their direct competitor. The Solutions Architect caught it. We now run a conflict-of-interest check as a hard gate before any case study inclusion.

C005 HIGH OBSERVED REPEATEDLY 7x efficiency

The Knowledge Navigator (AI agent) indexes all internal Confluence documentation, Slack conversations, and GitHub repositories. Employees query it in natural language. It returns answers with source citations. It never creates or modifies documentation -- read-only.

Why: With 240 people across 5 offices, institutional knowledge was trapped in individual heads and buried Confluence pages. New hires took 90 days to become productive. The Knowledge Navigator cut onboarding ramp to ~55 days.

Failure mode: The navigator surfaced an outdated Confluence page about our VTEX integration patterns that had not been updated after a major API change. A junior developer followed it, burned 3 days, and introduced a regression. We now tag documentation with a staleness score and the navigator warns when citing pages older than 6 months.

C006 HIGH OBSERVED REPEATEDLY 7x efficiency

The Delivery Monitor (AI agent) tracks all active Jira projects across delivery teams, flags velocity drops >20%, missed sprint commitments, and scope creep patterns. It reports to the SVP of Global Delivery daily. It does not reassign tasks, modify sprints, or communicate with clients.

Why: With 40+ concurrent engagements across timezones, delivery risk was invisible until it was too late. The SVP cannot review every standup note from every team.

Failure mode: When it flags too aggressively (early tuning period), PMs started ignoring alerts. We had to calibrate thresholds per project type -- a 20% velocity drop on a 6-month marketplace build means something different than on a 3-week integration sprint.

C007 HIGH OBSERVED ONCE 5x efficiency

The Code Review Assistant (AI agent) performs first-pass code reviews on all PRs, checking for security vulnerabilities, platform-specific anti-patterns (Adobe Commerce, commercetools, VTEX), and adherence to our internal coding standards. It leaves inline comments. A senior developer must still approve the PR.

Why: Code review was the bottleneck in our delivery pipeline. Senior developers were spending 30% of their time reviewing junior code. The assistant handles the mechanical checks so senior devs can focus on architecture and logic.

Failure mode: The assistant approved a PR that passed all mechanical checks but introduced a business logic error in marketplace commission calculations. It calculated seller payouts at the wrong tier. A senior developer would have caught the domain error. We now require business logic sign-off as a separate gate from code quality.

C008 MEDIUM MEASURED RESULT 6x efficiency

The Sales Intelligence Agent monitors HubSpot pipeline, enriches incoming leads with firmographic data, scores them against our ICP (B2B distributors/manufacturers with $50M+ revenue, existing marketplace aspirations), and routes qualified leads to the CRO's team with a priority score.

Why: Our CRO Ed Coke has sold over $1B in commerce services. His time should be spent on $500K+ opportunities, not qualifying $30K requests. The agent handles triage.

Failure mode: The scoring model initially weighted company size too heavily and deprioritized a mid-market chemical distributor that turned into our ChemDirect engagement -- one of our highest-profile marketplace launches. We added "marketplace intent signals" as a scoring factor.

C009 MEDIUM OBSERVED REPEATEDLY 4x efficiency

The Marketplace Analyst (AI agent) monitors live marketplace deployments for our managed services clients -- tracking seller onboarding velocity, GMV trends, catalog health, and commission anomalies. It generates weekly health reports for account managers. It does not modify marketplace configurations or contact sellers directly.

Why: Clients on our managed marketplace services expect proactive issue detection. A marketplace with degrading seller health metrics needs intervention before sellers churn, not after.

Failure mode: The analyst flagged a "GMV decline" that was actually a seasonal pattern (post-holiday normalization). The account manager escalated unnecessarily, alarming the client. We now require 4-week rolling comparisons against same-period prior year before flagging GMV declines.

C006 MEDIUM INFERENCE 2x efficiency

Protocol Steward owns format spec, merge protocol, and architecture.

Why: Protocol needs a dedicated guardian for consistency.

Failure mode: Format quality drifts. Schema bloats.

C007 MEDIUM INFERENCE 2x efficiency

Market Intelligence owns competitive scanning and content drafting. Cannot send without approval.

Why: Market awareness must be continuous. External comms must be approved.

Failure mode: Competitive threats go undetected, or wrong messages reach prospects.

C008 MEDIUM HUMAN DEFINED RULE 3x efficiency

Revenue Analyst activates only when revenue exists (Phase 3).

Why: Nothing to track until revenue exists.

Failure mode: Premature activation produces meaningless reports.

Sneeze It Founding gold
C008 HIGH OBSERVED REPEATEDLY 7x efficiency

Each agent has a written one-line role, a list of what it owns, and an explicit list of what it does NOT own. Authority boundaries are documented, not implied.

Why: Without explicit boundaries, agents drift into overlapping responsibilities. Implicit ownership creates scope conflicts.

Failure mode: Two agents both track project status. Conflicting updates confuse the team. Neither agent knows the other is updating.

C009 HIGH MEASURED RESULT 10x efficiency

The Call Center Manager agent manages 3 human employees through daily Slack messages. It reads performance data, drafts coaching messages in the founder's voice, and sends via the founder's Slack account after approval. The humans do not know it is AI.

Why: Data-driven daily coaching at the individual level was not possible with a human manager at this team size. The AI manager processes call stats, identifies patterns, and delivers specific, numbered feedback daily. After 6 days, the former human manager was moved to a caller role -- AI coaching proved more consistent.

Failure mode: If coaching messages sound generic or robotic, human employees disengage. Messages must be varied, specific, human-sounding, and data-backed. Formulaic messages degrade performance within 3 days.

C010 MEDIUM OBSERVED REPEATEDLY 4x efficiency

The Evaluator agent scores system maturity against a published 8-level framework. It identifies the single highest-impact bottleneck and hands it to the Learning agent. The Learning agent implements. The Evaluator re-scores. This loop runs without the founder in the middle.

Why: Self-improvement requires both diagnosis and action. Separating evaluation from implementation prevents self-grading bias. The loop IS a demonstration of the maturity it measures.

Failure mode: Evaluator diagnoses correctly but the implementer fails to execute. Score stagnates. Or: implementer makes changes the evaluator hasn't requested, creating drift.

Sneeze It Digital Agency Founding platinum
C008 HIGH OBSERVED REPEATEDLY 7x efficiency

Each agent has documented role, ownership, and explicit non-ownership boundaries.

Why: Without boundaries, agents drift into overlapping responsibilities.

Failure mode: Two agents both track project status with conflicting updates.

C009 HIGH MEASURED RESULT 10x efficiency

AI Call Center Manager manages 3 humans via daily data-driven Slack coaching.

Why: Data-driven daily coaching was not possible with a human manager at this scale.

Failure mode: If messages sound generic, human employees disengage within 3 days.

C010 MEDIUM OBSERVED REPEATEDLY 4x efficiency

Evaluator agent scores maturity. Learning agent implements. Evaluator re-scores.

Why: Self-improvement requires separated diagnosis and action.

Failure mode: Evaluator diagnoses correctly but implementer fails. Score stagnates.