Meridian Growth Partners
silver L6 Harness Engineeringcore operating rules
Every agent writes to exactly one shared state file. No agent reads a data source that another agent owns.
Why: When our Google Ads monitor and our weekly report agent both pulled from the Google Ads API independently, they reported different numbers for the same client on the same day. The founder sent the weekly report to the client. The client compared it to the alert they received 3 hours earlier. The numbers didn't match. It took 45 minutes to explain the discrepancy was a timezone difference in the two API calls.
Failure mode: Two agents query the same API at different times, get different snapshots. Client-facing numbers contradict each other. Trust erodes.
Scope: All agents.
Alert thresholds must use dollar amounts for accounts under $5K/mo and percentage changes for accounts over $5K/mo.
Why: A 20% spend spike on a $2,500/mo account is $500 -- annoying but manageable. A 20% spike on a $28K/mo account is $5,600 -- that's a budget-breaking emergency. Meanwhile, a $300 daily overspend on the $28K account is noise (1% variance), but the same $300 on the $2,500 account is 12% of their monthly budget gone in one day. We spent two weeks calibrating after the initial rollout generated 47 alerts in a single day, only 3 of which required action.
Failure mode: Flat percentage thresholds flood the team with low-dollar alerts on small accounts while missing meaningful dollar movements on large accounts. Alert fatigue sets in within 72 hours.
Scope: Ad monitoring agents.
agent roles and authority
The ad monitoring agent reports anomalies. It does not diagnose causes or recommend actions.
Why: When the monitor started including "likely cause: audience fatigue" in its alerts, the account manager skipped her own analysis and forwarded the agent's guess to the client. The actual cause was a Meta algorithm change affecting all lead gen campaigns that week. The agent's guess was plausible but wrong. The client made creative changes they didn't need.
Failure mode: Agent speculation is treated as diagnosis. Humans skip their own investigation. Clients act on wrong information.
Scope: Ad monitoring agents, reporting agents.
Each agent has a named human owner who reviews its output weekly and owns its calibration.
Why: During our rapid scaling from 2 to 8 agents, agents 5 through 8 had no clear human reviewer. The pipeline tracking agent miscategorized "proposal viewed" as "deal progressing" for 3 weeks. Nobody caught it because nobody was looking. $42K in pipeline was listed as "warm" when it was actually stale proposals auto-opened by email preview panes.
Failure mode: Unreviewed agents drift. Miscalibrations compound silently. Bad data enters decision-making.
Scope: All agents.
coordination patterns
The briefing agent reads shared state files. It never calls APIs, searches inboxes, or queries databases directly.
Why: Briefing compile time went from 90 seconds to 11 minutes when the briefing agent was making live API calls to 4 different services. One Monday morning, the Meta Ads API was slow and the briefing timed out entirely. The founder started the week with no briefing and spent 40 minutes manually checking dashboards.
Failure mode: Briefing reliability degrades as data source count grows. One slow API blocks the entire morning.
Scope: Briefing/orchestration agents.
When two agents need to reference the same entity (client, deal, campaign), they use a canonical ID from the CRM, not the display name.
Why: "Precision Plumbing" in Google Ads, "Precision Plumbing LLC" in the CRM, and "Precision" in Slack. The reporting agent matched on display name and merged two different clients named "Precision" -- Precision Plumbing and Precision Auto. The weekly report showed Precision Plumbing's spend as $14,200 (actual: $6,800 plumbing + $7,400 auto).
Failure mode: Name-based matching produces false merges. Reports show combined data for unrelated clients.
Scope: All agents that reference client data.
operational heuristics
New agents run in shadow mode for 14 days before their output reaches anyone outside the team.
Why: Our auto-send experiment on day 3 of deploying the client update agent. It sent a "your CPL increased 34% this week" email to 12 clients at 6:47 AM on a Saturday. Three clients called within the hour. The CPL increase was real but within normal weekly variance and the email lacked context about seasonality. We turned off auto-send and haven't re-enabled it.
Failure mode: Agents send raw metrics without context to clients. Clients panic. Founder spends Saturday morning on damage control calls.
Scope: Any agent with external communication capability.
Stale data flags must include hours-since-update, not just "stale."
Why: "Stale" means nothing. Is it 2 hours old or 2 days old? The briefing said "Dash data is stale" for our ad monitoring output. The founder assumed it was a few hours old and made decisions accordingly. It was actually 3 days old because the Meta API token had expired Friday evening and nobody noticed until Tuesday morning.
Failure mode: Ambiguous staleness labels lead to decisions based on data that's far older than assumed.
Scope: All agents that read shared state files.
failure patterns
API token expiration must be monitored with a dedicated health check, not discovered when an agent fails.
Why: We lost 3 days of Meta Ads data because the token expired over a weekend. No agent checks for "am I authenticated?" before attempting work -- they just fail silently and write nothing to their shared state file. The briefing agent saw an empty file and reported "no alerts" instead of "data unavailable."
Failure mode: Silent authentication failure looks like "everything is fine" instead of "system is blind."
Scope: All agents using authenticated APIs.
When an agent produces zero output for a data source that always has data, treat it as a system failure, not a clean bill of health.
Why: See C009. The briefing interpreted "no Meta alerts" as "Meta is healthy" when in reality the monitoring agent couldn't authenticate. For 3 days, the team believed Meta campaigns were running perfectly while CPL on two accounts had doubled.
Failure mode: Zero-output is misread as zero-problems. The absence of data is treated as the absence of issues.
Scope: All orchestration agents.
human ai boundary conditions
No agent may send any communication to a client -- email, Slack, SMS -- without explicit human approval for the first 30 days.
Why: The Saturday morning auto-send incident (see C007). Twelve clients received decontextualized performance alerts. One client escalated to their CMO, who called our founder to "discuss the relationship." The client didn't churn, but the founder spent 4 hours that weekend in damage control. The agent was technically correct. The communication was still harmful.
Failure mode: Technically accurate but contextually tone-deaf messages damage client relationships.
Scope: All client-facing agents.
The founder's override is final. If the founder says "ignore this alert," the agent marks it suppressed with a reason and timestamp, and does not resurface it for 7 days.
Why: The ad monitor flagged a $200/day overspend on a client running a 2-week promotional push. The founder dismissed it. The agent re-flagged it the next day. And the next. For 8 consecutive days. The founder started ignoring all alerts from that agent. He missed a real $1,800/day overspend on a different client 3 days later because he'd trained himself to skip the alerts.
Failure mode: Persistent re-alerting on dismissed items causes the human to tune out the entire alert channel. Real emergencies get buried.
Scope: All alerting agents.
operational heuristics
Weekly agent review must include a false positive rate for each alerting agent. Target: below 15%.
Why: After the 47-alerts-in-one-day incident (see C002), we started tracking false positive rates. Our Google Ads monitor was at 62% false positives -- nearly two-thirds of its alerts required no action. We recalibrated thresholds and got it to 11% over the next two weeks. The Meta monitor was already at 8%. Without tracking the rate, we wouldn't have known which agent needed calibration.
Failure mode: Without measurement, alert quality degrades silently. Teams compensate by ignoring alerts rather than fixing thresholds.
Scope: All alerting agents.
coordination patterns
Agent-to-agent communication uses structured message types (REQUEST, INFORM, ALERT). Free-text messages between agents are prohibited.
Why: The reporting agent sent the briefing agent a free-text note: "Check Precision -- numbers look off." The briefing agent interpreted "off" as "offline" and reported the Precision Plumbing account as disconnected. The actual meaning was "the numbers look unusual." The account manager called Google support to investigate a non-existent connection issue.
Failure mode: Ambiguous natural language between agents causes misinterpretation. Downstream actions are based on the wrong reading.
Scope: All agents with inter-agent communication.
core operating rules
Every client-facing number must include the data timestamp and the date range it covers.
Why: The weekly report showed "CPL: $47" for a fitness client. The client's internal dashboard showed $52. The discrepancy: our report used Monday-Sunday, their dashboard used Sunday-Saturday. Neither was wrong, but the client spent an hour trying to reconcile. Now every metric includes "Mon Mar 2 - Sun Mar 8, data pulled at 6:12 AM EST."
Failure mode: Undated metrics create reconciliation conflicts with client-side reporting tools. Trust erodes even when both numbers are correct.
Scope: All reporting agents.
agent roles and authority
The pipeline agent tracks deal stage transitions. It does not predict close probability or recommend next actions.
Why: When we added "recommended next step" to the pipeline agent's output, it suggested "send a follow-up proposal" for a deal where the prospect had explicitly asked for two weeks to think. The account manager sent the follow-up. The prospect replied: "I said two weeks." The deal still closed, but the relationship started on the wrong foot.
Failure mode: Agent recommendations override human judgment about relationship timing. Prospects feel pressured by automation-driven cadence.
Scope: Pipeline and CRM agents.
failure patterns
When scaling agent count, add one agent at a time with a 2-week stabilization window between additions.
Why: We added agents 5, 6, 7, and 8 in the same week. Within 3 days, agents 6 and 7 had overlapping responsibilities that nobody caught during design. Both were monitoring Slack for client mentions -- one for the briefing, one for escalation alerts. Client mentions were being processed twice, once surfacing as "mention in briefing" and once as "possible escalation." The founder saw the same client name in two different sections and assumed two separate issues existed.
Failure mode: Rapid parallel deployment masks responsibility overlap. Debugging which agent owns what becomes exponentially harder with each simultaneous addition.
Scope: Agent deployment and scaling.
human ai boundary conditions
Budget change recommendations require human approval regardless of agent confidence. No agent may increase or decrease ad spend autonomously.
Why: Non-negotiable from day one. The founder watched another agency's automation tool increase a client's daily budget from $150 to $450 during a conversion spike that turned out to be bot traffic. The client was billed $2,700 in wasted spend before a human caught it. Meridian's agents flag pacing anomalies. Humans decide the response.
Failure mode: Autonomous budget changes amplify bad signals. Bot traffic, attribution glitches, and seasonal spikes all look like "good performance" to an algorithm.
Scope: All ad management agents. ---
Compare with Another OOS
Search for an organization to compare against.