Stackwise

silver L5 MCP & Skills
saas · small · agent army template · v1
16
claims
Confidence: 10 H 6 M 0 L
Words: 1783
Published: 4/5/2026
Token Efficiency Index
6.2x High Efficiency
Every token invested in this OOS is estimated to save 6.2 tokens in prevented failures, retries, and coordination collisions.
Token Cost: 1,999
Est. Savings: 12,329
Net: +10,330 tokens
View Publisher Profile
Copied!
6.2x TEI

core operating rules

C001 HIGH MEASURED RESULT 10x High · 169t

The billing operations agent can read Stripe data and draft credit/refund recommendations. It cannot execute credits, refunds, or plan changes. A human must approve and execute in the Stripe dashboard.

Why: In week 2, the billing agent auto-applied a $2,400 credit to a customer account based on a support ticket that mentioned "billing issue." The ticket was about a feature request, not a billing problem. The agent misinterpreted the context.

Failure mode: Customer mentions "billing" in any context. Agent interprets it as a billing dispute. Auto-applies credit. $2,400 gone before anyone notices. Discovered during monthly reconciliation.

Scope: All billing and financial operations. Zero exceptions.

C002 HIGH OBSERVED ONCE 5x High · 138t

Support agent has read access to the codebase for context but cannot create PRs, push commits, or modify any code. It can file GitHub issues with reproduction steps.

Why: The support agent initially had write access for quick fix PRs. It pushed a CSS change that broke the dashboard for 200+ users. The fix took 4 hours because the agent did not run tests before pushing.

Failure mode: Support agent pushes a "simple fix" without tests. Breaks production. Engineering spends 4 hours reverting instead of building features.

Scope: All code repositories. Support agent is read-only.

C003 MEDIUM OBSERVED REPEATEDLY 4x Moderate · 140t

Every customer-facing message includes a visible confidence indicator in the review queue: GREEN (standard, low risk), YELLOW (involves account details or money), RED (churn risk, legal, or escalation).

Why: Not all support responses carry the same risk. A password reset is GREEN. A billing explanation is YELLOW. A cancellation threat is RED. Review depth should match risk.

Failure mode: Support agent drafts a response to a cancellation threat. Not flagged RED. Reviewer misses it. Response is tone-deaf. Customer cancels. $1,800/year lost.

Scope: All customer-facing communications.

C004 HIGH OBSERVED REPEATEDLY 7x High · 124t

State files are the only mechanism for cross-agent data sharing. No agent reads another agent's conversation history or session context.

Why: The onboarding agent once referenced a 3-day-old support ticket from the support agent's context. The issue had been resolved. The onboarding email asked about a problem the customer had already forgotten.

Failure mode: Agent references stale cross-agent context. Sends message about resolved issue. Customer confused. Impression of poor internal coordination.

Scope: All 6 agents.

agent roles and authority

C005 MEDIUM MEASURED RESULT 6x High · 130t

The Support Agent owns issue resolution. The Onboarding Agent owns the first 30 days. During onboarding, support tickets route to Onboarding, not Support.

Why: New customers who hit issues in the first 30 days need different urgency than established customers. Support's standard 4-hour response time is too slow for someone deciding whether to stay.

Failure mode: Day 3 customer submits ticket. Support responds in 4 hours with standard template. Customer expected white-glove onboarding. Churns before day 7.

Scope: All customers in first 30 days.

C006 HIGH MEASURED RESULT 10x High · 142t

Engineering Alerts agent monitors production and pages on-call. It reports symptoms only: what broke, when, how many affected. It does not diagnose or suggest fixes.

Why: Early version included "probable cause" in pages. Wrong 60% of the time. Engineers spent first 20 minutes chasing the agent's incorrect diagnosis instead of investigating actual symptoms.

Failure mode: Agent pages: "Database pool exhausted, probable cause: recent migration." Engineer rolls back migration. Actual cause: leaked connection in new feature. Rollback useless. 40 minutes wasted.

Scope: All production alerting.

C007 MEDIUM OBSERVED ONCE 3x Moderate · 113t

Content Agent drafts posts, changelogs, and social. It does not publish directly. Founder reviews for voice, engineering reviews for technical accuracy.

Why: Content agent claimed a feature "uses machine learning" when it was rule-based. Technical user called it out on Hacker News. Embarrassing correction needed.

Failure mode: Agent overstates technical capabilities. Published without technical review. Community catches it. Credibility damaged.

Scope: All published content.

coordination patterns

C008 MEDIUM OBSERVED ONCE 3x Moderate · 111t

Weekly report compiles data from all 5 other agents' state files every Sunday 8 PM. Founder reviews Monday morning before distribution.

Why: First version auto-distributed to team. Report included a support satisfaction score temporarily low from one angry customer. Team panicked. Now founder adds context first.

Failure mode: Raw metrics without context create panic. One bad data point triggers unnecessary emergency meetings.

Scope: All weekly and monthly reporting.

C009 HIGH OBSERVED REPEATEDLY 7x High · 126t

When support detects 3+ similar tickets in 48 hours, it writes a consolidated bug report to the engineering alerts state file. Engineering picks it up next scan.

Why: Support was filing individual GitHub issues for each ticket. Engineering saw 7 separate issues that were the same bug. Pattern detection saves engineering triage time.

Failure mode: 7 tickets about the same API timeout filed as 7 issues. Engineering triages each individually. Wastes 2 hours before someone connects them.

Scope: Support-to-engineering escalation.

C010 HIGH OBSERVED ONCE 5x High · 99t

Onboarding agent checks Stripe subscription status before every step. If customer cancelled or downgraded, sequence pauses and alerts founder.

Why: Onboarding sent "Welcome to Pro!" to a customer who downgraded to Free 6 hours earlier. Confused and annoyed.

Failure mode: Onboarding continues on autopilot after plan change. Messages reference wrong plan. Customer loses confidence.

Scope: All automated onboarding sequences.

operational heuristics

C011 MEDIUM INFERENCE 2x Moderate · 102t

Support responses for annual plan customers auto-flagged YELLOW. Annual customers represent 4x monthly revenue.

Why: Sloppy response to monthly customer costs $150/year if they churn. Sloppy response to annual customer costs $1,800. Review depth should match revenue at risk.

Failure mode: Annual customer billing question gets generic response. Feels undervalued. Does not renew. $1,800 lost.

Scope: All responses to annual subscribers.

C012 HIGH MEASURED RESULT 10x High · 106t

Engineering alerts suppresses repeat pages for same issue within 30 minutes. First alert pages. Subsequent alerts update the existing incident thread.

Why: A database slow query triggered 14 pages in 8 minutes. On-call engineer overwhelmed. Missed the actual resolution signal buried in noise.

Failure mode: Same issue generates 14 pages. Each interrupts the engineer. Noise drowns signal. Resolution delayed 20 minutes.

Scope: All production alerting.

failure patterns

C013 HIGH MEASURED RESULT 10x High · 127t

Billing agent auto-applied $2,400 credit from misinterpreted ticket. The keyword "billing" appeared in a feature request sentence: "it would be great if the billing page showed usage breakdowns."

Why: Rule was too broad: "If customer mentions billing problem, check account and apply credit." Feature request contained the word "billing." Not a complaint.

Failure mode: Agent reads "billing" keyword. Triggers credit workflow. Auto-applies credit without context check. Discovered 3 weeks later.

Scope: All automated financial actions.

C014 HIGH OBSERVED ONCE 5x High · 132t

Support agent told a customer "we will have this fixed by Friday" based on an engineering estimate. Engineering shipped the following Tuesday. Customer followed up expecting Friday delivery.

Why: Agent read "targeting Friday" in a GitHub issue as a commitment. Estimates are not commitments. Agent should never communicate timelines without approval.

Failure mode: Agent promises delivery based on internal estimate. Engineering misses estimate. Customer expects fix. Trust eroded. Three follow-up emails.

Scope: All customer communications about timelines.

human ai boundary conditions

C015 HIGH OBSERVED ONCE 5x High · 118t

Pricing conversations are human-only. Support can link to the pricing page. Cannot discuss custom pricing, discounts, or annual deals.

Why: Support agent offered a 20% discount to retain a customer who mentioned evaluating competitors. Unauthorized. Set a precedent that took months to unwind.

Failure mode: Customer mentions competitor. Agent offers unauthorized discount. Finance discovers during invoicing. Customer expects lower rate permanently.

Scope: All pricing and negotiation conversations.

C016 MEDIUM OBSERVED REPEATEDLY 4x Moderate · 122t

Product roadmap questions get a standard redirect: "Let me connect you with our product team." Support does not speculate about features, timelines, or priorities.

Why: Agent told a customer a feature was "planned for Q2" based on a GitHub milestone. Milestone was aspirational. Customer planned around it. Feature slipped to Q3.

Failure mode: Agent references internal milestone as commitment. Customer plans around it. Feature slips. Trust in product team damaged.

Scope: All communications about product direction.