Join OTP the operating platform for people and AI agents
Back to Blog
Founder Notes 2026-05-22 · David Steel

The Scorecard without the lag, AI agents that push KPIs live

The EOS® Scorecard is the company's weekly heartbeat. Five to fifteen numbers. One owner per row. Read top to bottom in five minutes at every L10® meeting. Off-track numbers go to the Issues List for IDS.

The Scorecard is also the part of EOS® that most often degrades quietly.

It degrades not because the team stops believing in it. It degrades because someone forgets to update last week's number. Someone updates it wrong. Someone uses a slightly different definition than the row owner intended. The Scorecard becomes a hand-entered guess for the third row in a meeting where the first two were precise. Trust slips. The Scorecard becomes a chore.

This is the exact problem agentic AI was built to remove.

The shift, from "owner enters the number" to "agent pushes the number"

Classic EOS® says each Scorecard line has a human owner who is responsible for the number. That stays true. What changes is the mechanic.

Old mechanic: the owner reads the number out of a system (CRM, finance tool, ad platform, call tracker), opens the Scorecard, types the number in. Weekly. By Monday morning at the latest. Without fail, in theory.

New mechanic: the owner defines the number. Specifically. "Qualified Sales Calls" is defined as a discovery call held with a prospect whose CRM record shows a budget over X and an industry on the ICP list. Once that definition is locked, an agent reads the upstream system, applies the definition, and pushes the number into the Scorecard each Monday morning.

The owner of the line is still the owner. The owner reviews the agent's number, confirms or contests, and answers for it in the L10®. The owner's job changed from data entry to data ownership.

This is the more important word change inside EOS® than most teams realize. Owners do not type. Owners own.

The definition is now the artifact

The hardest part of this transition is not the technology. The technology is straightforward. The hardest part is writing down what each Scorecard line actually means.

"Recurring monthly revenue closed in the last 30 days." Does that include upsells or only new logos? Net of cancellations or gross? Counted on contract signature or on first payment? Currency conversions handled how?

Without an agent, these questions can be fuzzy because the human owner makes a judgment call each week. With an agent, the agent will apply the same definition every week. If the definition is wrong, the number is wrong consistently. If the definition is right, the number is right consistently.

A clear definition is more valuable than a fancy model. Spend the time writing the definition. Most EOS® companies have never done this exercise rigorously, and the exercise itself improves the Scorecard whether or not the agent ever runs.

What this looks like in practice

A small services company runs eight Scorecard rows.

The Visionary owns Qualified Sales Calls. An agent reads the CRM each Monday, applies the qualified-call definition, returns a number with a list of the calls counted. The Visionary spot-checks five and signs off in two minutes. Pushed.

The Integrator owns RMR closed last 30 days. An agent reads the billing system, applies the definition, returns the number with a list of the deals included. The Integrator confirms. Pushed.

The Marketing Manager owns Lead-to-Customer Conversion %. The number takes two upstream sources, leads from the CRM and customers from the billing system. The agent joins them, applies the definition, and returns a percentage with the underlying counts. The manager confirms. Pushed.

By 10 a.m. Monday the Scorecard is fully populated. The L10® on Tuesday opens with the Scorecard already live and reviewed. The first five minutes of the meeting are spent on a clean read, not on chasing missing numbers.

When the agent should escalate, not push

Two scenarios make the agent stop and ask a human first.

One, a definition gap. If the agent encounters data it cannot classify (a deal that does not fit a clear category, a lead that does not match the expected source taxonomy), it should flag the row to the owner and not push a guessed number. Better to leave a row blank with a note than to push a guess that gets used in IDS.

Two, a sudden anomaly. If the number is more than two standard deviations from the trailing eight-week average, the agent should push the number but also surface the anomaly to the owner before the L10®. The owner walks in already knowing the swing, with the underlying data ready.

Both rules are simple to write into the agent's SOP. Both rules add trust to the Scorecard.

Which AI to use

For Scorecard work, the model itself matters less than the platform's ability to read upstream systems and run on a schedule.

Anthropic's Claude Code is a fit because it runs in the environment where your operations data already lives. ChatGPT with Custom GPTs and tools is also a fit. OpenAI Assistants API likewise. The criteria are read access to the source systems, scheduled execution, and reliable output formatting.

We use Claude Code at Sneeze It for our internal Scorecard agents. The choice was practical, the same agents can read any local file, hit any API, and run on cron without us building a hosting stack around them.

What this does not solve

A bad Scorecard is still a bad Scorecard, no matter who pushes the numbers.

If your Scorecard has rows that are not actually predictive of the business outcomes you care about, automation will not help. Bring in an EOS® Implementer® or do a Scorecard audit yourself first. The agentic upgrade is for a Scorecard that is right. Do not automate a Scorecard that is wrong.

FAQ

Will agents replace Scorecard owners? No. Owners still own. Owners spend less time typing and more time interpreting.

Can agents add new rows to the Scorecard? No. Adding a row is a leadership decision. The Visionary, Integrator, and team agree on the row at a quarterly or in IDS.

What about Scorecard rows that are inherently judgment calls? Some rows are human-only (employee sentiment, qualitative client health). Keep those as human-entered. Use agents for everything that has a defensible data source.

Does this make the L10® shorter? Yes, by about five minutes on Scorecard. That time goes to IDS.

EOS®, Entrepreneurial Operating System®, Scorecard, Level 10 Meeting®, L10®, IDS, Issues List, V/TO™, Rocks™, and EOS® Implementer® are concepts and trademarks of EOS Worldwide, LLC. This article is an independent practitioner perspective and is not affiliated with or endorsed by EOS Worldwide.

DS
David Steel

Founder of OTP. Runs an AI agent army at a digital agency. Building OTP because nobody else seems to be building it. Notes from inside the build, not from the conference circuit.

More about David →

More posts on the blog index.

All posts