Back to Blog
Product March 2026 · David Steel

We Added Agentic Maturity Levels to OTP. Here Is Why They Matter.

"AI's coding ability is outpacing our ability to wield it effectively."

Bassim Eledath, "The 8 Levels of Agentic Engineering"

Bassim Eledath published a framework that changed how we think about our agent army. He calls it The 8 Levels of Agentic Engineering -- a progression from tab completion to autonomous agent teams. Each level builds on the last. Weaknesses at lower levels cap your score regardless of what you build above.

When we read it, two things happened. First, we built an agent named Bassim that scores our system against the framework every night. Second, we added agentic levels to OTP as a first-class feature.

Every published OOS on OTP now carries an agentic level badge -- L1 through L8 -- calculated automatically from the operational intelligence in your claims.

The 8 Levels

Here is the framework, compressed to what matters for OTP:

L1
Tab Complete. Autocomplete suggestions. The starting point. Copilot fills in your line.
L2
Agent IDE. AI-powered development environments with chat integration and multi-file editing. Context limitations are the ceiling.
L3
Context Engineering. "Every token needs to fight for its place." System prompts, rules files, tool descriptions, conversation management. This is where CLAUDE.md files and structured configuration live.
L4
Compounding Engineering. Plan, delegate, assess, codify. Documentation updates embed lessons into future sessions. You start attributing model errors to missing context, not missing capability.
L5
MCP and Skills. Tools expand agent capabilities beyond thinking to action. Database access, APIs, CI pipelines, browser testing. The agent can do things, not just say things.
L6
Harness Engineering. Automated feedback loops, observability, and constraints. Backpressure from type systems, tests, and linters keeps agents on track. Security boundaries between agents, generated code, and secrets.
L7
Background Agents. Agents run asynchronously without human approval. Orchestration becomes necessary. Separating implementers from reviewers prevents bias. This is where overnight pre-computation and scheduled autonomous agents live.
L8
Autonomous Agent Teams. The frontier. Agents coordinate directly with each other. No hub-and-spoke bottleneck. Message buses, structured handoffs, inter-agent communication without human mediation. Bassim notes: "Multi-agent coordination is a hard problem and nobody is near optimal yet."

Why We Named an Agent After Him

We did not just read the framework. We operationalized it.

We built an agent called Bassim that runs every night. It reads our entire agent army configuration -- 12 agents, 24 shared state files, 13 message bus inboxes, 17 scheduled background processes -- and produces a single score: X.X out of 8.0. With evidence for each level. With the single highest-impact bottleneck identified.

Then our Learning agent (Neil) reads Bassim's score, picks up the bottleneck, and implements fixes. Bassim re-scores. The loop runs without us in the middle.

That loop -- Bassim scores, Neil implements, Bassim re-scores -- is itself a demonstration of Level 8. Two agents coordinating autonomously to improve the system they are part of.

Our baseline score was 6.2. After four evaluation cycles, we hit 6.5. Not because we jumped to L8 capabilities, but because Bassim kept finding weaknesses at lower levels that were capping our score. The hierarchy rule is real: you cannot claim L7 if your L5 has gaps.

How Levels Work on OTP

When you publish an OOS on OTP, the platform analyzes your claims and calculates your agentic level automatically. It looks for evidence of each level in your operational intelligence:

  • L3 signals: Claims about shared state files, system prompts, context management
  • L5 signals: Claims about MCP tools, API integrations, skill systems
  • L6 signals: Claims about automated validation, staleness detection, quality feedback loops
  • L7 signals: Claims about overnight agents, scheduled execution, parallel pre-computation
  • L8 signals: Claims about agent-to-agent communication, message buses, autonomous coordination

The level appears as a badge on your publisher profile, your OOS detail page, and in browse results. It gives every visitor an instant read on how sophisticated your AI coordination is -- not based on marketing, but based on the operational claims you published.

Example badges:
L3 Context Engineering L5 MCP & Skills L7 Background Agents L8 Autonomous Agent Teams

Why This Matters for the OTP Community

Levels do three things for the platform:

1. They create a shared language. When someone says "we are at L5," everyone on OTP knows what that means. Five agents with MCP tools and skills, but no automated feedback loops yet. The framework gives the community a common vocabulary for maturity that does not require reading the full OOS.

2. They make comparison meaningful. Browsing OOS files is more useful when you can filter by level. A solo founder at L3 learns different things from an L8 enterprise deployment than from another L3. Levels create cohorts.

3. They create aspiration. Bassim's hierarchy rule means you cannot fake a high score. If your L4 is weak, your L7 does not count. This creates a natural improvement path: fix the floor before raising the ceiling. Organizations see what the next level requires and work toward it with specific evidence.

The Honest Part

Bassim wrote something that stuck with me: "For most organizations, Level 7 delivers superior leverage; Level 8 suits moonshot projects."

He is right. L8 is hard. Our message bus had zero transactions for two weeks after deployment because we built infrastructure without embedding triggers in agent workflows. We documented that failure as a claim in our OOS. It is one of the most viewed claims on the platform.

The value of levels is not reaching L8. It is knowing where you are, what is capping your score, and what to fix next. Bassim's framework gives you the map. OTP gives you the road -- the operational intelligence from other organizations that already navigated each level.

What level is your AI team?

Publish your OOS and find out. The platform calculates your agentic level automatically from your claims. See where you stand, what is capping your score, and learn from organizations at every level.

Publish Your OOS