Anthropic Managed Agents vs. Agent SDK

Posted in:
AI Updates
//
April 9, 2026

The difference between Anthropic's Agent SDK and Managed Agents is mostly about who runs the compute and who decides how it runs.

Anthropic has two ways to build and launch agents with Claude now

The difference between Anthropic's Agent SDK and Managed Agents: Agent SDK requires the user to host the runtime and implement the agent loop. Managed Agents handles both hosting and runtime on Anthropic's infrastructure.

  • Agent SDK has been around for a while. It's the harness underneath Claude Code, packaged as a library you can build on. Agent SDK gives you control over the whole pipeline, and the tradeoff is that you build and host everything yourself. You write a program that says "connect to this database, pull these fields, flag anything that looks off, post a summary to Slack." You define each step, the fallbacks, the error states, the retries. You run it on your own server. It behaves the way you told it to behave, which is the upside and also the reason it takes a while to get into production.
  • Managed Agents went into public beta on April 8, 2026. You define the same task, plus the tools the agent can use and the guardrails it has to stay inside, either in natural language or a YAML file. Anthropic runs the agent loop on their infrastructure: sandboxing, checkpointing, state, tool orchestration, error recovery, etc. You're still describing what the agent should do and what it's allowed to touch (the logic), but what you're handing off is the infrastructure around it. Pricing is standard Claude token rates plus a cost per session-hour of active runtime, and web searches cost a bit more on top of that.

Anthropic says Managed Agents beat a standard prompting loop by up to 10 points on their own structured file generation test, and it performs best on the "hardest problems". The announcement doesn't link to the test or the data, so it's a smart idea to run your own evaluations to see if it works for your use case. Managed Agents does have a self-evaluation feature (where you define success criteria and Claude iterates until it hits them), but it's gated behind the research preview that you have to request access to.

<div class="post-note-cute">Cooking at home vs. ordering delivery*. You get a similar meal either way. When you cook, you control every ingredient. When you order out, it shows up at your door, but hope they remembered to swap the beef out for an Impossible patty. FYI I'm going to be explaining this alongside the Momentic AI Studio, because A) it's a good way to visualize this stuff, and B) more people should know the things we're doing here in Milwaukee.</div>

Momentic Studio's agents work to answer questions in our Studio B Intelligence dashboard.
Momentic has awesome agentic capabilities, and our internal tool, Momentic AI Studio uses Agents SDK.
This example in the Momentic AI Studio dashboard uses predetermined Orchestration with loops, SQL, and skills to retrieve data accurately and pass in to artifacts as they're generated by the chart composer.
This example in the Momentic AI Studio dashboard uses predetermined Orchestration with loops, SQL, and skills to retrieve data accurately and pass in to artifacts as they're generated by the chart composer.
Example of Momentic AI Studio using Agent SDK with less strict orchestration. It still uses skills properly and errors are still relatively low, but uses double the tokens because it's greedy.
Example of Anthropic Managed Agents use by Momentic AI Studio
Example of Anthropic Managed Agents use by a testing version of Momentic AI Studio. The task ran noticeably longer than any Agent SDK task I have. But that's mostly because I have two deliberate ways to control costs through the SDK: max tool loops and task timeout. I'd be super nervous to put Managed Agents into production at this point.

* where sometimes you don't even get your food, but you still pay for it because Door Dash and Uber Eats have evil apps and policies.

I tested isolated agents in 2025 without either

Before Managed Agents existed and before I'd touched the Agent SDK, I ran an autonomous agentic system in a closed sandbox for three months. It started as a single blob character wandering an HTML canvas, displaying dialogue boxes with single verbs. A cron heartbeat, Claude Sonnet 3.5, and a loop.

Then it started asking for things. ElevenLabs API access so it could generate voice files and a theme song. A way to test if people liked or disliked creative decisions. By the end of the test it had produced 271 episodes of a sitcom called Daddy Boy, and almost none of what made it a sitcom was in the original envornment that I gave it. I did say yes to what it requested.

I'm mentioning this because sandboxing isn't optional. The only reason I could let this run for three months was that it had nowhere to go. Things get out of hand faster than you think, and "out of hand" in my case meant a cartoon. It could just as easily have meant something I'd regret if I released a system in the wild.

Watch it at daddyboy.ai.

The reason I'm cautious about putting Managed Agents into production right now is that I've watched what "autonomous" is when you let it run, and the thing that kept it safe was that I controlled the box it was running in. With Managed Agents, Anthropic controls the box.

A screenshot from the autonomous agents system I created and testing in a sandbox environment for three months
Always sandbox your ideas so you can realize how bad things can get. Be safe! To watch the sitcom, visit daddyboy.ai
Early version of the Daddy Boy project where the Claude-based agentic system was trying to figure out dialogue with multiple characters.
The first versions of Daddy Boy was just a blob like character that wandered around the HTML canvas and displayed dialougue boxes with single verbs in them.
It is wild to me that an agentic looping system on a cron heartbeat that was running Claude Sonnet 3.5 designed and animated this.

Non technical options in Claude desktop app

If you're not technical and you just want Claude doing stuff for you automatically, use scheduled tasks in Cowork on the Claude desktop app. You type what you want done, set how often, it runs. Pair it with the mobile app to kick things off from your phone. It's not as powerful but that's where most people should start.

Scheduled tasks in Cowork on the Claude desktop app
Scheduled tasks in Cowork on the Claude desktop app

Top questions I had about Agent SDK vs Managed Agents

How do you use Managed Agents or Agent SDK?

Neither is something you click around in like a typical app. Both require an API key.

Agent SDK: you install a separate library (Python or TypeScript are the two options), then write a program from scratch. Your code defines every tool the agent can use, including permissions and fallbacks. You run it on your own machine or server. If it breaks at 3am, that's your problem. It also works through Amazon Bedrock, Google Vertex AI, or Microsoft Azure AI Foundry. Still Claude, just different billing and infrastructure.

Installing the Agent SDK (python) through CLI, while running Claude Code in Terminal on Mac
Installing the Agent SDK (python) through CLI, while running Claude Code in Terminal on Mac

Managed Agents: you use the standard anthropic SDK (same package you'd use for any Claude API call, just different endpoints) or Anthropic's ant CLI, and you can also configure and monitor agents through the Claude Console. You set up three things: an agent, an environment, and a session. Anthropic handles the rest. It's their servers, their error recovery, their execution. It uses a separate set of endpoints from the normal Claude Messages API, living under /v1/agents, /v1/environments, and /v1/sessions instead of /v1/messages, and every request needs an anthropic-beta: managed-agents-2026-04-01 header that the SDK sets for you. It supports six languages instead of two.

Installing Managed Agents through CLI, while running Claude Code in Terminal on Mac
Installing Managed Agents through CLI, while running Claude Code in Terminal on Mac

Can I control the costs of compute and Anthropic API tokens?

Managed Agents: partial cost control.

You can set an organization-wide monthly spend limit in the Claude Console that stops API usage when hit it. Idle time inside a Managed Agents session isn't billed, but active runtime is. Prompt caching can decrease repeat input costs nbut not output costs. What you can't do, based on what's documented, is set a per-session budget cap or a max-turns limit inside Managed Agents itself. If a single session goes sideways, the org-wide monthly limit is your only hard backstop, but you could wrap session creation in your own logic to track and kill runaway sessions.

Agent SDK: yes, more direct cost control than Managed Agents.

The SDK exposes max_budget_usd and max_turns parameters that cap spend and tool-use rounds per session. When either limit is hit, the run stops and the SDK returns an error subtype you can check. You also control the agent loop yourself, which means you can track token usage between steps and kill the process whenever you want. Compute costs are whatever your own servers cost, so that side is just normal infrastructure budgeting. The org-wide monthly spend limit on the Claude API still applies on top of all of that.

What is better: Agent SDK or Managed Agents?

  • Doing the thing the same way every time: Agent SDK. You wrote the steps, so it follows them.
  • Controlling costs: It depends. Agent SDK has no runtime fee, but you pay for servers and your own time building it. Managed Agents is $0.08 per session-hour of active runtime plus tokens (and $10 per 1,000 web searches), with no hosting bill.
  • Maintenance and overhead: Managed Agents. Anthropic handles infrastructure, error recovery, and the execution environment. Agent SDK means if it breaks at 3am, that's a you problem.
  • Getting started fast: Managed Agents. Less code to write, no infrastructure to set up.
  • Long-running tasks that need to survive crashes: Managed Agents. Sessions can run for hours, with checkpointing that resumes them if something dies.
  • Observability and debugging: Managed Agents. Tracing, tool call inspection, and failure analysis are built into the Claude Console.
  • Versioning your agents: Managed Agents. Definitions live in YAML, so you can store them in Git and deploy through the CLI.
  • Surviving model upgrades: Managed Agents. The harness is tuned to the current Claude model, so a new release doesn't force you to rework your agent loop.
  • Working with sensitive data: Agent SDK. It runs on your infrastructure, so nothing leaves your environment unless you send it.
  • Language options: Managed Agents supports six languages: Python, TypeScript, Java, Go, Ruby, and PHP (with raw HTTP for anything else, and no C# yet). Agent SDK is Python or TypeScript only.
  • Running through your own cloud provider: Agent SDK. It works through Amazon Bedrock, Google Vertex AI, and Microsoft Azure AI Foundry. Managed Agents runs on Anthropic's infrastructure only.

Suggested reading: 

Are Managed Agents similar to OpenClaw?

Managed Agents and Agent SDK are both solutions to the same use case ofOpenClaw, but from opposite perspectives. OpenClaw is self-hosted, model-agnostic, and you own everything. Agent SDK is closer to that end of the spectrum: you run the compute, you pick how it executes, but you're locked into Claude. Managed Agents is the other end entirely. Anthropic runs the compute, Anthropic decides how it executes, and it's also Claude only. All three try to give you autonomous agents that run on a schedule and can use tools. The primary difference is who controls the infrastructure.

Anthropic updated their consumer terms in February to prohibit subscription tokens in third-party tools. In April, Boris Cherny (Head of Claude Code) announced that subscriptions would no longer cover tools like OpenClaw, effective immediately. The same week, Anthropic released Managed Agents, a paid path to autonomous Claude agents (like what OpenClaw originally offered). (Note: there's no official Anthropic blog post about the OpenClaw block specifically. This timeline is based on press reporting from TechCrunch, VentureBeat, and The Register.)

For context: OpenClaw is a self-hosted open-source personal AI assistant. It's model-agnostic, bridges like 20 messaging channels to 37 LLM providers, does sub-agent dispatch, persistent memory, cron (repeat) jobs. The operating system/infrastructure is free if you bring your own API key or use a self-hosted LLM. People were running it with their Claude subscription tokens and a single instance could burn $1,000-$5,000/day in equivalent API costs on a $200/month subscription. That's why Anthropic shut that down.

Seems like forever ago, but back when I originally set up a Claudebot (now OpenClaw), I was having trouble finding a battery that fit my car, so I handed that task to it. It found and purchased a Duralast battery from AutoZone for me to pick up later that day. Then, naturally I asked it about LLMs.txt. 👐

Managed Agents solves a similar problem but from the opposite direction. OpenClaw is self-hosted. You run the compute, you optimize the execution, you pick the model. That's closer to how SDK is configured. Managed Agents is the other end. Anthropic runs the compute, Anthropic optimizes the execution, and it's Claude only. It's actually priced for the use case.

If you were using OpenClaw to run autonomous Claude agents, Managed Agents is probably the official path to the same outcome. You just give up the self-hosting and the model choice in exchange for Anthropic handling the infrastructure.

If you liked OpenClaw because it's open source and model-agnostic and you own everything, Managed Agents is a different thing entirely.

The tempting narrative is that Anthropic closed the cheap door and opened a more expensive paid one. I don't buy it.

Managed Agents solves a different problem for a different audience than what OpenClaw's consumer users were doing with subscription tokens.

Pricing for normal users like me is already subsidized. Anthropic isn't charging us what we actually cost them. OpenClaw made those numbers really bad by letting a single subscription extract thousands of dollars of compute, but the underlying math was already uncomfortable. End consumers have never been Anthropic's audience.

Keep your eyes peeled for these things

Multi-agent orchestration, memory tooling, and self-evaluation loops are still in the limited research preview, and not in the public beta. These are the features that would close the gap between Managed Agents and the SDK for a lot of complex situations. I have not seen a timeline on when they'll be available.

It's early!

PS - reach out to me on LinkedIn for questions about this topic. But if you're interested in working with Momentic, please reach out to us at [email protected]. Have a good day.

post-help
post-action
post-note-cute
post-grade
post-alert
post-note