THE STACK · ENGINEERING · BILLING · May 16, 2026

Anthropic just metered the Agent SDK. Credit exhaustion is now a real failure mode.

Effective June 15, programmatic Claude usage runs on a separate monthly credit pool — billed at full API rates, no rollover, no dipping into your chat subscription. Your agent loop and your billing model are now coupled in a way they were not before. The security story is what happens when that pool drains.

$20 Pro Agent SDK credit

$200 Max 20x credit

Jun 15 effective date

TL;DR 30-second version · free

01 On June 15, Anthropic splits Claude subscriptions into two pools: chat subscriptions keep their existing limits; programmatic use (Agent SDK, claude -p, GitHub Actions, third-party agents) moves to a dedicated monthly credit at full API rates.
02 Credits do not roll over. If exhausted, you cannot dip into the chat-subscription pool — you have to buy extra. Email notice arrives June 8.
03 Security implication: agent loops can now exhaust the credit pool. If an attacker controls the input that drives a loop — prompt injection, malicious user content, recursive tool calls — they can drain your monthly budget. Cost is now an attack surface.

DEEP ANALYSIS · free while in beta

READING AS

FOR YOU

Pre-June 15: inventory every spot you run claude -p, Agent SDK, or third-party tools that consume Claude. Estimate monthly token spend. Match it to a tier. If your real usage exceeds $200/month, you are an API customer now, not a subscription customer.

FOR YOU

Add credit-budget tracking to your agent loops. Hard ceilings per task. Hard ceilings per tenant. Default to failing closed when budgets exceed. Distinguish credit-exhaustion errors from rate-limit errors in your retry logic.

FOR YOU

Anthropic is positioning toward enterprise unit economics — subscription pricing was unsustainable for agentic workloads. Watch competitors: if OpenAI introduces metered Agent SDK credits within 90 days, this is the new normal. Anthropic ARR commentary in upcoming earnings cycles will signal how this lands.

FOR YOU

Re-do your burn-rate math against the June 15 pricing. If your product uses Agent SDK in production, you almost certainly need to migrate to direct API billing — subscription credits will not cover real production loads. Get the conversation with Anthropic procurement started now.

FOR YOU

Adversarial credit exhaustion is a new attack class worth characterizing empirically. Studies on prompt-injection-driven token consumption — how much budget can an attacker drain per successful injection — would inform both defense design and threat modeling. The data does not exist yet.

What shipped that matters.

Jun 15

Two billing pools

Chat subscription limits stay. Programmatic Claude usage moves to a separate credit pool.

split

Jun 15

$20 / $100 / $200 credit

Monthly Agent SDK credit by tier — Pro $20, Max 5x $100, Max 20x $200.

pricing

Jun 15

Full API rates

Credits are billed at full API rates, not the discounted subscription rate.

pricing

Jun 15

No rollover

Unused Agent SDK credits expire at the end of the month.

policy

Jun 15

No dip into chat pool

When Agent SDK credit is exhausted, you cannot use chat subscription quota to cover it.

policy

Jun 8

Email notification

Anthropic emails subscribers June 8 with credit-claim instructions.

rollout

The billing split changes what fails when, and how. The under-covered consequence is what it does to autonomous agent design.

BEFORE

Pre-June-15 billing

One Claude subscription covers chat + programmatic use
Rate limits are the binding constraint
Exceeding limits returns rate-limit errors
Fallback paths watch rate-limit headers

→

AFTER

Post-June-15 billing

Chat subscription stays as-is
Programmatic use draws from a separate Agent SDK credit pool
Credit exhaustion is the new binding constraint for agents
Exhausted credits cannot fall back to chat subscription quota
Fallback paths now need credit-budget awareness, not just rate-limit awareness
Cost predictability for autonomous agents requires explicit instrumentation

Rate-limit errors are recoverable — wait, retry. Credit exhaustion is harder: the only fix is buying more credit or waiting until next month. Adversarial input that drains the pool is now a real attack class.

Six failure modes that emerge once credit becomes the binding constraint instead of rate limits. Severity reflects how reachable each becomes in typical agent deployments.

01 HIGH

Credit exhaustion as a denial-of-service surface

If an attacker controls input that drives an agent loop — adversarial user content, prompt-injection payloads, recursive tool calls — they can spike token consumption and drain the Agent SDK credit pool. Once drained, your agents are offline until next month or until you buy emergency credit. This is a new DoS surface that did not exist when chat + programmatic usage shared the same elastic pool.

DO Add per-loop, per-user, and per-tenant credit-consumption budgets. Alert on anomalous consumption rates before the pool drains. Treat credit-burn rate the same way you treat any other resource-exhaustion metric.
02 HIGH

Cost runaway in autonomous agents

Long-horizon agent loops — multi-step research, recursive tool calls, multi-agent collaboration — produce highly variable token consumption. Without explicit budget instrumentation, a single bug or prompt-injection event can consume an entire month of credits in hours. Production cost predictability now requires what production resource management requires.

DO Instrument expected vs actual token consumption per agent task. Set hard ceilings per loop and per session. Default to failing closed (stopping the loop) rather than failing open (continuing to consume) when the budget is exceeded.
03 MEDIUM

Fallback path planning gets more complex

Existing fallback paths watch for HTTP 429 (rate limited) and back off. Credit-exhaustion errors are a different signal — they will not resolve with backoff alone. Your fallback code needs to distinguish 'rate-limited, retry in 60 seconds' from 'credit exhausted, escalate' and route appropriately.

DO Update agent SDK error-handling code to recognize credit-exhaustion responses distinctly. Route them to a human escalation path or a non-Claude fallback model, not to a retry loop.
04 MEDIUM

Team plan billing may not match security boundary

Team Standard provides $20/seat and Team Premium $100/seat. If your security boundary differs from your team-plan boundary (e.g., one team plan covers multiple security tenants), the credit pool is shared in ways that may not be visible to consumers. A compromised tenant can affect availability for the whole team.

DO Audit whether your team-plan boundary matches your tenancy boundary. If not, consider separate team plans per security tenant, or migrate to direct API access where credit boundaries can be explicitly drawn.
05 MEDIUM

Third-party agent frameworks consume from the same pool

Cursor, Continue, Zed, Aider, and other third-party tools that use the Agent SDK or claude -p draw from your subscription credit pool. If you run multiple tools concurrently, their consumption competes for the same monthly budget. A developer running Cursor in autopilot can drain credits that the production agent stack depends on.

DO Inventory every place programmatic Claude consumption happens in your organization. Assign a budget per consumer. Consider separating production agent loops onto direct API keys with dedicated budgets, away from the shared subscription pool.
06 MEDIUM

No-rollover policy concentrates failure modes

Unused credits expire monthly. Failed automations (errors that caused you to under-consume one month) do not amortize against the next month. The accounting is symmetric in both directions — you cannot smooth consumption across the cycle.

DO Plan for tight credit-cycle alignment. If your usage is bursty (e.g., end-of-month batch jobs), the no-rollover policy makes that worse. Consider pre-buying extra credit to absorb spikes rather than relying on month-over-month carryover.

Three concrete actions this week.

1

Inventory every place programmatic Claude runs

claude -p in scripts, GitHub Actions workflows, Cursor/Zed/Continue/Aider, third-party agents, internal Agent SDK code. Every one of them draws from the same credit pool starting June 15.
2

Add credit-budget instrumentation before June 15

You need to know your current burn rate so you can size the right plan. Without instrumentation you will not know whether $20/$100/$200 covers your real usage until the credit pool drains in production.
3

Update error handling to distinguish credit exhaustion from rate limits

These will arrive as different error signals. Treat them differently. Credit exhaustion is an availability event, not a retry-after-backoff event.

Signals in the next 60 days that matter.

Cursor, Zed, Continue, and Aider's UX response

Third-party agent clients will need to surface credit consumption to users. The one that does this well becomes the practitioner default. Watch their changelogs.

Anthropic clarifications on GitHub Actions consumption

GitHub Actions runs are tricky — they are CI/CD events that may run dozens of times per day. Watch for Anthropic guidance on how those events consume credits.

Competitor responses (OpenAI, Google, Mistral, DeepSeek)

Anthropic just established the "metered programmatic use" pattern. Whether competitors follow tells you whether this is the new industry default or an Anthropic-specific move.

Two billing pools

$20 / $100 / $200 credit

Full API rates

No rollover

No dip into chat pool

Email notification

Credit exhaustion as a denial-of-service surface

Cost runaway in autonomous agents

Fallback path planning gets more complex

Team plan billing may not match security boundary

Third-party agent frameworks consume from the same pool

No-rollover policy concentrates failure modes

Inventory every place programmatic Claude runs

Add credit-budget instrumentation before June 15

Update error handling to distinguish credit exhaustion from rate limits

Cursor, Zed, Continue, and Aider's UX response

Anthropic clarifications on GitHub Actions consumption

Competitor responses (OpenAI, Google, Mistral, DeepSeek)