Model Releases · SIGNAL · May 24, 2026

What changes in the next Claude model — and what stays broken.

Bigger effective context, sharper agentic ergonomics, native long-running task support. The same indirect-injection weakness, the same instruction-following edge cases, the same eval-vs-production gap. Capability gains do not retire threat surfaces.

The next major Claude model arrives on the same trajectory as the last two: meaningfully more capable, with the capability gains concentrated in agentic ergonomics, structured-output reliability, and effective use of long context. The 1M-token context window already exists at the frontier; what changes is not 'bigger' but 'more useful at depth.' Multi-step tool use becomes less brittle. Plan-and-verify patterns work without prompt scaffolding that costs more than the task. Long-running tasks that require dozens of model calls become legitimately viable rather than experimentally cheap.

What predictably does not change: prompt injection. The OWASP LLM Top 10 has had prompt injection at #1 for three years running because new attack variants emerge faster than model-level defenses ship. PromptArmor and AgentWatcher are the strongest defenses in 2026 benchmarks, both achieving near-perfect performance on standard test suites. Under adaptive attack — where the attacker knows the deployed defense — bypass rates remain above 85 percent. Capable models are no harder to inject than less-capable ones; arguably easier, because the larger context window provides more surface area for indirect injection through tool outputs, document retrieval, and web content.

Agent-mode capability gains also do not retire the threat surfaces underneath agentic workflows. Anthropic's six-feature push between April and May 2025 — code-running, web search, file system access, persistent memory, hosted MCP, computer use — each meaningfully moved Claude toward autonomy. Each also introduced or expanded a threat surface: sandbox escape, retrieval poisoning through web search, file-system permission boundaries, persistent state poisoning, MCP tool-description injection, computer-use credential extraction. The next model will likely improve the ergonomics of these surfaces — better tool-call structure, better error recovery — while leaving the underlying threat shape intact.

A more capable model is better at completing the legitimate task AND better at executing a successfully-injected malicious instruction.

The category mistake to avoid: treating model capability gains as security improvements. A more capable model is better at completing the legitimate task AND better at executing a successfully-injected malicious instruction. The threat-model implication of the next Claude release is not "the model is safer now"; it is "the model can do more, which means a successful injection has higher impact." Threat surfaces are platform features. Capability is a model feature. They evolve on different curves.

What this means for teams shipping on Claude: re-budget for the agentic ergonomics improvements (the per-task fan-out will probably grow as agents try harder), re-run your adversarial evals against the new model on day one (capability shifts can reshape your specific exposure profile), and do not let the capability marketing distract from the threat-model maintenance. The next model will be impressive. The injection vector still lives in indirect content. Both can be true.