Skip to main content
    LLM News 8 min

    Claude Opus 4.8 Lands: Honesty, Dynamic Workflows, and What Changes vs Opus 4.7

    Anthropic shipped Claude Opus 4.8 on May 28, 2026. Same price as 4.7, sharper judgement, around 4x fewer unflagged code flaws, a cheaper fast mode, and dynamic workflows in Claude Code. Here is the honest read for CX leaders.

    CX Intelligence Editorial Team

    Editorial · May 27, 2026

    Editorial card showing Claude Opus 4.8 release date, honesty gains, fast mode, and API pricing

    TL;DR

    Anthropic released Claude Opus 4.8 on May 28, 2026 as a direct upgrade to Opus 4.7. Same price ($5 per million input tokens, $25 per million output tokens), better benchmarks across coding, agentic skills, reasoning, and knowledge work, and a new fast mode that runs at 2.5x the speed at three times less the cost of previous fast modes. The headline behavioural change is honesty: Anthropic reports Opus 4.8 is around four times less likely than Opus 4.7 to let flaws in its own code pass unremarked.

    For CX leaders, this is not a forklift upgrade. It is a quality-of-life release for the slice of traffic already routed to Opus, plus a few new primitives (dynamic workflows, effort control, mid-task system messages) that change how agentic stacks can be wired.

    What Changed vs Opus 4.7

    Five things are materially different.

    Honesty as a benchmark. Anthropic specifically trained and measured for honesty in 4.8. The model is more likely to flag uncertainty, push back on weak plans, and surface issues in inputs or outputs instead of papering over them. The headline number: around 4x fewer unflagged flaws in code it writes versus Opus 4.7. For CX, the analogue is a model that escalates ambiguity instead of hallucinating confident answers.

    Same price, better model. API pricing is unchanged: $5 per million input tokens and $25 per million output tokens. Every CX team currently routing to Opus 4.7 gets a free quality bump by swapping the model ID to claude-opus-4-8.

    Fast mode is cheaper and faster. Opus 4.8 ships with a fast mode at 2.5x speed, priced at $10 input and $50 output per million tokens. Anthropic describes this as three times cheaper than the fast mode on previous Opus models. That changes the economics of synchronous chat with an Opus-class model.

    Effort control is now a user-facing dial. Where Opus 4.7 introduced the xhigh tier in the API, Opus 4.8 exposes effort directly to end users on claude.ai and Cowork, and inside Claude Code via xhigh and max. High is the new default. The model uses roughly the same number of tokens as Opus 4.7 at default effort, but produces better answers.

    Dynamic workflows in Claude Code. Available in research preview on Enterprise, Team, and Max plans. Claude can now plan a task and run hundreds of parallel subagents in one session, verify outputs, and report back. Anthropic gives codebase-scale migrations across hundreds of thousands of lines of code as the worked example. For CX, the relevant analogue is long-horizon agentic operations: end-of-day reconciliations, bulk refund sweeps, KB rewrites.

    Bonus: mid-task system messages. The Messages API now accepts system entries inside the messages array. Agents can update permissions, token budgets, or environment context mid-run without breaking the prompt cache or routing through a user turn. This is a small surface change with real consequences for production agent harnesses.

    Industry Reception

    Anthropic published quotes from eleven early testers alongside the launch. The pattern is consistent.

    Cursor reports Opus 4.8 exceeds prior Opus models across every effort level on CursorBench, with meaningfully more efficient tool calling (fewer steps for the same intelligence). Genspark says it is the only model to complete every case end-to-end on their Super-Agent benchmark, beating GPT-5.5 at parity on cost. Browserbase reports 84 percent on Online-Mind2Web, calling it the strongest computer-use and browser-agent model they have tested. Devin highlights that Opus 4.8 fixes the comment-verbosity and tool-calling regressions they saw with Opus 4.7. Databricks reports 61 percent cheaper token cost than Opus 4.7 for multimodal reasoning over PDFs and diagrams in their Genie agent.

    Two themes run across the testimonials. First, judgement: testers describe a model that asks better questions, catches its own mistakes, and pushes back when a plan is unsound. Second, reliability on long-horizon agentic work, which is exactly where prior Opus generations had visible failure modes.

    What This Means for CX Teams

    The practical implications are narrower than the launch noise suggests.

    Swap the model ID. If you are already routing escalated, long-context, or supervisor-layer traffic to Opus 4.7, change the model ID to claude-opus-4-8. Same price, better outputs. No architectural work.

    Re-evaluate fast mode for synchronous chat. The 2.5x speed at three times lower cost than the prior fast tier makes Opus 4.8 fast mode plausible for live chat use cases where Opus 4.7 was previously too slow or too expensive. Run it against your current mid-tier model on the same case mix and measure.

    Lean on the honesty gains for high-stakes flows. Refunds, account changes, billing disputes, and anything regulated benefit from a model that flags uncertainty rather than confidently guessing. Move those flows to Opus 4.8 at high or xhigh effort and watch your false-confident-answer rate.

    Wait on dynamic workflows. They are powerful, but they are research preview and currently inside Claude Code. The CX analogue (hundreds of parallel subagents working through a long-horizon operation) is real, but worth piloting on internal back-office work before customer-facing flows.

    Do not push Opus 4.8 down market. The 80 percent of tier-1 traffic that lives on flash-class models should stay there. Opus 4.8 is still 10 to 30x the cost per token of those models. The right architecture is unchanged: route each contact to the cheapest model capable of resolving it at the required quality bar.

    AI Readiness Score

    How ready is your team for AI?

    6 quick questions. Get a personalised score and action plan.

    Try the AI Readiness Score

    1000+ agents deployed worldwide · 4.8 on G2

    How Opus 4.8 vs 4.7 Stacks Up at a Glance

    Price per million tokens (input / output): identical at $5 / $25. Fast mode: new on 4.8 at $10 / $50, around three times cheaper than the prior generation. Honesty on code review (unflagged flaws): around 4x lower on 4.8. Default effort: high on 4.8, with extra and max for harder tasks. Computer use (Online-Mind2Web, Browserbase): 84 percent on 4.8, a meaningful jump over 4.7. Tool calling efficiency (CursorBench): better intelligence per step on 4.8. Multimodal cost on dense documents (Databricks Genie): 61 percent cheaper on 4.8.

    The Honest Caveats

    Two things to watch.

    Anthropic itself frames 4.8 as a modest but tangible improvement, and signals that the next class of model (Mythos) is what to expect for a real capability jump. Do not over-rotate the roadmap on this release.

    Higher effort tiers spend more tokens. The default effort on Opus 4.8 lands at roughly the same token spend as 4.7 default, but extra and max will move your bill. The right answer is the same routing discipline as always: pay the premium only when the case actually needs it.

    What to Do This Week

    Three concrete moves.

    Flip your Opus traffic to claude-opus-4-8. Same price, better model. Measure autonomy, CSAT, and false-confident-answer rate over two weeks against the prior baseline.

    Pilot fast mode on one synchronous use case. Pick a chat flow currently on a mid-tier model where reasoning quality matters. Run Opus 4.8 fast mode in parallel and compare cost, latency, and resolution rate.

    Audit your high-stakes flows for honesty wins. Refunds, billing, account changes, regulated answers. Route them through Opus 4.8 at high or xhigh effort and track the rate of escalated-with-uncertainty versus confidently-wrong answers. The honesty gains should show up here first.

    Anthropic has shipped a focused, well-priced upgrade. The discipline that makes it valuable in a CX operation is the same as it has always been: route the work to the cheapest model that can do it well, and pay the premium tier only when the case actually needs it.

    Try it directly at claude.ai or via the Anthropic API. To see how Certainly routes contacts across Opus, GPT, and Gemini in production, book a demo.

    Frequently Asked Questions

    When was Claude Opus 4.8 released?

    Anthropic released Claude Opus 4.8 on May 28, 2026, roughly six weeks after Opus 4.7 launched on April 16, 2026. It is available everywhere today via claude.ai, the Claude API (model ID claude-opus-4-8), and major cloud platforms.

    How much does Claude Opus 4.8 cost compared to Opus 4.7?

    Pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens for regular usage. Fast mode on Opus 4.8 is $10 per million input tokens and $50 per million output tokens, which Anthropic describes as around three times cheaper than the fast mode on previous Opus generations.

    What is the biggest difference between Claude Opus 4.8 and Opus 4.7?

    Honesty. Anthropic specifically trained and measured for honesty in Opus 4.8, and reports the model is around four times less likely than Opus 4.7 to let flaws in its own code pass unremarked. It is more willing to flag uncertainty, push back on weak plans, and surface issues. Coding, agentic, reasoning, and multimodal benchmarks also improve.

    What are dynamic workflows in Claude Code?

    A new research preview feature, available on Claude Code Enterprise, Team, and Max plans. Claude plans a task, runs hundreds of parallel subagents in a single session, verifies its outputs, and reports back. Anthropic uses codebase-scale migrations across hundreds of thousands of lines of code as the example. For CX, the analogue is long-horizon agentic operations like bulk refund sweeps or KB rewrites.

    Should we upgrade from Opus 4.7 to Opus 4.8 in production?

    Yes for the traffic already routed to Opus. The price is identical, the model is better across every dimension Anthropic measured, and the API surface is unchanged. Swap the model ID to claude-opus-4-8 and measure autonomy, CSAT, and false-confident-answer rate against the prior baseline for two weeks before declaring victory.

    See how this works in practice.

    Book a demo
    claude opus 4.8claude opus 4.7anthropicllm newsagentic aicx automationmodel selection

    See Certainly in action.

    Book a demo and experience what agentic AI can do for your customer experience.