Anthropic has released Claude Opus 4.6, an upgraded version of its most advanced artificial intelligence model, significantly intensifying competition with OpenAI’s ChatGPT and Google’s Gemini. The launch comes as rivalry among frontier AI models accelerates, with companies racing to deliver stronger reasoning, coding, and long-context capabilities.
The new model brings major improvements over its predecessor, particularly in coding, reasoning, and autonomous task execution, positioning Claude Opus 4.6 as a powerful option for developers and enterprises working with large, complex systems.
For the first time in the Opus lineup, Anthropic has introduced a 1 million token context window in beta. This allows Claude Opus 4.6 to process and retain vastly larger amounts of information within a single session, making it well-suited for:
Large codebase analysis
Long-running autonomous agents
Deep research and financial analysis
Complex document and knowledge workflows
Anthropic says the expanded context window directly addresses challenges like “context rot”, where models lose accuracy or coherence as context grows.
According to Anthropic, Claude Opus 4.6:
Plans more carefully
Sustains agentic workflows for longer durations
Performs stronger code review and debugging
Can catch and correct its own errors
Beyond software development, the model is designed for complex knowledge work, including financial modelling, legal analysis, research tasks, and the creation of documents, spreadsheets, and presentations.
Within Cowork, Anthropic’s autonomous multitasking environment, Opus 4.6 can combine these capabilities to execute multi-step tasks with minimal human oversight, further reinforcing its positioning as an agent-first model.
Terminal-Bench 2.0: Opus 4.6 leads this agentic coding benchmark
Humanity’s Last Exam: Tops the multidisciplinary reasoning test
GDPval-AA: Outperforms OpenAI’s GPT-5.2 by roughly 144 Elo points and its own predecessor by 190 points
BrowseComp: Achieves the best results in locating difficult-to-find information online
Anthropic argues that these results demonstrate a broader leap in long-context reasoning and retrieval, an area where many AI models still struggle.
On MRCR v2, a needle-in-a-haystack benchmark using a 1 million token context:
Claude Opus 4.6 scored 76%
Claude Sonnet 4.5 scored 18.5%
Anthropic describes this gap as a qualitative shift in how effectively large contexts can be used without performance degradation.
Anthropic says performance gains did not come at the expense of safety. According to the system card:
Opus 4.6 shows low rates of misaligned behaviour
Reduced instances of deception or over-compliance
Fewer unnecessary refusals compared to earlier Claude models
The company has also introduced new cybersecurity probes in response to the model’s stronger defensive and offensive security capabilities.
Alongside Claude Opus 4.6, Anthropic rolled out several updates:
Adaptive thinking with adjustable reasoning effort
Context compaction for longer-running agents
Maximum output length increased to 128,000 tokens
US-only inference option available at a premium
Agent teams in Claude Code allow multiple AI agents to work in parallel on large codebases
Claude for Excel has received upgrades
A research preview of Claude in PowerPoint is now available for Max, Team, and Enterprise users
Claude Opus 4.6 is available:
On claude.ai
Via the Claude API
Across major cloud platforms
Pricing remains unchanged at $5 per million input tokens and $25 per million output tokens.
With a massive context window, benchmark-leading performance, and stronger agentic capabilities, Claude Opus 4.6 signals a major step forward for Anthropic as competition with OpenAI and Google intensifies. The release highlights how frontier AI models are shifting from simple chatbots toward long-running, autonomous systems capable of handling real-world complexity at scale.