Claude Sonnet 4.5 is Anthropic’s safest AI model yet


In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. Anthropic’s basis for that claim is a selection of benchmarks where the new AI outperforms not only its predecessor but also the more expensive Opus 4.1 and competing systems, including Google’s Gemini 2.5 Pro and GPT-5 from OpenAI. For instance, in OSWorld, a suite that tests AI models on real-world computer tasks, Sonnet 4.5 set a record score of 61.4 percent, putting it 17 percentage points above Opus 4.1. 

At the same time, the new model is capable of autonomously working on multi-step projects for more than 30 hours, a significant improvement from the seven or so hours Opus 4 could maintain at launch. That’s an important milestone for the type of agentic systems Anthropic wants to build. 

Sonnet 4.5 outperforms Anthropic's older models in coding and agentic tasks.

Sonnet 4.5 outperforms Anthropic’s older models in coding and agentic tasks.

(Anthropic)

Perhaps more importantly, the company claims Sonnet 4.5 is its safest AI system to date, with the model having undergone “extensive” safety training. That training translates to a chatbot Anthropic says is “substantially” less prone to “sycophancy, deception, power-seeking and the tendency to encourage delusional thinking” — all potential model traits that have landed OpenAI in hot water in recent months. At the same time, Anthropic has strengthened Sonnet 4.5’s protections against prompt injection attacks. Due to the sophistication of the new model, Anthropic is releasing Sonnet 4.5 under its AI Safety Level 3 framework, meaning it comes with filters designed to prevent potentially dangerous outputs related to prompts around chemical, biological and nuclear weapons.  

A chart showing how Sonnet 4.5 compares against other frontier models in safety testing.

A chart showing how Sonnet 4.5 compares against other frontier models in safety testing.

(Anthropic)

With today’s announcement, Anthropic is also rolling out quality of life improvements across the Claude product stack. To start, Claude Code, the company’s popular coding agent, has a refreshed terminal interface, with a new feature called checkpoints included. As you can probably guess from the name, they allow you to save your progress and roll back to a previous state if Claude writes some funky code that isn’t quite working like you imagined it would. File creation, which Anthropic began rolling out at the start of the month, is now available to all Pro users, and if you joined the waitlist Claude for Chrome, you can start using the extension today.   

API pricing for Sonnet 4.5 remains at $3 per one million input tokens and $15 for the same amount of output tokens. The release of Sonnet 4.5 caps off a strong September for Anthropic. Just one day after Microsoft added Claude models to Copilot 365 last week, OpenAI admitted its rival offers the best AI for work-related tasks.



Source link

The post Claude Sonnet 4.5 is Anthropic’s safest AI model yet first appeared on TechToday.

This post originally appeared on TechToday.

Leave a Reply

Your email address will not be published. Required fields are marked *