Anthropic launches a new AI model that ‘thinks’ as long as you want – TechCrunch

Latest

Amazon

Apps

Biotech & Health

Climate

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

Gaming

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

Meta

Microsoft

Privacy

Robotics

Security

Social

Space

Startups

TikTok

Transportation

Venture

Events

Startup Battlefield

StrictlyVC

Newsletters

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Contact Us
Anthropic is releasing a new frontier AI model called Claude 3.7 Sonnet, which the company designed to “think” about questions for as long as users want it to.Anthropic calls Claude 3.7 Sonnet the industry’s first “hybrid AI reasoning model,” because it’s a single model that can give both real-time answers and more considered, “thought-out” answers to questions. Users can choose whether to activate the AI model’s “reasoning” abilities, which prompt Claude 3.7 Sonnet to “think” for a short or long period of time.The model represents Anthropic’s broader effort to simplify the user experience around its AI products. Most AI chatbots today have a daunting model picker that forces users to choose from several different options that vary in cost and capability. Labs like Anthropic would rather you not have to think about it — ideally, one model does all the work.Claude 3.7 Sonnet is rolling out to all users and developers on Monday, Anthropic said, but only people who pay for Anthropic’s premium Claude chatbot plans will get access to the model’s reasoning features. Free Claude users will get the standard, non-reasoning version of Claude 3.7 Sonnet, which Anthropic claims outperforms its previous frontier AI model, Claude 3.5 Sonnet. (Yes, the company skipped a number.)Claude 3.7 Sonnet costs $3 per million input tokens (meaning you could enter roughly 750,000 words, more words than the entire “Lord of the Rings” series, into Claude for $3) and $15 per million output tokens. That makes it more expensive than OpenAI’s o3-mini ($1.10 per 1 million input tokens/$4.40 per 1 million output tokens) and DeepSeek’s R1 (55 cents per 1 million input tokens/$2.19 per 1 million output tokens), but keep in mind that o3-mini and R1 are strictly reasoning models — not hybrids like Claude 3.7 Sonnet.Claude 3.7 Sonnet is Anthropic’s first AI model that can “reason,” a technique many AI labs have turned to as traditional methods of improving AI performance taper off.Reasoning models like o3-mini, R1, Google’s Gemini 2.0 Flash Thinking, and xAI’s Grok 3 (Think) use more time and computing power before answering questions. The models break problems down into smaller steps, which tends to improve the accuracy of the final answer. Reasoning models aren’t thinking or reasoning like a human would, necessarily, but their process is modeled after deduction.Eventually, Anthropic would like Claude to figure out how long it should “think” about questions on its own, without needing users to select controls in advance, Anthropic’s product and research lead, Dianne Penn, told TechCrunch in an interview.“Similar to how humans don’t have two separate brains for questions that can be answered immediately versus those that require thought,” Anthropic wrote in a blog post shared with TechCrunch, “we regard reasoning as simply one of the capabilities a frontier model should have, to be smoothly integrated with other capabilities, rather than something to be provided in a separate model.”Anthropic says it’s allowing Claude 3.7 Sonnet to show its internal planning phase through a “visible scratch pad.” Penn told TechCrunch users will see Claude’s full thinking process for most prompts, but that some portions may be redacted for trust and safety purposes.Anthropic says it optimized Claude’s thinking modes for real-world tasks, such as difficult coding problems or agentic tasks. Developers tapping Anthropic’s API can control the “budget” for thinking, trading speed, and cost for quality of answer.On one test to measure real-word coding tasks, SWE-Bench, Claude 3.7 Sonnet was 62.3% accurate, compared to OpenAI’s o3-mini model which scored 49.3%. On another test to measure an AI model’s ability to interact with simulated users and external APIs in a retail setting, TAU-Bench, Claude 3.7 Sonnet scored 81.2%, compared to OpenAI’s o1 model which scored 73.5%.Anthropic also says Claude 3.7 Sonnet will refuse to answer questions less often than its previous models, claiming the model is capable of making more nuanced distinctions between harmful and benign prompts. Anthropic says it reduced unnecessary refusals by 45% compared to Claude 3.5 Sonnet. This comes at a time when some other AI labs are rethinking their approach to restricting their AI chatbot’s answers.In addition to Claude 3.7 Sonnet, Anthropic is also releasing an agentic coding tool called Claude Code. Launching as a research preview, the tool lets developers run specific tasks through Claude directly from their terminal.In a demo, Anthropic employees showed how Claude Code can analyze a coding project with a simple command such as, “Explain this project structure.” Using plain English in the command line, a developer can modify a codebase. Claude Code will describe its edits as it makes changes, and even test a project for errors or push it to a GitHub repository.Claude Code will initially be available to a limited number of users on a “first come, first serve” basis, an Anthropic spokesperson told TechCrunch.Anthropic is releasing Claude 3.7 Sonnet at a time when AI labs are shipping new AI models at a breakneck pace. Anthropic has historically taken a more methodical, safety-focused approach. But this time, the company’s looking to lead the pack.For how long, though, is the question. OpenAI may be close to releasing a hybrid AI model of its own; the company’s CEO, Sam Altman, has said it’ll arrive in “months.”Topics
Senior Reporter, Consumer
Google launches a free AI coding assistant with very high usage caps
Even Elon Musk forgets that X isn’t Twitter sometimes
DOGE’s HR email is getting the ‘Bee Movie’ spam treatment
Anthropic used Pokémon to benchmark its newest AI model
SpaceX says Starship self-destructed after propellant leaks caused fires and comms blackout
Anthropic launches a new AI model that ‘thinks’ as long as you want
Perplexity teases a web browser called Comet
Subscribe for the industry’s biggest tech newsEvery weekday and Sunday, you can get the best of TechCrunch’s coverage.TechCrunch’s AI experts cover the latest news in the fast-moving field.Every Monday, gets you up to speed on the latest advances in aerospace.Startups are the core of TechCrunch, so get our best coverage delivered weekly.By submitting your email, you agree to our Terms and Privacy Notice.© 2024 Yahoo.

Source: https://techcrunch.com/2025/02/24/anthropic-launches-a-new-ai-model-that-thinks-as-long-as-you-want/

Anthropic launches a new AI model that ‘thinks’ as long as you want – TechCrunch

More Stories

Behold, the essential oil diffusing ‘fragrance mouse’ – Rock Paper Shotgun

iPhone 17 leaker just revealed renders of all four models — here’s the new designs – Tom’s Guide

Microsoft quietly launches free, ad-supported version of Office apps for Windows with limited functionality – Windows Central

Leave a Reply Cancel reply

Behold, the essential oil diffusing ‘fragrance mouse’ – Rock Paper Shotgun

iPhone 17 leaker just revealed renders of all four models — here’s the new designs – Tom’s Guide

Microsoft quietly launches free, ad-supported version of Office apps for Windows with limited functionality – Windows Central

A Mercedes Unimog of All Things Is Coming to Gran Turismo 7 This Week – The Drive

More Stories

Behold, the essential oil diffusing ‘fragrance mouse’ – Rock Paper Shotgun

iPhone 17 leaker just revealed renders of all four models — here’s the new designs – Tom’s Guide

Microsoft quietly launches free, ad-supported version of Office apps for Windows with limited functionality – Windows Central

Leave a Reply Cancel reply

You may have missed

Behold, the essential oil diffusing ‘fragrance mouse’ – Rock Paper Shotgun

iPhone 17 leaker just revealed renders of all four models — here’s the new designs – Tom’s Guide

Microsoft quietly launches free, ad-supported version of Office apps for Windows with limited functionality – Windows Central

A Mercedes Unimog of All Things Is Coming to Gran Turismo 7 This Week – The Drive