Claude Sonnet 4 and Opus 4 spotted in early testing round

Anthropic appears to be approaching the public launch phase for its next-generation Claude 4 model family. Internal web configurations now reference “Claude 4,” with early access given to a group of internal testers dubbed “Ants.” The references specifically name Claude Sonnet 4 and Claude Opus 4, with Opus described as Anthropic’s most advanced model to date. These references were accompanied by a note that this version isn’t yet production-ready and is heavily rate-limited, suggesting it's still under active evaluation and reserved for trusted users.

"Claude 4 is here" - "Try Claude Sonnet 4 and Claude Opus 4 today"

"Try Claude Sonnet 4 or Claude Opus 4 for Anthropic’s smartest models yet."

"Not intended for production use. Subject to strict rate limits"

"show_raw_thinking" / "show_raw_thinking_mechanism"

(not available… pic.twitter.com/615BWDrWJk
— Tibor Blaho (@btibor91) May 21, 2025

These models were likely discovered via frontend configuration flags and are tied to Anthropic’s broader testing initiative known as Claude Neptune. Neptune is thought to be a system aimed at preventing prompt exploits and reinforcing model integrity. Notably, Claude 4 is reportedly categorized under the ASL-3 safety tier, Anthropic’s internal classification for models with higher capability and therefore higher misuse potential. This places it above the current ASL-2 models in terms of both power and risk.

If the past release cadence holds, Claude 4 could officially arrive around June, consistent with Anthropic's previous mid-year launch patterns. The mention of limited “friends and family” testing suggests the current stage is just short of a broader developer or public preview. For users, especially those already invested in Claude 3 or Opus tiers, this could offer an early look at more autonomous reasoning features and possibly deeper context retention.

Anthropic, known for its conservative approach to model deployment and safety, continues to iterate cautiously. If Claude 4’s capabilities meet internal expectations, it could mark a sharper pivot toward more agentic use cases, especially if the model’s “thinking” mode becomes a central differentiator. For now, it’s a waiting game, but signs point to Claude 4 entering the final rounds of internal vetting.