The AI community has been buzzing with reports of GPT-5 surfacing in an unexpected place: a recent pull request to the Biosec Benchmark repository. The commit in question referenced “GPT-5 Reasoning Alpha,” dated July 13, suggesting that OpenAI’s next major language model is already being used internally for high-stakes biosecurity research. The Biosec Benchmark is designed to evaluate how AI models can identify and mitigate threats from lab-created viruses, underscoring the seriousness of the early testing phase for this new model. The accidental public mention likely points to internal experiments leaking into open-source workflows, a pattern seen before with previous major releases.
BREAKING 🚨: GPT-5 has been spotted as “gpt-5-reasoning-alpha-2025-07-13” in the biosec benchmark commit.
— TestingCatalog News 🗞 (@testingcatalog) July 19, 2025
h/t @swishfever https://t.co/oDkoupZIlq pic.twitter.com/6C4YCxNECZ
On the technical side, just days after this leak, another new model appeared on the web arena: o3 Alpha Responses, recorded on July 17. This model stands out for its coding ability, outperforming competitors by rapidly generating complex web applications, such as Minecraft clones, in a single step, and demonstrating advanced skills in generating SVG graphics. There is currently no confirmation on whether o3 Alpha Responses is related to GPT-5 or is an independent project, but its performance has set a new bar on public benchmarks.
BREAKING 🚨: A new "o3-alpha-responses-2025-07-17" model on Web Arena is overperforming!
— TestingCatalog News 🗞 (@testingcatalog) July 18, 2025
Highly customisable SVG app in one shot 🤖 https://t.co/LVFNr2AeqJ pic.twitter.com/RhnUZbrJFn
The timing of these discoveries lines up with public hints from OpenAI staff, who recently acknowledged that GPT-5 is approaching release, though they have also downplayed the timeline for models with Math Olympiad–level performance, indicating such releases are at least a few months away. These developments suggest OpenAI is deep in the testing and refinement stage, prioritizing both research utility (as seen in biosecurity work) and practical coding capabilities. For now, both the research and developer communities are left watching for further signals, anticipating a release that could reshape expectations for large language models once again.
8/N Btw, we are releasing GPT-5 soon, and we’re excited for you to try it. But just to be clear: the IMO gold LLM is an experimental research model. We don’t plan to release anything with this level of math capability for several months.
— Alexander Wei (@alexwei_) July 19, 2025
OpenAI’s ongoing strategy appears focused on responsible, gradual rollouts for its most powerful models and ChatGPT, balancing competitive pressure with growing public scrutiny and regulatory conversations about the risks of advanced AI. Suppose the early signs from Biosec and web arena benchmarks are any indication. In that case, GPT-5 will target advanced reasoning and applied skills, raising both hopes and questions about the next phase of generative AI deployment.