Open-Source DeepSeek-R1 challenges OpenAI’s o1 in benchmarks

DeepSeek has unveiled DeepSeek-R1, a groundbreaking open-source AI model that rivals OpenAI's o1 in performance. This release marks a significant milestone in the democratization of advanced AI technology.

DeepSeek-R1 boasts impressive capabilities, particularly in mathematical reasoning, coding, and complex problem-solving tasks. The model employs large-scale reinforcement learning in its post-training phase, enabling it to achieve remarkable performance with minimal labeled data. Its architecture features 671 billion parameters, of which 37 billion are activated during operation, demonstrating its computational efficiency.

🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today!

🐋 1/n pic.twitter.com/7BlpWAPu6y
— DeepSeek (@deepseek_ai) January 20, 2025

One of the most notable aspects of DeepSeek-R1 is its open-source nature. The model is released under the MIT license, allowing for free commercial use, distillation, and modification. This permissive licensing approach aims to foster innovation and collaboration within the AI community.

In addition to the main model, DeepSeek has released six smaller distilled versions, with the 32B and 70B variants performing on par with OpenAI's o1-mini. These models range from 1.5B to 70B parameters, catering to various computational requirements and use cases.

DeepSeek-R1 is now in the Arena🔥

Congrats @deepseek_ai on R1 release! An open reasoning model matching OpenAI o1 in hard benchmarks like GPQA/SWE-Bench/AIME!

Now for the real-world challenge—R1 is in https://t.co/gxIFU9kIc2 for human evaluation. Bring your toughest prompts and… pic.twitter.com/UnJHdwcDsP
— lmarena.ai (formerly lmsys.org) (@lmarena_ai) January 20, 2025

DeepSeek-R1 is now accessible through multiple channels:

A web interface at chat.deepseek.com, featuring a "Deep Thinking" mode.
An API for developers, with competitive pricing:
- $0.14 per million input tokens (cache hit)
- $0.55 per million input tokens (cache miss)
- $2.19 per million output tokens

The model's performance has been validated across several benchmarks, including AIME, MATH-500, and SWE-bench Verified, where it matches or exceeds o1's capabilities. DeepSeek-R1 excels in self-verification, reflection, and generating long chains of thought, making it particularly suited for tasks requiring complex reasoning.

As DeepSeek-R1 enters the AI arena, it faces real-world challenges and human evaluation. The AI community is encouraged to test its capabilities and provide feedback, contributing to its ongoing development and refinement.

This release represents a significant step forward in open-source AI, potentially reshaping the landscape of accessible, high-performance language models for researchers, developers, and businesses worldwide.