DeepSeek-V2.5 logo

DeepSeek-V2.5

DeepSeek

DeepSeek-V2.5 represents a significant upgrade, merging the strengths of DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct to deliver enhanced general and coding capabilities. This new version is more attuned to human expectations and incorporates optimizations across a range of areas, such as improved writing proficiency and more accurate instruction following.

Model Specifications

Technical details and capabilities of DeepSeek-V2.5

Core Specifications

236.0B Parameters

Model size and complexity

8.2K / 8.2K

Input / Output tokens

May 7, 2024

Release date

Capabilities & License

Multimodal Support
Not Supported
Web Hydrated
No
License
deepseek

Resources

Research Paper
https://arxiv.org/abs/2405.04434
API Reference
https://www.deepseek.com/
Playground
https://huggingface.co/deepseek-ai/DeepSeek-V2.5
Code Repository
https://huggingface.co/deepseek-ai/DeepSeek-V2.5

Performance Insights

Check out how DeepSeek-V2.5 handles various AI tasks through comprehensive benchmark results.

100
75
50
25
0
95.1
GSM8K
95.1
(95%)
90.2
MT-Bench
90.2
(90%)
89
HumanEval
89
(89%)
84.3
BBH
84.3
(84%)
80.4
AlignBench
80.4
(80%)
80.4
MMLU
80.4
(80%)
78.3
DS-FIM-Eval
78.3
(78%)
76.2
ArenaHard
76.2
(76%)
74.7
MATH
74.7
(75%)
73.8
HumanEval Multi
73.8
(74%)
72.2
Aider
72.2
(72%)
63.1
DS-Arena-Code
63.1
(63%)
50.5
AlpacaEval 2.0
50.5
(51%)
41.8
LiveCodeBench
41.8
(42%)
16.8
SWE-verified
16.8
(17%)
GSM8K
MT-Bench
HumanEval
BBH
AlignBench
MMLU
DS-FIM-Eval
ArenaHard
MATH
HumanEval Multi
Aider
DS-Arena-Code
AlpacaEval 2.0
LiveCodeBench
SWE-verified

Model Comparison

See how DeepSeek-V2.5 stacks up against other leading models across key performance metrics.

100
80
60
40
20
0
89
HumanEval - DeepSeek-V2.5
89
(89%)
89
HumanEval - Nova Pro
89
(89%)
88.4
HumanEval - Qwen2.5 32B Instruct
88.4
(88%)
83.5
HumanEval - Qwen2.5 14B Instruct
83.5
(84%)
81.1
HumanEval - Nova Micro
81.1
(81%)
87.8
HumanEval - Gemma 3 27B
87.8
(88%)
80.4
MMLU - DeepSeek-V2.5
80.4
(80%)
85.9
MMLU - Nova Pro
85.9
(86%)
83.3
MMLU - Qwen2.5 32B Instruct
83.3
(83%)
79.7
MMLU - Qwen2.5 14B Instruct
79.7
(80%)
77.6
MMLU - Nova Micro
77.6
(78%)
76.9
MMLU - Gemma 3 27B
76.9
(77%)
95.1
GSM8K - DeepSeek-V2.5
95.1
(95%)
94.8
GSM8K - Nova Pro
94.8
(95%)
95.9
GSM8K - Qwen2.5 32B Instruct
95.9
(96%)
94.8
GSM8K - Qwen2.5 14B Instruct
94.8
(95%)
92.3
GSM8K - Nova Micro
92.3
(92%)
95.9
GSM8K - Gemma 3 27B
95.9
(96%)
74.7
MATH - DeepSeek-V2.5
74.7
(75%)
76.6
MATH - Nova Pro
76.6
(77%)
83.1
MATH - Qwen2.5 32B Instruct
83.1
(83%)
80
MATH - Qwen2.5 14B Instruct
80
(80%)
69.3
MATH - Nova Micro
69.3
(69%)
89
MATH - Gemma 3 27B
89
(89%)
84.3
BBH - DeepSeek-V2.5
84.3
(84%)
86.9
BBH - Nova Pro
86.9
(87%)
84.5
BBH - Qwen2.5 32B Instruct
84.5
(85%)
78.2
BBH - Qwen2.5 14B Instruct
78.2
(78%)
79.5
BBH - Nova Micro
79.5
(80%)
87.6
BBH - Gemma 3 27B
87.6
(88%)
HumanEval
MMLU
GSM8K
MATH
BBH
DeepSeek-V2.5
Nova Pro
Qwen2.5 32B Instruct
Qwen2.5 14B Instruct
Nova Micro
Gemma 3 27B

Detailed Benchmarks

Dive deeper into DeepSeek-V2.5's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Math

GSM8K

Current model
Other models
Avg (92.9%)

Coding

LiveCodeBench

Current model
Other models
Avg (48.9%)

Knowledge

MATH

Current model
Other models
Avg (73.0%)

Non categorized

ArenaHard

Current model
Other models
Avg (81.3%)

AlignBench

Current model
Other models
Avg (76.9%)

Aider

Current model
Other models
Avg (63.9%)

BBH

Current model
Other models
Avg (81.8%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for DeepSeek-V2.5. Compare costs across platforms to find the best pricing for your use case.

OpenAI
Anthropic
Google
Mistral AI
Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Stay Ahead with AI Updates

Get insights on Gemini Pro 2.5, Sonnet 3.7 and more top AI models