DeepSeek-V3 logo

DeepSeek-V3

DeepSeek

DeepSeek-V3 is a cutting-edge Mixture-of-Experts (MoE) language model boasting 671 billion parameters, with 37 billion activated for each token. It incorporates Multi-head Latent Attention (MLA) and a unique load-balancing strategy that avoids auxiliary losses, alongside multi-token prediction training objectives. Pre-trained on a massive dataset of 14.8 trillion tokens, this model excels in complex reasoning, mathematical problem-solving, and code generation. The MLA architecture is used for inference, while the DeepSeekMoE architecture ensures cost-effective training. The load-balancing strategy efficiently distributes computational tasks across experts, preventing interference that could hinder model accuracy. The model's multi-token prediction training enhances data efficiency and accelerates inference through speculative decoding. DeepSeek-V3's performance has been rigorously validated on various benchmarks, surpassing other open-source models with scores of 88.5 and 75.9 on the MMLU and MMLU-Pro educational datasets, respectively, and 90.2 on the MATH-500 mathematical reasoning task. Impressively, DeepSeek-V3 achieved these state-of-the-art capabilities for a relatively low training cost of $5.576 million, utilizing just 2.788 million H800 GPU hours.

Model Specifications

Technical details and capabilities of DeepSeek-V3

Core Specifications

671.0B Parameters

Model size and complexity

14800.0B Training Tokens

Amount of data used in training

131.1K / 131.1K

Input / Output tokens

December 24, 2024

Release date

Capabilities & License

Multimodal Support
Not Supported
Web Hydrated
No
License
MIT + Model License (Commercial use allowed)

Resources

Research Paper
https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
API Reference
https://platform.deepseek.com
Playground
https://chat.deepseek.com
Code Repository
https://github.com/deepseek-ai/DeepSeek-V3

Performance Insights

Check out how DeepSeek-V3 handles various AI tasks through comprehensive benchmark results.

100
75
50
25
0
91.6
DROP
91.6
(92%)
91
MBPPPlus
91
(91%)
90.9
CLUEWSC
90.9
(91%)
90.2
MATH-500
90.2
(90%)
89.1
MMLU-Redux
89.1
(89%)
88.5
MMLU
88.5
(89%)
86.5
C-Eval
86.5
(87%)
86.1
IFEval
86.1
(86%)
86
RepoQA 32k
86
(86%)
82.6
HumanEval-Mul
82.6
(83%)
79.7
Aider-Edit
79.7
(80%)
75.9
MMLU-Pro
75.9
(76%)
73.3
FRAMES
73.3
(73%)
64.8
C-SimpleQA
64.8
(65%)
61.6
MATH
61.6
(62%)
60
BFCL
60
(60%)
59.1
GPQA
59.1
(59%)
58.0
SQL
58.0
(58%)
55.0
Taubench Retail
55.0
(55%)
51.6
Codeforces
51.6
(52%)
49.6
Aider-Polyglot
49.6
(50%)
48.7
LongBench v2
48.7
(49%)
43.2
CNMO-2024
43.2
(43%)
42
SWE-bench Verified
42
(42%)
40.5
LiveCodeBench
40.5
(41%)
39.2
AIME-2024
39.2
(39%)
37.6
LiveCodeBench
37.6
(38%)
30
Taubench Airline
30
(30%)
24.9
SimpleQA
24.9
(25%)
DROP
MBPPPlus
CLUEWSC
MATH-500
MMLU-Redux
MMLU
C-Eval
IFEval
RepoQA 32k
HumanEval-Mul
Aider-Edit
MMLU-Pro
FRAMES
C-SimpleQA
MATH
BFCL
GPQA
SQL
Taubench Retail
Codeforces
Aider-Polyglot
LongBench v2
CNMO-2024
SWE-bench Verified
LiveCodeBench
AIME-2024
LiveCodeBench
Taubench Airline
SimpleQA

Model Comparison

See how DeepSeek-V3 stacks up against other leading models across key performance metrics.

100
80
60
40
20
0
88.5
MMLU - DeepSeek-V3
88.5
(89%)
90.4
MMLU - Claude 3.5 Sonnet
90.4
(90%)
86.8
MMLU - Claude 3 Opus
86.8
(87%)
87.3
MMLU - Llama 3.1 405B Instruct
87.3
(87%)
88.7
MMLU - GPT-4o
88.7
(89%)
84.8
MMLU - Phi-4
84.8
(85%)
75.9
MMLU-Pro - DeepSeek-V3
75.9
(76%)
76.1
MMLU-Pro - Claude 3.5 Sonnet
76.1
(76%)
68.5
MMLU-Pro - Claude 3 Opus
68.5
(69%)
73.3
MMLU-Pro - Llama 3.1 405B Instruct
73.3
(73%)
72.6
MMLU-Pro - GPT-4o
72.6
(73%)
70.4
MMLU-Pro - Phi-4
70.4
(70%)
91.6
DROP - DeepSeek-V3
91.6
(92%)
87.1
DROP - Claude 3.5 Sonnet
87.1
(87%)
83.1
DROP - Claude 3 Opus
83.1
(83%)
84.8
DROP - Llama 3.1 405B Instruct
84.8
(85%)
83.4
DROP - GPT-4o
83.4
(83%)
75.5
DROP - Phi-4
75.5
(76%)
59.1
GPQA - DeepSeek-V3
59.1
(59%)
59.4
GPQA - Claude 3.5 Sonnet
59.4
(59%)
50.4
GPQA - Claude 3 Opus
50.4
(50%)
50.7
GPQA - Llama 3.1 405B Instruct
50.7
(51%)
53.6
GPQA - GPT-4o
53.6
(54%)
56.1
GPQA - Phi-4
56.1
(56%)
61.6
MATH - DeepSeek-V3
61.6
(62%)
71.1
MATH - Claude 3.5 Sonnet
71.1
(71%)
60.1
MATH - Claude 3 Opus
60.1
(60%)
73.8
MATH - Llama 3.1 405B Instruct
73.8
(74%)
76.6
MATH - GPT-4o
76.6
(77%)
80.4
MATH - Phi-4
80.4
(80%)
MMLU
MMLU-Pro
DROP
GPQA
MATH
DeepSeek-V3
Claude 3.5 Sonnet
Claude 3 Opus
Llama 3.1 405B Instruct
GPT-4o
Phi-4

Detailed Benchmarks

Dive deeper into DeepSeek-V3's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Math

MATH-500

Current model
Other models
Avg (93.4%)

Coding

LiveCodeBench

Current model
Other models
Avg (46.0%)

Codeforces

90.0%
79.0%
68.0%
47.0%
41.3%
11.0%
Current model
Other models
Avg (57.0%)

SWE-bench Verified

Current model
Other models
Avg (48.4%)

Aider-Polyglot

Current model
Other models
Avg (51.4%)

Reasoning

DROP

Current model
Other models
Avg (87.2%)

Knowledge

MMLU

Current model
Other models
Avg (86.7%)

GPQA

Current model
Other models
Avg (59.0%)

Non categorized

MMLU-Pro

Current model
Other models
Avg (72.9%)

IFEval

Current model
Other models
Avg (84.7%)

SimpleQA

Current model
Other models
Avg (30.4%)

FRAMES

Current model
Other models
Avg (77.9%)

LongBench v2

Current model
Other models
Avg (42.9%)

CLUEWSC

Current model
Other models
Avg (91.7%)

C-Eval

Current model
Other models
Avg (84.8%)

C-SimpleQA

Current model
Other models
Avg (64.3%)

Taubench Retail

Current model
Other models
Avg (58.3%)

Taubench Airline

Current model
Other models
Avg (38.0%)

BFCL

Current model
Other models
Avg (66.6%)

MBPPPlus

Current model
Other models
Avg (89.3%)

SQL

Current model
Other models
Avg (66.3%)

RepoQA 32k

Current model
Other models
Avg (89.0%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for DeepSeek-V3. Compare costs across platforms to find the best pricing for your use case.

OpenAI
Anthropic
Google
Mistral AI
Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Stay Ahead with AI Updates

Get insights on Gemini Pro 2.5, Sonnet 3.7 and more top AI models