Qwen2.5 7B Instruct logo

Qwen2.5 7B Instruct

Qwen

Qwen2.5-7B-Instruct is a 7 billion parameter language model finely tuned to master instruction following. It demonstrates exceptional proficiency in creating extended content (over 8,000 tokens), interpreting structured information, and producing structured output formats such as JSON. This model offers improved performance in mathematical reasoning, coding tasks, and multilingual applications, supporting over 29 languages, including Chinese, English, French, and Spanish.

Model Specifications

Technical details and capabilities of Qwen2.5 7B Instruct

Core Specifications

7.6B Parameters

Model size and complexity

18000.0B Training Tokens

Amount of data used in training

131.1K / 8.2K

Input / Output tokens

September 18, 2024

Release date

Capabilities & License

Multimodal Support
Not Supported
Web Hydrated
No
License
apache-2.0

Resources

Research Paper
https://arxiv.org/abs/2407.10671
API Reference
https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api
Code Repository
https://github.com/QwenLM/Qwen2.5

Performance Insights

Check out how Qwen2.5 7B Instruct handles various AI tasks through comprehensive benchmark results.

100
75
50
25
0
91.6
GSM8K
91.6
(92%)
87.5
MT-bench
87.5
(88%)
84.8
HumanEval
84.8
(85%)
79.2
MBPP
79.2
(79%)
75.5
MATH
75.5
(76%)
75.4
MMLU-Redux
75.4
(75%)
73.3
AlignBench
73.3
(73%)
71.2
IFEval
71.2
(71%)
70.4
MultiPL-E
70.4
(70%)
56.3
MMLU-Pro
56.3
(56%)
52
Arena-Hard
52
(52%)
36.4
GPQA
36.4
(36%)
35.9
LiveBench
35.9
(36%)
28.7
LiveCodeBench
28.7
(29%)
GSM8K
MT-bench
HumanEval
MBPP
MATH
MMLU-Redux
AlignBench
IFEval
MultiPL-E
MMLU-Pro
Arena-Hard
GPQA
LiveBench
LiveCodeBench

Model Comparison

See how Qwen2.5 7B Instruct stacks up against other leading models across key performance metrics.

100
80
60
40
20
0
56.3
MMLU-Pro - Qwen2.5 7B Instruct
56.3
(56%)
63.7
MMLU-Pro - Qwen2.5 14B Instruct
63.7
(64%)
64.4
MMLU-Pro - Qwen2 72B Instruct
64.4
(64%)
69
MMLU-Pro - Qwen2.5 32B Instruct
69
(69%)
71.1
MMLU-Pro - Qwen2.5 72B Instruct
71.1
(71%)
54.3
MMLU-Pro - Phi-3.5-MoE-instruct
54.3
(54%)
75.5
MATH - Qwen2.5 7B Instruct
75.5
(76%)
80
MATH - Qwen2.5 14B Instruct
80
(80%)
59.7
MATH - Qwen2 72B Instruct
59.7
(60%)
83.1
MATH - Qwen2.5 32B Instruct
83.1
(83%)
83.1
MATH - Qwen2.5 72B Instruct
83.1
(83%)
59.5
MATH - Phi-3.5-MoE-instruct
59.5
(60%)
91.6
GSM8K - Qwen2.5 7B Instruct
91.6
(92%)
94.8
GSM8K - Qwen2.5 14B Instruct
94.8
(95%)
91.1
GSM8K - Qwen2 72B Instruct
91.1
(91%)
95.9
GSM8K - Qwen2.5 32B Instruct
95.9
(96%)
95.8
GSM8K - Qwen2.5 72B Instruct
95.8
(96%)
88.7
GSM8K - Phi-3.5-MoE-instruct
88.7
(89%)
84.8
HumanEval - Qwen2.5 7B Instruct
84.8
(85%)
83.5
HumanEval - Qwen2.5 14B Instruct
83.5
(84%)
86
HumanEval - Qwen2 72B Instruct
86
(86%)
88.4
HumanEval - Qwen2.5 32B Instruct
88.4
(88%)
86.6
HumanEval - Qwen2.5 72B Instruct
86.6
(87%)
70.7
HumanEval - Phi-3.5-MoE-instruct
70.7
(71%)
36.4
GPQA - Qwen2.5 7B Instruct
36.4
(36%)
45.5
GPQA - Qwen2.5 14B Instruct
45.5
(46%)
42.4
GPQA - Qwen2 72B Instruct
42.4
(42%)
49.5
GPQA - Qwen2.5 32B Instruct
49.5
(50%)
49
GPQA - Qwen2.5 72B Instruct
49
(49%)
36.8
GPQA - Phi-3.5-MoE-instruct
36.8
(37%)
MMLU-Pro
MATH
GSM8K
HumanEval
GPQA
Qwen2.5 7B Instruct
Qwen2.5 14B Instruct
Qwen2 72B Instruct
Qwen2.5 32B Instruct
Qwen2.5 72B Instruct
Phi-3.5-MoE-instruct

Detailed Benchmarks

Dive deeper into Qwen2.5 7B Instruct's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Knowledge

MATH

Current model
Other models
Avg (73.4%)

GPQA

Current model
Other models
Avg (40.6%)

Non categorized

MultiPL-E

Current model
Other models
Avg (70.3%)

IFEval

Current model
Other models
Avg (76.6%)

Arena-Hard

Current model
Other models
Avg (66.6%)

AlignBench

Current model
Other models
Avg (76.9%)

MT-bench

Current model
Other models
Avg (90.5%)

LiveBench

Current model
Other models
Avg (54.7%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for Qwen2.5 7B Instruct. Compare costs across platforms to find the best pricing for your use case.

OpenAI
Anthropic
Google
Mistral AI
Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Stay Ahead with AI Updates

Get insights on Gemini Pro 2.5, Sonnet 3.7 and more top AI models