Phi-3.5-MoE-instruct

Unknown Developer

Phi-3.5-MoE-Instruct is a highly efficient mixture-of-experts model, featuring approximately 42 billion total parameters with 6.6 billion active parameters. Boasting an extensive 128K context window, this model demonstrates exceptional capabilities in reasoning, mathematics, coding, and multilingual applications. In numerous benchmarks, it surpasses the performance of larger, dense models. Rigorously refined through safety-focused post-training, including SFT and DPO, Phi-3.5-MoE-Instruct is released under the MIT license. It is particularly well-suited for tasks demanding both high performance and efficiency, especially in multilingual contexts or those requiring sophisticated reasoning.

Model Specifications

Technical details and capabilities of Phi-3.5-MoE-instruct

Core Specifications

60.0B Parameters

Model size and complexity

4900.0B Training Tokens

Amount of data used in training

128.0K / 128.0K

Input / Output tokens

September 30, 2023

Knowledge cutoff date

August 22, 2024

Release date

Capabilities & License

Multimodal Support
Not Supported
Web Hydrated
No
License
MIT

Resources

Research Paper
https://arxiv.org/abs/2404.14219
API Reference
https://huggingface.co/microsoft/Phi-3.5-MoE-instruct

Performance Insights

Check out how Phi-3.5-MoE-instruct handles various AI tasks through comprehensive benchmark results.

100
75
50
25
0
91
ARC Challenge
91
(91%)
89.6
OpenBookQA
89.6
(90%)
88.7
GSM8K
88.7
(89%)
88.6
PIQA
88.6
(89%)
87.1
RULER
87.1
(87%)
85
RepoQA
85
(85%)
84.6
BoolQ
84.6
(85%)
83.8
HellaSwag
83.8
(84%)
82.8
MEGA XStoryCloze
82.8
(83%)
81.3
WinoGrande
81.3
(81%)
80.8
MBPP
80.8
(81%)
79.1
BigBench Hard CoT
79.1
(79%)
78.9
MMLU
78.9
(79%)
78
Social IQA
78
(78%)
77.5
TruthfulQA
77.5
(78%)
76.6
MEGA XCOPA
76.6
(77%)
70.7
HumanEval
70.7
(71%)
69.9
Multilingual MMLU
69.9
(70%)
67.1
MEGA TyDi QA
67.1
(67%)
65.3
MEGA MLQA
65.3
(65%)
60.4
MEGA UDPOS
60.4
(60%)
59.5
MATH
59.5
(60%)
58.7
MGSM
58.7
(59%)
54.3
MMLU-Pro
54.3
(54%)
45.3
Multilingual MMLU-Pro
45.3
(45%)
40
Qasper
40
(40%)
37.9
Arena Hard
37.9
(38%)
36.8
GPQA
36.8
(37%)
26.4
GovReport
26.4
(26%)
24.1
SQuALITY
24.1
(24%)
19.9
QMSum
19.9
(20%)
16.9
SummScreenFD
16.9
(17%)
ARC Challenge
OpenBookQA
GSM8K
PIQA
RULER
RepoQA
BoolQ
HellaSwag
MEGA XStoryCloze
WinoGrande
MBPP
BigBench Hard CoT
MMLU
Social IQA
TruthfulQA
MEGA XCOPA
HumanEval
Multilingual MMLU
MEGA TyDi QA
MEGA MLQA
MEGA UDPOS
MATH
MGSM
MMLU-Pro
Multilingual MMLU-Pro
Qasper
Arena Hard
GPQA
GovReport
SQuALITY
QMSum
SummScreenFD

Detailed Benchmarks

Dive deeper into Phi-3.5-MoE-instruct's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Math

Non categorized

BigBench Hard CoT

Current model
Other models
Avg (74.0%)

BoolQ

Current model
Other models
Avg (82.9%)

OpenBookQA

Current model
Other models
Avg (76.5%)

PIQA

Current model
Other models
Avg (83.6%)

Social IQA

Current model
Other models
Avg (76.4%)

WinoGrande

Current model
Other models
Avg (77.4%)

Multilingual MMLU

Current model
Other models
Avg (62.7%)

MGSM

Current model
Other models
Avg (68.0%)

Qasper

Current model
Other models
Avg (40.9%)

SQuALITY

Current model
Other models
Avg (21.2%)

RULER

Current model
Other models
Avg (85.6%)

RepoQA

Current model
Other models
Avg (81.0%)

Multilingual MMLU-Pro

Current model
Other models
Avg (38.1%)

MEGA MLQA

Current model
Other models
Avg (63.5%)

MEGA TyDi QA

Current model
Other models
Avg (64.7%)

MEGA UDPOS

Current model
Other models
Avg (53.4%)

MEGA XCOPA

Current model
Other models
Avg (69.8%)

MEGA XStoryCloze

Current model
Other models
Avg (78.1%)

GovReport

Current model
Other models
Avg (26.2%)

QMSum

Current model
Other models
Avg (20.6%)

SummScreenFD

Current model
Other models
Avg (16.4%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for Phi-3.5-MoE-instruct. Compare costs across platforms to find the best pricing for your use case.

OpenAI
Anthropic
Google
Mistral AI
Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Stay Ahead with AI Updates

Get insights on Gemini Pro 2.5, Sonnet 3.7 and more top AI models