GPT-4o mini

OpenAI

GPT-4o mini is OpenAI's new, budget-friendly model intended to broaden access to AI. Surpassing earlier models like GPT-3.5 Turbo, it demonstrates exceptional text comprehension and multimodal reasoning. It supports both text and vision within a large 128K token context window, enabling real-time, cost-effective solutions such as customer service chatbots. Its pricing is a significant improvement over previous models, at just 15 cents per million input tokens and 60 cents per million output tokens. Furthermore, GPT-4o mini includes robust safety features and enhanced defenses against security risks.

Model Specifications

Technical details and capabilities of GPT-4o mini

Core Specifications

128.0K / 16.4K

Input / Output tokens

September 30, 2023

Knowledge cutoff date

July 17, 2024

Release date

Capabilities & License

Multimodal Support
Supported
Web Hydrated
No
License
Proprietary

Resources

API Reference
https://platform.openai.com/docs/api-reference

Performance Insights

Check out how GPT-4o mini handles various AI tasks through comprehensive benchmark results.

90
68
45
23
0
87.2
HumanEval
87.2
(97%)
87
MGSM
87
(97%)
82
MMLU
82
(91%)
79.7
DROP
79.7
(89%)
70.2
MATH
70.2
(78%)
59.4
MMMU
59.4
(66%)
56.7
MathVista
56.7
(63%)
40.2
GPQA
40.2
(45%)
HumanEval
MGSM
MMLU
DROP
MATH
MMMU
MathVista
GPQA

Model Comparison

See how GPT-4o mini stacks up against other leading models across key performance metrics.

100
80
60
40
20
0
82
MMLU - GPT-4o mini
82
(82%)
86.5
MMLU - GPT-4 Turbo
86.5
(87%)
86
MMLU - Llama 3.3 70B Instruct
86
(86%)
86.8
MMLU - Claude 3 Opus
86.8
(87%)
88.7
MMLU - GPT-4o
88.7
(89%)
78.9
MMLU - Gemini 1.5 Flash
78.9
(79%)
87.2
HumanEval - GPT-4o mini
87.2
(87%)
87.1
HumanEval - GPT-4 Turbo
87.1
(87%)
88.4
HumanEval - Llama 3.3 70B Instruct
88.4
(88%)
84.9
HumanEval - Claude 3 Opus
84.9
(85%)
90.2
HumanEval - GPT-4o
90.2
(90%)
74.3
HumanEval - Gemini 1.5 Flash
74.3
(74%)
40.2
GPQA - GPT-4o mini
40.2
(40%)
48
GPQA - GPT-4 Turbo
48
(48%)
50.5
GPQA - Llama 3.3 70B Instruct
50.5
(51%)
50.4
GPQA - Claude 3 Opus
50.4
(50%)
53.6
GPQA - GPT-4o
53.6
(54%)
51
GPQA - Gemini 1.5 Flash
51
(51%)
87
MGSM - GPT-4o mini
87
(87%)
88.5
MGSM - GPT-4 Turbo
88.5
(89%)
91.1
MGSM - Llama 3.3 70B Instruct
91.1
(91%)
90.7
MGSM - Claude 3 Opus
90.7
(91%)
90.5
MGSM - GPT-4o
90.5
(91%)
82.6
MGSM - Gemini 1.5 Flash
82.6
(83%)
70.2
MATH - GPT-4o mini
70.2
(70%)
72.6
MATH - GPT-4 Turbo
72.6
(73%)
77
MATH - Llama 3.3 70B Instruct
77
(77%)
60.1
MATH - Claude 3 Opus
60.1
(60%)
76.6
MATH - GPT-4o
76.6
(77%)
77.9
MATH - Gemini 1.5 Flash
77.9
(78%)
MMLU
HumanEval
GPQA
MGSM
MATH
GPT-4o mini
GPT-4 Turbo
Llama 3.3 70B Instruct
Claude 3 Opus
GPT-4o
Gemini 1.5 Flash

Detailed Benchmarks

Dive deeper into GPT-4o mini's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Coding

HumanEval

Current model
Other models
Avg (82.8%)

Reasoning

DROP

Current model
Other models
Avg (79.7%)

Knowledge

GPQA

Current model
Other models
Avg (43.4%)

MATH

Current model
Other models
Avg (69.6%)

Non categorized

MGSM

Current model
Other models
Avg (83.9%)

MathVista

Current model
Other models
Avg (54.1%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for GPT-4o mini. Compare costs across platforms to find the best pricing for your use case.

OpenAI
Anthropic
Google
Mistral AI
Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Stay Ahead with AI Updates

Get insights on Gemini Pro 2.5, Sonnet 3.7 and more top AI models