o1-preview

OpenAI

This research model offers a glimpse into the future of AI, showcasing advanced mathematical and logical reasoning skills. It excels at tasks that demand methodical, step-by-step thinking, such as solving complex math problems and generating code. With enhanced formal reasoning capabilities, it maintains robust performance across a variety of general AI tasks.

Model Specifications

Technical details and capabilities of o1-preview

Core Specifications

128.0K / 32.8K

Input / Output tokens

November 30, 2023

Knowledge cutoff date

September 11, 2024

Release date

Capabilities & License

Multimodal Support
Not Supported
Web Hydrated
No
License
Proprietary

Resources

Research Paper
https://cdn.openai.com/o1-system-card-20240917.pdf
API Reference
https://platform.openai.com/docs/models
Code Repository
https://github.com/openai

Performance Insights

Check out how o1-preview handles various AI tasks through comprehensive benchmark results.

100
75
50
25
0
90.8
MMLU
90.8
(91%)
90.8
MGSM
90.8
(91%)
85.5
MATH
85.5
(86%)
73.3
GPQA
73.3
(73%)
52.3
LiveBench
52.3
(52%)
42.4
SimpleQA
42.4
(42%)
42
AIME 2024
42
(42%)
41.3
SWE-bench
41.3
(41%)
31.4
Codeforces
31.4
(31%)
MMLU
MGSM
MATH
GPQA
LiveBench
SimpleQA
AIME 2024
SWE-bench
Codeforces

Model Comparison

See how o1-preview stacks up against other leading models across key performance metrics.

100
80
60
40
20
0
85.5
MATH - o1-preview
85.5
(86%)
96.4
MATH - o1
96.4
(96%)
76.6
MATH - GPT-4o
76.6
(77%)
97.9
MATH - o3-mini
97.9
(98%)
80.4
MATH - Phi-4
80.4
(80%)
90.8
MMLU - o1-preview
90.8
(91%)
91.8
MMLU - o1
91.8
(92%)
88.7
MMLU - GPT-4o
88.7
(89%)
86.9
MMLU - o3-mini
86.9
(87%)
84.8
MMLU - Phi-4
84.8
(85%)
73.3
GPQA - o1-preview
73.3
(73%)
78
GPQA - o1
78
(78%)
53.6
GPQA - GPT-4o
53.6
(54%)
79.7
GPQA - o3-mini
79.7
(80%)
56.1
GPQA - Phi-4
56.1
(56%)
90.8
MGSM - o1-preview
90.8
(91%)
89.3
MGSM - o1
89.3
(89%)
90.5
MGSM - GPT-4o
90.5
(91%)
92
MGSM - o3-mini
92
(92%)
80.6
MGSM - Phi-4
80.6
(81%)
42.4
SimpleQA - o1-preview
42.4
(42%)
42.6
SimpleQA - o1
42.6
(43%)
61.8
SimpleQA - GPT-4o
61.8
(62%)
13.8
SimpleQA - o3-mini
13.8
(14%)
3
SimpleQA - Phi-4
3
(3%)
MATH
MMLU
GPQA
MGSM
SimpleQA
o1-preview
o1
GPT-4o
o3-mini
Phi-4

Detailed Benchmarks

Dive deeper into o1-preview's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Math

AIME 2024

Current model
Other models
Avg (63.7%)

Coding

Codeforces

90.0%
79.0%
68.0%
47.0%
41.3%
11.0%
Current model
Other models
Avg (52.4%)

Knowledge

MATH

Current model
Other models
Avg (82.4%)

MMLU

Current model
Other models
Avg (87.0%)

GPQA

Current model
Other models
Avg (70.7%)

Non categorized

MGSM

Current model
Other models
Avg (86.2%)

LiveBench

Current model
Other models
Avg (54.7%)

SimpleQA

Current model
Other models
Avg (37.8%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for o1-preview. Compare costs across platforms to find the best pricing for your use case.

OpenAI
Anthropic
Google
Mistral AI
Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Stay Ahead with AI Updates

Get insights on Gemini Pro 2.5, Sonnet 3.7 and more top AI models