o1-mini

OpenAI

O1-mini is OpenAI's resource-conscious language model, engineered for sophisticated reasoning while demanding minimal computing power, offering a cost-effective solution.

Model Specifications

Technical details and capabilities of o1-mini

Core Specifications

128.0K / 65.5K

Input / Output tokens

August 31, 2023

Knowledge cutoff date

September 11, 2024

Release date

Capabilities & License

Multimodal Support

Not Supported

Web Hydrated

License

Proprietary

Resources

Research Paper

https://cdn.openai.com/o1-system-card-20240917.pdf

API Reference

https://openai.com/api/o1-mini

Playground

https://platform.openai.com/playground

Performance Insights

Check out how o1-mini handles various AI tasks through comprehensive benchmark results.

100

92.4

HumanEval

92.4

(92%)

MATH-500

(90%)

85.2

MMLU

85.2

(85%)

SuperGLUE

(75%)

GPQA

(60%)

41.3

Codeforces

41.3

(41%)

28.7

Cybersecurity CTFs

28.7

(29%)

HumanEval

MATH-500

MMLU

SuperGLUE

GPQA

Codeforces

Cybersecurity CTFs

Model Comparison

See how o1-mini stacks up against other leading models across key performance metrics.

100

92.4

HumanEval - o1-mini

92.4

(92%)

HumanEval - Claude 3.5 Sonnet

(92%)

88.4

HumanEval - Grok-2

88.4

(88%)

90.2

HumanEval - GPT-4o

90.2

(90%)

84.1

HumanEval - Gemini 1.5 Pro

84.1

(84%)

HumanEval - Llama 3.1 405B Instruct

(89%)

85.2

MMLU - o1-mini

85.2

(85%)

90.4

MMLU - Claude 3.5 Sonnet

90.4

(90%)

87.5

MMLU - Grok-2

87.5

(88%)

88.7

MMLU - GPT-4o

88.7

(89%)

85.9

MMLU - Gemini 1.5 Pro

85.9

(86%)

87.3

MMLU - Llama 3.1 405B Instruct

87.3

(87%)

GPQA - o1-mini

(60%)

59.4

GPQA - Claude 3.5 Sonnet

59.4

(59%)

56.0

GPQA - Grok-2

56.0

(56%)

53.6

GPQA - GPT-4o

53.6

(54%)

59.1

GPQA - Gemini 1.5 Pro

59.1

(59%)

50.7

GPQA - Llama 3.1 405B Instruct

50.7

(51%)

HumanEval

MMLU

GPQA

o1-mini

Claude 3.5 Sonnet

Grok-2

GPT-4o

Gemini 1.5 Pro

Llama 3.1 405B Instruct

Detailed Benchmarks

Dive deeper into o1-mini's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Math

MATH-500

97.3%

96.2%

96.2%

90.6%

90.2%

90.0%

Current model

Other models

Avg (93.4%)

Coding

Codeforces

Kimi-k1.5

94.0%

o3-mini

79.0%

68.0%

DeepSeek-V3

51.6%

47.0%

o1-mini

41.3%

o1-preview

31.4%

GPT-4o

11.0%

Current model

Other models

Avg (52.9%)

HumanEval

Claude 3.5 Sonnet

93.7%

Qwen2.5-Coder 32B Instruct

92.7%

92.4%

92.0%

92.0%

90.2%

89.0%

89.0%

Ministral 8B Instruct

34.8%

Current model

Other models

Avg (85.1%)

Knowledge

MMLU

91.8%

Llama 3.2 90B Instruct

86.0%

Llama 3.3 70B Instruct

86.0%

85.9%

85.9%

85.2%

84.8%

84.0%

84.0%

Llama 3.2 3B Instruct

63.4%

Current model

Other models

Avg (83.7%)

GPQA

87.7%

71.4%

67.2%

65.2%

62.1%

60.0%

59.4%

59.1%

59.1%

25.3%

Current model

Other models

Avg (61.7%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for o1-mini. Compare costs across platforms to find the best pricing for your use case.

OpenAI

Anthropic

Google

Mistral AI

Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

o1-mini

Model Specifications

Core Specifications

Capabilities & License

Resources

Performance Insights

Model Comparison

Detailed Benchmarks

Math

MATH-500

Coding

Codeforces

HumanEval

Knowledge

MMLU

GPQA

Providers Pricing Coming Soon

Share your feedback

Stay Ahead with AI Updates