Grok-2 mini

xAI

Grok-2 mini is a streamlined version of Grok-2, designed for speed without sacrificing quality. Though more compact, it's still a powerful tool for reasoning, coding, and engaging in conversation. It provides a strong balance of efficiency and performance.

Model Specifications

Technical details and capabilities of Grok-2 mini

Core Specifications

128.0K / 128.0K

Input / Output tokens

July 31, 2024

Knowledge cutoff date

August 12, 2024

Release date

Capabilities & License

Multimodal Support

Supported

Web Hydrated

Yes

License

Proprietary

Resources

API Reference

https://x.ai/api

Performance Insights

Check out how Grok-2 mini handles various AI tasks through comprehensive benchmark results.

100

93.2

DocVQA

93.2

(93%)

86.2

MMLU

86.2

(86%)

85.7

HumanEval

85.7

(86%)

MATH

(73%)

MMLU-Pro

(72%)

68.1

MathVista

68.1

(68%)

63.2

MMMU

63.2

(63%)

GPQA

(51%)

DocVQA

MMLU

HumanEval

MATH

MMLU-Pro

MathVista

MMMU

GPQA

Model Comparison

See how Grok-2 mini stacks up against other leading models across key performance metrics.

100

GPQA - Grok-2 mini

(51%)

50.7

GPQA - Llama 3.1 405B Instruct

50.7

(51%)

50.5

GPQA - Llama 3.3 70B Instruct

50.5

(51%)

53.6

GPQA - GPT-4o

53.6

(54%)

56.0

GPQA - Grok-2

56.0

(56%)

56.1

GPQA - Phi-4

56.1

(56%)

86.2

MMLU - Grok-2 mini

86.2

(86%)

87.3

MMLU - Llama 3.1 405B Instruct

87.3

(87%)

MMLU - Llama 3.3 70B Instruct

(86%)

88.7

MMLU - GPT-4o

88.7

(89%)

87.5

MMLU - Grok-2

87.5

(88%)

84.8

MMLU - Phi-4

84.8

(85%)

MMLU-Pro - Grok-2 mini

(72%)

73.3

MMLU-Pro - Llama 3.1 405B Instruct

73.3

(73%)

68.9

MMLU-Pro - Llama 3.3 70B Instruct

68.9

(69%)

72.6

MMLU-Pro - GPT-4o

72.6

(73%)

75.5

MMLU-Pro - Grok-2

75.5

(76%)

70.4

MMLU-Pro - Phi-4

70.4

(70%)

MATH - Grok-2 mini

(73%)

73.8

MATH - Llama 3.1 405B Instruct

73.8

(74%)

MATH - Llama 3.3 70B Instruct

(77%)

76.6

MATH - GPT-4o

76.6

(77%)

76.1

MATH - Grok-2

76.1

(76%)

80.4

MATH - Phi-4

80.4

(80%)

85.7

HumanEval - Grok-2 mini

85.7

(86%)

HumanEval - Llama 3.1 405B Instruct

(89%)

88.4

HumanEval - Llama 3.3 70B Instruct

88.4

(88%)

90.2

HumanEval - GPT-4o

90.2

(90%)

88.4

HumanEval - Grok-2

88.4

(88%)

82.6

HumanEval - Phi-4

82.6

(83%)

GPQA

MMLU

MMLU-Pro

MATH

HumanEval

Grok-2 mini

Llama 3.1 405B Instruct

Llama 3.3 70B Instruct

GPT-4o

Grok-2

Phi-4

Detailed Benchmarks

Dive deeper into Grok-2 mini's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Coding

HumanEval

93.7%

87.2%

87.1%

86.6%

86.0%

85.7%

85.4%

84.9%

84.8%

Ministral 8B Instruct

34.8%

Current model

Other models

Avg (81.6%)

Knowledge

GPQA

87.7%

56.0%

53.6%

53.6%

51.0%

51.0%

Llama 3.1 405B Instruct

50.7%

Llama 3.3 70B Instruct

50.5%

Claude 3 Opus

50.4%

Qwen2 7B Instruct

25.3%

Current model

Other models

Avg (53.0%)

MMLU

91.8%

86.9%

86.8%

86.5%

86.4%

86.2%

Llama 3.2 90B Instruct

86.0%

Llama 3.3 70B Instruct

86.0%

Nova Pro

85.9%

Llama 3.2 3B Instruct

63.4%

Current model

Other models

Avg (84.6%)

MATH

o3-mini

97.9%

Qwen2.5 7B Instruct

75.5%

DeepSeek-V2.5

74.7%

Llama 3.1 405B Instruct

73.8%

73.3%

73.0%

72.6%

71.1%

70.6%

32.6%

Current model

Other models

Avg (71.5%)

Non categorized

MMLU-Pro

DeepSeek-R1

84.0%

Grok-2

75.5%

GPT-4o

74.7%

Llama 3.1 405B Instruct

73.3%

72.6%

72.0%

71.1%

70.4%

69.0%

Qwen2.5-Coder 7B Instruct

40.1%

Current model

Other models

Avg (70.3%)

MMMU

Gemini Pro 2.5 Experimental

81.7%

68.3%

66.1%

65.9%

64.0%

63.2%

Mistral Small 3.1 24B

62.8%

Gemini 1.5 Flash

62.3%

Nova Pro

61.7%

GPT-3.5 Turbo

0.0%

Current model

Other models

Avg (59.6%)

MathVista

Kimi-k1.5

74.9%

Pixtral Large

69.4%

Grok-2

69.0%

Mistral Small 3.1 24B

68.9%

68.1%

68.1%

67.7%

65.8%

63.8%

0.0%

Current model

Other models

Avg (61.6%)

DocVQA

Claude 3.5 Sonnet

95.2%

Mistral Small 3.1 24B

94.1%

93.6%

93.3%

93.2%

92.8%

90.7%

Llama 3.2 90B Instruct

90.1%

Grok-1.5V

85.6%

Current model

Other models

Avg (92.1%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for Grok-2 mini. Compare costs across platforms to find the best pricing for your use case.

OpenAI

Anthropic

Google

Mistral AI

Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Grok-2 mini

Model Specifications

Core Specifications

Capabilities & License

Resources

Performance Insights

Model Comparison

Detailed Benchmarks

Coding

HumanEval

Knowledge

GPQA

MMLU

MATH

Non categorized

MMLU-Pro

MMMU

MathVista

DocVQA

Providers Pricing Coming Soon

Share your feedback

Stay Ahead with AI Updates