
Grok-3 Mini
xAI
Grok 3 Mini, the lightweight sibling of xAI’s flagship model, proves that “small” doesn’t have to mean “shallow.” Designed with a leaner architecture and optimized for speed, Mini still demonstrates impressive reasoning capabilities—particularly when enhanced with test-time inference via the Think setting. On AIME 2024, a benchmark of symbolic reasoning under pressure, Grok 3 Mini (Think) achieves a striking 95.8%, outperforming larger rivals like Gemini 2.0 Flash Thinking (73.3%) and even Grok 3 Beta (93.3%) itself. It also leads in code generation with 80.4% on LiveCodeBench, edging out Grok 3 Beta (79.4%) and surpassing Claude 3.5 Sonnet (66.3%) and GPT-4o (32.3%) by wide margins. Its 84.0% on GPQA confirms strong performance on graduate-level, adversarial reasoning tasks. This positions Mini not as a fallback, but as a front-line option for STEM-heavy tasks where speed, cost, and accuracy matter—like on-device tutoring, fast coding agents, or high-throughput evaluation pipelines. Critically, Grok 3 Mini (Think) benefits from large-scale RL fine-tuning, allowing it to engage in multi-step reasoning with error correction and backtracking—something previously reserved for frontier models. While it lacks the full contextual breadth of Grok 3 (1M token context, advanced multimodal fluency), Mini still posts competitive scores on MMLU-Pro (78.9%) and MMMU (69.4%), reflecting robust general knowledge and image understanding. In short, Grok 3 Mini isn't a "cut-down" model—it's a recalibrated one. For researchers and builders focused on compute efficiency without compromising reasoning depth, Grok 3 Mini (Think) offers an unusually capable tradeoff.
Model Specifications
Technical details and capabilities of Grok-3 Mini
Performance Insights
Check out how Grok-3 Mini handles various AI tasks through comprehensive benchmark results.
Model Comparison
See how Grok-3 Mini stacks up against other leading models across key performance metrics.
Detailed Benchmarks
Dive deeper into Grok-3 Mini's performance across specific task categories. Expand each section to see detailed metrics and comparisons.
Math
AIME 2024
AIME 2025
Coding
LiveCodeBench
Knowledge
GPQA
Providers Pricing Coming Soon
We're working on gathering comprehensive pricing data from all major providers for Grok-3 Mini. Compare costs across platforms to find the best pricing for your use case.
Share your feedback
Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.
Your feedback helps us improve our service