Gemma 3 27B

New

Released in the last 30 days

Google

Gemma 3 marks a decisive leap forward in open vision-language models. The 27B instruction-tuned variant (Gemma-3-27B-IT) delivers a 1338 Elo score on LMSys Chatbot Arena, ranking in the top 10 across all models and outperforming most open-weight peers—including DeepSeek-V3, Qwen2.5-72B, and even Meta’s 405B LLaMA 3.1. On standard benchmarks, it posts strong results: 67.5 on MMLU-Pro, 29.7 on LiveCodeBench, and 54.4 on Bird-SQL, showing robust reasoning and coding ability. Its 89.0 on MATH and 74.9 on FACTS Grounding reflect precision in symbolic tasks and factual alignment. This is enabled by a novel post-training pipeline blending distillation, RLHF (with reward models like BOND and WARP), and extensive multilingual tuning across 140+ languages using a 262K-token SentencePiece tokenizer. Architecturally, Gemma 3 introduces efficient long-context handling (up to 128K tokens) through RoPE scaling and 5:1 local-to-global attention layering—cutting KV cache memory by up to 85% vs. global-only designs without hurting perplexity. Multimodal inputs are powered by a frozen 400M SigLIP vision encoder and enhanced at inference with Pan & Scan, helping Gemma 3 excel on real-world image tasks (e.g., +17 points on InfoVQA with P&S). Its release spans 1B to 27B dense models with instruction-tuned and pre-trained variants—all deployable via Hugging Face, MLX, or llama.cpp. With day-zero support across tooling, near-SOTA performance, and strong safety benchmarks, Gemma 3 is a high-performing, accessible alternative to Gemini 1.5-Pro and a defining model in the open frontier.

Model Specifications

Technical details and capabilities of Gemma 3 27B

Core Specifications

27.4M Parameters

Model size and complexity

14.0B Training Tokens

Amount of data used in training

131.1K / 131.1K

Input / Output tokens

March 11, 2025

Last 30 Days

Release date

Capabilities & License

Multimodal Support

Supported

Web Hydrated

License

gemma

Resources

Research Paper

https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf

API Reference

https://huggingface.co/google/gemma-3-27b-it

Playground

https://huggingface.co/chat/models/google/gemma-3-27b-it

Code Repository

https://github.com/google/gemma_pytorch

Performance Insights

Check out how Gemma 3 27B handles various AI tasks through comprehensive benchmark results.

100

95.9

GSM8K

95.9

(96%)

90.4

IFEval

90.4

(90%)

MATH

(89%)

87.8

HumanEval

87.8

(88%)

87.6

BBH

87.6

(88%)

84.5

N2C

84.5

(85%)

76.9

MMLU

76.9

(77%)

75.1

GMMLU-Lite

75.1

(75%)

74.4

MBPP

74.4

(74%)

56.0

HiddenMath

56.0

(56%)

53.4

WMT24++

53.4

(53%)

LiveCodeBench

(39%)

19.3

BBEH

19.3

(19%)

16.7

ECLeKTic

16.7

(17%)

GSM8K

IFEval

MATH

HumanEval

BBH

N2C

MMLU

GMMLU-Lite

MBPP

HiddenMath

WMT24++

LiveCodeBench

BBEH

ECLeKTic

Model Comparison

See how Gemma 3 27B stacks up against other leading models across key performance metrics.

100

76.9

MMLU - Gemma 3 27B

76.9

(77%)

79.7

MMLU - Qwen2.5 14B Instruct

79.7

(80%)

83.3

MMLU - Qwen2.5 32B Instruct

83.3

(83%)

82.3

MMLU - Qwen2 72B Instruct

82.3

(82%)

78.9

MMLU - Phi-3.5-MoE-instruct

78.9

(79%)

75.1

MMLU - Qwen2.5-Coder 32B Instruct

75.1

(75%)

74.4

MBPP - Gemma 3 27B

74.4

(74%)

MBPP - Qwen2.5 14B Instruct

(82%)

MBPP - Qwen2.5 32B Instruct

(84%)

80.2

MBPP - Qwen2 72B Instruct

80.2

(80%)

80.8

MBPP - Phi-3.5-MoE-instruct

80.8

(81%)

90.2

MBPP - Qwen2.5-Coder 32B Instruct

90.2

(90%)

87.8

HumanEval - Gemma 3 27B

87.8

(88%)

83.5

HumanEval - Qwen2.5 14B Instruct

83.5

(84%)

88.4

HumanEval - Qwen2.5 32B Instruct

88.4

(88%)

HumanEval - Qwen2 72B Instruct

(86%)

70.7

HumanEval - Phi-3.5-MoE-instruct

70.7

(71%)

92.7

HumanEval - Qwen2.5-Coder 32B Instruct

92.7

(93%)

95.9

GSM8K - Gemma 3 27B

95.9

(96%)

94.8

GSM8K - Qwen2.5 14B Instruct

94.8

(95%)

95.9

GSM8K - Qwen2.5 32B Instruct

95.9

(96%)

91.1

GSM8K - Qwen2 72B Instruct

91.1

(91%)

88.7

GSM8K - Phi-3.5-MoE-instruct

88.7

(89%)

91.1

GSM8K - Qwen2.5-Coder 32B Instruct

91.1

(91%)

MATH - Gemma 3 27B

(89%)

MATH - Qwen2.5 14B Instruct

(80%)

83.1

MATH - Qwen2.5 32B Instruct

83.1

(83%)

59.7

MATH - Qwen2 72B Instruct

59.7

(60%)

59.5

MATH - Phi-3.5-MoE-instruct

59.5

(60%)

57.2

MATH - Qwen2.5-Coder 32B Instruct

57.2

(57%)

MMLU

MBPP

HumanEval

GSM8K

MATH

Gemma 3 27B

Qwen2.5 14B Instruct

Qwen2.5 32B Instruct

Qwen2 72B Instruct

Phi-3.5-MoE-instruct

Qwen2.5-Coder 32B Instruct

Detailed Benchmarks

Dive deeper into Gemma 3 27B's performance across specific task categories. Expand each section to see detailed metrics and comparisons.

Math

GSM8K

97.1%

Llama 3.1 405B Instruct

96.8%

96.4%

96.4%

95.9%

95.9%

95.8%

95.1%

95.0%

68.6%

Current model

Other models

Avg (93.3%)

Coding

MBPP

Qwen2.5-Coder 32B Instruct

90.2%

80.8%

80.2%

79.2%

78.2%

74.4%

Phi-3.5-mini-instruct

69.6%

Qwen2 7B Instruct

67.2%

Gemma 2 27B

62.6%

Gemma 2 9B

52.4%

Current model

Other models

Avg (73.5%)

HumanEval

Claude 3.5 Sonnet

93.7%

Qwen2.5-Coder 7B Instruct

88.4%

88.4%

88.1%

87.8%

87.2%

87.1%

86.6%

Ministral 8B Instruct

34.8%

Current model

Other models

Avg (83.0%)

LiveCodeBench

80.0%

55.5%

50.0%

41.8%

40.5%

39.0%

37.6%

35.1%

Qwen2.5-Coder 32B Instruct

31.4%

Qwen2.5-Coder 7B Instruct

18.2%

Current model

Other models

Avg (42.9%)

Knowledge

MMLU

91.8%

79.0%

78.9%

78.9%

77.6%

76.9%

75.7%

75.2%

75.2%

Llama 3.2 3B Instruct

63.4%

Current model

Other models

Avg (77.3%)

MATH

o3-mini

97.9%

96.4%

89.7%

89.0%

86.5%

85.5%

83.1%

83.1%

32.6%

Current model

Other models

Avg (82.6%)

Non categorized

HiddenMath

63.0%

56.0%

52.0%

47.2%

32.8%

Current model

Other models

Avg (50.2%)

BBH

87.6%

86.9%

84.5%

84.3%

82.4%

82.4%

79.5%

78.2%

Current model

Other models

Avg (83.2%)

IFEval

Claude 3.7 Sonnet

93.2%

Nova Pro

92.1%

Llama 3.3 70B Instruct

92.1%

Gemma 3 27B

90.4%

Command A

90.0%

Nova Lite

89.7%

Llama 3.1 405B Instruct

88.6%

Llama 3.1 70B Instruct

87.5%

Pixtral-12B

61.3%

Current model

Other models

Avg (87.2%)

Providers Pricing Coming Soon

We're working on gathering comprehensive pricing data from all major providers for Gemma 3 27B. Compare costs across platforms to find the best pricing for your use case.

OpenAI

Anthropic

Google

Mistral AI

Cohere

Share your feedback

Hi, I'm Charlie Palars, the founder of Deepranking.ai. I'm always looking for ways to improve the site and make it more useful for you. You can write me through this form or directly through X at @palarsio.

Your feedback helps us improve our service

Gemma 3 27B

Model Specifications

Core Specifications

Capabilities & License

Resources

Performance Insights

Model Comparison

Detailed Benchmarks

Math

GSM8K

Coding

MBPP

HumanEval

LiveCodeBench

Knowledge

MMLU

MATH

Non categorized

HiddenMath

BBH

IFEval

Providers Pricing Coming Soon

Share your feedback

Stay Ahead with AI Updates