Efficient multimodal AI for text, image, audio, and video on low-resource devices.
50K+
Gemma 3n is a compact, multimodal AI model from Google DeepMind, designed for efficiency on low-resource devices. It supports text, image, audio, and video input, with open weights and support for over 140 languages. With optimized parameter usage and strong safety features, it builds on the Gemma family to extend lightweight, high-performance foundation models.
Gemma 3n models are designed for:
| Attribute | Details |
|---|---|
| Provider | Google DeepMind |
| Architecture | Gemma 3n |
| Cutoff date | June 2024 |
| Languages | 140+ |
| Tool calling | ❌ |
| Input modalities | Text |
| Output modalities | Text |
| License | Gemma Terms |
| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
|---|---|---|---|---|---|
ai/gemma3n:4B-F16 | 6.9B | MOSTLY_F16 | 33K tokens | 9.32 GiB | 12.79 GB |
¹: VRAM estimated based on model characteristics.
To run the model:
docker model pull ai/gemma3n
Then launch it:
docker model run ai/gemma3n
More details in the Docker Model Runner documentation.
| Category | Benchmark | 2B Value | 4B Value |
|---|---|---|---|
| General | MMLU | 60.1 | 64.9 |
| DROP | 53.9 | 60.8 | |
| BIG-Bench Hard | 44.3 | 52.9 | |
| ARC-Challenge (25-shot) | 51.7 | 61.6 | |
| Multilingual | Global-MMLU | 55.1 | 60.3 |
| MGSM | 53.1 | 60.7 | |
| STEM & Code | HumanEval | 66.5 | 75.0 |
| MBPP | 56.6 | 63.6 | |
| GPQA (Diamond) | 24.8 | 23.7 |
Content type
Model
Digest
sha256:f45ebd23a…
Size
8.3 GB
Last updated
8 months ago
docker model pull ai/gemma3n:2B-F16Pulls:
620
Last week