ai/gemma3-qat

Verified Publisher

By Docker

Updated 6 months ago

Google’s latest Gemma, in its QAT (quantization aware trained) variant

Model
Machine learning & AI
21

100K+

ai/gemma3-qat repository overview

Gemma 3 QAT (Quantization Aware Trained) - Instruct

GGUF version by Unsloth

logo

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory.

Thanks to QAT, the model is able to preserve similar quality as bfloat16 while significantly reducing the memory requirements to load the model.

These are instruction tuned variants of the Gemma3 QAT models.

Gemma is a versatile AI model family designed for tasks like question answering, summarization, and reasoning. With open weights and responsible commercial use, it supports image-text input, a 128K token context, and over 140 languages.

Intended uses

Gemma 3 4B model can be used for:

  • Text generation: Create poems, scripts, code, marketing copy, and email drafts.
  • Chatbots and conversational AI: Enable virtual assistants and customer service bots.
  • Text summarization: Produce concise summaries of reports and research papers.
  • Image data extraction: Interpret and summarize visual data for text-based communication.
  • Language learning tools: Aid in grammar correction and interactive writing practice.
  • Knowledge exploration: Assist researchers by generating summaries and answering questions.

Characteristics

AttributeDetails
ProviderGoogle DeepMind
ArchitectureGemma3
Cutoff date-
Languages140 languages
Tool calling
Input modalitiesText, Image
Output modalitiesText, Code
LicenseGemma Terms

Available model variants

Model variantParametersQuantizationContext windowVRAM¹Size
ai/gemma3-qat:4B

ai/gemma3-qat:4B-UD-Q4_K_XL

ai/gemma3-qat:latest
4BMOSTLY_Q4_K_M131K tokens3.88 GiB2.36 GB
ai/gemma3-qat:270M-F16270MMOSTLY_F1633K tokens1.59 GiB511.46 MB
ai/gemma3-qat:27B-UD-Q4_K_XL27BMOSTLY_Q4_K_M131K tokens18.52 GiB15.66 GB
ai/gemma3-qat:4B-BF164BMOSTLY_BF16131K tokens8.75 GiB7.23 GB
ai/gemma3-qat:12B-Q4_K_M12BMOSTLY_Q4_K_M131K tokens9.28 GiB6.92 GB
ai/gemma3-qat:270M-UD-Q4_K_XL270MMOSTLY_Q4_K_M33K tokens1.33 GiB236.27 MB

¹: VRAM estimated based on model characteristics.

latest4B

Use this AI model with Docker Model Runner

First, pull the model:

docker model pull ai/gemma3-qat

Then run the model:

docker model run ai/gemma3-qat

For more information on Docker Model Runner, explore the documentation.

Benchmark performance

CategoryBenchmarkValue
GeneralMMLU59.6
GSM8K38.4
ARC-Challenge56.2
BIG-Bench Hard50.9
DROP60.1
STEM & CodeMATH24.2
MBPP46.0
HumanEval36.0
MultilingualMGSM34.7
Global-MMLU-Lite57.0
XQuAD (all)68.0
MultimodalVQAv263.9
TextVQA58.9
DocVQA72.8

Tag summary

Content type

Model

Digest

sha256:efe9562a8

Size

3.2 GB

Last updated

6 months ago

docker model pull ai/gemma3-qat

This week's pulls

Pulls:

968

Last week