Google releases Gemma 4 with Apache 2.0 license, enabling unrestricted local AI on devices

Reviewed byNidhi Govil

17 Sources

Share

Google has launched Gemma 4, its latest generation of open-weight AI models, marking a significant shift to the Apache 2.0 license from its previous restrictive terms. The release includes four model variants optimized for everything from smartphones to enterprise servers, with the 31B model ranking third on the Arena AI leaderboard. This licensing change removes commercial deployment barriers and positions Gemma 4 as a domestic alternative to Chinese open-weight models.

Google Shifts Strategy with Fully Permissive Licensing

Google has released Gemma 4, its latest generation of open-weight AI models, under the Apache 2.0 license—a dramatic departure from the restrictive custom license that governed predecessor Gemma 3

1

. This licensing shift grants developers and enterprises near-total freedom to use, modify, and redistribute the models for any purpose without royalty requirements, addressing long-standing frustrations with AI licensing restrictions

2

. The move enables enterprise and commercial use without fear of Google terminating access, making Gemma 4 a viable option for organizations with strict data privacy and sovereignty requirements

3

.

Source: Ars Technica

Source: Ars Technica

Developed by Google DeepMind using the same research and technology that powers Gemini 3, Gemma 4 arrives as Chinese competitors like Moonshot AI, Alibaba, and Z.AI flood the market with open-weight models rivaling OpenAI's GPT-5

3

. Google positions Gemma 4 as a domestic alternative that won't harvest sensitive corporate data to train future models, a critical consideration for healthcare providers and enterprises bound by regulatory restrictions.

Four Models Optimized for Different Deployment Scenarios

Gemma 4 comprises four distinct variants designed to address use cases ranging from edge devices to high-performance servers. The 31B Dense model focuses on maximizing output quality and currently ranks third on the Arena AI open-model leaderboard, behind only GLM-5 and Kimi 2.5

1

. Despite its capabilities, the 31B model is a fraction of the size of competing models, making local AI deployment significantly more cost-effective. This model can run unquantized in bfloat16 format on a single 80GB Nvidia H100 GPU, and when quantized to 4-bit precision, it fits on consumer graphics cards like the Nvidia RTX 4090

3

.

Source: Wccftech

Source: Wccftech

The 26B Mixture-of-Experts model prioritizes low latency over raw quality, activating only 3.8 billion of its 26 billion model parameters during inference to deliver higher tokens-per-second performance

1

. This architecture proves particularly valuable for applications requiring faster responses, such as coding assistants and agentic workflows, though the reduced active parameters do impact output quality compared to dense models

3

. Both larger models feature a 256,000-token context window, making them appropriate for complex code generation tasks

3

.

AI on Smartphones Without Internet Connectivity

The Effective 2B and Effective 4B models target mobile devices and edge devices like Raspberry Pi and Jetson Nano, developed through collaboration with the Pixel team, Qualcomm, and MediaTek

4

. These models use per-layer embeddings to reduce their effective size to 2.3 billion and 4.5 billion parameters respectively, despite having actual parameter counts of 5.1 billion and 8 billion

3

. This innovation enables on-device AI that runs entirely offline, using minimal memory during inference and consuming up to 60% less battery than previous versions

5

.

Source: Mashable

Source: Mashable

Google touts near-zero latency for these edge models, with the E2B running three times faster than the E4B

5

. Both support multi-modality, natively processing video, images, and audio inputs for speech recognition, with a 128,000-token context window

3

. These models will also serve as the foundation for Gemini Nano 4, Google's next-generation on-device model for Android devices launching later this year

5

.

Improved Reasoning and Enterprise-Ready Features

All Gemma 4 variants incorporate improved reasoning capabilities for mathematics and instruction-following, support for more than 140 languages, and native function calling for structured JSON output

3

. These enhancements position the models for agentic AI workflows where autonomous decision-making is required. Google claims significant performance improvements across AI benchmarks compared to Gemma 3, though the company advises taking vendor-supplied benchmarks with appropriate skepticism

3

.

Since the first Gemma release in February 2024, developers have downloaded the models over 400 million times, creating a vibrant ecosystem of more than 100,000 community variants

2

. The shift to a permissive license is expected to accelerate adoption rates further, particularly among enterprises that can now legitimately bundle the AI with products, services, and devices

2

.

Availability and Deployment Options

Gemma 4 is immediately available through Hugging Face, Kaggle, and Ollama, with the larger models accessible via Google AI Studio and edge models through AI Edge Gallery

5

. Google claims day-one support for more than a dozen inference frameworks including vLLM, SGLang, Llama.cpp, and MLX

3

. Hugging Face co-founder Clément Delangue described the Apache 2.0 licensing decision as "a huge milestone," while Google DeepMind CEO Demis Hassabis called the new models "the best open models in the world for their respective sizes"

5

.

While Gemma 4 carries the Apache 2.0 license, it remains "open-weight" rather than fully open-source, as Google has not released the complete training dataset, scripts, infrastructure code, or detailed methodologies required for full reproducibility

4

. For most developers, this distinction matters little, as the license still permits all forms of commercial use, modification, redistribution, and deployment with only attribution required

4

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo