Google DeepMind has officially released Gemma 4, the latest generation of its open-source model family. Built on the same research as Gemini 3, Gemma 4 is designed to bring “frontier-level” intelligence to developers, offering a significant leap in reasoning, multimodal capabilities, and agentic workflows under a commercially permissive Apache 2.0 license.
What is Gemma 4?
Gemma 4 is a family of highly efficient, lightweight open models. Unlike its predecessors, it is natively multimodal, meaning it can process text, images, and audio (in specific variants) without needing external plugins. It comes in several sizes to balance performance and speed:
31B (Dense): The flagship model for complex logic and deep reasoning.
26B–A4B (MoE): A Mixture-of-Experts model that uses only 4B active parameters per task, offering high quality with 6x faster throughput.
E4B & E2B (Edge): Optimized for mobile and IoT devices, with the E2B model running up to 3x faster for low-latency tasks. also read:https://newz24india.in/strict-directives-issued-for-water-cleaning-in-government-medical-institutes/
Key Capabilities & Features
Gemma 4 introduces “Thinking Mode,” allowing the model to perform internal chain-of-thought reasoning before providing a final answer.
Massive Context Window: The edge models support 128K tokens, while the 26B and 31B versions boast a 256K context window, perfect for analyzing entire code repositories or long documents.
Native Multimodality: It excels at OCR (Optical Character Recognition), chart understanding, and handwriting recognition. The E2B/E4B models also support 30 seconds of native audio input for speech translation.
Agentic Workflows: With built-in support for function calling and structured JSON output, Gemma 4 can act as an autonomous agent, interacting with APIs to complete real-world tasks.
Global Fluency: Support for over 140 languages, making it one of the most linguistically diverse open models available.
How to Use Gemma 4 Right Now
You can start experimenting with Gemma 4 today across various platforms:
- Google AI Studio: The fastest way to test the 31B and 26B models for free in a web-based playground.
- Local Execution: Use Ollama, LM Studio, or llama.cpp to run Gemma 4 on your own hardware. The E2B model can run on as little as 5GB of RAM.
- Cloud Deployment: For enterprise scale, Gemma 4 is available on Vertex AI, Google Kubernetes Engine (GKE), and NVIDIA NIM for optimized inference on Blackwell GPUs
- Mobile Development: Android developers can access Gemma 4 through the AI Core Developer Preview or the ML Kit GenAI Prompt API.
- http://For More Hindi News: http://newz24india.com