Jemma is specifically designed for local execution, with models small enough for phones, Raspberry Pis, and consumer GPUs. This enables applications that prioritize privacy, low latency, and offline functionality.
Google's release of Jemma 4 under the Apache 2.0 license, a direct response to community feedback, signals a strong commitment to open source. The "Gemiverse" concept encourages developers to fine-tune and build upon the base models for specialized tasks.
The discussion highlights a practical architecture where a small, local model like Jemma handles the majority (70-80%) of user tasks, while routing more complex queries to a larger, more capable cloud model like Gemini. This approach optimizes for cost, speed, and intelligence.
The Jemma family incorporates multimodal capabilities, with smaller models understanding audio, video, and images, and larger ones having advanced vision. This makes sophisticated, multi-sensory AI accessible to developers without requiring massive computational resources.
Jemma models possess agentic capabilities like function calling, allowing them to interact with APIs and control device functions (e.g., turning on a phone's flashlight). This demonstrates that even small, local models can perform useful, automated tasks.
Keep pulling the thread on Omar.