- Google is building Gemini Nano, the smallest of its AI models, directly into the Chrome desktop client, starting with Chrome 126.
- Google is making it possible for many of the advanced APIs in Chrome to translate, add captions, and transcribe text in the browser using its Gemini model.
At the Google I/O 2024 developer conference on Tuesday, Google announced that it is building its smallest AI model, Gemini Nano, directly into the Chrome desktop client, starting with Chrome 126.
Gemini Nano AI model into Chrome on the desktop
Gemini, coming up with 3 types like Gemini Ultra, Gemini Pro and Gemini Nano, is Google’s long-promised, next-gen GenAI model family, developed by Google’s AI research labs DeepMind and Google Research. Gemini Nano, a smaller “distilled” model, runs on mobile devices like the Pixel 8 Pro. The company says it’s the recent work on WebGPU and WASM support in Chrome that enables these models to run at a reasonable speed on a wide set of hardware.
During a briefing prior to Tuesday’s announcement, Jon Dahlke, Google’s Chrome product management director, mentioned that discussions were ongoing with other browser providers to implement this capability—or a comparable one—in their respective browsers as well.
“We have started to engage with other browsers and will be opening up an early preview program for developers,” Dahlke wrote in Tuesday’s announcement. “With webGPU, WASM, and Gemini built into Chrome, we believe the web IS AI-ready. ”
Also read: Google’s Gemini expected to land on Android phones next year
Also read: A look at Alphabet’s Gemini, the AI model aiming to challenge ChatGPT-4
Writing helper
Google is enabling numerous advanced APIs within Chrome to utilize its Gemini model for tasks such as translation, captioning, and text transcription directly within the browser. According to the company, this will empower developers to leverage the on-device model for their own AI functionalities. Google intends to utilize this enhanced capability to support features such as the current “help me write” tool from Workspace Lab in Gmail.
So far, it powers a couple of features on the Pixel 8 Pro, Pixel 8 and Samsung Galaxy S24, including Summarize in Recorder and Smart Reply in Gboard. The Recorder app, which allows users to simply press a button to record and transcribe audio, now features a Gemini-driven summary of your recorded discussions, interviews, presentations, and other segments.
Dahlke said during developer keynote at I/O, “Now we want to give you access to Gemini models in Chrome. Our vision to give you the most power AI models in Chrome to reach billions of users without having to worry about prompt engineering, fine-tuning, capacity and cost. All you have to do is call a few high-level APIs – translate, caption, transcribe. This is a big shift for the web and we want to get it right.”