Presentation dedicated to Gemini at Google I/O 2024 – GOOGLE
May 14. (Portaltic/EP) –
Google is betting everything on artificial intelligence (AI) with Gemini, its family of large language models that paves the way to the universal intelligent assistant that the company has anticipated with Project Astra, but also with agents that perform tasks for users and the new capabilities offered by a larger context window in its flagship model, Gemini 1.5 Pro.
This Tuesday, Google held a new edition of its annual Google I/O developer event, in which it was confirmed that Gemini is its path to general artificial intelligence, that is, responsible AI that is useful to people in their lives. day to day.
Currently, the Gemini family of models powers the main AI functions of the company’s services. Gemini 1.5 Pro, with its context window of up to one million tokens, offers more advanced reasoning, planning and understanding.
Gemini 1.5 Pro is available from this Tuesday to all developers globally, and within the Gemini Advanced subscription, it will expand its context window to 2 million tokens by the end of the year, being first available to developers with a private preview.
With the help of Google DeepMind, the family of models grows with a new addition: Gemini 1.5 Flash, a lighter version than Pro optimized for common tasks such as summary or translation, which can be tested in Google AI studio and Vertex AI with a million tokens.
Gemini also powers agents, intelligent systems that show reasoning, planning and memory capacity to help the user in a wide variety of tasks, with the support of Google services such as Gmail or Chrome.
Likewise, it has updated the generative AI tools that fall under Generative Media, dedicated to the creation of images, music and videos, and on which it has worked in recent months.
Image 3, in tests in Labs, now offers a more photorealistic result, since it creates images with great detail and quality from descriptions that the user can complete with all the nuances they want to add.
Music AI Sandbox, for its part, offers a set of AI tools for creating professional-quality songs, while Veo generates high-quality video (1080p) from text, images and video proposals, and incorporates effects with the experimental VideoFx feature.
Google DeepMind has also previewed Project Astra, what the company hopes will be a true universal assistant in the future. In the demonstration that has been shared, the company has pointed out that it is a multimodal assistant built from Gemini that sees the world through the smartphone’s camera so that the user can ask about it.
GEMMA 2
On the other hand, Google has presented the sixth generation of Tensor processing units (TPU), Trillim, which increases the peak computing by 4.7 times and is behind the training of models such as Gemini 1.5 Flash, Image 3, but also Gemma 2 .
Google’s family of open source language models has been expanded with PaliGemma, a visual language model for performing tasks involving images, subtitles, visual questions, understanding text in images, etc.
It will soon be expanded with Gemma 2, a new generation that will be available with 27 billion parameters (27B), a size that offers performance on par with Meta’s Llama 3, which has 70 billion parameters. It is optimized to run on Nvidia GPUs on a single TPU on Vertex AI.
Add Comment