For yet another year, Google focuses its developer event entirely on artificial intelligence, a day after OpenAI dazzled with GPT-4o.
The pressure on Google regarding artificial intelligence is evident. Just one day after OpenAI unveiled GPT-4o to the world and showed the world what they are capable of doing with multimodal AI like we have never seen before, Google has shown its advances in artificial intelligence during the Google I/O 2024.
A year has passed since the announcement of Bard, later renamed Gemini, and Google has integrated its artificial intelligence into many of its products. Still, with 2 billion users, OpenAI – ChatGPT in particular – remains the most important standard and AI in the world.
Sundar Pichai, the CEO of Google, commented during his presentation that more than 1.5 million developers use Gemini. And its AI is already integrated into millions of mobile phones directly in the Google application on Android and iOS, but it will also be integrated into Google Photos, where more than 6 billion videos and photos are uploaded every day.
Gemini in more flavors, integrated into all its products
Google’s first big announcement is that Gemini 1.5 Pro is now available to all developers worldwide, in 35 languages and has 2 million tokens of context, double the previous version.
Google demonstrated the Gemini 1.5 Pro with a multi-modal audio system with easy-to-understand, completely natural language that you can “annoy” by asking for more information. Extremely similar to the demos OpenAI made with GPT-4o, showing that Google is getting very close to OpenAI.
It also demonstrated its possibilities in Google Photoswhose mobile application will integrate a conversation window where you can ask questions about people or events in the photos.
Google also presented Gemini 1.5 Flasha model designed to be more efficient and cheaper for developers, especially interesting to integrate into chatbots or applications where you have to extract data from files and documents.
Google will also allow you to create custom versions of Geminiiterations trained for specific tasks, very similar to OpenAI’s GPT Store.
Google’s mission: organize the world’s information and make it universally accessible and useful… with the help of AI
Demis Hassabis, CEO of Google DeepMind – for the first time at a public Google event – announced Project Astra, a universal AI agent that is designed to be “truly useful in everyday life.” For now it is a prototype, but it is an assistant that, using the mobile camera, can identify and give context to what it is seeing.
He is able to remember what he has seen and provide context for what he is seeing, including solving problems. Google will integrate some of these features into the Gemini app by the end of the year.
Again, and this was a constant throughout the presentation, many functions similar to the possibilities of GPT-4o.
Of course, Google also showed advances in tools that compete with Midjourney, Sora or Suno.
- Image 3 is a new image generator using text messages, available in ImageFX for developers.
- I see It is a 1080p video generator using only promps of text. Integrates into VideoFX for developers.
- Music AI Sandboxa suite of AI tools for creating music, designed for artists.
Google wants to change the way we search on Google forever
The Google search engine evolves. Google recently integrated the “circle to search” function into Android on the Galaxy S24 and Pixel, although it is expanding to more phones. But how we search on Google is going to change forever.
Soon you will be able to perform searches by recording video and the search engine will automatically interpret what it is seeing and the audio, for example commenting on why something is not working as it should.
Google will give you a summarized answer based on the millions of information points it has scanned from websites, saving you a click, something that as a means of information such as Computer Todaywe don’t really know how to take it.
This is, possibly, the best example that Google has shown during the entire Google I/O – or rather Google I/A – of how, although perhaps with an AI slightly inferior to that of OpenAI, By being integrated into products that millions of people use, it is more useful.
Google does not lead in AI, it simply responds
Google’s fear of OpenAI is obvious, so much so that even taking into account the resources that Google has in the form of talent and money, the vast majority of the news presented at Google I/O seem like responses to OpenAI.
Everything that Google has shown during Google I/O has been impressiveno one can doubt it, but the fact that Google is not able to surpass OpenAI says a lot about the state of Google right now.
Image 3 and I Spy are responses to Midjourney and Sora. Gemini Gems is a response to GPT Store. Astra is very similar to the possibilities of GTP-4o with the mobile camera. Gemini 1.5 Flash, the fastest and most flexible model, is a direct response to the GPT 4 updates.
Google’s problem is called OpenAI and little by little the company led by Sundar Pichai is forced to invest more to keep up. The good news for Pichai is that OpenAI doesn’t have services that 2 billion people use.
Add Comment