Google will begin deploying Gemini on Wednesday, its new artificial intelligence (AI) model that is expected to allow the company to better compete with OpenAI (the creator of ChatGPT) and Microsoft, from applications aimed at the general public to to computing capacities for companies.
• Also read: Meta, IBM and their allies are calling for an “open” future for AI
• Also read: Could AI replace your job?
“It is our most consistent, most talented and also most general AI model,” assured Eli Collins, vice president of Google DeepMind, the Californian company’s AI research laboratory, during a presentation to the press.
He then posted a video in which a user shows Gemini objects, drawings and videos. The AI system verbally comments on what it “sees,” identifies objects, plays music, and answers questions that require a certain level of analysis, thereby justifying its “reasoning.”
Given the image of a rubber duck that has to choose between two paths – the left leads to another duck drawn on paper and the right to a threatening-looking bear – Gemini suggests the left path because “that’s how it is.” Better to make friends than enemies.”
The video also shows that Gemini can recognize references with very little context, such as a scene from the movie “The Matrix” played by a person pretending to dodge bullets in slow motion.
The new model is “multimedia from the start, has sophisticated reasoning capabilities and can program at an advanced level,” explained Eli Collins.
According to him, Gemini is the first AI model to outperform human experts in an industry-standard test, the “MMLU”, used to assess the thinking skills of these computer programs in various fields from mathematics to science, history and law.
Since the launch of ChatGPT a year ago, the giants of Silicon Valley have been engaged in a frantic race to develop so-called generative AI, which makes it possible to obtain texts, images or lines of code equivalent to those of humans, upon simple request in everyday language.
Google, the AI leader, was surprised by the phenomenal success of ChatGPT and responded notably with its own chatbot, Bard.
But everything happens at the level of models, the computer systems that underlie these applications, first stuffed with texts collected online and now fed with all kinds of data to process requests with images and discuss verbally with their users.
OpenAI said in September that it had added language and vision to ChatGPT to make it more “intuitive.”
Gemini: “This is another step towards our vision: to offer you the best AI collaborator in the world,” emphasized Sissie Hsiao, Google Vice President responsible for Bard, on Wednesday.
Bard has to be free from Wednesday, but always with written requests and only in English.
You will have to wait until 2024 for other functions and formats, such as extended help for solving mathematical problems.
Less known than ChatGPT, Bard has the opportunity to make up ground against its competitor that was a victim of its success: in mid-November, OpenAI actually suspended subscriptions to the paid version due to overwhelming demand.
Google will also give customers access to an initial version of Gemini in the cloud (remote computing) on December 13, including developers who use its Vertex AI platform to build their own AI applications.
In this area, the Internet giant is in direct competition with Microsoft, the main investor in OpenAI and world number 2 in the cloud behind Amazon.
The two American groups spent the year adding generative AI tools to their respective software (search engine, office and productivity software, cloud platform, etc.).
“This new era of models represents one of the greatest scientific and engineering efforts we have undertaken as a society,” Sundar Pichai, the head of Google, was quoted as saying in a press release.