As pressure grows on Google to explain how it plans to monetize artificial intelligence, the company is launching what it believes to be its largest and most capable model.
Three sizes will be available for the large language model Gemini: Gemini Ultra, which is the largest and most capable category; Gemini Pro, which scales across a variety of tasks; and Gemini Nano, which is reserved for specialized tasks and mobile devices.
For the time being, the company intends to license Gemini to clients via Google Cloud so they can utilize it in their apps. Developers and enterprise clients can use Google Cloud Vertex AI or Google AI Studio’s Gemini API to access Gemini Pro as of December 13. Gemini Nano will also be available for use by Android developers. Additionally, Gemini will power Google products like its Bard chatbot and Search Generative Experience (SGE), which attempts to provide conversational-style text responses to search queries (though SGE is not yet generally available).
Businesses and corporations could use it to identify trends for product advertising, as well as for more sophisticated customer service engagement through chatbots and product recommendations. Gemini can also be used for productivity apps that need to produce code for developers or summarize meetings, or for content creation in the case of a business wanting to write blog posts or marketing campaigns.
The company provided examples, such as how Gemini could snap a picture of a chart, analyze hundreds of research pages, and then update the chart. Analyzing a photo of someone’s math homework and pointing out the right answers and incorrect ones was another example.
The company announced in a blog post that Gemini Ultra is the first model to surpass human experts on MMLU (massive multitask language understanding), which tests both problem-solving and general knowledge across 57 subjects including math, physics, history, law, medicine, and ethics. It is said to be able to comprehend reasoning and subtleties in complicated subjects.
The CEO of Google Sundar Pichai wrote in a blog post, “Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video.”
Gemini Pro will be used by Google’s chatbot Bard to assist with advanced reasoning, planning, understanding, and other skills starting today. On a call with reporters on Tuesday, executives announced that “Bard Advanced,” which will use Gemini Ultra, will launch early in the following year. It is the most significant update to Bard, the chatbot that mimics ChatGPT.
Eight months have passed since the search engine behemoth debuted Bard, and a year has passed since OpenAI introduced ChatGPT on GPT-3.5. The startup headed by Sam Altman introduced GPT-4 in March of this year. On Tuesday, executives declared that Gemini Pro performed better than GPT-3.5, but they avoided discussing how it compared to GPT-4.
However, a white paper Google published claims that in a few benchmarks, Gemini’s Ultra model performed better than GPT-4.
Eli Collins, vice president of product at Google DeepMind, said during a press briefing that he “suspects” Gemini has any novel capabilities in comparison to current generation LLMs, but that research is still ongoing to determine these capabilities.
Google allegedly delayed the release of Gemini due to its unpreparedness, which brought back memories of the company’s problematic rollout of its artificial intelligence tools at the start of the year.
Collins responded that testing the more sophisticated models takes longer when several reporters questioned about the delay. Collins claimed that Gemini has “the most comprehensive safety evaluations” of any Google model and is the most thoroughly tested AI model the company has ever created.
Collins stated that Gemini Ultra is substantially less expensive to serve even though it is the company’s largest model. He declared, “It’s more efficient—not just more capable. While training Gemini still takes a lot of processing power, we’re making great progress in training these models.”
Collins stated that the perimeter count will not be released, but the company will provide a technical white paper that includes more model details.
Google also unveiled its next-generation tensor processing unit for AI model training. According to Google, the TPU v5p chip offers better performance for the price than the TPU v4 chip, which was announced in 2021. Salesforce and startup Lightricks have started using it. However, the business withheld details regarding performance in comparison to Nvidia, the industry leader.
The chip announcement follows the demonstration of custom silicon aimed at AI by cloud rivals Microsoft and Amazon a few weeks ago.
Investors pressed Google executives further during the company’s third-quarter earnings conference call in October, asking more questions about how AI will be used to generate real profit.
Search is still a significant source of revenue for Google, which is why in August it introduced Search Generative Experience, or SGE, as an “early experiment” to show users what a generative AI experience would look like when using the search engine. As a result, the outcome is more conversational, in line with the chatbot era. It is still in the experimental stage and has not yet been made available to the general public.
Since May, when the company first revealed the experiment at its annual developer conference, Google I/O, investors have been requesting a timeline for SGE. SGE was hardly mentioned in the Gemini announcement, and executives only stated that Gemini would be integrated into it “in the next year” when asked about plans to launch to the public.
Pichai said in a blog post, “This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company. I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.”