Google introduced Gemini 1.5 Pro – a new, more efficient AI model

On Thursday, Google introduced Gemini 1.5 Pro, an improved AI model that surpasses the previous version in all respects.


In the blog, Google CEO Sundar Pichai and DeepMind CEO Demis Hassabis try to reassure audiences that AI is ethical and safe, while at the same time touting the rapidly growing capabilities of their models.

Our teams continue to push the boundaries of the latest models, putting safety first.

The company needs to emphasize security to AI skeptics and government regulators. But it’s also important to highlight the models’ growing performance for AI developers, potential clients and investors concerned about Google’s lack of response to the success of OpenAI’s ChatGPT, which yesterday unveiled an AI model for generating photorealistic videos.

According to Pichai and Hassabis, Gemini 1.5 Pro achieves comparable results to Gemini 1.0 Ultra, but is more efficient with less computational effort. Multimodal capabilities include text, image, video, audio, and code processing. With the development of AI, models will offer more and more universal functionality in a single query window.

Gemini 1.5 Pro can also process up to a million tokens – units of data for analysis in a single request. Google says the model is capable of processing more than 700,000 words, an hour of video, 11 hours of audio, and 30,000 lines of code. The company claims that it has already successfully tested the version for 10 million tokens.


The company notes that at large values, Gemini 1.5 Pro maintains high accuracy of results if it receives new data for training. In the “Needle In a Haystack” test, the model showed impressive results in searching for a piece of text within a large array of data. Google reported that Gemini 1.5 Pro found text with 99% accuracy, even in data blocks up to a million tokens long.

Google says Gemini 1.5 Pro can reason about various details in a 402-page transcript of the Apollo 11 mission to the moon. Plus, she analyzes the plot and events in the busy 44-minute silent film starring Buster Keaton.

Since the 1.5 million token context window is the first of its kind for a large model, we continue to develop new evaluation criteria to test its unusual capabilities.

At launch, Gemini 1.5 Pro supports 128,000 tokens, the same number that OpenAI’s publicly announced GPT-4 stopped at. Hassabis says the company will soon introduce pricing tags for requests up to 1 million tokens.

Gemini 1.5 Pro can also extract new skills from long queries without additional training. In the “Machine Translation from One Book” test, the model learned the grammar of the Kalamang language, which is spoken by fewer than 200 people in the world, and had not previously seen it. According to Google, Gemini 1.5 Pro performed translations at the level of a human learning the same material from English to Kalamang.

Google says Gemini 1.5 Pro can solve problems in longer code snippets.

When asked for 100,000 lines of code, the model understands better with examples, suggests useful modifications, and explains how different parts of the code work.

In terms of ethics and safety, Google promises the same responsible approach as for Gemini 1.0. This includes developing methods to test for problems, with the team essentially playing devil’s advocate by testing “a wide range of potential harms.” In addition, the company carefully checks areas such as content safety and bias.

Google is targeting Gemini 1.5 Pro for early access for developers and enterprise customers, with plans to make it more widely available in the future. And Gemini 1.0 is already available to consumers along with a paid Pro version for $20 per month.


