Curated by THEOUTPOST
On Tue, 8 Apr, 12:02 AM UTC
8 Sources
[1]
Google's best Gemini AI model is showing up in more places
Google made waves with the release of Gemini 2.5 last month, rocketing to the top of the AI leaderboard after previously struggling to keep up with the likes of OpenAI. That first experimental model was just the beginning. Google is deploying its improved AI in more places across its ecosystem, from the developer-centric Vertex AI to the consumer Gemini app. Gemini models have been dropping so quickly, it can be hard to grasp Google's intended lineup. Things are becoming clearer now that the company is beginning to move its products to the new branch. At the Google Cloud Next conference, it has announced initial availability of Gemini 2.5 Flash. This model is based on the same code as Gemini 2.5 Pro, but it's faster and cheaper to run. You won't see Gemini 2.5 Flash in the Gemini app just yet -- it's starting out in the Vertex AI development platform. The experimental wide release of Pro helped Google gather data and see how people interacted with the new model, and that has helped inform the development of 2.5 Flash. The Flash versions of Gemini are smaller than the Pro versions, though Google doesn't like to talk about specific parameter counts. Flash models provide faster answers for simpler prompts, which has the side effect of reducing costs. We do know that 2.5 Pro (Experimental) was the first Gemini model to implement dynamic thinking, a technique that allows the model to modulate the amount of simulated reasoning that goes into an answer. 2.5 Flash is also a thinking model, but it's a bit more advanced. We recently spoke with Google's Tulsee Doshi, who noted that the 2.5 Pro (Experimental) release was still prone to "overthinking" its responses to simple queries. However, the plan was to further improve dynamic thinking for the final release, and the team also hoped to give developers more control over the feature. That appears to be happening with Gemini 2.5 Flash, which includes "dynamic and controllable reasoning." The newest Gemini models will choose a "thinking budget" based on the complexity of the prompt. This helps reduce wait times and processing for 2.5 Flash. Developers even get granular control over the budget to lower costs and speed things along where appropriate. Gemini 2.5 models are also getting supervised tuning and context caching for Vertex AI in the coming weeks. In addition to the arrival of Gemini 2.5 Flash, the larger Pro model has picked up a new gig. Google's largest Gemini model is now powering its Deep Research tool, which was previously running Gemini 2.0 Pro. Deep Research lets you explore a topic in greater detail simply by entering a prompt. The agent then goes out into the Internet to collect data and synthesize a lengthy report. Google says that the move to Gemini 2.5 has boosted the accuracy and usefulness of Deep Research. The graphic above shows Google's alleged advantage compared to OpenAI's deep research tool. These stats are based on user evaluations (not synthetic benchmarks) and show a greater than 2-to-1 preference for Gemini 2.5 Pro reports. Deep Research is available for limited use on non-paid accounts, but you won't get the latest model. Deep Research with 2.5 Pro is currently limited to Gemini Advanced subscribers. However, we expect before long that all models in the Gemini app will move to the 2.5 branch. With dynamic reasoning and new TPUs, Google could begin lowering the sky-high costs that have thus far made generative AI unprofitable.
[2]
Google's newest Gemini AI model focuses on efficiency | TechCrunch
Google is releasing a new AI model designed to deliver strong performance with a focus on efficiency. The model, Gemini 2.5 Flash, will soon launch in Vertex AI, Google's AI development platform. The company says it offers "dynamic and controllable" computing, allowing developers to adjust processing time based on the complexity of queries. "[You can tune] the speed, accuracy, and cost balance for your specific needs," Google wrote in a blog post provided to TechCrunch. "This flexibility is key to optimizing Flash performance in high-volume, cost-sensitive applications." Gemini 2.5 Flash arrives as the cost of flagship AI models continues trending upward. Lower-priced, performant models like 2.5 Flash present an attractive alternative to costly top-of-the-line options at the cost of some accuracy. Gemini 2.5 Flash is a "reasoning" model along the lines of OpenAI's o3-mini and DeepSeek's R1. That means it takes a bit longer to answer questions in order to fact-check itself. Google says that 2.5 Flash is ideal for "high-volume" and "real-time" applications like customer service and document parsing. "This workhorse model is optimized specifically for low latency and reduced cost," Google said in its blog post. "It's the ideal engine for responsive virtual assistants and real-time summarization tools where efficiency at scale is key." Google didn't publish a safety or technical report for Gemini 2.5 Flash, making it more challenging to see where the model excels and falls short. The company previously told TechCrunch that it doesn't release reports for models it considers to be "experimental." Google also announced on Wednesday that it plans to bring Gemini models like 2.5 Flash to on-premises environments starting in Q3. The company's Gemini models will be available on Google Distributed Cloud (GDC), Google's on-prem solution for clients with strict data governance requirements. Google says it's working with Nvidia to bring Gemini models to GDC-compliant Nvidia Blackwell systems that customers can purchase through Google or their preferred channels.
[3]
Google debuts more Gemini updates: New Workspace tools, Gemini 2.5 Flash, and agentic AI
To no one's surprise, there have been a lot of AI-related announcements at the Google Cloud Next event. Even less surprising: Google's annual cloud computing conference has focused on new versions of its flagship Gemini model and advances in AI agents. So, for those following the whiplash competition between AI heavy hitters like Google and OpenAI, let's unpack the latest Gemini updates. On Wednesday, Google announced Gemini 2.5 Flash, a "workhorse" that has been adapted from its most advanced Gemini 2.5 Pro model. Gemini 2.5 Flash has the same build as 2.5 Pro but has been optimized to be faster and cheaper to run. The model's speed and cost-efficiency are possible because of its ability to adjust or "budget" its processing power based on the desired task. This concept, known as "test-time compute," is the emerging technique that reportedly made DeepSeek's R1 model so cheap to train. Gemini 2.5 Flash isn't available just yet, but it's coming soon to Vertex AI, AI Studio, and the standalone Gemini app. On a related note, Gemini 2.5 Pro is now available in public preview on Vertex AI and the Gemini app. This is the model that has recently topped the leaderboards in the Chatbot Arena. Google is also bringing these models to Google Workspace for new productivity-related AI features. That includes the ability to create audio versions of Google Docs, automated data analysis in Google Sheets, and something called Google Workspace Flows, a way of automating manual workflows like managing customer service requests across Workspace apps. Agentic AI, a more advanced form of AI that reasons across multiple steps, is the main driver of the new Google Workspace features. But it's a challenge for all models to access the requisite data to perform tasks. Yesterday, Google announced that it's adopting the Model Context Protocol (MCP), an open-source standard developed by Anthropic that enables "secure, two-way connections between [developers'] data sources and AI-powered tools," as Anthropic explained. "Developers can either expose their data through MCP servers or build AI applications (MCP clients) that connect to these servers," read a 2024 Anthropic announcement describing how it works. Now, according to Google DeepMind CEO Demis Hassabis, Google is adopting MCP for its Gemini models. This Tweet is currently unavailable. It might be loading or has been removed. This will effectively allow Gemini models to quickly access the data they need, producing more reliable responses. Notably, OpenAI has also adopted MCP. And that was just the first day of Google Cloud Next. Day two will likely bring even more announcements, so stay tuned.
[4]
Google Brings Key Updates to AI Agents, Announces Gemini 2.5 Flash
The company announced several new techniques and frameworks to build and deploy AI agents. At the Google Cloud Next 2025 event on Wednesday, Google announced numerous key updates regarding AI agents. The newly announced tools, protocols and frameworks help developers and enterprises effectively build, connect, and deploy agentic systems across their ecosystems. Furthermore, the company also announced another family in the Gemini 2.5 family of models -- the Gemini 2.5 Flash. To begin with, Google announced the new Agent2Agent (A2A) protocol, which facilitates the connection of AI agents within an enterprise ecosystem. The protocol enables AI agents to communicate with one another, irrespective of the framework or vendor on which they are built. "A2A facilitates communication between a 'client' agent and a 'remote' agent. A client agent is responsible for formulating and communicating tasks, while the remote agent is responsible for acting on those tasks in an attempt to provide the correct information or take the correct action," Google stated. Moreover, A2A is an open protocol that complements Anthropic's Model Context Protocol (MCP), which helps connect data sources with AI systems. The company has already established partnerships with over 50 organisations, including Deloitte, Accenture, Oracle, Salesforce, and others, to collaborate on the protocol. At the event, Google announced the Agent Development Kit (ADK), an open-source framework for designing AI agents and multi-agent systems while maintaining control over each other. While it currently supports Python, more languages will be supported later this year. The company also announced 'Agent Garden', a collection of pre-built agent patterns and components to 'accelerate' the development process. ADK also allows agents to be connected directly to enterprise systems. "This includes over 100 pre-built connectors, workflows built with Application Integration, or data stored within your systems like AlloyDB, BigQuery, NetApp and much more," Google said in the announcement. Moreover, the company announced Agent Engine, which allows developers to deploy AI agents built using any framework and AI model. "No more rebuilding your agent system when moving from prototype to production. Agent Engine handles agent context, infrastructure management, scaling complexities, security, evaluation, and monitoring," it added. Agents hosted on Agent Engine can also be registered on Google Agentspace, for which the company also announced a few updates. Google Agentspace offers an enterprise search agent that answers questions using company data, alongside a central hub for custom AI agents that help employees research, create content, and automate tasks. Google also announced that, starting today, Agentspace will be integrated with Chrome Enterprise edition. This means employees can utilise its search capabilities directly from Chrome. Google is also introducing a new Agent Gallery, which gives employees a single view of all available agents across the enterprise. In addition, the company is also introducing an Agent Designer in preview, which offers a no-code interface for creating agents that connect to enterprise data sources to complement various tasks. Moreover, at the event, Google announced the Gemini 2.5 Flash model. It will be available on Google's Vertex AI and Google AI Studio. Google promotes the Gemini 2.5 Flash as a "workhorse model", designed for low latency and cost efficiency. The model is also said to adjust reasoning time based on query complexity. Users can control reasoning by adjusting speed, accuracy, and cost. The Gemini 2.5 Flash is the successor to the Gemini 2.0 Flash, which outperforms Llama 3.3 70B and OpenAI's GPT-4o mini models. Recently, Google released the Gemini 2.5 Pro, which leads several benchmarks as the top frontier AI model. Furthermore, numerous developers across the internet have hailed the Gemini 2.5 Pro as the best AI model for coding. To help users choose the appropriate model, Google is also introducing an experimental Vertex AI Model Optimiser that generates the highest quality response for each prompt based on the desired balance of cost and quality. Google also announced that Gemini will be available on Google Distributed Cloud (GDC), with public preview starting in Q3 2025. The company has partnered with NVIDIA to bring Gemini models to NVIDIA Blackwell Systems. GDC extends Google Cloud's infrastructure and services to on-premises data centers and edge locations, enabling organisations to run applications closer to their data sources. Furthermore, Google unveiled the seventh generation of its tensor processing unit (TPU), Ironwood. The TPU supports two configurations -- one with 256 chips and another with 9,216 chips. Ironwood's performance/watt ratio is two times greater than the sixth generation, Trillium. Ironwood also offers 192 GB per chip, six times that of Trillium.
[5]
Google's new Gemini 2.5 Flash is proof AI doesn't have to be slow
Google is rolling out Gemini 2.5 Flash, a speedier, more efficient AI model announced at the Google Cloud Next conference, expanding the reach of its latest AI architecture beyond its initial experimental phase. After Gemini 2.5 Pro turned heads last month, Google is now deploying improved AI across its ecosystem, clarifying its model lineup. Flash joins Pro, starting life on the developer-focused Vertex AI platform rather than the consumer Gemini app. Gemini 2.5 Flash stems from the same base code as Gemini 2.5 Pro but is designed to be faster and cheaper to operate. Google gathered user interaction data from the experimental release of Pro, which helped shape the development of Flash. Flash models are smaller than their Pro counterparts, accelerating answers for simpler queries and reducing operational costs, though Google doesn't disclose specific parameter counts. Both 2.5 Pro and Flash feature dynamic thinking, enabling the AI to adjust its simulated reasoning effort based on the query. According to Ars Tecnica, the initial experimental 2.5 Pro occasionally "overthought" simple requests, states Google's Tulsee Doshi. Gemini 2.5 Flash incorporates more advanced "dynamic and controllable reasoning" to address this, choosing a "thinking budget" relative to prompt complexity to cut down wait times and processing needs. Developers using Vertex AI gain granular control over this thinking budget, allowing further cost reduction and speed optimization. Google also plans to add supervised tuning and context caching for Gemini 2.5 models on Vertex AI in the coming weeks. Separately, the larger Gemini 2.5 Pro model now powers Google's Deep Research tool, upgrading it from the previous Gemini 2.0 Pro. Deep Research uses prompts to gather internet data and synthesize detailed reports on a topic. Google states the upgrade to Gemini 2.5 Pro has enhanced the accuracy and utility of Deep Research reports. Citing user evaluations, Google claims a greater than two-to-one preference for its reports compared to those from OpenAI's similar tool. While Deep Research is available for limited use on free accounts, the version running Gemini 2.5 Pro is currently restricted to Gemini Advanced subscribers.
[6]
Google Claims Its New Gemini AI Is Smarter and Cheaper Than OpenAI's Best
On April 4, Google made its most powerful AI model yet, Gemini 2.5 Pro, available to software developers. The new model outperforms offerings from OpenAI and Anthropic across several benchmarks while being cheaper to use. Google first revealed Gemini 2.5 Pro in late March, the first in a line of Gemini 2.5 models. You can expect other models in the 2.5 line to have names that denote their price and capabilities, similar to Gemini 2.0 Flash, a lower-cost model revealed in February. All models in the 2.5 line will be thinking models, meaning they can reason through the best way to answer a question by using an internal dialogue. Gemini 2.5 Pro was initially only available on the Gemini app and website, but is now available for commercial use through an API. According to Google, Gemini 2.5 Pro exhibits high levels of capability in math and science. The model outperformed OpenAI, Anthropic, xAI, and DeepSeek's latest models in benchmarks that test high-level science and math. And in a benchmark meant to test the model's agentic coding skills, Gemini 2.5 Pro came in second, behind only Anthropic's Claude 3.7 Sonnet. In brief, 2.5 Pro seems to be a whiz at those subjects. In a blog post announcing Gemini 2.5 Pro's API debut, Google senior product manager Logan Kilpatrick wrote that the model had been "priced competitively." Like most other AI APIs, developers will need to pay a fee to Google every time the model processes a new input and creates a new output. This is done through a process called "tokenization," in which input data is broken up into a series of "tokens," to be processed by the model. The number of tokens in an input/output determines the size of the API fee. Essentially, more data means more tokens, which means more money. Google lists the precise pricing scheme in the blog post.
[7]
Gemini 2.5 Pro Now Available in Public Preview With Higher Rate Limits
Google said Gemini 2.5 Pro will soon be rolled out to Vertex AI The tech giant will charge higher prices after using 2,00,000 tokens Gemini 2.5 Pro is also available to users via Gemini Advanced Google moved the Gemini 2.5 Pro artificial intelligence (AI) model to public preview last week. The Mountain View-based tech giant released its latest flagship tier AI model last month as an experimental preview with limited rate limits. But now, those using the Gemini application programming interface (API) or accessing the model via the Google AI Studio will get higher rate limits. The company also announced the pricing for the higher rate limits, and claimed that it is keeping them competitive. In a blog post, Google said that it witnessed an early adoption of Gemini 2.5 Pro from developers, and due to this high demand, it is now expanding the access of the large language model (LLM) to developers. The tech giant said that the foundation model is now available as a public preview in the Gemini API in Google Studio with higher rate limits. Notably, the AI model is yet to be released on Vertex AI, and the company said it will be added to the platform shortly. Coming to the pricing, Gemini 2.5 Pro is now available in two pricing tiers. The first is available with regular rate limits of 2,00,000 tokens or less. The second price tier kicks in once a developer has exhausted the regular rate limit and goes beyond that. Within the regular rate limit, Gemini 2.5 Pro will cost $1.25 (roughly Rs. 107) per million input tokens and $10 (roughly Rs. 858) per million output tokens. The input tokens include text, images, audio, and videos, and the output price is adjusted for reasoning tokens. Beyond the 2,00,000 token rate limit, developers will be charged $2.5 (roughly Rs. 214) for every one million input tokens and $15 (roughly Rs. 1,290) per million output tokens. Notably, the experimental version of the Gemini model continues to remain free with lower rate limits. The tech giant says the model has been priced "competitively," and comparing its prices to competitors, the Gemini 2.5 Pro model does appear to be significantly cheaper. For instance, Anthropic's flagship model, Claude 3.7 Sonnet, costs $3 per million input tokens and $15 per million output tokens. Google's major rival in the AI space, OpenAI, charges even higher for its reasoning models. The o1 AI model has an input price of $15 per million input tokens and $60 (roughly Rs. 5,150) per million output tokens. However, the company does make a provision for cached input tokens and charges a discounted $7.5 (roughly Rs. 640) per million.
[8]
Gemini 2.5 Flash Unveiled as a Faster and Cheaper AI Model for Developers
Gemini 2.5 Flash comes with native reasoning capability Google said it will be added to Vertex AI and AI Studio soon Gemini 2.5 Flash can also be used to build AI agents Google released its second artificial intelligence (AI) model in the Gemini 2.5 family on Thursday. Dubbed Gemini 2.5 Flash, it is a cost-efficient low-latency model which is designed for tasks requiring real-time inference, conversations at scale, and those which are generalistic in nature. The Mountain View-based tech giant will soon make the AI model available on both the Google AI Studio as well as Vertex AI to help users and developers access the Gemini 2.5 Flash, and build applications and agents using it. In a blog post, the tech giant detailed its latest large language model (LLM). Alongside announcing the debut of the Flash model, the post also confirmed that the Gemini 2.5 Pro model is now available on Vertex AI. Differentiating between the use cases of the two models, Google said the Pro model is ideal for tasks that require intricate knowledge, multi-step analyses, and making nuanced decisions. On the other hand, the Flash model prioritises speed, low latency, and cost efficiency. Calling it a workhorse model, the tech giant said it is an "ideal engine for responsive virtual assistants and real-time summarisation tools where efficiency at scale is key." While launching the 2.5 Pro model, Google had specified that all LLMs in this series would feature natively built reasoning or "thinking" capability. This means the 2.5 Flash also comes with "dynamic and controllable reasoning." Developers can adjust the processing time for a query based on the complexity, enabling them to get a granular control over the response generation times. For its enterprise clients, Google is also introducing the Vertex AI Model Optimiser tool. Available as an experimental feature within the platform, it takes away the confusion of choosing a specific model when users are not sure. The feature can automatically generate the highest-quality response for each prompt based on factors such as quality and cost. Google did not release a technical paper or model information card alongside the release, so information about its architecture, pre- and post-training processes, and benchmark scores are not known. The company might release it at a later time while making the model available to end consumers. Meanwhile, the tech giant is also adding new tools to support agentic application building on Vertex AI. The company is adding a new Live application programming interface (API) for Gemini models that will allow AI agents to process streaming audio, video, and text with low latency to let it complete tasks in real-time. The Live API, which is powered by Gemini 2.5 Pro, also supports resumable sessions longer than 30 minutes, multilingual audio output, time-stamped transcripts for analysis, tool integration, and more.
Share
Share
Copy Link
Google introduces Gemini 2.5 Flash, a new AI model optimized for speed and efficiency, alongside updates to its AI ecosystem and agent technologies.
Google has unveiled its latest AI model, Gemini 2.5 Flash, at the Google Cloud Next conference, marking a significant advancement in the company's AI capabilities. This new model, derived from the same base code as Gemini 2.5 Pro, is designed to be faster and more cost-effective, addressing the growing demand for efficient AI solutions 12.
Gemini 2.5 Flash introduces several innovative features:
Dynamic and Controllable Reasoning: The model can adjust its "thinking budget" based on the complexity of the query, reducing processing time and costs for simpler tasks 14.
Optimized Performance: Flash models are smaller than their Pro counterparts, enabling quicker responses to straightforward prompts 1.
Developer Control: Through Vertex AI, developers can fine-tune the balance between speed, accuracy, and cost, making it ideal for high-volume, cost-sensitive applications 24.
Initially, Gemini 2.5 Flash will be available on Google's Vertex AI and AI Studio platforms 4. The company plans to integrate it into the consumer-facing Gemini app in the future 1. Google is also working to bring Gemini models to on-premises environments through Google Distributed Cloud (GDC) starting in Q3 2025 24.
Alongside Gemini 2.5 Flash, Google announced several updates to its AI agent technologies:
Agent2Agent (A2A) Protocol: This new protocol facilitates communication between AI agents within an enterprise ecosystem, regardless of their underlying framework or vendor 4.
Agent Development Kit (ADK): An open-source framework for designing AI agents and multi-agent systems 4.
Agent Engine: A platform that allows developers to deploy AI agents built using any framework and AI model 4.
The introduction of Gemini 2.5 Flash and the updates to AI agent technologies significantly enhance Google's AI offerings:
Improved Deep Research Tool: Gemini 2.5 Pro now powers Google's Deep Research tool, boosting its accuracy and usefulness in generating detailed reports 13.
Workspace Integration: New AI features in Google Workspace, such as audio versions of Google Docs and automated data analysis in Sheets, leverage the capabilities of Gemini models 3.
Model Context Protocol (MCP) Adoption: Google is implementing this open-source standard to enable secure connections between data sources and AI-powered tools 3.
The release of Gemini 2.5 Flash and the associated AI agent updates represent Google's strategic move to compete more effectively in the AI market:
Cost Efficiency: By focusing on speed and efficiency, Google addresses the growing concern over the high costs associated with running advanced AI models 12.
Enterprise Focus: The new offerings cater to businesses seeking to integrate AI into their operations without compromising on performance or incurring excessive costs 24.
Competitive Positioning: These advancements help Google maintain its position as a leading AI provider, competing directly with companies like OpenAI and Anthropic 13.
As the AI landscape continues to evolve rapidly, Google's latest innovations demonstrate its commitment to developing more efficient, versatile, and accessible AI technologies for both developers and end-users.
Reference
[1]
[3]
[4]
Google introduces new Gemini 2.0 models, including Flash, Pro Experimental, and Flash-Lite, offering improved performance, expanded capabilities, and cost-effective options for developers and users across various AI tasks.
41 Sources
41 Sources
Google has launched Gemini 1.5 Flash-8B, a smaller and faster version of its Gemini AI model, offering high performance at the lowest cost in the Gemini family. This new model is designed for efficiency and affordability in AI development.
3 Sources
3 Sources
Google has announced the release of new Gemini models, showcasing advancements in AI technology. These models promise improved performance and capabilities across various applications.
2 Sources
2 Sources
Google has made its latest AI model, Gemini 2.5 Pro (Experimental), available to free users with rate limits, showcasing improved reasoning capabilities and topping performance benchmarks.
18 Sources
18 Sources
Google has announced significant updates to its AI offerings, including the integration of Gemini 1.5 into enterprise contact centers and new AI-powered features for Google Workspace. These advancements aim to revolutionize customer engagement and boost productivity in the workplace.
9 Sources
9 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved