Curated by THEOUTPOST
On Fri, 4 Oct, 4:03 PM UTC
3 Sources
[1]
Gemini 1.5 Flash-8B Becomes the Cheapest Gemini-Powered AI Model
Gemini 1.5 Flash-8B is said to be optimised for speed and efficiency Gemini 1.5 Flash-8B, the latest entrant in the Gemini family of artificial intelligence (AI) models, is now generally available for production use. On Thursday, Google announced the general availability of the model, highlighting that it was a smaller and faster version of the Gemini 1.5 Flash which was introduced at Google I/O. Due to being fast, it has a low latency inference and more efficient output generation. More importantly, the tech giant stated that the Flash-8B AI model is the "lowest cost per intelligence of any Gemini model". In a developer blog post, the Mountain View-based tech giant detailed the new AI model. The Gemini 1.5 Flash-8B was distilled from the Gemini 1.5 Flash AI model, which was focused on faster processing and more efficient output generation. The company now claims that Google DeepMind developed this even smaller and faster version of the AI model in the last few months. Despite being a smaller model, the tech giant claims that it "nearly matches" the performance of the 1.5 Flash model across multiple benchmarks. Some of these include chat, transcription, and long context language translation. One major benefit of the AI model is its price effectiveness. Google said that the Gemini 1.5 Flash-8B will offer the lowest token pricing in the Gemini family. Developers will have to pay $0.15 (roughly Rs. 12.5) per one million output tokens, $0.0375 (roughly Rs. 3) per one million input tokens, and $0.01 (roughly Rs. 0.8) per one million tokens on cached prompts. Additionally, Google is doubling the rate limits of the 1.5 Flash-8B AI model. Now, developers can send up to 4,000 requests per minute (RPM) while using this model. Explaining the decision, the tech giant stated that the model is suited for simple, high-volume tasks. Developers who wish to try out the model can do so via Google AI Studio and the Gemini API free of charge.
[2]
Gemini 1.5 Flash-8B Debuts with Lowest Cost in Google AI Family
Google launches Gemini 1.5 Flash-8B, an affordable, high-speed AI tool optimized for smartphones that is now globally available Google LLC has announced the general availability of the Gemini 1.5 Flash-8B, marking a significant advancement in the accessibility of artificial intelligence (AI) technology. The Gemini 1.5 Flash-8B, introduced at Google I/O 2024, is an optimized version of the earlier Gemini models, offering high-speed processing and efficient output generation at a 50% reduction in price, making it one of the most affordable AI solutions on the market. Google's Gemini 1.5 Flash-8B is a low-powered AI tool designed for use on smartphones and sensors, demonstrating their commitment to making AI more accessible and cost-effective for developers worldwide, highlighting their ongoing innovation.
[3]
Google's lightweight Gemini 1.5 Flash-8B hits general availability - SiliconANGLE
Google's lightweight Gemini 1.5 Flash-8B hits general availability Google LLC is making a new version of its popular Gemini 1.5 Flash artificial intelligence model available that's said to be "smaller and faster" than the original. It's called Gemini 1.5 Flash-8B, and it's much more affordable, at half the price. Gemini 1.5 Flash is the lightweight version of Google's Gemini family of large language models, optimized for speed and efficiency and designed to be deployed on low-powered devices such as smartphones and sensors. The company first announced Gemini 1.5 Flash at Google I/O 2024 in May, and it was released to some paying customers a few weeks later, before becoming available for free via the Gemini mobile app, albeit with some restrictions on use. It finally hit general availability at the end of June, offering competitive pricing and a 1 million-token context window combined with high-speed processing. At the time of its launch, Google noted that its input size is 60-times larger than that of OpenAI's GPT-3.5 Turbo, and 40% faster on average. The original version was designed to provide a very low token input price, making it price-competitive for developers, and was adopted by customers such as Uber Technologies Inc., powering the Eats AI assistant in that company's UberEats food delivery service. With Gemini 1.5 Flash-8B, Google is introducing one of the most affordable lightweight LLMs available on the market, with a 50% lower price and two-times higher rate limits compared to the original 1.5 Flash. In addition, it also offers lower latency on small prompts, the company said. Developers can access Gemini 1.5 Flash-8B for free via the Gemini API and Google AI Studio. In a blog post, Gemini API Senior Product manager Logan Kilpatrick explained that the company has made "considerable progress" in its efforts to improve 1.5 Flash, taken into consideration feedback from developers and "testing the limits" of what's possible with such lightweight LLMs. He explained that the company announced an experimental version of Gemini 1.5 Flash-8B last month, and it has since been refined further, and is now generally available for production-use. According to Kilpatrick, the 8-B version can almost match the performance of the original 1.5 Flash model on many key benchmarks, and has shown to be especially useful in tasks such as chat, transcription and long context language translation. "Our release of best in class small models continues to be informed by developer feedback and our own testing of what is possible with these models," Kilpatrick added. "We see the most potential for this model in tasks ranging from high volume multimodal use cases to long context summarization tasks." Kilpatrick added that Gemini 1.5 Flash-8B offers the lowest cost per intelligence of any Gemini model released so far: The pricing compares very well with equivalent models from OpenAI and Anthropic PBC. In the case of OpenAI, its cheapest model is still GPT-4o mini, which costs $0.15/1M input, though that drops by 50% for reused prompt prefixes and batched requests. Meanwhile, Anthropic's most affordable model is Claude 3 Haiku at $0.25/M, though the price drops to $0.03/M for cached tokens. In addition, Kilpatrick said the company is doubling 1.5 Flash-8B's rate limits, in an effort to make it more useful for simple, high-volume tasks. As such, developers can now send up to 4,000 requests per minute, he said.
Share
Share
Copy Link
Google has launched Gemini 1.5 Flash-8B, a smaller and faster version of its Gemini AI model, offering high performance at the lowest cost in the Gemini family. This new model is designed for efficiency and affordability in AI development.
Google has announced the general availability of Gemini 1.5 Flash-8B, a new addition to its Gemini family of artificial intelligence (AI) models. This latest offering represents a significant advancement in making AI technology more accessible and cost-effective for developers worldwide [1][2].
Gemini 1.5 Flash-8B is an optimized version of the earlier Gemini models, specifically designed for speed and efficiency. Despite its smaller size, Google claims that it "nearly matches" the performance of the original 1.5 Flash model across multiple benchmarks, including chat, transcription, and long context language translation [1][3].
The model is particularly suited for:
One of the most notable aspects of Gemini 1.5 Flash-8B is its pricing structure, which Google touts as the "lowest cost per intelligence of any Gemini model" [1]. The pricing details are as follows:
This pricing model represents a 50% reduction compared to previous versions, making it one of the most affordable AI solutions on the market [2][3].
Google has doubled the rate limits for Gemini 1.5 Flash-8B, allowing developers to send up to 4,000 requests per minute (RPM). This increase is designed to support simple, high-volume tasks more effectively [1][3].
Developers can access and experiment with Gemini 1.5 Flash-8B through:
Gemini 1.5 Flash-8B is a low-powered AI tool optimized for use on smartphones and sensors. It offers lower latency on small prompts and is particularly effective for tasks such as:
The release of Gemini 1.5 Flash-8B positions Google competitively in the AI market. Its pricing and performance compare favorably to offerings from other major players:
Google's Senior Product Manager for Gemini API, Logan Kilpatrick, emphasized that the development of Gemini 1.5 Flash-8B was informed by developer feedback and extensive testing. The company sees significant potential for this model in various applications and continues to refine its capabilities based on real-world usage and requirements [3].
As AI technology continues to evolve, the introduction of more efficient and cost-effective models like Gemini 1.5 Flash-8B is likely to accelerate the adoption and integration of AI across various industries and applications.
Reference
[1]
[2]
Google has announced the release of new Gemini models, showcasing advancements in AI technology. These models promise improved performance and capabilities across various applications.
2 Sources
Google has introduced Gemini 1.5 Flash, a significant upgrade to its AI chatbot. This update brings faster and smarter responses to free-tier users across 230 countries, enhancing the AI's capabilities in various tasks.
5 Sources
Google has announced significant updates to its AI offerings, including the integration of Gemini 1.5 into enterprise contact centers and new AI-powered features for Google Workspace. These advancements aim to revolutionize customer engagement and boost productivity in the workplace.
9 Sources
Recent leaks suggest Google is preparing to launch Gemini 2.0, a powerful AI model that could rival OpenAI's upcoming o1. The new model promises enhanced capabilities in reasoning, multimodal processing, and faster performance.
5 Sources
Google DeepMind's Gemini AI platform has attracted over 1.5 million developers worldwide, with India emerging as one of the largest user bases. The company is expanding access to advanced AI models and emphasizing India's potential in driving AI innovation.
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved