2 Sources
[1]
Amazon's new Nova AI models could be ground-breaking - why we can't know for certain
Most reports on AWS' re:Invent conference earlier this month, which brought us new chips and new data centers, overlooked the cloud giant's unveiling of its first "frontier" models in generative artificial intelligence, code that can compete with the best from OpenAI and Google. Amazon debuted Nova, a "new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance." Also: Amazon AWS unveils Trainium3 chip, Project Rainier Having sat out the battle of frontier performance while Google's Gemini and OpenAI's GPT-4 got all the attention, Amazon is making haste to catch up. Nova's models, which handle multiple modalities that include text and image, come in flavors suited to video generation (akin to OpenAI's Sora) and image generation, which has become standard fare for large language models that integrate text and images. The models come with snappy names, too: "Reel" is the name of the video-generation model, and "Canvas" is the name of the image-generation flavor. There are nice-looking demonstrations of the capabilities akin to what we've seen from OpenAI and Google: There's a video generated by Reel using the keyword "A snowman in a Venetian gondola ride, 4k, high resolution" and a slick photo of an interior made using Canvas with the prompt, "A very fancy French restaurant." Nova makes extensive use, in Amazon's own testing, of the retrieval-augmented-generation (RAG) approach to tap into databases, as well as "chain of thought," a process for producing output that is treated as a kind of reasoning exercise by the AI model. It's hard to say because, as is increasingly the case with commercial AI software, Amazon's technical report discloses precious little about how the Nova models are built. (Even the names of the report's authors are not disclosed!) Also: AWS says its AI data centers just got even more efficient - here's how The company states that the Nova models are "based on the Transformer architecture," referring to Google's 2017 breakthrough AI language model. There is also a "fine-tuning" approach where successive rounds of training seek to refine the models' handling of different domains of data. The training data to build the models is also not disclosed, with Amazon stating only that, "Our models were trained on data from a variety of sources, including licensed data, proprietary data, open source datasets, and publicly available data where appropriate." The most remarkable part of the work is the extensive discussion of "responsible AI" -- that is, avoiding things such as adversarial attacks on AI models by malicious threat actors. Also: AI isn't hitting a wall, it's just getting too smart for benchmarks, says Anthropic "To work to ensure our models' robustness against adversarial inputs such as those that attempt to bypass alignment guardrails, we focused on risks applicable to both developers building applications using our models, and users interacting with our models via those applications," write the authors of the technical report. In particular, Amazon's engineers made extensive use of so-called red teaming, where they sought to break the models by creating various kinds of attacks such as "prompt injection," crafting a language model's prompt with keywords or phrases that would encourage the model to break its guardrails. Some of that involved automatically generating malicious prompts: "We enhanced the diversity of manually curated adversarial prompts by employing linguistic, structural, and modality-based prompt mutation techniques, assessing each mutation for its effectiveness at generating a response that does not adhere to our RAI [Responsible AI] objectives, the likelihood of its success, and the technique's novelty to a model revision." "In total, we identified and developed over 300 distinct techniques," the report relates, "and tested techniques individually and via chaining various combinations." Also: The best AI chatbots: ChatGPT, Copilot, and notable alternatives It remains to be seen whether Amazon has broken ground in the reliability and safety testing of Gen AI. Like so much of the frontier model work, the devil is in the details, and the details are hidden behind intellectual property safeguards. Certainly, the intent sounds ambitious in the technical report. We'll have to wait until the field as a whole can come up with the proper evaluations -- benchmarks, metrics, etc. -- to compare Amazon's red-teaming against the competing methods out there, both open and closed-source.
[2]
AWS introduced Amazon Nova, a next-generation foundation model family
Available on Amazon Bedrock, the Amazon Nova lineup includes Nova Micro, a highly efficient text-to-text model, Nova Lite, Nova Pro, and Nova Premier -- multimodal models that process text, images, and videos to generate text content. Amazon also unveiled two additional models: Amazon Nova Canvas, designed to produce studio-quality visuals, and Amazon Nova Reel, which generates professional-grade videos. Rohit Prasad, Senior Vice President of Amazon Artificial General Intelligence, highlighted Amazon's unique perspective, saying: "At Amazon, we use nearly 1,000 AI applications. This gives us a high-level understanding of where developers continue to face challenges. Our new Amazon Nova models aim to help developers both inside and outside of Amazon overcome these barriers. They offer exceptional intelligence and content generation capabilities while advancing latency, cost-effectiveness, personalization, retrieval-augmented generation (RAG), and agent-based functionalities." The Nova lineup includes four models: Amazon Nova Micro leads with ultra-low latency and cost, making it ideal for text-only applications requiring fast responses. The remaining three models push the boundaries of multimodal AI: Amazon Nova Micro, Nova Lite, and Nova Pro are already available for general use, while Nova Premier will launch in Q1 2025. Nova models were rigorously tested against industry-standard benchmarks. Results show that these models consistently perform on par with or surpass leading alternatives. Amazon Nova Micro, Lite, and Pro models support over 200 languages. Nova Micro handles input contexts up to 128,000 tokens, while Nova Lite and Nova Pro support up to 300,000 tokens or 30-minute video processing. Amazon plans to expand this to over 2 million tokens in early 2025. Amazon Nova models are designed to deliver exceptional speed and cost efficiency. Compared to other top-performing models within their intelligence classes on Amazon Bedrock, Nova Micro, Nova Lite, and Nova Pro are at least 75% more cost-effective while offering the fastest performance. Amazon Nova models integrate directly with Amazon Bedrock, AWS's fully managed service that gives customers access to foundation models from leading AI providers and Amazon itself through a single API call. With Bedrock, developers can easily test and evaluate Nova models alongside other options to determine the best fit for their applications. Amazon Nova models support personalized fine-tuning, allowing customers to improve accuracy by guiding the models with examples from their own data. The models learn what matters most to a customer -- be it text, images, or videos -- and Amazon Bedrock then delivers tailored, fine-tuned responses. In addition to fine-tuning, Nova supports model distillation, enabling the transfer of knowledge from large, high-capability models to smaller, faster, and more cost-effective models without sacrificing accuracy. Amazon Nova models integrate seamlessly with Amazon Bedrock Knowledge Bases, enabling retrieval-augmented generation (RAG) to deliver responses based on an organization's own data for the highest levels of accuracy. Designed to excel in multi-step tasks, Nova models are optimized for agent-based applications requiring interaction with proprietary systems and data via multiple APIs. Amazon Nova Canvas generates professional-quality images from text or image prompts, with built-in controls for editing, color adjustments, and layouts. Integrated safeguards include watermarking and content moderation to ensure responsible AI use. In evaluations, Nova Canvas outperformed models like OpenAI DALL·E 3 and Stable Diffusion. Amazon Nova Reel empowers customers to create high-quality videos from text and images. Designed for advertising, marketing, and educational content, it allows control over visual styles, pacing, and camera effects. Nova Reel consistently outperformed competitors, with reviewers preferring its output over Runway Gen-3 Alpha. While currently supporting six-second videos, Nova Reel will expand to two-minute video generation in the coming months. In Q1 2025, Amazon plans to release a speech-to-speech model designed to transform AI applications for natural voice interactions. The model will interpret spoken language, tone, and tempo to deliver human-like responses with minimal latency. Additionally, Amazon is developing a multimodal-to-multimodal model capable of taking text, images, audio, and video as inputs and producing outputs across any of these modalities. This model, set for mid-2025, will simplify applications requiring content translation, editing, and multimodal understanding. Several leading organizations are already adopting Nova models: AWS has released detailed AI Service Cards for Nova models, providing transparency on use cases, limitations, and responsible AI practices:
Share
Copy Link
Amazon introduces Nova, a family of AI foundation models, aiming to compete with OpenAI and Google in generative AI capabilities while emphasizing responsible AI practices and cost-efficiency.
Amazon has made a significant move in the artificial intelligence arena with the introduction of its Nova family of AI models. Unveiled at the recent AWS re:Invent conference, Nova represents Amazon's first foray into "frontier" models in generative AI, positioning the company to compete directly with industry leaders like OpenAI and Google 1.
The Nova family includes a range of models designed for various applications:
These models are built to handle multiple modalities, including text, image, and video generation, mirroring capabilities seen in OpenAI's GPT-4 and Sora 12.
While specific technical details are limited due to intellectual property concerns, Amazon claims that Nova models are based on the Transformer architecture and utilize fine-tuning approaches. The company reports that Nova models have performed on par with or surpassed leading alternatives in industry-standard benchmarks 2.
Key features of Nova models include:
A notable aspect of Nova's development is Amazon's focus on responsible AI practices. The company has implemented extensive "red teaming" efforts, developing over 300 distinct techniques to test the models against potential misuse or adversarial attacks. This includes safeguards against prompt injection and other security risks 1.
Amazon claims that Nova models offer significant cost advantages, stating they are at least 75% more cost-effective than comparable models while delivering the fastest performance within their intelligence classes on Amazon Bedrock 2.
Amazon has ambitious plans for Nova, including:
Several organizations are already adopting Nova models, indicating potential for significant industry impact. However, as with many frontier AI developments, the full extent of Nova's capabilities and its position relative to competitors remains to be seen, pending more comprehensive evaluations and benchmarks 12.
Meta has recruited Ruoming Pang, Apple's head of AI models, in a significant move that highlights the intensifying competition for AI talent among tech giants.
13 Sources
Technology
15 hrs ago
13 Sources
Technology
15 hrs ago
Elon Musk's xAI company faces criticism after its Grok chatbot exhibits troubling behavior, including generating antisemitic content and inconsistent responses, following a recent update.
4 Sources
Technology
23 hrs ago
4 Sources
Technology
23 hrs ago
An AI-generated indie rock band, The Velvet Sundown, gains viral attention and sparks controversy about the future of music creation and authenticity in the age of artificial intelligence.
3 Sources
Technology
7 hrs ago
3 Sources
Technology
7 hrs ago
Samsung announces new security and privacy updates for upcoming Galaxy smartphones with One UI 8, focusing on protecting AI-powered features, expanding threat detection, and introducing quantum-resistant encryption.
3 Sources
Technology
23 hrs ago
3 Sources
Technology
23 hrs ago
Ukrainian military official reports Russia's use of a new AI-powered drone, the Shahed MS001, equipped with Nvidia's Jetson Orin. This autonomous weapon can identify and strike targets without external commands, raising concerns about AI in warfare and the effectiveness of tech sanctions.
2 Sources
Technology
23 hrs ago
2 Sources
Technology
23 hrs ago