AWS Unveils Major AI Innovations at re:Invent 2024

9 Sources

[1]

SiliconANGLE

The latest AWS AI announcements at re:Invent - SiliconANGLE

It's been full-throttle coverage for theCUBE Research team during our exclusive coverage of AWS re:Invent 2024, and analysts John Furrier (pictured, left) and Dave Vellante (right) broke down all the breaking news on the latest episode of the CUBE Podcast. At the event, Amazon Web Services Inc. dropped its holiday charm on the industry with plenty of AWS AI announcements, according to Furrier. It contrasted with how former AWS Chief Executive Officer Adam Selipsky used to run the event. "Selipsky used to run more of a marketing kind of vibe ... this was just old-school blocking and tackling," Furrier said. "Just a fire hose of gifts, as some people are saying: 'It's like Christmas. It's like the holiday, and all these gifts are coming out. I want to unpack. I want to play with all the new action.'" During the event, Amazon CEO Andy Jassy introduced the Nova models. As a surprise guest, his presence completed changed the energy in the room, according to Furrier. "He just delivered, because he's a pro. He's been doing it forever. Up until two years ago, this was his show, and he was the master at laying down," Furrier said. "He handled the big announcements and then also flexed a little bit about subtly saying, without saying the cliche, 'We've been doing AI before AI.' But he was flexing with data." AWS re:Invent always features a packed schedule of AWS AI announcements, according to Vellante. There's not a lot of hand-waving that goes on. "Even I've been saying all week, the SageMaker stuff, to me, is really interesting," Vellante said. "And they're like, 'Hey, we see AI and analytics coming together, because they're using the same data.' Then right into the weeds, and people eat it up." Last week, Intel Corp. CEO Pat Gelsinger "retired" amid foundry and growth challenges. Any longtime listener of the CUBE Podcast knows that this has long been a subject of conversation for Furrier and Vellante. "Intel's been trying to market its way out of the problem, but you can't market your way out of the problem of math," Vellante said. Intel's problem started in 2012 when PC volumes peaked, according to Vellante. The real problem started when they passed on the iPhone. "The reason why those two milestones are really important is because when PC volumes peaked, that set in motion ARM volumes overtaking PC volumes -- and they did," Vellante said. "In semiconductor manufacturing, volumes win. We've been saying this over and over and over. Everybody knows Moore's Law. We talk about Wright's Law. I'm not going to go into Wright's Law, but look it up. That's what the problem was." This week at AWS re:Invent, in addition to his unscheduled appearance, Jassy also sat down for an exclusive interview with theCUBE. It was the only media exclusive one-on-one with Jassy at the conference, according to Furrier. "We wanted to try to get at the heart of how he's thinking, why he came back, what's he mood, is he leaning in, and what he thinks about the next generation of cloud," Furrier said. The interview touched on the growth of the AWS business, which happened very quickly, according to Jassy. It took about eight or nine years to build an AWS annual run rate of about $4 billion or so. "If you look at AI, that'll happen in just a couple of years," Jassy said. "It's growing incredibly quickly, but the reality is they're very much connected. Because first of all, if you want to use AI, you have to have your data organized and architected in such a way that you can access it." It's nearly impossible to do AI from a mainframe, so companies need to have their infrastructure modernizing the cloud and their data accessible to run AI, according to Jassy. From there, there's more to consider. "The reality is that I think you've heard us talk for years about, every application is compute," he said. "Virtually every application is storage. Almost every application is database and analytics and content. Another one of those core building blocks is going to be inference. Every application is going to have sub-generative AI and inference infused in it, so it's very much a building block."

[2]

SiliconANGLE

At AWS re:Invent 2024, AI innovations fall across markets - SiliconANGLE

Attending AWS re:Invent 2024 was like watching a forest grow and decay at 10,000-times time-lapse speed. With each major breakthrough release falling, Amazon Web Services Inc. might crush a wide swath of products, including a few of its own, while sprouting even more new startups in its wake. Just pick out any one announcement such as AWS Bedrock Data Automation -- which could overshadow a few dozen intelligent document processing and business workflow solutions, at least in the view of AWS customers that aren't already automating that kind of work. Indeed, while AWS Bedrock was being infused as the new substrate for multiple artificial intelligence models, large language model, retrieval-augmented generation and so on that will someday run everything, the real progress will still be among vendors and enterprises making cloud infrastructure better support application innovation and operations. AI is just today's bleeding edge, driving the advancement of scalable and interoperable cloud architectures, massive resilient and secure data infrastructures, and development tools and technologies that are ready for change. "The ability to scale down cloud infrastructure is just as powerful as the ability to scale up, so we can focus our optimization where it needs to be," said Ed Peters, vice president and distinguished engineer at Capital One Financial Corp., a company that famously started its journey to cloud with AWS way back in 2016. Here are just a few of those disruptive AWS projects, partner vendors and startups I managed to catch up in Las Vegas: A key takeaway -- beyond the advances of speeds and feeds in silicon technology, clustering, storage tiers -- was how AWS is striving for interoperability these days, in terms of more open-source involvement, integrations with other vendors, and tools to help companies get their arms around cost and complexity. For me the newly announced "Buy with AWS" marketplace release represents a nice externalization of this massive ecosystem very well. Just by embedding a widget, vendors no longer need to send their customers over to the main AWS Marketplace to consolidate purchases. There's no reason why vendors can't curate their own marketplaces, and offer some tools that compete with some AWS offerings, and leverage others. If chief information officers and architects are working on their shopping lists here, AWS would always get a cut of that IT budget anyway, even if customers choose some alternative management, observability and deployment tooling. Sumo Logic Inc. presented its new "Mo" generative AI assistant based on AWS Bedrock as a foundational model. With a natural language prompt, an engineer, developer or business process owner kicks off a "conversation" of multiple virtual site reliability engineering agents behind the scenes, scanning telemetry data and presenting anomalies with a complete service map of dependencies and root cause analysis. An automation service can then suggest the resulting agentic AI consensus of remediation steps that combine security information and event management or SIEM, security orchestration, automation and response or SOAR, and observability workflows. Vega Cloud Inc. was in Vegas hunting for "cloud whales" -- enterprises and managed services providers that might spend millions of dollars annually on cloud and data infrastructure services. After aggregating a custom FinOps data lake with usage and payment data, their service recommends alternative rate options and deployment footprints, including multi-vendor negotiations, availability scheduling actions and alternative solution packaging to meet scale requirements and service level agreements at lower costs. "AWS announcements on Bedrock, Q and infrastructure to support training and inference for other models have really pulled them into the generative AI conversation," said Puneet Gupta, chief executive of Amberflo.io, Inc, which introduced new interdepartmental IT chargebacks into its application programming interface and AI metering and monetization solution here. "We're seeing huge tailwinds for usage-based pricing and metering. As customers incorporate gen AI into their application stack, they have to fundamentally rethink how they sell their products and services to their customers because of variable cost vectors." Remember when S3 buckets used to be considered sort of a catch-all "junk drawer" for cloud object storage? Now we are seeing huge advances from AWS through announcements such as S3 tables and metadata with chipsets and networking to transform those buckets into high-performance, queryable data lakes. Observability and security vendors are reacting with their own offerings, provisioning Iceberg-style data lakes, lower-cost/faster object stores in S3, and resilient storage and backups to power new AI-augmented search and telemetry. "The overall scale of this event is impressive, and we're having lots of conversations with folks trying to build AI-powered applications on top of existing data stores, or support new use cases," said Steve Kearns, general manager of search at Elastic N.V. "It's been interesting to see where people are in their adoption curve, their understanding, and their learning on this journey. The further they get down the path to production for building generative AI applications, the more nuanced their understanding of the capabilities they are going to need." ScyllaDB Inc. was there demonstrating its unique method of dynamically scaling their NoSQL-style database by sharding data "closer to the metal" across distributed instances for lower latency and cloud scaling costs, and quicker release of unused resources. With streaming event data volumes sometimes nearing daily petabyte scales, operators and security analysts now want to keep everything without bleeding their entire budget. ChaosSearch Inc. delivers a combined analytics, search and SQL query database platform atop low-priced S3 tiers geared toward optimizing ingest costs and real-time search cost-to-performance ratios when compared with leading SIEM and observability vendor data lake offerings. Backups can comprise as much as 10-to-20% of a company's total IT spend for management and storage. Eon Inc.'s immersive "time travel" expo booth demonstrated how its cloud backup management platform tags and retains records according to declarative content policies, retrieving snapshots or individual records with global search and disaster recovery capabilities across AWS and other cloud storage resources. "We're seeing a lot of interest here in moving enterprise data to hybrid cloud, with on-premise systems working alongside private and public cloud infrastructure," said Karthik Ranganathan, CEO of Yugabyte Inc. "Here at re:Invent, AWS just announced Aurora DSQL -- a PostgreSQL-compatible, multi-region, horizontally scalable database which validates our core thesis and the movement of bringing distributed data capabilities into the future." Never has the maxim of enabling people, process and technology, in that order, been more true than today. Developers and engineers need help mastering the many details of building and securing new cloud services with AI futures in mind without breaking their existing critical applications, or breaking the IT budget. Integrail Corp. offers a drag-and-drop platform for designing and delivering agentic AI workflows. Its low-code process oriented orchestration layer combines multiple AI agents, which may come from different sources, to complete complicated tasks. For instance, you might have different agents recognizing the real-world context and source of a product image, writing a text description of it, and tagging and dropping it in an online catalog, so a natural-language LLM prompt can answer user questions about it. I've definitely heard of synthetic data for software testing and observability, but Gretel Labs Inc. was there with a novel approach for AI builders, creating synthetic datasets for training and fine tuning AI models and LLMs. The product leverages different inference models to generate anonymized real-world data, or create datasets from scratch. I didn't expect to see a new rapid application development or RAD tool enter the market, but I visited Retool Inc. and found its modern approach to low-code React component app building and snap-together integrations with back-end Git-style source control, permissions, co-pilots, and deployment workflow capabilities might just fill the bill. Even skilled developers appreciate spending less time coding internal and external apps that are portable to cloud or local infrastructure. In a cloud development world where permissions are often unused and identities can lie dormant, Apono Inc. offers DevOps teams and engineers a consumer identity and access management platform that allows them to embed "access flow" permissions with just-in-time policy monitoring that dynamically validates least-privilege user access in the workflow context of the application. WorkOS Inc. offers an access management platform with self-service user activation, role-based access control and identity that seems specifically useful for business-to-business or software-as-a-service startups, with single sign-on, authentication and directory sync based on connections and usage, instead of unpredictable per-user costs. Since cloud-native development patterns must include a strategy for dealing with legacy technologies and code, I really enjoyed a conversation about modernization with Bill Platt, a general manager at AWS. It has a pragmatic approach to enterprise cloud enablement, including free advisory, training and an expansive system integrator partner program. Some customers are already using its Amazon Q agentic AI development assistant to document codebases and map out a plan for which dependencies to start on. "Development teams seek out the best tools, and whether they are choosing a language, or using an AI model to help them do their job, they want standardization and repeatability," said Sunil Mallya, co-founder and chief technology officer of Flip AI, formally Flip Technology Corp. "All LLMs have slightly different prompts and interpret instructions differently," he said. "The idea of developers freely choosing between AI models and orchestrating them is an illusion. Why create five different codebases to maintain across the team, when the models could be updated at any time?" My last AWS re:Invent coverage in 2019 was a tough slog -- I felt overwhelmed with features and overcrowded by every vendor doing anything related to software and infrastructure in general. Perhaps at the time, being surrounded by more than a hundred thousand people with an oncoming pandemic exacerbated my agoraphobia, but I swore never to return. Now, five years later, I approached with an open mind and found the environment offered much easier routes to understanding and collaboration. If AWS demonstrated one thing really well here, it's that it cares about the partners who are their channel to the world, and even more so, the developers and operators who are their end customers. With so many engineers being forced to return to the office or transform their own skill sets, let's hope that commitment to human ingenuity augmented by AI potential continues. Or, maybe my bio-brain was digitized upon event entry, and this article is just another product of multimodal agentic AI trained on a RAG of my collected writings and the commentary of other autonomous agents having conversations with each other in a simulated re:Invent environment. Only only time will tell tell tell [glitch].... Jason English is a partner and principal analyst at Intellyx B.V., an analyst firm that advises enterprises on their digital transformation initiatives, and publishes the weekly Cortex and BrainCandy newsletters. He wrote this article for SiliconANGLE.

[3]

SiliconANGLE

Andy Jassy returns to AWS re:Invent as Amazon beefs up its AI chops - SiliconANGLE

Andy Jassy returns to AWS re:Invent as Amazon beefs up its AI chops Amazon Web Services' re:Invent conference in Las Vegas dominated the news this week, as the cloud giant debuted credible new artificial intelligence models dubbed Nova, new Trainium AI chips and cloud instances, and numerous new additions to its Bedrock and Sagemaker AI application platforms. Although a lot of new services are still in preview, all in all, AWS -- viewed as behind AI leaders such as Google and Microsoft with OpenAI -- looks to be closing the gap quickly. And if anyone doubted its AI resolve, Amazon CEO Andy Jassy (pictured) was the one to introduce the Nova models, signaling in no uncertain terms that Amazon intends to be among the major players in AI. John Furrier and Dave Vellante nabbed an exclusive interview on theCUBE: Do. Not. Miss. It. Is Intel heading for oblivion? Sure seems like it after CEO Pat Gelsinger "retired" (read: was canned) as the board realized his chip foundry ambitions weren't going to work as planned, and investors have bailed. It's hard to put too much blame on Gelsinger, who inherited a disaster from a board that dithered for decades with a series of ineffective CEOs who lost Intel's edge. It's not clear what happens next, but the future for the chipmaking pioneer doesn't look good -- and by extension the U.S.'s attempt to rebuild its domestic chip manufacturing base doesn't look good either. Enterprise software and hardware companies logged pretty good quarters overall, including Salesforce, Hewlett Packard Enterprise, Pure Storage and chipmaker Marvell. So did UiPath, but investors weren't as impressed. Bitcoin passed $100,000 on confidence in Trump, who likes crypto no doubt because he has his own crypto scheme and just appointed his donor David Sacks to be some kind of AI and crypto czar. I'm sure this will end well. Veeam raised a huge $2 billion round ahead of an initial public offering, raising its valuation to $15 billion. Next week, earnings are due from Oracle, MongoDB, C3 AI, Adobe and Broadcom. John and Dave will air their podcast theCUBE Pod Friday, and look for Dave's weekly deep dive Breaking Analysis this weekend. Here's all the news and analysis this week from SiliconANGLE and beyond: John Furrier and Dave Vellante scored a coup with an interview with Amazon's CEO, a longtime fixture on theCUBE at re:Invent: Exclusive: Amazon CEO Andy Jassy reveals AWS' strategy for building the enterprise AI platform And check out Dave's analysis on theCUBE Research of Jassy's portion of the marquee keynote: Reinventing AI: highlights from Andy Jassy's re:Invent keynote AWS re:Invent 2024: CEO Matt Garman unveils the future of cloud with generative AI and agentic workflows Dave Brown talks Trainium2: AWS' secret weapon for generative AI leadership And for more stories and AWS executive interviews as well as all the news and analysis from re:Invent, check out our Special Report: AI and the Cloud, including pre-re:Invent interviews from the previous week. Scaling up: At re:Invent, AWS focuses on enterprise AI with advances in Bedrock and launch of new models Fast-moving and disruptive: Top AWS technologist outlines vision for AI's enterprise future AWS looks to dominate AI and cloud infrastructure services: TheCUBE Research weighs in Hybrid cloud computing, gen AI driving next wave of enterprise transformation Amazon introduces Nova family of multimodal AI foundation models And Simon Willison says they look pretty damn good. AWS transforms Amazon SageMaker into a single platform for AI and data analytics Amazon Bedrock gets better safeguards and the ability to orchestrate multiple AI agents AWS details Project Rainier AI compute cluster with hundreds of thousands of chips AWS upgrades Amazon Connect with new generative AI features AWS enhances Q Developer AI assistant to reduce tedium, accelerate work AWS expands Q Business gen AI assistant features and integrates with its business intelligence platform Amazon Q unlocks new generative AI capabilities for business users Amazon SageMaker HyperPod cooks up recipes and flexible training plans to accelerate AI development Amazon Bedrock's new marketplace kicks off with more than 100 AI models Exploring Databricks' vision for an AI-driven data intelligence future How AWS drives practical AI adoption through enterprise collaboration AWS details materials science collaboration with Orbital Materials Nvidia and AWS team up to accelerate AI deployments in the cloud New Relic integrates with Amazon Q Business AI assistant to make observability understandable PagerDuty teams up with AWS to enhance incident management with generative AI integrations AI and DevOps: Transforming incident management in the age of complexity Poolside brings its powerful generative AI coding models to Amazon Sumo Logic brings generative AI to DevSecOps observability JPMorgan leverages scalable AI for enhanced risk management and operational innovation Breaking cloud barriers: Changing multicloud strategies redefining the future of computing New physical AWS Data Transfer Terminals let you upload to the cloud faster OpenAI debuts ChatGPT Pro plan with reasoning-optimized o1 pro mode LLM OpenAI to host 12-day product announcement series with new reasoning model, Sora expected Elon Musk's xAI raises $6B to scale up GPU infrastructure Google DeepMind's latest AI models: super-accurate weather forecasting and playable 3D worlds MLCommons releases new AILuminate benchmark for measuring AI model safety SurePath AI launches Discovery service to enhance enterprise generative AI oversight Elon Musk tries to block OpenAI's for-profit transition in latest legal tussle Google expands Vertex AI with video generator AI model Veo Couchbase introduces Capella AI services for AI agent design and deployment Clarifai's new AI compute orchestration platform spans every cloud and private data center environment Kore.ai looks to boost business workers' productivity with AI agents for information retrieval and more Database company Weviate speeds up AI development with flexible vector embeddings service Enterpret raises $20.8M to interpret customer feedback everywhere with AI models Who is David Mayer? ChatGPT faces scrutiny over censorship of public figures There's even more AI and big data news on SiliconANGLE The big enterprise news of the week: Intel CEO Pat Gelsinger retires amid foundry, growth challenges Intel's CEO-shortlist candidates include former board member Lip-Bu Tan, sources say (per Reuters) AWS unveils next-gen Trainium3 custom AI chips and cloud Trainium2 instances AWS expands Amazon S3 with features to support Apache Iceberg and metadata management AWS introduces Amazon Aurora DSQL and DynamoDB enhancements for multi-region workloads Embracing 'simplexity:' Amazon CTO Werner Vogels offers key principles for managing the complex IT world AWS integrates liquid cooling and simplifies electrical distribution to lower data center power consumption Oracle database on AWS cloud enters limited preview Red Hat expands AWS alliance with additional Ansible and OpenShift offerings SurrealDB's Surreal Cloud debuts on AWS S3 for scalable multimodel DBaaS Meta to invest $10 billion in Louisiana AI data center AI data center builder Nebius raises $700M in funding AI chip startup Tenstorrent raises $693M, deepening rivalry with Nvidia CDW strengthens cloud offerings through Mission Cloud acquisition Cloud software firm ServiceTitan seeks to raise up to $502M in initial public offering Tractian nabs $120M to reduce industrial equipment downtime Biden administration hits out at China's chip industry with export controls Arista redefines the network stack Microsoft faces £1B UK lawsuit over software licensing fees Broadcom introduces 3.5D XDSiP chip packaging for data center processors Salesforce reports mixed third-quarter results, but stock jumps on upbeat forecast Shares of Pure Storage jump on impressive earnings beat and strong guidance Marvell tops quarterly expectations thanks to strong ASIC, interconnect demand Rising AI server demand drives another strong earnings and revenue beat for HPE Shares of UiPath sink despite solid earnings and revenue beat Synopsys shares decline as fiscal 2025 outlook falls below analyst expectations SaaS providers Asana, Smartsheet, Docusign and Domo issue quarterly earnings, with varied results IoT cloud platform provider Samsara sees revenue growth, shares slide on guidance We have more news on cloud, infrastructure and apps AWS expands cybersecurity portfolio with AWS Security Incident Response Rubrik expands cybersecurity and AI solutions with new AWS and Amazon Bedrock capabilities Balbix expands cybersecurity offerings with three AI-powered tools on AWS Marketplace US urges use of encrypted messaging apps following Salt Typhoon hack Vaultree open-sources its technology for computation on encrypted data Skyhawk Security introduces interactive cloud threat detection for enhanced zero-trust security New Thales solution integrates Imperva technologies to enhance data risk management Sweet Security introduces unified Cloud Native Detection and Response platform Bitdefender enhances GravityZone XDR with new sensor for SaaS application security $2B secondary funding gives Veeam $15B valuation on path to IPO Cloud security firm Upwind nabs $100M funding at $900M valuation SentinelOne shares fall 13% as loss expands despite revenue beat Okta shares surge over 15% on strong revenue and earnings beats Rubrik shares soar 18% on strong earnings results and upbeat outlook Zscaler's in-line revenue guidance underwhelms investors and shares fall Threat detection startup Tuskira launches with $28.5M from investors including Intel System Two Security raises $7M to automate detection engineering with AI More cybersecurity news here

[4]

SiliconANGLE

AWS re:Invent 2024: Builder ethos embraces simplicity - SiliconANGLE

AWS re:Invent 2024 this past week ushered in a transformative chapter for both Amazon Web Services Inc. and the broader tech ecosystem. This year's event marked the debut of Matt Garman as chief executive of AWS, stepping into a role that aligns with what John Furrier aptly describes as a "wartime CEO" -- a technically adept leader and trusted consigliere. Garman's keynote set the tone for AWS' strategic focus: doubling down on core infrastructure capabilities across silicon, compute, storage and networking, while extending its Graviton chip playbook to graphics processing units and potentially large language models. Key highlights included the unveiling of six new artificial intelligence LLMs, formerly codenamed "Olympus" and now named Nova, introduced by none other than Andy Jassy, the godfather of cloud and now CEO of Amazon.com Inc., who made a cameo appearance with a substantive keynote. Jassy's return underscored Amazon's commitment to innovation at large scale, bridging its heritage in infrastructure with the future of AI. A notable emphasis on simplification was palpable at this year's event. Though the company remains steadfast in offering primitives, or granular services, and service flexibility, it's now streamlining AI, analytics and data workflows into a unified view. This vision materialized with advancements in SageMaker, positioning it as a hub for machine learning, analytics and AI. Additionally, the event showcased a deluge of new features and services across AWS' expansive ecosystem. In this Breaking Analysis, we explore the critical takeaways from re:Invent 2024. Leveraging insights from Enterprise Technology Research data, we'll unpack AWS' strategy, the implications for the broader ecosystem, and how we believe the next several years will unfold. One of the recurring themes in AWS' journey has been its "primitives for builders" ethos -- a foundational philosophy shaped by Amazon Chief Technology Officer Werner Vogels. At re:Invent 2021, this philosophy came under the spotlight as AWS navigated a market shift from tech-first early adopters to mainstream corporate developers who demand simplicity and pre-integrated solutions. During the event, Vogels articulated the essence of AWS' approach: Primitives provide unparalleled choice and power, enabling builders to craft virtually anything. However, he also acknowledged a critical tradeoff: As primitives proliferate without being inherently designed to fit together, they introduce complexity. This bottom-up approach starkly contrasts with the top-down strategy employed by competitors such as Microsoft Corp., which prioritizes simplification and abstracts away the complexity of underlying primitives. Former AWS CEO Adam Selipsky's tenure at AWS attempted to address this tension by introducing a simplification narrative, beginning at re:Invent 2021. However, these efforts are akin to what Bill Gates said many years ago: When you have an architecture or a business model that you try to shift radically, it's like kicking dead whales down the beach. In other words, it's a monumental shift that takes time to execute fully. Now, under Garman's leadership, AWS is positioned to rationalize these seemingly competing philosophies, leveraging its traditional strengths while addressing the complexity concerns of mainstream enterprises. AWS' challenge remains clear: Integrating its extensive portfolio of primitives into cohesive solutions without sacrificing the flexibility that has long defined its identity. As the market continues to demand simplification, AWS must reconcile these competing forces to maintain its leadership. AWS, under its new CEO Garman, is reasserting its dominance in core infrastructure while embracing the simplification imperative, for certain parts of its estate. This marks a significant evolution, blending AWS' traditional ethos of "primitives for builders" with a shift toward pre-integrated, simplified solutions tailored for corporate developers. One of the standout strategies involves leveraging Nitro and applying the AWS Graviton Playbook -- originally devised to lower x86 compute costs -- to GPUs and perhaps even LLMs. This year, AWS introduced six new LLM models called Nova, along with enhancements in SageMaker. These announcements underscore AWS' commitment to lowering the cost of AI training and inference while reducing its reliance on Nvidia Corp.'s expensive GPUs. With Graviton, Trainium and Inferentia silicon offerings, AWS is making strides toward a more independent and cost-efficient AI strategy. This approach has also allowed AWS to get on the generative AI training and inference learning curve at lower cost, using its own infrastructure (silicon and networking). We'll come back to this point later in the post, but we see this as a significant competitive differentiator. The irony of this shift is that AWS' original "primitives-first" ethos left the perception that it was lagging behind in LLM infrastructure and, like others, was forced to rely on Nvidia. However, for the past two years, AWS has been able to gain critical learnings by evolving its AI infrastructure, using its lower-cost chips. Moreover, Garman's leadership is now reaping the benefits of predecessor Selipsky's simplification groundwork. AWS' renewed focus on user-friendly, integrated solutions positions it to compete more effectively in the evolving enterprise market, in our view. As AWS evolves, the simplification and choice positioning will remain central to its strategy. We'll continue monitoring how this balance shapes AWS's trajectory in AI, analytics, and infrastructure. Garman brings a deeply technical background, having led EC2 for years. His leadership reflects AWS' strategy of combining technical depth with a pragmatic approach to infrastructure. His tenure marks a shift from Selipsky's operational focus, positioning AWS to capitalize on its extensive systems expertise. As the industry increasingly relies on AI-driven innovation, AWS is reasserting its dominance in silicon, networking and infrastructure, aiming to reduce dependency on Nvidia's GPUs for AI training and inference. This strategy stands in contrast to Microsoft's approach, which has been limited by GPU constraints and a reliance on Nvidia's supply chain. AWS' silicon strategy changed dramatically when it started working with Annapurna Labs and eventually acquired the firm in early 2015. Its silicon roadmap, including Graviton, Trainium and Inferentia, is focused on lowering costs and optimizing performance for AI workloads. The result? AWS is able to position itself aggressively as a leader in scalable, cost-effective AI infrastructure. Microsoft got started much later in its silicon journey and as such is more exposed to supplies of Nvidia GPUs. Google LLC, like Amazon, has been on a custom silicon journey for more than a decade with its tensor processing unit or TPU, first deployed around the 2015 timeframe. Subsequently, Google has rolled out other silicon innovations and has a leading position in the space. AWS's Nitro system is a core differentiator. The company is able to deliver high-performance, Ethernet-based networking infrastructure optimized for AI workloads, avoiding reliance on Nvidia's InfiniBand infrastructure. Nitro exemplifies AWS' ability to integrate hardware and software into cohesive systems, and to a certain degree, replicate Nvidia's systems-level approach to AI. Though Nvidia continues to lead in training at large scale -- leveraging software beyond CUDA to optimize massive clusters -- AWS is catching up by porting its entire Model Garden to Trainium-based clusters. Anthropic PBC, with AWS as its primary cloud sponsor, represents a pivotal partnership. Though we believe much of the $8 billion investment in Anthropic includes cloud credits, the collaboration showcases AWS' ability to support advanced models such as Claude 3.5 on Inferentia. This capability is a key step toward cracking Nvidia's moat around inference, which has historically been a bottleneck for cloud providers. In short, despite some naysayers, we believe the Anthropic investment is working for both companies. The gen AI revolution is reshaping every phase of the software development lifecycle. AWS, once perceived as a laggard in AI, now appears well-positioned to compete at large scale. By leveraging Nitro and its silicon expertise, AWS is aligning its infrastructure to support gen AI workloads effectively, even as it plays catch-up in areas such as large-scale training. However, AWS' systems expertise gives it a critical edge. While Microsoft struggles with GPU constraints and Google continues to lead in AI functionality, AWS is building a roadmap to independence. By optimizing its silicon and networking capabilities, AWS is on track to challenge Nvidia's dominance and expand its AI offerings within its own ecosystem. AWS' focus on silicon and systems integration signals a long-term strategy to redefine AI infrastructure. Though Nvidia remains the leader in large-scale training, AWS' accelerated progress in silicon and networking could erode that dominance, at least within the AWS installed base. With its Model Garden and advanced development tools maturing, AWS is proving that even lower-cost infrastructure can drive meaningful innovation. Garman's leadership will be pivotal in navigating this transition. AWS is no longer just following Nvidia's footsteps; it is leveraging its systems expertise to chart a path forward that could reshape the competitive landscape of AI infrastructure. At re:Invent, the introduction of the Nova family of gen AI models marked a significant milestone for AWS, signaling its intention to build leadership in the foundation model space. Jassy's return to the stage to unveil these models was both a symbolic and strategic move, emphasizing AWS' commitment to innovation and reinforcing its cultural and technical legacy. The Nova family, much like Graviton's positioning against x86 chips, is designed to provide customers with optionality and competitive price-performance. AWS indicated to theCUBE Research that Nova models will not be exclusively tied to any AWS services, maintaining its ethos of flexibility and customer choice. This, however, could also indicate an internal bias against Nova, which comes out of Amazon.com's Alexa team. Our understanding is there were at least two competing teams for LLM leadership and it was decided that the Nova team was further along, so it won the resource battle. Nonetheless, the roadmap and benchmarks for these models demonstrate Amazon's ambition to compete head-on with industry leaders such as OpenAI, Google and Anthropic in gen AI, again, providing potentially lower-cost solutions for customers. AWS' entry into the foundation model game can be considered a strategic necessity. Just as the acquisition of Annapurna Labs laid the groundwork for in-house silicon development (such as Nitro and Graviton), the Nova family provides AWS with critical skills in gen AI, ensuring it remains competitive in a rapidly evolving market. This approach also positions AWS to offer differentiated solutions while gradually reducing its dependence on third parties for LLMs and training infrastructure. The fine-tuning of Nova models on Trainium represents a deliberate effort by AWS to control costs and build independence in its AI strategy. As noted, though pretraining on Nvidia hardware remains standard due to its compute intensity, fine-tuning and inference -- less resource-intensive phases -- are being transitioned to AWS' silicon. This shift highlights AWS' focus on cost-effectiveness, contrasting sharply with the high costs associated with training competitors' models. For instance, Google's Gemini Ultra reportedly cost nearly $200 million to train, underscoring the financial burden of building large-scale gen AI models. AWS appears committed to a long-term strategy of innovation in GenAI. While still leveraging Nvidia for pretraining, its investments in Trainium and other in-house silicon solutions position it to gradually reduce reliance on external vendors. The Nova models are an essential step in building the technical and operational capabilities required to compete in an increasingly AI-driven world. With a clear roadmap, competitive pricing and a focus on optionality, AWS is signaling that it is not just participating in the gen AI race but is determined to shape its trajectory. AWS continues to push forward in the data and analytics space, balancing its core ethos of power primitives with the industry's demand for simplification. At re:Invent, Amazon unveiled significant updates to SageMaker, signaling its ambition to redefine the data and analytics landscape. Historically, AWS's data platform struggled to compete as a cohesive solution. The updates position SageMaker as a robust, unified platform, aimed at simplifying workflows for data and analytics professionals while preserving flexibility for developers. Summary of Snowflake's take on the AWS announcements: Snowflake underscored its growing partnership with AWS, citing 68% year-over-year growth in bookings and recognition with nine AWS re:Invent 2024 awards, including "Global Data and Analytics ISV Partner of the Year." The collaboration spans integrations like Snowpark for Python with SageMaker and joint work on Apache Iceberg to enhance data interoperability, reduce complexity and mitigate vendor lock-in. Snowflake's unified platform approach, supported by its Datavolo acquisition, accelerates AI application development and maintains robust governance through its Horizon Catalog. Highlighting AWS SageMaker's recent improvements and its shared commitment to Apache Iceberg, Snowflake announced a multiyear partnership with Anthropic to integrate Claude 3.5 models into its Cortex AI platform on AWS and expressed plans to deepen collaboration in analytics and AI innovation. Our take on Snowflake's response is it's no surprise the company would put forth an optimistic stance. As well, Snowflake and AWS (and Databricks) have shown this is not a zero-sum game. At the same time, we continue to feel that Snowflake's primary competitive advantage is its integrated experience and the quality of its core database. As we've previously discussed, the value is shifting up the stack toward the governance catalog, which is becoming more open. This in our view will push Snowflake in new directions, which its response acknowledges. From the Databricks perspective, we believe what AWS announced validates much of that which Databricks CEO Ali Ghodsi has been espousing, that open formats will win and customers want to bring any compute to any data source. As AWS moves more toward a unified data platform, metadata emerges as the key value layer. This harmonization or semantic layer transforms technical metadata into business-relevant information -- people, places, things and their relationships. Skepticism remains about AWS' ability to execute at this level of integration. Industry experts have previously doubted Amazon's capability to unify its platform effectively. However, the reimagined SageMaker, with its unified storage and metadata layers, represents a meaningful step forward. The updates will intensify competition with Snowflake, Databricks and others in the battle for enterprise data dominance. As AWS builds out its platform, market confusion could grow as enterprises weigh the benefits of SageMaker versus other well-established data platforms. The new and improved SageMaker signals AWS' commitment to delivering a unified platform that simplifies workflows without sacrificing power. While competitors like Snowflake and Databricks have led the charge in unifying tools and storage, AWS is rapidly closing the gap by converging data, analytics and AI capabilities under one umbrella. AWS' roadmap points toward greater integration, including knowledge graph capabilities and semantic layers. By bridging its infrastructure expertise with application-centric design, AWS is positioned to redefine how enterprises harness the power of their data. This evolution will determine whether AWS can shift from a fragmented toolkit provider to a leader in unified data platforms. The stakes are high because we believe ultimate agentic architectures won't scale without this harmonization capability. We believe Amazon is laying the foundation with SageMaker and AWS is prepared to compete at the highest levels of the data ecosystem. AWS continues to bridge its core ethos of "primitives for builders" with the need for usability and integration. A standout example of this strategy is Q Developer, a suite aimed at transforming the software development lifecycle by introducing agents that automate noncoding tasks such as documentation, unit testing and code reviews. This represents a major step forward in simplifying cloud and AI development, making these capabilities more accessible to developers and enterprises alike. At the same time, AWS is pursuing legacy modernization efforts, targeting workloads like mainframes, Windows/.NET, and VMware, with varying degrees of success and challenges. Though these efforts showcase AWS' ambition, the real headline is the growing ecosystem around gen AI-powered development agents, which AWS is integrating into its services, signaling a new era of productivity and accessibility for developers. As cloud evolves, so too does the role of developers. While they have long been the "kingmakers" in cloud, the rise of gen AI and low-code tools signals a shift in focus toward data-centric application development. In this context: The interplay between gen AI agents and harmonized metadata is central to this usability revolution. By integrating operational and technical metadata, AWS is creating a framework where agents can navigate seamlessly across datasets and services. This approach aligns with the broader vision of an AI-driven development ecosystem: AWS's Q Developer and its integration of gen AI agents signal a paradigm shift in how cloud development is approached. By combining the power of primitives with enhanced usability, AWS is paving the way for a more accessible and productive development environment. This effort positions AWS to compete not just with traditional cloud competitors but also with platforms that emphasize low-code, AI-driven workflows. As the market evolves, the battle will increasingly focus on usability and integration, where AWS, Microsoft, Salesforce and others compete to define the next generation of software development. AWS' success will hinge on its ability to harmonize its deep infrastructure expertise with the growing demand for developer and enterprise simplicity. Q Business represents a nascent attempt to bring "Copilot"-like functionality to the enterprise, echoing Microsoft 365 Copilot's promise. This early iteration from AWS seeks to unify collaboration and business data under a single index, enabling LLM queries. The potential benefits are substantial but in our view will take more time to evolve: However, Q Business faces significant hurdles: As Q Business continues to evolve, addressing these challenges will be crucial to realizing its vision of streamlined information access and automation. The chart below provides a classic view into spending momentum and market penetration within the machine learing and AI ecosystem. Using ETR data from 1,775 enterprise information technology decision-makers, the vertical axis reflects Net Score, a measure of spending momentum -- essentially the percentage of customers increasing their spending on a platform. The horizontal axis, labeled as Overlap, represents penetration within the surveyed accounts. Together, these metrics highlight both platform momentum and adoption across the ML/AI market. The red line at 40% on the vertical axis represents a highly elevated Net Score. This snapshot of the ML/AI sector highlights a evolving competitive landscape where foundational model players such as OpenAI, Meta Llama and Anthropic are redefining the space. Data-centric platforms such as Databricks and Snowflake are integrating AI to remain competitive, while legacy players such as IBM and Oracle are fighting to stay relevant. AWS Nova models may move the needle, but it's likely Anthropic will continue to be the dominant platform in the ETR data AWS will in our view continue to exert influence through Bedrock and its other partnerships and its optionality strategy is playing out as planned. As enterprises increasingly view AI as foundational to their strategies, the momentum and penetration of these platforms will shape the next wave of innovation in the ML/AI ecosystem. The evolving software stack (shown below) is pushing beyond traditional silos of infrastructure, platform services and software-as-a-service applications. We believe the imperative for the next decade is the creation of a harmonization layer -- a transformative new abstraction that orchestrates end-to-end business processes, aligning activities such as demand forecasting, internal operations and supplier coordination into a seamless whole. We describe this vision as a "virtual assembly line of expertise," which seeks to achieve for managerial and collaborative work what the physical assembly line did for manufacturing a century ago: Optimize throughput, eliminate inefficiencies and enable expertise to flow seamlessly across the organization. This new harmonization layer represents a seismic shift in enterprise IT. By extending the concepts of databases and enterprise resource planning systems (for example, Oracle and SAP SE) across the entire data and application estate, this layer creates a predictive, interconnected framework that transcends traditional silos. The emergence of the harmonization layer is nothing short of transformative, redefining how enterprises align their processes and leverage expertise. However, this evolution is still in its early stages, with significant challenges in governance, agent integration and metadata harmonization. As AWS, Microsoft and startups such as Glean, Celonis and Kubiya push the boundaries of this sector, the journey will define the next generation of application platforms. Though the payoff may take a decade or more, the potential to revolutionize enterprise productivity and collaboration is undeniable. What do you think? Did you catch the action at AWS re:Invent 2024? What caught your attention that we didn't touch upon? Are there things in this research note you disagree with or to which you can add context?

[5]

The Register

How AWS is leaning into AI with chips, code, models

Cloud colossus reckons it can clarify hallucinations, get your apps off Microsoft's OS at pleasing speed re:Invent Amazon Web Services CEO Matt Garman opened the cloud computing service's 13th annual re:Invent conference by observing: "Generative AI has the potential to disrupt every single industry out there." While that may be worrisome for those defending against disruption, it's an appealing apocalypse for AWS, not just on its own Arm-based servers, but also for its x86 fleet, and AI-oriented silicon services, too. It'll happily sell you the tools for that disruption. "In 2018, we saw a trend in compute," recalled Garman. "We were looking out there, and we saw that Arm cores were getting faster. Most of them were in mobile, but they were getting more powerful. And we had this idea that there's this opportunity that maybe we could go combine that technology curve with our knowledge of what's most important to customers running inside of AWS and develop a custom general purpose processor." Today, that processor - the Graviton - is widely used by almost every AWS customer through EC2 instances in one way or another. "Graviton delivers 40 percent better price performance than x86," claimed Garman. "It uses 60 percent less energy." In 2019, AWS was a $35 billion business, said the AWS chief, adding: "Today, AWS has more Graviton capacity than we had total capacity in 2019." According to Dave Brown, VP of AWS Compute, over the last two years, more than 50 percent of AWS's additional CPU capacity has been Graviton-based. Coincidentally, Intel, creator of the x86 architecture, is looking for a new CEO. The biggest compute problem currently, Garman argued, involves generative AI. Graviton is aimed at general computation, but AWS has more AI-focused offerings. "Now, the vast majority of generative AI workloads today run on Nvidia GPUs, and AWS is by far the best place anywhere in the world to run GPU workloads," boasted Garman. "Part of the reason is because AWS and Nvidia have been collaborating together for 14 years to ensure that we're really great at operating and running GPU workloads." AWS is doubling down on that partnership, Garman declared, with plans to launch instances next year featuring Nvidia's forthcoming Blackwell chips. Garman claimed those "P6" instances "will give you up to 2.5 times faster compute than the current generation of GPUs." More immediately, Garman said Trainium2 (Trn2), the second coming of AWS's AI training chip, has reached general availability. The Amazon-designed silicon delivers 30 to 40 percent better price performance than current GPU systems, he claimed. "These are purpose-built for the demanding workloads of cutting-edge generative AI training and inference," said Garman. There will also be EC2 Trn2 UltraServers, which consist of four Trn2 instances tied together using the company's NeuroLink interconnect for a total of 64 Trainium2 chips capable of up to 83.2 FP8 petaflops. According to Garman, Databricks, an AWS partner, expects to use Trainium2 to reduce its total cost of ownership (TCO) for joint customers by up to 30 percent. Benoit Dupin, senior director of machine learning and AI at Apple, made a brief appearance to plug Apple Intelligence and - rather unusually for the inwardly focused iGiant - Apple's relationship with AWS. "We're in the early stages of evaluating Trainium 2, and we expect our model numbers to gain up to 50 percent improvement in efficiency in pre-training," said Dupin. "With AWS, we found that working closely together and taking advantage of the latest technologies has helped us be more efficient in the cloud. AWS expertise, guidance, and services have been instrumental in supporting our scale and growth." We can't recall the last time an Apple exec showed up at a public event and gushed about a tech supplier - particularly in a way that highlights the sort of external technical dependency that the iBiz typically tries to avoid. Buckets of cloudy upgrades Also at the conf, Amazon announced its S3 cloud storage service has a new flavor: S3 Tables, which promises 3x faster query performance, and 10x more transactions per second for Apache Iceberg tables. There's also S3 Metadata in preview, for managing metadata in S3. Amazon Aurora, the cloud-based MySQL/PostgreSQL relational database management system, is celebrating its ten-year anniversary, and has a new sibling called Aurora DSQL. The "D" in this instance stands for distributed. The associated service manages to provide distributed data storage with low latency, which for those knowledgeable in the ways of databases and networks represents a bit of a technical achievement. "What we did is we added a hardware reference clock in every single EC2 instance all around the world," explained Garman. "And those hardware reference clocks sync with satellite-connected atomic clocks. So that means that every EC2 instance now has microsecond-precision accurate time that's in sync with any instance anywhere in the world." Aurora DSQL provides distributed storage with low-latency reads and writes, as well as strong consistency across multiple regions. According to Garman, it does so 4 times faster than Google Spanner - a comparable distributed database management service - given a 10-statement SQL transaction. Amazon DynamoDB (NoSQL) tables now also support multi-region strong consistency, as a preview capability. Of course there's AI news, too Amazon Bedrock, AWS's managed AI model service, also received some attention. Amazon Bedrock Model Distillation is a new service, in preview, to reduce the size and cost of AI models. "What model distillation does is you take this large frontier model," explained Garman. "In this example, it's a Llama 405B model. You take this highly capable model, and you send it all your prompts and all the questions that you might want to ask it. Then you take all of the data and the answers that come out of that and together with the questions, you use that to train a smaller model, in this case a Llama 8B model, to be an expert at that one particular thing." The result is a smaller, faster model that's more affordable - up to 500 percent faster and up to 75 percent less expensive than the original model, according to Amazon, with an accuracy loss of less than two percent on use cases like Retrieval Augmented Generation (RAG). Bedrock has also gained another preview capability: multi-agent collaboration. It allows customers to design agents for specific tasks that can be compiled in parallel or in sequence under a supervisory agent. Garman described beta testing work done with financial service provider Moody's to develop a proof-of-concept application capable of generating financial risk reports for its customers. "Before this proof of concept, this was a workflow that would take one of their agents or one of their employees about a week to do," said Garman. "They ran this proof of concept for this multi-agent collaboration. They were able to accomplish the same task in one hour, and the ability to seamlessly scale it across any number of companies in parallel." Seeing things Perhaps most impressive of all, AWS claims to have an answer to one of generative AI's most pernicious problems: hallucination. Bedrock Guardrails has gained a new capability in preview called Automated Reasoning Checks. "Automated Reasoning Checks prevent factual errors due to model hallucinations," explained Garman. "So, when you implement one of these automated reasoning checks, what happens is Bedrock can actually check that the factual statements made by models are accurate." The idea is that this will allow businesses to deploy AI for customer-facing applications where responding correctly is critical. As an example, Garman described how an insurance company would want to be sure that an AI help system could correctly answer a customer query about whether or not insurance coverage applied to a particular situation. Model building Andy Jassy, CEO of Amazon and former AWS boss, took a turn on stage to announce Amazon Nova - a set of foundation models. They include Nova Micro (text-based), Nova Lite (multimodal), Nova Pro (multimodal), and Nova Premiere (multimodal, Q1 2025), as well as Nova Canvas (image generation) and Nova Reel (video generation). They are, by Amazon's assessment, "at least 75 percent less expensive than the best performing models in their respective intelligence classes in Amazon Bedrock." An Amazon Nova speech-to-speech model is planned for 2025, along with an any-to-any modality model. Amazon developed the Nova models itself, without the involvement of Anthropic, AWS's preferred AI partner and recipient of some $7 billion of investment. The Nova model card and AI Service cards attest to various responsible AI tests and security evaluations that have been conducted, but offer little in the way of safety comparisons with other models or current vulnerabilities. That all said, AWS teased Project Rainier, a supercomputing cluster said to contain "hundreds of thousands" of Trainium2 chips, due to power up in 2025 and will be used by Anthropic for building models. New toys for coders Software developers were not forgotten in all this. Amazon's Q coding assistant has been bestowed with appealing new capabilities such as generating unit tests, documentation, and performing code reviews. "We think that these agents can materially reduce a lot of the time that's spent on these really important, but maybe undifferentiated tasks, and allow your developers to spend more time on those value-added activities," claimed Garman. Customers would love an easy button to get off of Windows The artificially intelligent Q Developer has become more adept at code migration - it can convert Windows-only .NET Framework applications to modern .NET Core (which can be hosted on Linux) far faster than was previously the case, can transform workloads built to run under VMware to cloud native architectures, and can even assist with the conversion of mainframe applications. "Customers would love an easy button to get off of Windows," observed Garman. "They're tired of constant security issues, the constant packing or patching, all the scalability challenges that they have to deal with. And they definitely hate the onerous licensing costs." Garman estimated that Windows refugees can save as much as 40 percent. He said Signaturit - a European digital transaction firm - worked with an early beta of Q Developer to move from Windows .NET applications to Linux. He boasted: "A project that they estimated was gonna take six to eight months, they actually completed in just a few days." More detail on the goodies in Q Developer can be found at DevClass. Amazon's summary of all the stuff it's teased and talked up at re:Invent so far can be found here, for those who want to dive into the individual announcements. We'll have more coverage this week, including what's going on with SageMaker. ®

[6]

SiliconANGLE

Amazon's AI-powered cloud services are beating the competition - SiliconANGLE

Amazon Web Service Inc.'s annual re:Invent conference was flooded with announcements this week as the company increases its dominance in artificial intelligence models, data analytics and AI-powered cloud services. The introduction of the Nova suite of AI multimodal models "makes Microsoft Ignite look like a picnic," according to theCUBE Research's John Furrier. Other announcements include enhancements for Amazon Bedrock and the data analytics platform, SageMaker, and the general availability of Trainium2-powered Elastic Compute Cloud for machine learning training and inference. "I've been covering Amazon for years, and one of the things I've been noticing is the growth of building these platforms and having these tech stacks to help with these modernization efforts," said Paul Nashawaty (pictured), analyst at theCUBE Research. "Amazon is delivering. They're putting together these unified packages that don't offer this bag of bits anymore, and they're offering a single way of delivering ... this tech stack." Nashawaty spoke with Furrier for theCUBE's "Cloud AWS re:Invent Coverage," during an exclusive broadcast on theCUBE, SiliconANGLE Media's livestreaming studio. They discussed AI-powered cloud services and how Amazon is tackling the cloud infrastructure problem and the future of multicloud. (* Disclosure below.) Organizations tend to use multiple clouds, according to Nashawaty, so AWS' efforts are aimed at unifying the high-performance, multicloud infrastructure with a developer-friendly interface. In particular, SageMaker could enable an "invisible" cloud architecture. "Ninety-four percent of organizations use two or more clouds," Nashawaty said. "Sixty-five percent use four or more clouds ... if they're using these multicloud or distributed cloud environments, that means that there needs to be a way to use that tech stack across these different cloud environments. So, the [Cloud Native Computing Foundation] and the projects that are being put in place, all that's doing is amplifying that ability to use that scale up architecture, but making the cloud underneath it invisible." CNCF's open-source projects will all support a unified infrastructure of federated clouds, Nashawaty believes. The other pillar of Amazon's new releases is, of course, AI, which he predicts will revolutionize application development. Nashawaty emphasized that developers will still have jobs, but warns that companies will need to welcome AI into the tech stack or face extinction. "In our research, we see that 67% of organizations are hiring generalists over specialists, which is [to say] they're pushing back to organizations like Amazon," he said. "We also see in the same study that over 800 person in respondent study, that we see that 24% of organizations want to release code on an hourly basis, yet only 8% can do so." Re:Invent tends to focus more on developers, according to Nashawaty, who believes that AWS needs to put more of an emphasis on business value to let companies know what it is actually offering. "There's very much a developer-centric view here, the practitioner view here," he said. "They're missing a lot of that business value messaging. That's a big takeaway."

[7]

SiliconANGLE

Nine notable innovations from AWS CEO Matt Garman's re:Invent keynote - SiliconANGLE

Nine notable innovations from AWS CEO Matt Garman's re:Invent keynote Amazon Web Services Inc. Chief Executive Matt Garman delivered a three-hour keynote at the company's annual re:Invent conference to an audience of 60,000 attendees in Las Vegas and another 400,000 watching online, ad they heard a lot of news from the new leader, who became CEO earlier this year after joining the company in 2006. The conference, dedicated to builders and developers, offered 1,900 in-person sessions and featured 3,500 speakers. Many of the sessions were led by customers, partners and AWS experts. In his keynote, Garman (pictured) announced a litany of advancements designed to make developers' work easier and more productive. Here are nine key innovations he shared: Garman kicked off his presentation by announcing the general availability of the company's latest Trainium chip -- Trainium2 -- along with EC2 Trn-2 instances. He described these as the most powerful instances for generative artificial intelligence thanks to custom processors built in-house by AWS. He said Trainium2 delivers 30% to 40% better price performance than current graphics processing unit-powered instances. "These are purpose-built for the demanding workloads of cutting-edge gen AI training and inference," Garman said. Trainium2 gives customers "more choices as they think about the perfect instance for the workload they're working on." Beta tests showed "impressive early results," according to Garman. He said the organizations that did the testing -- Adobe Inc., Databricks Inc. and Qualcomm Inc. -- all expect the new chips and instances will deliver better results and a lower total cost of ownership. He said some customers expect to save 30% to 40% over the cost of alternatives. "Qualcomm will use the new chips to deliver AI systems that can train in the cloud and then deploy at the edge," he said. When the announcement was made, many media outlets painted Trn2 as Amazon looking to go to war with Nvidia Crop. I asked Garman about this in the analyst Q&A, and he emphatically said that was not the case. The goal with its own silicon is to make the overall AI silicon pie bigger where everyone wins. This is how Amazon approaches the processor industry, and there is no reason to assume it will change how it handles partners other than having headlines be clickbait. More Nvidia workloads are run in the AWS cloud, and I don't see that changing. Today's models have become very big and very fast, with hundreds of billions to trillions of parameters. That makes them too big to fit on a single server. To address that, AWS announced EC2 Trainium2 UltraServers. These connect four Trainium2 instances -- 64 Trainium2 chips -- all interconnected by high-speed, low-latency Neuronlink connectivity. This gives customers a single ultranode with over 83 petaflops of compute power from a single compute node. Garman said this will have a "massive impact on latency and performance." It enables very large models to be loaded into a single node to deliver much better latency and performance without having to break it up across multiple nodes. Garman said Trainium3 chips will be available in 2025 to keep up with gen AI's evolving needs and provide the landscape customers need for their inferences. Garman said AWS is the easiest, most cost-effective way for customers to use Nvidia's Blackwell architecture. AWS announced a new P6 family of instances based on Blackwell. Coming in early 2025, the new instances featuring Nvidia's latest GPUs will deliver up to 2.5 times faster compute than the current generation of GPUs. AWS's collaboration with Nvidia has led to significant advancements in running generative AI workloads. Bedrock gives customers model choice: It's not one model to rule them all but a single source for a wide range of models, including AWS' newly announced Nova models. There won't be a divide between applications and gen AI applications. Gen AI will be part of every application, using inference to enhance, build or change an application. Garman said Bedrock resonates with customers because it provides everything they need to integrate gen AI into production applications, not just proofs of concept. He said customers are starting to see real impact from this. Genentech Inc., a leading biotech and pharmaceutical company, wanted to accelerate drug discovery and development by using scientific data and AI to rapidly identify and target new medicines and biomarkers for their trials. Finding all this data required scientists to scour many external and internal sources. Using Bedrock, Genentech devised a gen AI system so scientists can ask detailed questions about the data. The system can identify the appropriate databases and papers from a huge library and synthesize the insights and data sources. It summarizes where it gets the information and cites the sources, which is incredibly important so scientists can do their work. It used to take Genentech scientists many weeks to do one of these lookups. Now, it can be done in minutes. According to Garman, Genentech expects to automate five years of manual efforts and deliver new medications more quickly. "Leading ISVs, like Salesforce, SAP, and Workday, are integrating Bedrock deep into their customer experiences to deliver GenAI applications," he said. Garman said AWS is making it easier for companies to take a large, highly capable frontier model and send it all their prompts for the questions they want to ask. "Then you take all of the data and the answers that come out of that, and you use that output and your questions to train a smaller model to be an expert at one particular thing," he explained. "So, you get a smaller, faster model that knows the right way to answer one particular set of questions. This works quite well to deliver an expert model but requires machine learning involvement. You have to manage all of the data workflows and training data. You have to tune model parameters and think about model weights. It's pretty challenging. That's where model distillation in Bedrock comes into play." Distilled models can run 500% faster and 75% more cheaply than the model from which they were distilled. This is a massive difference, and Bedrock does it for you," he said. This difference in cost can turn around the gen AI application ROI from being too expensive to roll it out in production to be very valuable. You send Bedrock sample prompts from your application, and it does all of the work. But getting the right model is just the first step. "The real value in Generative AI applications is when you bring your enterprise data together with the smart model. That's when you get really differentiated and interesting results that matter to your customers. Your data and your IP really make the difference," Garman said. AWS has expanded Bedrock's support for a wide range of formats and added new vector databases, such as OpenSearch and Pinecone. Bedrock enables users to get the right model, accommodates an organization's enterprise data, and sets boundaries for what applications can do and what the responses look like. Bedrock Guardrails make it easy to define the safety of applications and implement responsible AI checks. "These are guides to your models," said Garman. "You only want your gen AI applications to talk about the relevant topics. Let's say, for instance, you have an insurance application, and customers come and ask about various insurance products you have. You're happy to have it answer questions about policy, but you don't want it to answer questions about politics or give healthcare advice, right? You want these guardrails saying, 'I only want you to answer questions in this area.'" This is a huge capability for developing production applications, Garman said. "This is why Bedrock is so popular," he explained. "Last year, lots of companies were building POCs for gen AI applications, and capabilities like Guardrails were less critical. It was OK to have models 'do cool things.' But when you integrate gen AI deeply into your enterprise applications, you must have many of these capabilities as you move to production applications." Garman said AWS wants to help developers innovate and free them from undifferentiated heavy lifting so they can focus on the creative things that "make what you're building unique." Gen AI is a huge accelerator of this capability. It allows developers to focus on those pieces and push off some of that undifferentiated heavy lifting. Q Developer, which debuted in 2023, is the developers' "AWS expert." It's the "most capable gen AI assistant for software development," he said. Q Developer helped Datapel Systems "achieve up to 70% efficiency improvements. They reduced the time needed to deploy new features, completed tasks faster, and minimized repetitive actions," Garman said. But it's about more than efficiency. The Financial Industry Regulatory Authority or FINRA has seen a 20% improvement in code quality and integrity by using Q Developer to help them create better-performing and more security software. Amazon Q has the "highest reported acceptance rate of any multi-line coding assistant in the market," said Garman. However, a coding assistant is just a tiny part of what most developers need. AWS research shows that developers spend just one hour a day coding. They spend the rest of the time on other end-to-end development tasks. According to Garman, autonomous agents for generating user tests, documentation and code reviews are now generally available. The first enables Amazon Q to generate end-to-end user tests automatically. It leverages advanced agents and knowledge of the entire project to provide developers with full test coverage. The second can automatically create accurate documentation. "It doesn't just do this for new code," Garman said. "The Q agent can apply to legacy code as well. So, if a code base wasn't perfectly documented, Q can understand what that code is doing." The third new Q agent can perform automatic code reviews. It will "scan for vulnerabilities, flag suspicious coding patterns, and even identify potential open-source package risks" that might be present," said Garman. It will identify where it views a deployment risk and suggest mitigations to make deployment safer. "We think these agents can materially reduce a lot of the time spent on really important, but maybe undifferentiated tasks and allow developers to spend more time on value-added activities," he said. Garman also announced a new "deep integration between Q Developer and GitLab." Q Developer functionality is now deeply embedded in GitLab's platform. "This will help power many of the popular aspects of their Duo Assistant," he said. Teams can access Q Developer capabilities, which will be natively available in the GitLab workflows. Garman said more will be added over time. Another new Q Developer capability is performing mainframe modernization, which Garman called "by far the most difficult to migrate to the cloud." Q Transformation for Mainframe offers several agents that can help organizations streamline this complex and often overwhelming workflow. "It can do code analysis, planning, and refactor applications," he said. "Most mainframe code is not very well-documented. People have millions of lines of COBOL code, and they have no idea what it does. Q can take that legacy code and build real-time documentation that lets you know what it does. It helps let you know which applications you want to modernize." Garman said it's not yet possible to make mainframe migration a "one-click process," but with Q, instead of a multiyear effort, it can be a "multiquarter process." Garman introduced the next generation of Amazon SageMaker, which he called "the center for all your data, analytics and AI needs." He said AWS is expanding SageMaker by adding "the most comprehensive set of data, analytics, and AI tools." SageMaker scales up analytics and now provides "everything you need for fast analytics, data processing, search data prep, AI model development and generative AI" for a single view of your enterprise data. He also introduced SageMaker Unified Studio, "a single data and AI development environment that allows you to access all the data in your organization and act on it with the best tool for the job. Garman said SageMaker Unified Studio, which is currently in preview, "consolidates the functionality that analysts and data scientists use across a wide range of standalone studios in AWS today." It offers standalone query editors and a variety of visual tools, such as EMR, Glue, Redshift, Bedrock and all the existing SageMaker Studio capabilities. Even with all these new and upgraded products, solutions and capabilities, Garman promised more to come.

[8]

VentureBeat

The biggest news from Amazon re:Invent 2024

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Cloud computing leader Amazon Web Services's (AWS) annual re:Invent conference for 2024 is taking place this week in Las Vegas, Nevada, and it's shaping up to be the biggest of the series since it launched 12 years ago. Why? Generative AI, of course, and the increasing competition between tech giants and startups to offer useful tools to enterprises -- AWS's bread and butter. VentureBeat's senior AI reporter Emilia David is reporting directly from the conference and is joined remotely by the rest of us covering the most important news for business leaders and those looking to embrace and deploy the latest, most useful AWS technology. Here's the biggest news we've found from the show so far: AWS Brings Multi-Agent Orchestration to Bedrock: AWS has introduced multi-agent orchestration to its Bedrock platform, allowing enterprises to build collaborative AI agents and streamlined workflows. This upgrade enables companies like Moody's to achieve more accurate analyses by coordinating specialized agents for complex tasks. AWS Bedrock Upgrades to Add Model Teaching, Hallucination Detection: New features on Amazon Bedrock include Model Distillation for training smaller, faster AI models and Automated Reasoning Checks to reduce hallucinations. These tools aim to improve response accuracy and enable enterprises to create tailored models for specific needs. AWS SageMaker Transforms Into a Combined Data and AI Hub: AWS unveiled the next generation of SageMaker, integrating analytics and ML tools into a unified platform. The upgrades, including Lakehouse and Unified Studio capabilities, allow enterprises to seamlessly link data from various sources for faster AI app development. Amazon Launches Nova AI Model Family for Generating Text, Images, and Video: Amazon debuted the Nova family of generative AI models at re:Invent 2024, targeting text, image, and video creation. The Nova models, integrated with Bedrock, offer businesses customizable tools for creative content development and advanced AI applications. This year's AWS announcements highlight the company's efforts to empower enterprises with advanced AI, data analytics, and generative tools. Explore these innovations to stay ahead in the AI race.

[9]

TechRadar

Everything announced at AWS re:Invent 2024 you might have missed

With AWS re:Invent underway, the company has already unveiled a whole host of news, updates, upgrades and other announcements. Given everyone's busy schedules, you may have missed some of the more niche or interesting announcements - but don't worry, we've got you covered. If you want to track all the news and updates as they happen - follow our AWS re:Invent 2024 live blog for the breaking news - and don't worry, we'll keep this piece updated with anything else we spot that we don't have time for a full article on! Our special guest in the day one keynote was a familiar face for AWS fans - former CEO, and now Amazon boss Andy Jassy. He revealed Nova, a new family of foundation models, which is hoped will push Amazon to the forefront of the AI arms race. We've covered the image and video generation models Canvas and Reel in our news story - but Nova is actually four tiers of model, available on Amazon Bedrock now. The full line-up is Amazon Nova Micro (a very fast, text-to-text model); and Amazon Nova Lite , Amazon Nova Pro , and Amazon Nova Premier (multi-modal models that can process text, images, and videos to generate text). Amazon's analytics platform SageMaker has long been one of the company's biggest success stories, and it got a host of upgrades at re:Invent 2024. AWS CEO unveiled "a new generation of SageMaker", which the company says will bring together the capabilities customers need for fast SQL analytics, petabyte-scale big data processing, data exploration and integration, model development and training, and generative artificial intelligence into one integrated platform. Another part of this new generation was SageMaker Unified Studio - a new, single environment to access all of your organization's data, and act on it in new ways. The last bit of the SageMaker triple-header was SageMaker Lakehouse - a new way to simplify analytics and AI with an open, unified and secure data lake house - providing easy access right across all of your data. AI is becoming an indispensible tool for developers everywhere, and Amazon Q for Developers is getting a host of new tools. This includes generating end-to-end user tests, accurate documentation, and code reviews - all of which should cut down on developer workloads immensely. Q Developer is also getting transformation support for transforming .NET apps, a great help for modernizing legacy Windows applications, allowing a quick way to transform apps from Windows to Linux. There's also a new elastic service, moving VMware subscriptions to AWS, but also a new tool allowing transforming VMware workloads to cloud native architectures. Trying to get all the information you need from your company's can be tricky to track down, especially if you're a big organization with multiple data siloes. To make this less painful, AWS revealed Amazon Q Business integration with Amazon QuickSight, bringing together structured and unstructured data out of their siloes. Amazon Q Business is also now offering more than 50 new actions for some of the most popualr third-party apps it uses, like creating a task in Asana or sending a private message in Teams. There's also even a new capability that uses agents to execute complex workflows, helping employees automate both simple and complex tasks.

Twitter

Facebook

Copy Link

Amazon Web Services (AWS) made significant AI-related announcements at its re:Invent 2024 conference, including new AI models, chips, and enhancements to existing services, signaling a strong push into the AI market.

AWS Unveils Nova AI Models and Trainium2 Chips

At the AWS re:Invent 2024 conference, Amazon Web Services (AWS) made a series of significant announcements, demonstrating its commitment to artificial intelligence (AI) and cloud computing. Amazon CEO Andy Jassy made a surprise appearance to introduce the Nova family of AI models, signaling the company's determination to compete in the AI space 1

New AI Hardware and Infrastructure

AWS unveiled Trainium2, the second generation of its AI training chip, which has now reached general availability. According to AWS CEO Matt Garman, Trainium2 offers 30 to 40 percent better price performance than current GPU systems 5

. The company also announced plans to launch new EC2 instances featuring NVIDIA's upcoming Blackwell chips, promising up to 2.5 times faster compute than the current generation of GPUs 5

Enhancements to Existing Services

The conference saw numerous updates to AWS's existing services:

Amazon Bedrock, AWS's managed AI model service, introduced Model Distillation, a preview capability to reduce the size and cost of AI models 5
5
.
S3, AWS's cloud storage service, now offers S3 Tables for faster query performance and S3 Metadata for improved metadata management 5
5
.
Amazon Aurora, the cloud-based relational database management system, introduced Aurora DSQL for distributed data storage with low latency 5
5
.
DynamoDB now supports multi-region strong consistency in preview mode 5
5
.

Focus on Simplification and Integration

AWS is working to simplify its offerings while maintaining its "primitives for builders" philosophy. The company is streamlining AI, analytics, and data workflows into a unified view, with advancements in SageMaker positioning it as a hub for machine learning, analytics, and AI 4

Strategic Partnerships and Collaborations

AWS emphasized its ongoing collaboration with NVIDIA for GPU workloads 5

. Interestingly, Apple's senior director of machine learning and AI, Benoit Dupin, made an appearance to discuss Apple's use of AWS services, including plans to evaluate Trainium2 5

Market Position and Future Outlook

While AWS has been perceived as lagging behind in AI compared to competitors like Google and Microsoft, the announcements at re:Invent 2024 suggest that the company is rapidly closing the gap 3

. The introduction of Nova models and enhancements to Bedrock and SageMaker demonstrate AWS's commitment to becoming a major player in the AI market.

Impact on the Tech Industry

The innovations presented at re:Invent 2024 have the potential to disrupt various industries. AWS's advancements in AI chips and models could lead to more efficient and cost-effective AI solutions for businesses across sectors 1

. The company's focus on simplifying complex technologies may also make AI more accessible to a broader range of enterprises.

As AWS continues to evolve its AI offerings and infrastructure, it is positioning itself as a comprehensive platform for enterprises looking to leverage AI technologies. The coming years will likely see increased competition in the cloud AI space, with AWS's latest announcements setting the stage for significant market shifts.

References

Summarized by

Navi

[1]

SiliconANGLE

The latest AWS AI announcements at re:Invent - SiliconANGLE

[2]

SiliconANGLE

At AWS re:Invent 2024, AI innovations fall across markets - SiliconANGLE

[3]

SiliconANGLE

Andy Jassy returns to AWS re:Invent as Amazon beefs up its AI chops - SiliconANGLE

[4]

SiliconANGLE

AWS re:Invent 2024: Builder ethos embraces simplicity - SiliconANGLE

[5]

The Register

How AWS is leaning into AI with chips, code, models

Weekly Highlights

Today's Top Stories

Apple's New AI-Powered Siri Faces Internal Concerns Ahead of iOS 26.4 Release

Apple employees testing the upcoming AI-enhanced Siri in iOS 26.4 express concerns over its performance, highlighting ongoing challenges in the company's AI development efforts.

6 Sources

Technology

1 day ago

Adobe Launches AI Foundry: Custom Generative AI Models for Enterprises

Adobe introduces AI Foundry, a service offering custom generative AI models for enterprises. Built on their Firefly technology, it aims to provide safer, brand-specific AI tools for content creation at scale.

2 Sources

Technology

4 hrs ago

Foxconn Gears Up for NVIDIA's Next-Gen Vera Rubin AI Servers, Eyeing 2026 Launch

Foxconn, a key NVIDIA partner, has begun developing the next-generation Vera Rubin NVL144 MGX AI servers. Mass production is slated for late 2026, signaling NVIDIA's aggressive AI product timeline and Foxconn's growing dominance in the AI server market.

2 Sources

Technology

4 hrs ago

IBM Teams Up with Groq to Accelerate Enterprise AI Deployment

IBM and Groq announce a strategic partnership to enhance AI deployment for enterprises. The collaboration aims to speed up the implementation of agentic AI by integrating Groq's hardware-accelerated inference technology with IBM's watsonx Orchestrate platform.

2 Sources

Technology

4 hrs ago

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

The Outpost

News

About

AWS Unveils Major AI Innovations at re:Invent 2024

AWS Unveils Nova AI Models and Trainium2 Chips

New AI Hardware and Infrastructure

Enhancements to Existing Services

Focus on Simplification and Integration

Strategic Partnerships and Collaborations

Market Position and Future Outlook

Impact on the Tech Industry

References

The latest AWS AI announcements at re:Invent - SiliconANGLE

At AWS re:Invent 2024, AI innovations fall across markets - SiliconANGLE

Andy Jassy returns to AWS re:Invent as Amazon beefs up its AI chops - SiliconANGLE

AWS re:Invent 2024: Builder ethos embraces simplicity - SiliconANGLE

How AWS is leaning into AI with chips, code, models

Related Stories

Databricks Secures Massive $10B Funding Round Amid AI Boom

AWS Unveils Bedrock AgentCore: A Game-Changer for Enterprise AI Agents

AWS Unveils Vision for AI-Driven Enterprise Future at re:Invent 2024

Weekly Highlights

OpenAI Partners with Broadcom for Custom AI Chips in Massive Infrastructure Expansion

Tech Giants and Investment Firms Join Forces in $40 Billion AI Data Center Acquisition

OpenAI's Trillion-Dollar Gamble: Ambitious Plans and Financial Challenges in the AI Race

Weekly Highlights

Today's Top Stories

Apple's New AI-Powered Siri Faces Internal Concerns Ahead of iOS 26.4 Release

Adobe Launches AI Foundry: Custom Generative AI Models for Enterprises

Foxconn Gears Up for NVIDIA's Next-Gen Vera Rubin AI Servers, Eyeing 2026 Launch

IBM Teams Up with Groq to Accelerate Enterprise AI Deployment