2 Sources
2 Sources
[1]
The modern data stack was built for humans asking questions. Google just rebuilt its for agents taking action.
Enterprise data stacks were built for humans running scheduled queries. As AI agents increasingly act autonomously on behalf of businesses around the clock, that architecture is breaking down -- and vendors are racing to rebuild it. Google's answer, announced at Cloud Next on Wednesday, is the Agentic Data Cloud. The architecture has three pillars: * Knowledge Catalog. Automates semantic metadata curation, inferring business logic from query logs without manual data steward intervention * Cross-cloud lakehouse. Lets BigQuery query Iceberg tables on AWS S3 via private network with no egress fees * Data Agent Kit. Drops MCP tools into VS Code, Claude Code and Gemini CLI so data engineers describe outcomes rather than write pipelines "The data architecture has to change now," Andi Gutmans, VP and GM of Data Cloud at Google Cloud, told VentureBeat. "We're moving from human scale to agent scale." From system of intelligence to system of action The core premise behind Agentic Data Cloud is that enterprises are moving from human‑scale to agent‑scale operations. Historically, data platforms have been optimized for reporting, dashboarding, and some forecasting -- what Google characterizes as "reactive intelligence." In that model, humans interpret data and decide what to do. Now, with AI agents increasingly expected to take actions directly on behalf of the business, Gutmans argued that data platforms must evolve into systems of action. "We need to make sure that all of enterprise data can be activated with AI, that includes both structured and unstructured data," Gutmans said. "We need to make sure that there's the right level of trust, which also means it's not just about getting access to the data, but really understanding the data." The Knowledge Catalog is Google's answer to that problem. It is an evolution of Dataplex, Google's existing data governance product, with a materially different architecture underneath. Where traditional data catalogs required data stewards to manually label tables, define business terms and build glossaries, the Knowledge Catalog automates that process using agents. The practical implication for data engineering teams is that the Knowledge Catalog scales to the full data estate, not just the curated subset that a small team of data stewards can maintain by hand. The catalog covers BigQuery, Spanner, AlloyDB and Cloud SQL natively, and federates with third-party catalogs including Collibra, Atlan and Datahub. Zero-copy federation extends semantic context from SaaS applications including SAP, Salesforce Data360, ServiceNow and Workday without requiring data movement. Google's lakehouse goes cross cloud Google has had a data lakehouse called BigLake since 2022. Initially it was limited to just Google data, but in recent years has had some limited federation capabilities enabling enterprises to query data found in other locations. Gutmans explained that the previous federation worked through query APIs, which limited the features and optimizations BigQuery could bring to bear on external data. The new approach is storage-based sharing via the open Apache Iceberg format. That means whether the data is in Amazon S3 or in Google Cloud , he argued it doesn't make a difference. "This truly means we can bring all the goodness and all the AI capabilities to those third-party data sets," he said. The practical result is that BigQuery can query Iceberg tables sitting on Amazon S3 via Google's Cross-Cloud Interconnect, a dedicated private networking layer, with no egress fees and price-performance Google says is comparable to native AWS warehouses. All BigQuery AI functions run against that cross-cloud data without modification. Bidirectional federation in preview extends to Databricks Unity Catalog on S3, Snowflake Polaris and the AWS Glue Data Catalog using the open Iceberg REST Catalog standard. From writing pipelines to describing outcomes The Knowledge Catalog and cross-cloud lakehouse solve the data access and context problems. The third pillar addresses what happens when a data engineer actually sits down to build something with all of it. The Data Agent Kit ships as a portable set of skills, MCP tools and IDE extensions that drop into VS Code, Claude Code, Gemini CLI and Codex. It does not introduce a new interface. The architectural shift it enables is a move from what Gutmans called a "prescriptive copilot experience" to intent-driven engineering. Rather than writing a Spark pipeline to move data from source A to destination B, a data engineer describes the outcome -- a cleaned dataset ready for model training, a transformation that enforces a governance rule -- and the agent selects whether to use BigQuery, the Lightning Engine for Apache Spark or Spanner to execute it, then generates production-ready code. "Customers are kind of sick of building their own pipelines," Gutmans said. "They're truly more in the review kind of mode, than they are in the writing the code mode." Where Google and its rivals diverge The premise that agents require semantic context, not just data access, is shared across the market. Databricks has Unity Catalog, which provides governance and a semantic layer across its lakehouse. Snowflake has Cortex, its AI and semantic layer offering. Microsoft Fabric includes a semantic model layer built for business intelligence and, increasingly, agent grounding. The dispute is not over whether semantics matter -- everyone agrees they do. The dispute is over who builds and maintains them. "Our goal is just to get all the semantics you can get," he explained, noting that Google will federate with third-party semantic models rather than require customers to start over. Google is also positioning openness as a differentiator, with bidirectional federation into Databricks Unity Catalog and Snowflake Polaris via the open Iceberg REST Catalog standard. What this means for enterprises Google's argument -- and one echoed across the data infrastructure market -- is that enterprises are behind on three fronts: Semantic context is becoming infrastructure. If your data catalog is still manually curated, it will not scale to agent workloads -- and Gutmans argues that gap will only widen as agent query volumes increase. Cross-cloud egress costs are a hidden tax on agentic AI. Storage-based federation via open Iceberg standards is emerging as the architectural answer across Google, Databricks and Snowflake. Enterprises locked into proprietary federation approaches should be stress-testing those costs at agent-scale query volumes. Gutmans argues the pipeline-writing era is ending. Data engineers who move toward outcome-based orchestration now will have a significant head start.
[2]
Google delivers connective tissue for autonomous AI agents to access data without restrictions - SiliconANGLE
Google delivers connective tissue for autonomous AI agents to access data without restrictions Google Cloud is turning the traditional enterprise data platform on its head, unveiling the Agentic Data Cloud infrastructure platform that aims to act as a kind of central nerve center for the era of artificial intelligence agents. In a blog post, Andi Gutmans, Google's vice president and general manager of Data Cloud, explains that existing data infrastructures were designed to act as "static repositories," where information just sits until it's asked a question by a human. But in the era of AI, this kind of "human-scale" infrastructure is no longer fit for purpose. To that end, Google has designed the Agentic Data Cloud to work as a "system of action" that evolves data infrastructure into a dynamic reasoning engine that enables autonomous agents to get to work, rather than just think about the problems they're trying to solve. Announced at Google Cloud Next 2026 this week in Las Vegas, the Agentic Data Cloud will provide the connective tissue AI agents need to work across the enterprise without hindrance, and it's built on three main pillars: a universal context engine that aims to prevent agents from "hallucinating," a suite of agentic-first developer tools, and the cross-cloud lakehouse platform that unifies data from across any cloud environment. According to Gutmans, one of the biggest hurdles with deploying AI agents today is the so-called "context gap." If an agent doesn't understand a company's specific definition of what something like "gross margin" actually means, it's probably going to end up making expensive mistakes. To fix this, Google has evolved its Dataplex Universal Catalog into the Knowledge Catalog, which is a kind of map of business meaning that's meant to inform AI agents of the peculiarities of the organization they serve. The catalog scans all of a company's documents, including its accounts, PDFs, PowerPoint presentations and images, extracting entities and studying the relationships within them to build a navigable schema that agents can use. Also helping with this is BigQuery Measures and a new LookML Agent that will help to bake business logic into the entire Agentic Data Cloud stack. By aggregating all of these metrics into a single, governed data foundation, Google says that when an AI agent queries company data, it will use the same "source of truth" each time. This new context engine is already powering Google's new Deep Research Agent, enabling it to perform multistep reasoning across web assets and internal documents to create complex research reports that would take human analysts weeks. The lives of developers are being made easier, too. The company has announced a new Google Cloud Data Agents Kit that brings "agentic skills" directly into the tools developers already use, including platforms such as Claude Code and VS Code. With the Data Agent Kit, developer environments can autonomously orchestrate outcomes, including selecting frameworks such as Apache Spark or dbt, while generating production-ready code based on Google's best practices. Three new, highly specialized AI agents were also announced to make life easier for developers. They include a new Data Engineering agent for building and governing complex data transformations, a Data Science agent for automating AI model lifecycles across BigQuery and Spark, and a Database Observability agent that acts like a "guardian," tasked with diagnosing and repairing data infrastructure issues. Gutmans said Google has embraced the Model Context Protocol to ensure these agents play nicely with one another. "[It] provides a secure, universal interface that allows any agent to safely discover and use your data assets across our core engines, including: BigQuery, Spanner (Preview), AlloyDB, Cloud SQL (GA) and Looker MCP (Preview)," he said. "MCP for Google Cloud uses our security stack, governing agent interactions based on your existing IAM policies, VPC Service Controls, and data residency requirements." Finally, Google is trying to address the problem of AI agent "gravity." This refers to how agents lose their autonomy when they're slowed down by cross-cloud latency or prevented from accessing data trapped in other cloud platforms. Gutmans introduced the new "cross-cloud Lakehouse," which aims to provide a borderless data environment for AI agents. It integrates with Google's Cross-Cloud Interconnect service directly into the data plane, and employs the Apache Iceberg REST catalog to connect to the Amazon Web Services and Microsoft Azure clouds. What this means is that AI agents can treat data stored in Azure data lake or in an S3 bucket as if it were sitting locally in Google Cloud, without the usual headaches associated with data migration and egress fees. To aid data mobility further, Google also introduced bi-directional federation capabilities for Databricks Unity Catalog, Snowflake Polaris and AWS Glue to break down proprietary data siloes. It's also unchaining its Spanner Omni database, allowing it to run on-premises or in rival clouds.
Share
Share
Copy Link
Google Cloud announced Agentic Data Cloud at Cloud Next 2026, a data infrastructure platform designed to enable autonomous AI agents to operate at scale. The platform addresses the shift from human-driven queries to agent-driven actions with three core components: Knowledge Catalog for automated metadata curation, a cross-cloud lakehouse that queries data across AWS and Azure without egress fees, and Data Agent Kit that lets engineers describe outcomes instead of writing code.
Google Cloud unveiled its Agentic Data Cloud at Cloud Next 2026 in Las Vegas, fundamentally rethinking how enterprise data stacks serve autonomous AI agents rather than human analysts
1
2
. The data infrastructure platform represents a shift from reactive intelligence to systems of action, where AI agents take direct business actions around the clock instead of waiting for humans to interpret dashboards. Andi Gutmans, VP and GM of Data Cloud at Google Cloud, told VentureBeat that "the data architecture has to change now" as companies move from human scale to agent scale1
. The platform provides what Gutmans describes as the "connective tissue for AI agents" to access enterprise data without restrictions2
.
Source: VentureBeat
The Knowledge Catalog addresses what Google identifies as the "context gap"—when agents misinterpret business-specific definitions and make costly errors
2
. Evolved from Dataplex, Google's existing data governance product, the catalog automates semantic metadata curation by inferring business logic from query logs without manual data steward intervention1
. This architectural shift means data engineering teams can scale to their full data estate rather than just the curated subset a small team can maintain manually. The catalog scans documents including accounts, PDFs, PowerPoint presentations and images, extracting entities and studying relationships to build a navigable schema2
. It covers BigQuery, Spanner, AlloyDB and Cloud SQL natively, and federates with third-party catalogs including Collibra, Atlan and Datahub1
. Zero-copy federation extends semantic context from SaaS applications including SAP, Salesforce Data360, ServiceNow and Workday without requiring data movement.The cross-cloud lakehouse tackles what Google calls "data gravity"—when agents lose autonomy due to cross-cloud latency or data trapped in other platforms
2
. BigQuery can now query Apache Iceberg tables sitting on AWS S3 via Google's Cross-Cloud Interconnect, a dedicated private networking layer, with no egress fees and price-performance comparable to native AWS warehouses1
. Gutmans explained the previous federation worked through query APIs, limiting optimizations BigQuery could apply to external data. The new storage-based sharing approach means AI agents can treat data stored in Azure data lakes or S3 buckets as if it were local in Google Cloud2
. Bidirectional federation in preview extends to Databricks Unity Catalog on S3, Snowflake Polaris and AWS Glue Data Catalog using the open Iceberg REST Catalog standard1
.
Source: SiliconANGLE
Related Stories
The Data Agent Kit introduces agent-centric platforms directly into developer workflows, shipping as portable MCP tools and IDE extensions for VS Code, Claude Code, Gemini CLI and Codex
1
. Rather than writing Spark pipelines to move data, engineers describe outcomes—a cleaned dataset for model training or a transformation enforcing governance rules—and agents select whether to use BigQuery, Lightning Engine for Apache Spark or Spanner, then generate production-ready code. "Customers are kind of sick of building their own pipelines," Gutmans said, noting they're "truly more in the review kind of mode"1
. Google announced three specialized agents: a Data Engineering agent for building complex data transformations, a Data Science agent for automating AI model lifecycles across BigQuery and Spark, and a Database Observability agent that diagnoses and repairs infrastructure issues2
. The Model Context Protocol provides a secure, universal interface allowing any agent to safely discover and use data assets, with interactions governed by existing IAM policies, VPC Service Controls and data residency requirements2
.Summarized by
Navi
[1]
06 Aug 2025•Technology

Yesterday•Technology

10 Apr 2025•Technology
1
Policy and Regulation

2
Technology

3
Business and Economy
