Databricks unveils LTAP architecture to eliminate data pipelines slowing AI agents

3 Sources

Share

Databricks announced two products at its Data + AI Summit that aim to solve a decades-old infrastructure challenge. The company's LTAP architecture and Lakehouse//RT engine eliminate the need for separate data pipelines between operational and analytical systems, delivering millisecond query latency directly on Delta Lake and Apache Iceberg tables. The move addresses a critical bottleneck as enterprises scale AI agents that need to reason and act on live data without delays.

News article

Databricks tackles the data pipeline bottleneck at Data + AI Summit

At the Data + AI Summit on Tuesday, Databricks announced a fundamental shift in how enterprises handle operational and analytical data, introducing two products designed to eliminate infrastructure that has slowed AI agents

1

. The company unveiled Lake Transactional/Analytical Processing (LTAP) and Lakehouse//RT, technologies that promise to collapse the decades-old separation between transactional databases and analytical systems

2

.

Reynold Xin, co-founder of Databricks, described a simpler data stack as "the holy grail for agents," arguing that as users generate more applications, AI agents reasoning analytically need the underlying infrastructure out of the way to move fast

1

. The challenge is structural: a system that reasons continuously and acts on live data cannot tolerate a pipeline between itself and the information it needs to act on.

LTAP delivers a unified platform for operational and analytical data without ETL pipelines

LTAP stores PostgreSQL-native transactional data in Delta Lake and Apache Iceberg format from the point of write, eliminating ETL pipelines that have connected operational and analytical systems for decades

1

. The architecture builds upon Lakebase, Databricks' serverless cloud-based PostgreSQL database service that became generally available in February, built on technology from the Neon acquisition

3

.

Shanku Niyogi, Databricks' vice president of product management, has renamed Change Data Capture (CDC) as "continuous data corruption," reflecting widespread frustration with pipeline reliability. "CDC was slow, and it was buggy, and it was expensive. Pipelines break down. Schemas change," Niyogi said during an interview at the summit

3

. He cited a large banking customer maintaining hundreds of thousands of Postgres databases, each requiring CDC pipelines to bring data back to the lake

2

.

The LTAP approach unifies data at the storage layer rather than the engine level, distinguishing it from earlier HTAP (Hybrid Transactional/Analytical Processing) attempts. "HTAP to us is kind of more of a failure of the industry rather than a success," Xin noted

1

. Instead of converging engines, LTAP maintains PostgreSQL compatibility for transactional workloads while simultaneously writing data in columnar formats like Delta Lake and Apache Iceberg that analytical engines can read directly.

Lakehouse//RT enables millisecond query latency without separate serving infrastructure

Lakehouse//RT delivers sub-100ms latency at 12,000 queries per second, with response times as low as 10ms on smaller datasets and up to 16 times better performance than existing dedicated serving stacks

1

. The product is powered by a new execution engine called Reyden, built specifically for high-concurrency, low-latency serving that queries Delta Lake and Apache Iceberg tables directly without moving data out of the lakehouse.

Niyogi described Lakehouse//RT as "the biggest innovation we've had since we started the lakehouse" in 2020, noting that it removes the need for separate serving infrastructure while delivering real-time data access

3

. Every query runs within Unity Catalog's governance framework with no separate permissions layer, no data copies and no ingestion pipelines

1

.

Why eliminating data pipelines matters for AI agents at scale

The urgency stems from explosive growth in code generation. "This year, the amount of code being written in the world has gone up 50x. We think in the next 12 months, more code will be written than in the history of coding," Niyogi said

3

. These applications, increasingly powered by AI agents, need to read, analyze and act upon data in near real-time, making traditional architectures with separate transactional systems, analytical systems and serving layers inadequate

2

.

"Agents need the best data," Niyogi explained. "If they're getting stale or wrong data, they act poorly"

2

. The central engineering challenge is latency, as object storage carries response times in the seconds range, far too slow for OLTP workloads requiring sub-millisecond performance. Lakebase handles this through a caching layer between Postgres compute instances and object storage, with idle CPU capacity performing row-to-column conversion before data lands in object storage. When data converts from row to column, it compresses more than 10 times typically, substantially reducing network costs

1

.

Databricks plans to open-source technology that enables PostgreSQL data to be stored in Apache Parquet format while preserving compatibility, reinforcing its commitment to open formats

2

. As enterprises grapple with scaling AI agents, the ability to eliminate pipeline complexity while maintaining governance and performance will determine which organizations can deploy autonomous systems effectively.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved