Unity Catalog at Data+AI Summit 2026: AI Governance and Multimodal Data

Databricks Unity Catalog data governance visualization showing interconnected catalog nodes, AI agents, and data assets with blue and white color scheme

Unity Catalog at Databricks Data+AI Summit 2026

Databricks Data+AI Summit is running this week in San Francisco, and most coverage has focused on the keynote: agents hitting production scale, OpenSharing, the big partner demos. Fair enough. But the feature set that will actually change how teams build data platforms is in the Unity Catalog update — and it deserves a closer look.

Three things stand out: AI runtime governance that finally has teeth, external engine writes to managed Delta tables, and multimodal data support baked into the catalog itself. Here’s what changed and what to do with it.

AI Governance That Stops Things, Not Just Logs Them

The Unity AI Gateway has been around for a while as a proxy layer for model calls. At this summit, Databricks upgraded it to handle runtime enforcement — the kind that enterprise security teams have been asking for since AI agents started touching production systems.

The headline addition is hard spend caps. When a budget threshold is crossed, requests stop. Not an alert to a Slack channel, not an email to finance — the request stops. For teams running multi-agent pipelines calling external models like GPT-5.5 or Claude at scale, this is the difference between a predictable bill and a surprise charge.

More interesting are the Contextual Service Policies, now in beta. These are runtime guardrails defined at the catalog level. An admin can now enforce “this agent requires approval before pushing any code to GitHub” or “no writes to directories under /prod/” — and it’s enforced at the gateway, not in a system prompt that the model can be jailbroken into ignoring. The policies can also block PII from appearing in responses and detect prompt injection attempts.

Unity AI Gateway also now governs managed MCP services — Google Drive, Jira, Confluence, Slack, GitHub, SharePoint — through the same catalog. Every tool call goes through the same governed telemetry layer, so you get a full audit trail: what agent called what tool, when, and what came back. That’s the traceability production AI needs.

External Engines Can Finally Write to Managed Tables

This one removes a long-standing frustration. Unity Catalog managed Delta tables were write-locked for external engines — if you were running Apache Spark or Flink outside of Databricks compute, you could read but not write. That changes with the external access Public Preview.

Apache Spark, Apache Flink, and DuckDB can now create and write to managed Unity Catalog Delta tables. It’s built on Delta Lake’s catalog commits feature, which coordinates concurrent writes through the catalog — so you get safe multi-engine writes with audit trails rather than the chaos of uncoordinated Delta writes.

Credential vending is now GA as well, with M2M OAuth support and automatic credential refresh for long-running pipelines. Worth noting what’s still off the table: external engines can’t run OPTIMIZE, VACUUM, or ANALYZE, and there’s no support for shallow clones or generated columns. You handle those through Databricks compute.

For multi-engine data platforms — Spark for heavy ETL, Flink for streaming, Databricks for transformation and serving — this makes Unity Catalog a genuine multi-engine governance layer rather than a Databricks-only concept.

Multimodal Data Gets Real Governance

Databricks added a FILE type to Delta and Iceberg tables. PDFs, images, audio, and video can now be governed as native catalog assets, not shunted to separate object storage with separate access policies and no lineage.

This matters for the AI pipelines that are actually running in 2026 — document ingestion, image classification, audio transcription — where the raw unstructured inputs have always been a governance gap. Now the same Unity Catalog access policies and lineage tracking that apply to your structured feature tables apply to the PDF that generated the training labels.

It’s in beta, so production adoption should wait. But the architectural direction is clear: Unity Catalog is becoming the single governance layer for all data and AI assets, structured or not.

What’s GA Right Now

Two capabilities moved to GA and are ready to use: geospatial types in Delta and Iceberg v3 (requires Databricks Runtime 18.2+), enabling native geometry and geography types with Spatial SQL; and External Lineage, which tracks data provenance beyond Databricks to upstream sources and downstream BI tools automatically through Lakeflow Connect.

The Bigger Picture

Unity Catalog started as data governance — access controls for tables and columns. It’s now expanding into AI infrastructure governance: who can call which model, what tools an agent can use, how much it can spend, and what it can do with the results.

That’s a significant architectural bet. Databricks is positioning the catalog as the control plane for enterprise AI, not just enterprise data. Whether that resonates depends on whether your organization has reached the point where AI agent governance is a real ops problem — and for companies running agents in production, it increasingly is.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.

Unity Catalog at Data+AI Summit 2026: AI Governance and Multimodal Data

AI Governance That Stops Things, Not Just Logs Them

External Engines Can Finally Write to Managed Tables

Multimodal Data Gets Real Governance

What’s GA Right Now

The Bigger Picture

Tokenomics Foundation: Linux Foundation Takes on the AI Cost Crisis

Amazon Bedrock AgentCore: Deploy Agents in 4 Commands

Leave a reply Cancel reply

More in:AI & Development

NVIDIA Nemotron-Labs TwoTower: 2.42x Faster Inference, No Retraining Required

Buzz by Block: Open-Source Workspace Where AI Agents Own Their Actions

Copilot Cloud Agent for Linear Is Now GA: Setup and Limits

AI Is Buying Rare Books and Shredding Them. Here’s Why

Kimi K3 Open Weights Live: Self-Host or Use the API?

ExploitGym: OpenAI’s AI Escaped Its Sandbox and Breached Hugging Face

Categories

AI Governance That Stops Things, Not Just Logs Them

External Engines Can Finally Write to Managed Tables

Multimodal Data Gets Real Governance

What’s GA Right Now

The Bigger Picture

Share

You may also like

Leave a reply Cancel reply

More in:AI & Development

Categories

Latest Posts