Home
Scholarly Works
Unity Catalog: Open and Universal Governance for...
Conference

Unity Catalog: Open and Universal Governance for the Lakehouse and Beyond

Abstract

Enterprises are increasingly adopting the Lakehouse architecture to manage their data assets due to its flexibility, low cost, and high performance. While the catalog plays a central role in this architecture, it remains underexplored, and current Lakehouse catalogs exhibit key limitations, including inconsistent governance, narrow interoperability, and lack of support for data discovery. Additionally, there is growing demand to govern a broader range of assets beyond tabular data, such as unstructured data and AI models, which existing catalogs are not equipped to handle. To address these challenges, we introduce Unity Catalog (UC), an open and universal Lakehouse catalog developed at Databricks that supports a wide variety of assets and workloads, provides consistent governance, and integrates efficiently with external systems, all with strong performance guarantees. We describe the primary design challenges and how UC's architecture meets them, and share insights from usage across thousands of customer deployments that validate its design choices. UC's core APIs and both server and client implementations have been available as open source since June 2024.

Authors

Chandra R; Chen H; Matharu R; Cai S; Chen J; Dutta P; Ghita B; Greenstein T; Holla G; Huang P

Pagination

pp. 310-322

Publisher

Association for Computing Machinery (ACM)

Publication Date

June 22, 2025

DOI

10.1145/3722212.3724459

Name of conference

Companion of the 2025 International Conference on Management of Data
View published work (Non-McMaster Users)

Contact the Experts team