Home Tech Databricks Buys Tabular to Create Unified Data Lakehouse

Databricks Buys Tabular to Create Unified Data Lakehouse

0
Databricks Buys Tabular to Create Unified Data Lakehouse

0:00

Databricks, a leader in analytics and AI, has purchased data management firm Tabular for an undisclosed amount. (CNBC reports that Databricks paid over $1 billion for the acquisition.)

According to Tabular co-founder Ryan Blue, he and the other two co-founders, Daniel Weeks and Jason Reid, will join Databricks in various roles. Their objective will be to integrate Tabular’s and Databrick’s customer bases and communities.

“Being part of Databricks will bring more contributions from our new colleagues,” Blue writes in a blog post. “While doing this, we assure that our approach to [our community] is not changing.”

Founded in 2021 by Blue, Weeks, and Reid, Tabular offers data management solutions built on Apache Iceberg, a project initially developed by Blue and Weeks at Netflix and later donated to the Apache Software Foundation. Iceberg is an open-source, high-performance format for optimizing database tables for large-scale data while allowing data engines to work efficiently with these tables.

Iceberg competed with Databricks’ Delta Lake in the format wars for data lakehouses — architectures designed to store massive amounts of raw data while providing structure and management functionalities. Though both Iceberg and Delta Lake utilize the Apache Parquet data storage format, they differ in several key aspects.

However, with this acquisition, Delta Lake and Iceberg will soon align into a unified standard. Databricks and Tabular are committed to this common framework.

“[We will be] working to enhance Iceberg support throughout the Databricks platform,” Blue said. “Our goal is to improve interoperability so that you can leverage the innovative work of both communities without worrying about the underlying format.”

The market for data lakehouses is vast — according to MIT Tech Review, around 74% of organizations have one. From Databricks’ perspective, integrating Tabular was a logical decision. Having fewer competing data lakehouse formats, or platforms that robustly support multiple formats, enhances Databricks’ appeal to corporate clients, even when these formats are not proprietary.

In a blog post co-authored by Databricks CEO Ali Ghodsi and chief architect Reynold Xin, Databricks states its intention to “work closely” with both the Iceberg and Delta Lake communities to “achieve interoperability between the formats.”

“This acquisition underscores our dedication to open formats and open-source data in the cloud,” the blog post comments. “This will be a long journey, likely taking several years to achieve in [the data lakehouse] communities.”

Before the acquisition, Tabular, based in San Jose, had raised $37 million in venture capital from investors, including Andreessen Horowitz, Zetta Venture Partners, and Altimeter Capital. Databricks anticipates that the acquisition will be finalized in Q2 2024, subject to customary closing conditions.

No comments

Leave a reply

Please enter your comment!
Please enter your name here

Exit mobile version