Notes: Snowflake - Open Storage Competition (Pt.1)

Notes: Snowflake - Open Storage Competition (Pt.1)

Summary

  • Fully embracing Iceberg will likely be a net positive for Snowflake, but with a number of caveats.
  • The Tabular acquisition highlights the weaknesses and past missteps of Databricks. Integration will be a big challenge.
  • The future of Iceberg remains uncertain, but with broad ecosystem adoption it is unlikely that Databricks will repeat the same, largely unsuccessful, strategy it adopted for Delta Lake.
  • The data catalog is a big missing piece for open storage format adoption. Polaris Catalog has great potential to become the standard, but Databricks/Tabular will be fighting back too.
  • In this Part 1 of the SNOW Notes series, we're going to be discussing what the adoption of Iceberg means for Snowflake.
  • In Part 2 and Part 3 of this Notes series we will discuss Databricks' Tabular acquisition and its impact on Snowflake along with the implications of Snowflake introducing the Polaris Catalog.

Since we last covered the Modern Data Stack (MDS) in the SNOW initiation report, it has become evident that the hype in this space has peaked, and the pace of innovation has rapidly declined. As a result, we are witnessing consolidation and washout in the industry. For the remaining vendors, competition has become even more aggressive as they vie for market share and customer attention in a space where growth is quickly decelerating.

The hottest topic within the MDS space is the storage standard war, which Databricks has played very well since 2017 with its open-sourcing of Delta Lake. The timeline of significant events in this space is as follows:

• 2017: Databricks open-sources Delta Lake

• 2021: Apache Iceberg is released and quickly becomes the standard, as we correctly anticipated in 2022

• Mid-2023: Both Databricks and Snowflake (SNOW) release their data governance and catalog solutions, Unity Catalog and Horizon respectively

• June 2024: SNOW announces the release of Polaris Catalog, an open-source catalog for Iceberg, to be publicly available in 90 days

• Two days later: Databricks announces a staggering $1bn+ acquisition of Tabular, the open-core company behind Iceberg, which had raised just $37 million and had only 44 employees

• Eight days after SNOW's announcement: Databricks open-sources its Unity Catalog entirely, immediately publishing the Unity Catalog 0.1 source code on Github

So, what's going on?

  • Will SNOW begin a decline as it has to embrace Iceberg instead of its proprietary storage format?
  • Will Databricks completely takeover SNOW by owning Tabular, a company founded by Iceberg creators?
  • What is a data catalog and what does the launch of Polaris Catalog mean for SNOW?
  • What should we expect next in the intensifying competition?

In this Part 1 of our SNOW Notes report, we shall address the first question.

!DOCTYPE html> Contact Footer Example