Discussion about this post

User's avatar
Chris Jackson's avatar

While I enjoyed this read, I think one key point missed is that databases don’t just store data, they do process it. The main processing paradigm in many including Snowflake is SQL, and it isn’t just used to get data out of databases for end user queries. In the analytic world huge volumes of grunt transformation, some of it quite sophisticated, are done in SQL. There are decades of history in optimising this stuff which means engineers have to worry a lot less about tuning than in other processing paradigms. Even in an ML pipeline perhaps 80-90% of what is done is work most easily done in SQL.

Of course not everyone likes working in that paradigm (though it’s interesting that almost every data lake alternative has felt the need to bolt on SQL-like processing) which is where the ability to use Scala and Java directly on or in Snowflake fleshes out the capability.

Expand full comment
Vivek's avatar

A critical part that needs to be kept in mind is ‘ownership of data’ - companies, businesses and entities want to own their data without having to move it around. A platform like Databricks provides that to those customers.

Expand full comment
4 more comments...

No posts