The Endless Data Buffet
Data Mesh + Champagne Brunch = Heaven.
I have always maintained that brunch is the best meal. Why? Because there is nothing you can’t eat at brunch. Pancakes, pad thai, steak, tacos, ice cream, salad, pastry... Whatever you’re in the mood for, it’s all fair game at brunch. And there’s a special brand of brunch that I would argue is truly the best: ✨the champagne brunch buffet✨ These magical places allow you to choose from an array of incredible brunch options, all prepared incredibly well. There are stations for each type of food, you can choose exactly what you want, and add a bit of bubbly 🍾 Heaven!
So how does a champagne brunch buffet come together?
This nirvana is truly greater than the sum of its parts. You’ve got pastry chefs baking, a team running the omelette station, a sushi chef, folks chopping veggies for salads, a pasta station, carving station, charcuterie, etc. Each kitchen team focuses on the ingredients they need to pull together the product - the eggs or fruit or flour or specific baking dishes. The end product - what ends up on my plate - is a delightful mix of what I like - a bit of crunch, a bit of savory, and a bit of sweet 🍳🍣🥐🍝🌮 And, of course, champagne 🥂
A data buffet
What does this have to do with data? More than you’d think! As we all know, data is the new oil, and it’s everywhere, growing exponentially every day. But it turns out that data by itself is fun to collect but isn’t truly valuable until you use it to build business insights through analysis. There are also questions about how to organize and process data, and who should “own” it at each stage, and that’s where things get interesting.
Think of data as the raw ingredients, the line cooks and chefs as the data engineers preparing the data dishes, and the analysts as the brunch customers, picking and choosing dishes that will make the perfect plate of analytics and business value. And of course, the champagne is the business value that takes it to the next level 📈
How do you get from raw ingredients to the plate full of delicious business value? Turns out, it’s just like a brunch buffet!
➡️ Just as a chef has their focus and uses a team of line cooks to create a food product - omelettes, pastries, tacos - each data domain has their business focus and employs data engineers to create data products 🧑🏽🍳🧑🏿🍳👩🏻🍳
➡️ And in both food and data, teams can help each other out by sharing ingredients and even food/data products. One prepared dish (say, salsa) can be used in another dish (a Mexican omelette); one data product (say, customer records) can be used in producing another (a lookalike analysis of target markets).
➡️ Each team of chefs presents their food product on the buffet, and each team of data producers presents their data product on the data buffet.
This is exactly like the new hotness blowing up the data world right now, the 🔥Data Mesh🔥.
Let’s look at the analogy:
Ingredients ➡️ Data. This one’s pretty straightforward! Just as food and individual ingredients are the building blocks that ultimately end up in dishes on your plate, raw data is the base that ultimately produces delectable business insights.
Cuisine type ➡️ Domain. On a buffet you’ve got charcuterie, pasta, pastries, desserts, etc. In a business, a domain would be sales, marketing, clickstream, finance, etc. These are the business functions that produce data, just like the sushi team produces sushi.
Chef ➡️ Data Product Owner. In the same way a chef runs a kitchen team and decides what dishes to make, in Data Mesh the data product owner directs the domain’s data strategy and ultimately is responsible for its data products.
Line cooks ➡️ Domain Data Product Developers (aka Data Engineers). Just as a line cook takes raw ingredients (eggs, veggies, cheese) and makes a finished dish (a frittata), these are the folks that do the technical work of using data to build data products for the data consumers.
Buffet options ➡️ Data Products. Buffet options for each cuisine will be things like “pasta” and made up of spaghetti, tortellini, and farfalle. Data products for each domain will be things like “regional sales activity” and are made up of things like daily, quarterly, and annual aggregates. Note that the diner (data consumer) doesn’t really care how the dishes in their brunch buffet (data products) were created as long as as it’s high quality food (high quality data), nor is that essential information for their enjoyment of the food (business analysis).
Kitchen ➡️ Self-service Infrastructure. While there may be slight variations in the tools the cooks and chefs use on each team, the cooks don’t need to know how a kitchen is built, how a counter is installed, or how a freezer works - they focus on prepping the food. Similarly, a data product developer can use the self-service data infrastructure created and maintained by the central IT group to focus on building the data product, without worrying about installing or maintaining software or hardware.
Plate of delicious brunch foods ➡️ High-value business analyses. The end product in a brunch is a plate full of the foods you enjoy and are able to eat - whether because you have avoided foods with certain allergens or ingredients, or because you don’t have a ticket that allows you to hit up the open bar. For the Data Mesh, the end product is a buffet of data from which you can pick and choose the data you’re allowed to see, and produce whatever business analytics you require. Delicious!
Let’s take this a little further:
Each team of chefs and kitchen workers should use the ingredients and tools that make sense to their dishes - you don’t need an ice cream maker at the sushi bar. Teams should focus on their specialities - you wouldn’t ask folks from the meat carving station to start baking croissants, would you?
The end product is a finished dish - I want an omelette, not a set of ingredients that I need to prepare and learn how to cook myself.
The food should all be prepared in a kitchen built and serviced by people who focus on building and servicing kitchens, not the cooks. Also it makes sense that the buffet is the place where customers get their food - you don’t want some food located in the kitchen and some in the parking lot, it should all be easy to access.
Lastly - if I pay for the bottomless champagne brunch and endless buffet, I should have access to everything, but what if I only pay for the salad bar and water? It should be clear (by my plate color or my ticket color perhaps) what I’m allowed to eat, and that’s all I can get. Moreover, there are food safety regulations to think about - you don’t want to leave dishes out too long, or forget to store things in the right temperature. Someone needs to keep an eye on these things and enforce the rules.
There are other places to get food & data, right?
Of course there are other meals that will fill your basic human need for nourishment, and some that will even taste good or may work better for you:
Perhaps you love to cook (you’re a more technical analyst and understand the raw data incredibly well) - in that case maybe a meal prep kit where you receive the ingredients and prep the meal yourself is a better fit. I’d call that the data lake use case - the data is all there, but you need to figure it out on your own.
On the other hand, perhaps you want a traditional restaurant meal, where you have a plate full of dishes that were chosen for you, with few or no substitutions. This is more like a classic data warehouse in my mind - which is also convenient and delicious at times. But if you’re a group of diners (analysts) with wildly different tastes (use cases), it’ll be hard to find a restaurant that fits the bill.
Sometimes I just can’t deal, and I end up eating popcorn for dinner. And sometimes, you just need a spreadsheet 📊 - it happens to the best of us! Just keep in mind that popcorn for dinner isn’t something you can do every night, and it’s certainly not what most adults would serve to a friend 😉
The difference between these cases is really where the choice of what goes on your plate comes from. In the data lake / meal prep kit case, it’s up to you to choose which meals you want from the list - but you have to prepare the food (datasets) yourself. In the data warehouse / restaurant case, the choices are there but you’re limited by what the chefs (data engineers) know how to cook, and if you’re in the mood for sushi at an Italian restaurant, well, you’re out of luck.
The bottom line
If you want your end-users and analysts to have their choice of delicious data products, the answer is to provide them with a bottomless data buffet in the form of a Data Mesh. There are four guiding principles that reflect and encompass these buffet-like concepts within the Data Mesh:
Domain-driven data ownership and architecture
Data as a product
Self-serve infrastructure as a platform
Federated computational governance.
The bottomless brunch buffet also satisfies these four principles, and it’s where you get the biggest bang for your buck: everyone can prepare the meal that suits their specific dietary restrictions and tastes. And the beauty of a Data Mesh is that all the users can choose the data that they need and build the analysis that brings the most business value.
In my mind, nothing will ever satisfy like a champagne brunch buffet driving a data-driven business. Care to discuss? Find me on Twitter @ctartow!