A few years back, Annika and I started this blog because we had ideas about data maturity. We wrote about what data maturity was in Part 1, and how that extended to analytics in Part 2. Over the years I keep coming back to this idea of data maturity, and it’s become incredibly relevant once again now that we’re in the midst of the genAI hype cycle. So buckle up, it’s time for Part 3 🙌
Data Maturity, previously
The conclusion we came to in our previous articles (spoiler alert!) was: ✨to be mature in analytics, a company must be mature in data✨ So what does Data Maturity even mean? In the past we divided organizations into low and high data maturity:
➡️ Low data maturity
A low data maturity, or data immature, organization doesn’t know a lot about data, and it typically hasn’t been a priority for them. They have very few experienced data people, and they’re not using much beyond Excel / Powerpoint / an old school ERP or basic accounting system. They’ve likely got reporting under control but when it comes to anything beyond very simple dashboards or SQL queries against source DBs, they’re just not there.
➡️ High data maturity
A highly data mature team, on the other hand, has a solid handle on data engineering, and data science is a priority. Data is treated like a product. Many people are proficient in Python & SQL. They focus on accelerating the time-to-value for data, and the company thinks ahead when it comes to data. They’re strategic about data technology, and data is a steady investment target/priority amongst leadership.
Adding deep learning to the equation
So how does AI fit in? In my previous article of March 2021, I said “For years I have been annoying people who want to bring data science into their orgs by telling them that 90% of the work is the data.” I still say this today 3+ years later, but now in reference to deep learning and generative AI. Particularly when attempting AI, having a mature data strategy - meaning the people, process, and technologies to harness the value of data - is paramount. Data engineering and curation of data into data products remains a key activity of data mature organizations.
It’s all about the unstructured data
Unstructured data can’t be put into a table format, encompassing audio 🔉, video 📹, images 🩻 (my beloved memes!), documents 📑, etc. And it’s really the key to genAI - being able turn unstructured data into math is the innovation that makes generative AI possible. It is necessary for organizations to store, catalog, curate, and process both structured and unstructured data into data products to “do” AI. This mean your data strategy needs the technology to store and process far more data (like >400% more). It also means no more tiering data into hot and cold - you need all the data readily available in order to do AI. This is a shift in a lot of dimensions - but at the very least your tech stack and strategy will need to be scalable. 📈
The bottom line: Data Maturity and AI
As use cases for data expand to include generative AI and other advanced deep learning techniques, data mature organizations need to focus on all of their data - structured, semi-structured, and unstructured.
While BI and SQL are built for structured data, data maturity goes beyond that now: data maturity includes handling and activating unstructured data.
A data immature organization needs to get their data house in order before they can build an AI-focused data strategy. If an org can’t get a handle on their structured data and provide basic analytics in a mature way, that should be the primary focus before attempting more advanced data work like genAI.
Of course, it’s not just about where you are on the data maturity scale - what really matters is how an organization can use data to answer questions and drive business.
Whether that’s with reporting, BI, or AI, a mature strategy for the underlying data is critical.
What are your thoughts? Comment here or reach out on LinkedIn and let’s discuss 🗣️