The hardest part of deploying gen AI for most companies is having data that’s ready

The challenge that's preventing tech leaders from deploying AI is not actually generating a model, it's having data that's ready.
In a recent global study of more than 1,300 tech and data executives, just 18% of companies say they're fully ready for AI deployment, meaning their data is fully accessible and unified.
Businesses also need to complete complicated data tagging and classification, primarily to keep private data within the right confines.

While boards of directors are calling for the deployment of artificial intelligence, IT executives like chief information officers know there's more to the story than having a solid AI use case.

24/7 San Diego news stream: Watch NBC 7 free wherever you are

The challenge that's preventing tech leaders from deploying AI is not actually generating a model and rolling it out, said Prukalpa Sankar, co-founder of data catalog and governance software Atlan. Instead, she said it's failing to have data ready for AI. "Everybody's ready for AI except your data," Sankar said.

In a recent global study of more than 1,300 tech and data executives, just 18% of companies say they're fully ready for AI deployment, meaning their data is fully accessible and unified (another 40% consider themselves mostly ready, but not quite there).

Get top local stories in San Diego delivered to you every morning. Sign up for NBC San Diego's News Headlines newsletter.

In order to get to that point of readiness, Sankar said companies must overcome several hurdles. The first is finding and organizing all of your data, a job primarily for data engineers. "You're looking to bring together data that was otherwise siloed in different business units to actually deploy for a specific use case," she said.

Businesses also need to complete complicated data tagging and classification, primarily to keep private data within the right confines. "Depending on who's asking the question, I can change the data that goes behind it," said Sankar. For example, a human resources chatbot may be able to use payroll data while an overall chatbot cannot.

With AI, data governance isn't so cut and dry

Money Report

news 3 mins ago

SoftBank-backed fintech Zopa aims to double profit this year as it eyes 2025 current account launch

news 4 mins ago

It's ‘liquidity, stupid': VCs say tech investing is tough amid IPO lull and ‘nuts' AI hype

All of this falls under the umbrella of data governance, or how a business manages data assets through policies, processes and standards. Matt Carroll, CEO and co-founder of data security platform Immuta, said data governance is not new, but AI changes how it's done.

"When you think of traditional business intelligence, which we've been doing for 30 years, governance was always a structured, well-oiled machine," said Carroll. "As you introduce AI, you can't do it the same way."

This is because businesses need to constantly add new data to support AI models from both inside and outside sources.

Ultimately, Carroll said, AI readiness boils down to three things: "They need to be able to find the data, they need to use it, and they need to be able to observe how it's being used."

Having a mature data governance pipeline isn't common across industries, or at least not yet. A 2024 AI readiness report from MIT found that data governance, trust and security are a greater focus in government and financial institutions versus other industries. Carroll said this practice needs to extend well beyond banks and government, as they're not the only industries handling sensitive data. All businesses pursuing generative or other types of AI solutions need to be doing a dance between IT, legal and broader organizational executives, as well as the departments they trickle down to.

Moreover, Carroll wants to see more businesses implement ongoing data readiness even after deploying AI. One such way companies can do that is through an AI hotline, which can be a full-on hotline in a large company, or a more attainable managed Slack channel in a smaller company. What's important is that domain experts have a direct line to the engineering team to report issues such as hallucinations or incorrect data tagging.

"They need that feedback loop, so maybe a model review board can take it down or reevaluate it, or potentially flag it for retraining and revalidation," Carroll said, "which is, by the way, not a negative thing. That's just the game."

This is, of course, in addition to continuous testing on models to look for anomalous behavior and make sure they meet the company's quality standards.

Companies get creative in getting ready for AI

From the start of AI deployment journeys, Sankar said she's seeing companies create AI readiness scores to help quantify the process of getting their data in order. The measurable score for AI readiness might rank a data set out of 5.0, for example, based on a range of factors. "Unless you measure it, nothing moves," she said.

Another trend experts are seeing is adding a secondary title of data steward to an employee's primary role. "You're in the business, you happen to know the domain, but now, all of a sudden, you're going to be owning this data set that may or may not be used for AI," said Carroll. Additionally, he said, highly specialized data governors (who might have an official title of data governance executives or data management engineers, for example) are hard to find, but increasingly important and something we may see more of in the future.

Sankar likens the data infrastructure ecosystem to a marketplace. "On one side of the marketplace you have business-ready AI use cases," she said. "And on the other hand is your complicated data infrastructure."

For organizations pursuing AI, experts agree that data readiness must come first. But even the broad category of data readiness breaks down further. Before even tackling step one, Carroll said, it's worth asking what may be an unpopular question in the C-suite: "In data readiness, there also is a question of, should you do it at all?" By this, Carroll means there's an ethical decision all companies must make as to whether or not you should expose certain types of data into your systems. Only with that go-ahead can companies truly pursue AI readiness.