Enterprise AI Has a Foundation Problem

For decades, I've watched companies wrestle with the promises of automation, integration, and data. When AI arrived, the pattern repeated: AI agents are delivering real results in engineering, but they are underperforming or stalling everywhere else. This is not a failure of the models - it's a failure of the foundation they're deployed into. That is the structural problem we set out to solve.

Executive Summary

The reason engineering is succeeding is simple: it's a structured environment with clear guardrails. AI writes code, catches bugs, and accelerates releases. If the code doesn't compile, the AI failed. If the tests don't pass, something is wrong. These guardrails make AI tractable.

The business side operates without them. Enterprise data is fragmented, incomplete, and riddled with relationships nobody has mapped. Business processes are informal, inconsistent, and full of exceptions. Access to information is governed by policies that AI tools routinely ignore. The documents that describe how the business works are, at best, partial snapshots of a reality that has since moved on.

The result: enterprise AI deployments are underperforming at scale. Gartner estimates that 60% of AI projects will be abandoned due to poor data readiness. Half of generative AI pilots never make it to production.

This post isn't an in-depth analysis - it's a framing of the five structural problems that make business AI so difficult to deploy, problems that predate AI by decades. It's also our statement on why we built what we did: an architectural approach built around intentional data, learned process, and how work actually gets done.

Why Now?

The problems described in this paper are not new. Enterprise data has been a mess for decades. Business processes have always resisted automation. Data governance has been a known challenge since the first LAN.

What is new is the convergence. Four structural forces are colliding simultaneously in a way that makes this moment different from every previous cycle of enterprise technology investment - and makes getting AI right inside the business more urgent, and more consequential, than it has ever been.

Organizations are getting leaner - fast. The middle layer of most organizations is thinning. Management layers are being removed. Spans of control are widening. But the conventional response - giving managers better dashboards and reporting tools - misses the point entirely. When the management layer thins, you no longer have a boss telling you to update Salesforce every Friday afternoon. The leverage point shifts to the person doing the work. Software that exists primarily to serve upward reporting becomes a burden, not an asset. The bar has moved from "helps managers see what's happening" to "helps the doer get it done." That is a fundamentally different requirement, and most enterprise software was not built for it.

Cost pressure is now structural, not cyclical. The Rule of 40 - the long-standing benchmark that a software company's growth rate and profit margin should sum to 40% - has been replaced by more demanding expectations. Rule of 50, 60, even 70 is now the bar for well-run software businesses, and that pressure flows downstream to every customer. Post-sales teams - customer success, account management, support - are being scrutinized for efficiency in ways they haven't been before. The tools that serve those teams need to do more with less. But that efficiency will not come from consolidation or oversight. It will come from empowering the people doing the work to do it better, faster, and with less friction.

The function-specific software stack is not going away - and it shouldn't. For the past twenty years, enterprise software expanded horizontally - a dedicated tool for every function, every workflow, every team. That is not going to reverse, nor should it. You cannot run your entire company on one tool. Finance, legal, support, engineering, and account management each operate differently, and the tools that serve them reflect that. The data those tools hold is necessarily distributed. The mistake is treating that distribution as a problem to be solved through consolidation. It is not. The competitive advantage does not lie in where data is stored - it lies in the ability to knit it together across functions to answer the questions that actually matter: are we delivering on our commitments to customers, and are they going to grow?

The GTM model itself is changing. Pre-sales, the customer is largely dealing with one part of your business: sales. Post-sales, the entire company effectively comes online. The customer is now interacting with support, services, account management, product, and engineering - often simultaneously. Forward-thinking companies - including high-growth leaders like ClickHouse - are recognizing that this hand-off model is broken. Adoption is a revenue event. Expansion is a sales motion. But capturing that opportunity requires more than unifying sales and customer success under one owner. It requires connecting the full distributed reality of every function that now touches the customer - and making that picture legible to the people responsible for the relationship.

These four forces are not independent trends. They are reinforcing each other. The window to get this right - and to capture the position that comes with it - is now.

The organizations that navigate this convergence well will be leaner, more efficient, and closer to their customers than their competitors. Those that don't will find that AI investments compounded the structural problems they already had, rather than solving them.

Understanding why business AI is so hard is the first step to deploying it in a way that actually works.

Introduction: Two Environments, One Illusion

There is a reason AI agents are succeeding in engineering and stalling in the rest of the business. Engineering is a structured environment. There are compilers, linters, test suites, and version control systems. Every output can be validated. When an AI model generates bad code, you know almost immediately. The feedback loop is tight, the guardrails are built in, and course correction is fast.

In business, you don't get a compiler error when the AI gives you a wrong answer. You often don't find out at all.

Think of it like bowling. Engineering AI is bowling with bumpers - the ball might wobble, but it stays in the lane and hits something. Business AI is bowling without them. Without structure to bounce off, the ball goes wherever the floor takes it.

The engineering codebase is also a fundamentally different data environment. It is small, consolidated, purpose-built, and continuously tested. It exists to solve a specific problem, and it does.

Enterprise business data is none of these things. It is vast, fragmented, inconsistent, and largely unmapped - the accumulated residue of every system, team, and process that has ever touched the business. The data that AI needs to work with exists across dozens of systems in forms that were never designed to talk to each other, and in documents that capture a fraction of the story on a good day.

The problems are not technical. They are structural. And they predate AI by decades.

Problem 1: Enterprise Structured Data Is a Mess

Start with the basics. An example mid-size enterprise might run six Salesforce instances, three Jira instances, multiple instances of Zendesk, Linear, HubSpot, and a handful of other tools that have accumulated over years of acquisitions, team autonomy, and organic growth. Nobody planned this. Nobody maintains a single source of truth across it.

Salesforce's own research puts the scale of the problem into stark terms: 91% of CRM data is incomplete, and 70% of it becomes inaccurate every year. The average enterprise contact database is composed of 90% incomplete contacts, with more than a quarter being outright duplicates. This is not a data hygiene problem at the margins. It is the default state of enterprise data.

When you ingest this mess into an AI system, you don't get a smarter version of it. You get a faster, more confident version - an AI that synthesizes contradictory information and delivers inconsistent answers with apparent authority. The garbage-in, garbage-out principle does not disappear with a large language model. It accelerates.

The real-world consequences are already showing up. One company we spoke with deployed AI agents on their customer support desk - a relatively contained use case with a well-defined knowledge base. The agents performed badly. The team dug in and found the cause: the underlying knowledge base was disorganized, outdated, and internally contradictory. They spent months cleaning and restructuring it before the agents became usable. That was a knowledge base. Imagine the same exercise across every database in your enterprise.

Gartner now predicts that 60% of AI projects will be abandoned through 2026 due to lack of AI-ready data. That is not a forecast about the future. It is a description of what is already happening.

None of this is new. The data management industry - MDM, data cataloguing, data discovery, enrichment pipelines - has spent two decades and billions of dollars trying to solve it. The tools exist. But cleaning and governing enterprise data at scale is expensive, slow, and never fully done. AI doesn't fix this. It inherits it.

Problem 2: The Relationships Between Data Are Unknown

Bad data quality is only the first layer of the problem. The deeper issue is that even when individual data points are accurate, nobody knows how they relate to each other.

Take a simple question: what is the complete picture of a specific customer? To answer it, you need to connect their records in Salesforce, their support history in Zendesk, their product usage in your data warehouse, and their open issues in Jira - across multiple instances of each system, with no shared identifier, and no agreed definition of what a "customer" even means across those systems.

This is not a hypothetical edge case. It is the standard operating environment of most enterprises, and it is precisely why a multi-billion dollar integration industry exists - led by companies like Boomi, Informatica, and MuleSoft. That market now exceeds $17 billion and continues to grow. It exists because the problem remains unsolved.

Two example approaches illustrate opposite ends of a solution - and the limits of what's currently possible:

Cross-referencing fields. Some organizations add lookup fields in one system that point to records in another - Salesforce IDs embedded in Zendesk tickets, for example. This works at small scale, but it is brittle, requires constant maintenance, and breaks the moment either system changes. It also only works where somebody thought to build the bridge in the first place.

Data sampling, discovery, and cataloguing. More sophisticated organizations use statistical inference and automated scanning tools to infer relationships between datasets - examining patterns in the data to deduce what is connected to what. This is genuinely powerful, but it is resource-intensive and expensive. It is the preserve of the most advanced data engineering teams with significant budgets.

Even where both techniques are applied, the result is a partial and often stale picture. They produce structural inferences - not a reflection of how the business actually operates day to day. The connections they surface are snapshots of a data architecture that has already moved on.

The cost of solving this comprehensively - centralizing, cleaning, and actively managing data relationships across all of an enterprise's disparate systems - is prohibitively high for all but the largest organizations. Despite decades of investment, this problem remains largely unsolved. AI doesn't change the underlying challenge. It just makes the gap between "data relationships we understand" and "data relationships we need" more visible, and more consequential.

Problem 3: Unstructured Data Is an Even Bigger Mess

Structured data - the records in your CRM, the tickets in your support tool - at least has a schema. There is a defined shape to the mess. Unstructured data has no such consolation.

Every Google Doc, Word file, Notion page, and Dropbox Paper in your enterprise was created to serve a specific moment: closing a deal, securing budget, writing an account plan, documenting a decision. Each one captured part of the story - some more than others, most less than people think. Then the moment passed, the document was filed, and nobody updated it.

IBM research estimates that 80% of all enterprise data is dark - collected, stored, and never actively used. Splunk's research found that at least 55% of enterprise data, on average, is dark, with a third of organizations reporting that more than 75% of their data falls into this category. Most enterprise documents contain somewhere between 10% and 70% of the full story. Of what they do capture, a significant portion is now incorrect or irrelevant.

The result is a vast, haphazardly correct content layer - partial truths, outdated context, and superseded decisions, all sitting in your file system waiting to be retrieved.

When AI ingests this corpus, it does not apply judgement about what is current. It treats a strategy document from three years ago with the same weight as a live account plan. It synthesizes from sources that no longer reflect reality and delivers the output with the same confidence it would apply to accurate information.

Unlike structured records, documents have no schema, no "last validated" timestamp, and no field-level integrity check. There is no equivalent of a database constraint to catch a value that has drifted out of range. The document simply sits there, aging quietly, until an AI retrieves it and treats it as truth.

Problem 4: The Governance Crisis Beneath the Surface

Even if your data were clean, complete, and well-connected, there is a problem that would still stop enterprise AI from working safely at scale: you cannot give an AI access to everything.

Ask an AI agent to research a customer and it will need to pull information across multiple systems. But which systems? Which records? Pulling everything raises an immediate and serious question - does the person asking have the right to see all the information the AI is drawing on?

In most enterprise deployments, the answer is: sometimes yes, often no, and nobody has checked.

Enterprise data access is governed by role-based permissions, team boundaries, and compliance policies - controls that exist for legal, ethical, and regulatory reasons. An employee in sales should not see HR records. A customer-facing rep often cannot see support data about their own customers. A manager should not access information restricted to HR. These boundaries are carefully maintained in individual systems. They do not automatically transfer to an AI layer sitting on top of those systems.

The governance violation is invisible. A user receives an answer. They have no way of knowing which data sources generated it, or whether they were permitted to see those sources.

This is not a theoretical risk. One of our customers experienced a serious data leak in which an employee about to be placed on a performance improvement plan was inadvertently exposed to confidential HR information about their own case - surfaced through an LLM tool that had been connected to internal systems without sufficient permission scoping. The employee saw data they had no right to see. The organization had no record of it happening.

The Samsung case, which became public in 2023, illustrates the same risk from a different angle: engineers entering source code and internal meeting notes into ChatGPT while debugging, inadvertently exporting proprietary data outside the organization. A 2024 report found that 20% of UK companies had already experienced sensitive corporate data exposure through employee use of generative AI.

Skyflow has articulated the structural reason this is hard: unlike traditional software, where you can control data access by restricting which rows and columns a user sees, an LLM has no rows and columns to manage. The model doesn't expose the data it drew on to produce an answer. It just produces an answer. Enforcing access control at the output layer is fundamentally different - and much harder - than enforcing it at the data layer.

The implication is direct: you cannot simply connect your enterprise tools to an LLM and assume that existing access controls will be respected. Data governance must be a first-class architectural concern, built into the system from the start - not added as an afterthought after something goes wrong.

This governance problem also compounds the consistency problem. Two people asking the same question will get different answers not just because of hallucination or framing, but because they have access to different data. The AI is synthesizing correctly from different inputs - but the organization experiences it as unreliability. Because it is.

Problem 5: Business Processes Are Ad Hoc and Undefined

Software engineering has established, repeatable processes. There are release cycles, code review workflows, incident response protocols, and deployment pipelines - refined over decades. AI can slot into them because the structure was already there.

Business processes are different in kind, not just in degree. They are often informal, partially documented, and riddled with exceptions that get resolved through human collaboration rather than system logic. The happy path is defined. Everything else is handled by a Slack message, a phone call, or institutional knowledge held by one person who has been at the company long enough to know how things actually work.

This is not a failure of business operations. It is a reflection of the genuine complexity of running a company. Customers are not uniform. Deals are not identical. Processes that work for an enterprise account in one region don't apply to an SMB in another. The variation is not a bug - it is how the business adapts to reality.

This is why enterprise collaboration tools disrupted workflow automation. Slack didn't win because it systematized processes better than BPM tools. It won because it made it easier for people to collaborate around the exceptions - and businesses run on exceptions.

AI is now entering this space, and in doing so, it is walking into one of the historically hardest problems in enterprise software: business process management and workflow automation. BPM vendors have spent decades attempting to codify, automate, and standardize business processes, with mixed results and enormous implementation costs. The challenge is not technical. It is that the processes themselves resist full codification.

The fact that OpenAI is now deploying a large team of enterprise sales and implementation staff to help customers build AI-powered workflows is instructive. It reflects an acknowledgement that this is not a software problem you can solve by shipping a product. It requires deep, sustained engagement with how each organization actually operates - because every organization operates differently.

Even with fully capable agents, the effort required to discover, document, and reliably automate a single business process - even a relatively stable one - is substantial. Multiply that across the processes of an enterprise, and the scope of the undertaking becomes clear. This is not an argument against pursuing it. It is an argument for being honest about what it costs.

See How Noded AI Organizes Chaos

Sign up today for instant access and get ahead of the curve. Because Customer Success doesn’t need more documentation.

It needs smarter support.

Get Started

Why the Customer Is the Right Place to Start - and Why the Old Approach Falls Short

Given these five structural problems, the question is not whether enterprise AI is hard. Clearly it is. The question is where to start.

We focused on the customer relationship because it represents the highest-value problem - and because it exposes all five challenges in concentrated form. Every complexity described above is present in the customer context: fragmented CRM data, unknown cross-system relationships, outdated documents, governance risks, and ad hoc processes.

The conventional answer has been the Customer 360: a centralized data warehouse containing enriched, unified master data about every customer. It is an appealing concept. In practice, it has significant limits.

The traditional Customer 360 is fundamentally an audit trail. It records what happened - deal stages, support tickets, renewal dates, usage metrics. It is an operational view of what was delivered and when, not a picture of how the customer relationship is actually doing, or where it is headed.

More fundamentally, these systems are lossy. They take the richness of every human interaction with a customer - the nuance of a difficult renewal conversation, the context of an escalation, the relationship dynamics that turned a risk account into a champion - and compress it into forms and fields. What survives the compression is the transactional skeleton: a closed date, a stage change, a ticket status. The reasoning, the relationship, the risk signals - almost none of it makes it through.

The collaboration and context that surrounds those records is siloed - scattered across email threads, Slack messages, meeting notes, and the memory of whoever was in the room. It is rarely captured, and almost never structured. The result is that the data you have about a customer is not just incomplete. It is a pale, lossy shadow of the actual relationship.

We took the richness of the human experience of every customer interaction and squeezed it into forms and fields - with some very siloed collaboration bolted on the side. This is not a criticism of CRM vendors. It reflects the inherent limitations of trying to represent a complex, ongoing human relationship in a structured database. The problem is not that the tools are bad. The problem is that the model is wrong.

A Different Approach: The Customer Fabric

The customer fabric is not a better Customer 360. It is a different concept - one designed to address the structural problems described in this paper rather than work around them. It has three core components.

Component 1: Continuous Context Ingestion

The first instinct when deploying enterprise AI is to connect everything: ingest all the data, give the model access to every system. This is the wrong approach - for data quality reasons, for governance reasons, and for practical ones. As noted, the vast majority of enterprise data is archival. It is not in active use. Ingesting it adds noise, not signal.

The customer fabric takes a different approach: it ingests intentional data - the records, files, and signals that are actively driving outcomes. The deals in motion. The customer success plans in use. The support issues being actively worked. The accounts being renewed.

Rather than trying to infer data quality from the database schema - which is what MDM tools do - we assess quality by intention. Data that is actively being used to drive work is, almost by definition, more relevant and more current than data sitting in archival storage. We use that as the signal.

This approach sidesteps both the data quality problem and the context collapse problem. By ingesting less - but the right less - the system maintains a current, accurate, and actionable picture of what is actually happening with customers, without attempting to process the entirety of an enterprise's data estate.

Component 2: Graph, Decision Traces, and Temporal Bias

The second component addresses the relationships problem - but from a different angle than traditional data integration.

Rather than going to the database to understand how data relates, we go to the humans. We look at how people in the business actually interact with data in the course of real work - which records they pull together, which systems they consult when making a decision, which combinations of information they find meaningful. That behavioral signal reveals relationships that no data schema can surface.

As work happens, the system captures decision traces - the patterns of how people navigate information, make judgements, and handle exceptions. These traces are codified into memory and applied to future interactions. Over time, the AI develops a model of how work actually gets done, at the level of the individual user, the team, and the organization.

This means the system does not require business processes to be pre-defined. It learns them by observation. Processes that are ad hoc, regional, or individual-specific are captured as they occur - not imposed in advance through rigid workflow automation. The result is an AI that feels like it understands how you work, because it has learned from watching you do it.

Component 3: Closed-Loop Tool Understanding

The third component addresses how work actually gets done - not how tools say it should be done.

The emerging standard for connecting AI to enterprise tools is MCP - the Model Context Protocol. MCP answers the question: which tool can perform which function? It is a useful foundation. But for most business users, it answers the wrong question.

The right question is: which tool does this user use to perform this function? In most enterprises, the answers are specific, non-obvious, and invisible to anyone who hasn't worked there.

A concrete example. At a shipping company we work with, customer support tickets are not submitted via Zendesk's interface or API. They are submitted by internal email. An internal routing system picks up those emails and creates the Zendesk tickets automatically - applying routing logic, tagging, and priority scoring in the process. If you submit a ticket via the Zendesk MCP server, you bypass all of that. The ticket lands in the wrong queue. The routing is broken.

Another example: the account management team at the same company doesn't create Jira issues by opening Jira. They run a Salesforce Flow - a workflow built specifically for their team - which creates the Jira issue with the correct tags, assignments, and tracking fields already populated. An engineer at the same company does open Jira directly. Same action, entirely different execution path, depending entirely on who is doing it.

The customer fabric learns these patterns. It builds a picture of how individuals, teams, and the organization as a whole actually interact with their tools - not how those tools' documentation says they should be used. This is what closes the loop between AI capability and real-world execution.

Conclusion

Enterprise AI is not failing because the models are bad. The models are remarkable. It is failing because the environment into which they are being deployed is structurally unprepared - and the scope of that unpreparedness is larger than most organizations have recognized.

By deploying AI into the enterprise, organizations simultaneously encounter five problems that have resisted solution for decades: the data quality crisis, the data relationships crisis, the unstructured data problem, the governance problem, and the business process problem. Each one is hard in isolation. Together, they represent an extraordinary structural challenge.

The organizations that succeed with enterprise AI will not be the ones that move fastest. They will be the ones that are most honest about what they are actually attempting - and most disciplined about building the foundations before they build the application.

That means not ingesting everything. It means understanding which data is actually in use and building from there. It means learning how processes work by watching them, rather than trying to codify them in advance. It means taking data governance seriously as an architectural constraint, not a compliance afterthought. And it means understanding how your organization actually uses its tools - not how those tools say they should be used.

The path forward is not cleaning up everything. It is understanding what matters, how it relates, and how work actually gets done.

The customer relationship is the right place to start. It is where the problems are most concentrated, the value is highest, and the consequences of getting it wrong are most visible. Solving it well - really well, not just at demo quality - requires a different architectural foundation than the industry has been building toward.

That is what the customer fabric is designed to provide. And that's why we built Noded AI .

Like these results?