There is a story the AI industry tells about itself, and there is a story the people who actually ship AI inside large companies could tell, if anyone asked them. The two stories are not the same. The gap between them is the most expensive misunderstanding in enterprise software right now.

The first story is about models. New benchmarks. New parameter counts. New capabilities that arrive every few weeks and demand a new round of "we should be using this." It is the story you read on Twitter, see on stage at conferences, and hear from vendors competing for your AI budget.

The second story is more boring. It is about a service ticket system that has been customised by seven different teams over twelve years. It is about a finance dataset where the column called customer_id means three different things depending on which year the row was written. It is about a single sign-on flow that nobody fully understands, an approval workflow that exists only in someone's head, and a vendor contract that quietly forbids the data from leaving a particular region.

The model is not the hard part. The model has never been the hard part.

Where pilots actually die

I have spent the last several years inside enterprise applications engineering. My team supports the systems that real businesses run on — Salesforce, NetSuite, Coupa, Boomi, Snowflake, Tableau, plus the long tail of internal tools nobody outside the company has ever heard of. When AI initiatives arrive in this environment, they almost always follow the same arc.

A pilot is announced. A model is selected — usually a frontier LLM, sometimes a fine-tuned open one. A demo is built. The demo is impressive. Stakeholders nod. Budget is allocated. And then, for the next four to six months, almost nothing visible happens.

What is happening, invisibly, is integration.

Someone is figuring out how to get the model access to data without violating the data residency rules. Someone is mapping fields between two systems that disagree about what a customer is. On one recent project my team spent weeks reconciling the definition of "customer" between the CRM and the billing system, only to discover each used a different identifier format and a different piece of business logic underneath. Someone is writing the retry logic for when the API times out at 3 a.m. Someone is fighting with the SSO team about service accounts. Someone is realising that the "knowledge base" the model is supposed to read is actually four knowledge bases, two of which are out of date, one of which is in a wiki nobody can edit anymore, and one of which lives in a shared drive folder that has not been organised since 2019.

By the time the integration work is done, the demo from month one looks naive. By the time it is in production, the original model has been superseded by a better one. By the time anyone measures the impact, the team has already moved on to the next pilot, with no real lessons captured from the last. On that same customer-identifier project, the unresolved gaps came back to bite us in downstream systems for months — exactly the kind of problem that a written-down post-mortem would have caught early. We didn't have one. Most teams don't.

The model gets the credit when it works. The integration gets the blame when it doesn't. Both attributions are wrong.

What the integration layer actually does

There is a useful mental model I keep coming back to. In any enterprise AI project, you can roughly split the work into three layers:

  1. The model layer. Pick the right model, prompt it well, fine-tune if needed. This is where most public discourse lives.
  2. The application layer. Wrap the model in a useful interface. A chat box, a recommendation, a triage suggestion, a summary in a ticket.
  3. The integration layer. Connect the application to the systems and data and people and policies that already exist. This is where almost all the time, money, and risk actually live.

The model layer changes every quarter. The application layer changes every year or two. The integration layer changes on the timescale of the company itself — which is to say, slowly, and only with great effort.

Most AI commentary is about the layer that is moving fastest, and therefore most photogenic. Most enterprise AI value is unlocked in the layer that is moving slowest, and therefore most invisible.

What this looks like from the inside

A few patterns I see repeatedly:

The data is not where the model expects it. The model assumes a clean, queryable corpus. Reality is a fragmented mess across SaaS systems, file shares, an old SharePoint, and a Confluence space that three teams have stopped maintaining. The pilot proves the model works on a curated sample. Production proves the curated sample was the easy 10% — and the other 90% is where the duplications, the gaps, the silent schema drift, and the expensive surprises live.

The workflow does not yet exist. The model is supposed to "help with ticket triage" or "summarise customer interactions" — but the triage process and the interaction logging are themselves underdefined. People have been doing the work using tribal knowledge. The AI surfaces this gap immediately. Now you are not just shipping AI; you are also redesigning the underlying business process. Which nobody budgeted for.

The audit and compliance trail is non-trivial. Every output the model produces, and every input it sees, has to be logged in a way that survives a SOC 2 audit, a data subject request, or a regulator's question two years from now. Skip this and the cost is not theoretical — it's regulatory penalties, reputational damage, and an inability to defend what your systems did when someone asks. Unglamorous, expensive, and absolutely required. It rarely shows up in the pilot.

The humans on the other end have opinions. The agents whose tickets are being triaged, the analysts whose work is being summarised, the managers whose decisions are being recommended — they are not passive recipients. They will adopt the system, work around it, weaponise it against each other, or quietly ignore it, depending on factors that no model card will ever tell you.

What actually makes things ship

I am not arguing that models do not matter. They obviously do. A bad model produces a bad system no matter how good the integration is. But once you cross a certain capability threshold — and in 2026, most of the things enterprises actually want to do are over that threshold — the differentiator stops being the model and starts being everything around it.

The teams that ship enterprise AI well, in my experience, are not the teams with the best ML researchers. They are the teams that take the integration layer seriously as a first-class engineering problem. They have:

None of this is novel advice. It is the same advice you would give about any non-trivial enterprise software project. Which is the point. AI does not exempt you from the rest of software engineering. If anything, it raises the cost of skipping the parts that look boring.

Why I keep writing about this

Most of the public AI conversation happens in a register that is useless to people who actually work inside large companies. It is either too hyped — every release is a paradigm shift — or too academic to translate into a Monday morning standup. There is a missing register: practitioner-level honesty about what the work is actually like.

That is the gap this site is trying to fill. Field notes, not predictions. Observations from a place where the model is the easy part, and the integration is where careers, budgets, and quarters are quietly won and lost.

If you are working somewhere in this layer — and a lot more people are than the discourse suggests — I hope these notes are useful. Or at least, I hope they make you feel slightly less alone.