Skip to content
← Skygena Signal
opinion 3 min read

Knowledge engineering is doing the work models cannot

The unfashionable truth about enterprise RAG: most failures happen at the retrieval layer, not the model layer. The knowledge layer is where the real engineering lives.

by Skygena Editorial

We spend a lot of time arguing about which model is better. We should be spending it on the layer underneath the model, because that is where almost every enterprise AI failure we have audited this past year actually originated.

The unglamorous truth: most enterprise RAG systems hallucinate not because the model is bad but because the retrieval layer is feeding it junk. Stale documents. Wrong document. Right document but the chunk that was retrieved is the wrong paragraph. No ontology, so the model cannot disambiguate “Q3 revenue” between group, segment, and legal entity. No citation enforcement, so the model invents a number that never existed.

The model is not the problem. The model is doing exactly what it was asked to do, which is “guess plausibly given this terrible context”. The problem is the context.

What knowledge engineering actually means

In our practice it means four discrete deliverables:

1. A domain ontology. Not a database schema. A small, business- literate model of the entities, relationships and definitions the agent will reason about. We typically write it with the controlling team or the domain experts, not the engineers. It takes between two and six weeks. Skipping it is the single most common reason RAG projects fail.

2. Document understanding pipelines. Real enterprise documents are PDFs of varying quality, scans, drawings, transcripts, slides, spreadsheets exported as PDFs. Every one needs an extraction strategy. Off-the-shelf “PDF loaders” handle about 30% of what we see in the field. The rest requires domain-aware extraction.

3. Hybrid retrieval. Pure vector search is one tool. For most enterprise use cases, hybrid (vector + lexical + structured filters) outperforms it substantially, especially when the question is concrete (“what was the U-value on building 3-A in the 2024 audit?”). The right mix is not theoretical — it is empirical, measured on a golden set.

4. Citation enforcement at the runtime. The agent is not allowed to answer without grounding. No source, no answer. This sounds obvious. Almost no production RAG systems we have seen actually enforce it.

The boring observation that would save the industry millions

If you spent half your AI budget this year on the model and half on the retrieval layer, you got the ratio backwards. Spend two thirds of your budget on the knowledge layer. That is where the quality lives. The model on top is almost a commodity at this point — substitutable, cheap, and improving without your effort. The knowledge layer is where the moat is, and where the hallucinations are.

What we tell clients

Three rules of thumb we now repeat in every kickoff:

  • No agent without a knowledge layer. If we cannot find or build a grounded, traceable source for the information the agent needs, we do not build the agent. We build the knowledge layer first, then the agent.

  • No knowledge layer without an evaluator. Every retrieval call gets measured against a golden set. We do not ship a retrieval layer that has not been evaluated.

  • No evaluator without business co-authorship. If the business team has not authored the test set, the test set is wrong. Engineers do not know what the right answer is. The business does.

There is nothing exotic in any of this. It is the engineering discipline the field has been postponing because models are sexy and ontologies are boring. The teams that flip that ratio in 2026 will ship.

If you have a stuck RAG project and want a second opinion on the retrieval layer, we are happy to do a half-day audit — write to [email protected].

Thinking about AI in your business?

Skygena is a boutique European AI studio engineering autonomous agents and LLM products. If you're wrestling with where to start — or where to stop — we can help.

Book a 30-minute call