opinion 13 April 2026 5 min read

Five AI governance controls that actually work in production

Most AI governance frameworks die quietly on Confluence pages nobody opens. These five controls survive contact with engineering teams — because they are wired into the places engineers already live.

by Skygena Editorial

Every enterprise AI programme we’ve audited in 2026 has a governance framework. Almost none of them are actually followed. The documents are detailed, thoughtful, and written by serious people — and they live on Confluence pages that nobody opens after the quarterly review.

The issue is not that governance is unnecessary. It is that most governance frameworks are designed by people who do not ship software to people who do, and the mismatch is terminal. Controls that require engineers to remember a separate portal, fill in a spreadsheet, or attend a monthly review will be ignored within six weeks.

Here are the five controls we see surviving. All of them share one property: they live where engineers already work — in pull requests, CI pipelines, observability dashboards, incident channels.

1. The evaluation harness as a CI gate

Every agent has a golden test set of 50–500 labelled question/answer pairs that the business owner has signed off on. The set lives in the repo. On every pull request, a CI job runs the agent against the set and blocks merge if agreement drops below the agreed threshold (typically 95%).

This one control kills an entire class of silent regressions. Changes that “felt fine in my head” but actually degrade accuracy never ship, because the machine rejects them first. Engineers do not resent it because it is the same pattern they already accept for unit tests.

What to avoid: a golden set that lives in a Google Sheet that nobody maintains. Version it in git, treat it as code, let the business owner open PRs against it.

2. The append-only audit log

Every material decision the agent makes — every tool call, every generation, every escalation — is written to an append-only log with a cryptographic hash chain. The log is queryable but not mutable.

This control serves three masters at once:

Compliance: the log is regulator-ready without additional work.
Debugging: when an engineer needs to know what the agent did yesterday at 3am, the answer is one query away.
Tuning: weekly analysis of the log surfaces patterns humans would never see.

What to avoid: scattering logs across three different tools (application logs, CRM events, a “governance journal”). One log, canonical, append-only.

3. The one-click kill switch

Every agent in production has a documented kill switch: a single action, exposed in the same dashboard the on-call engineer already uses, that disables the agent within 60 seconds. The action is logged, auditable, reversible, and everyone on the team knows how to fire it.

The kill switch is almost never used. Its existence changes behaviour anyway — engineers ship more confidently when they know they can stop the bleeding instantly. Executives approve more deployments when they know the abort path is real.

What to avoid: a kill switch that requires filing a ticket with infrastructure, or that only the CTO can pull. If it is not within 30 seconds of the on-call engineer’s fingertip, it does not exist.

4. Monthly review — on the actual metrics

A real monthly review for each production agent, attended by the business owner, the engineering lead, and one person from risk or compliance. It runs for 45 minutes. The agenda is fixed: evaluation set performance, override rate, drift signals, incidents, and one topic of the owner’s choice.

Attendance is mandatory. Minutes are brief. Outputs are either “keep running” or a dated action item.

The review dies immediately if three conditions are not met:

The metrics on the screen must be generated automatically, not manually compiled by a junior the day before.
The business owner must attend. If engineering alone is reviewing, the discussion devolves into technical detail.
The review must have the authority to actually change something. If every decision needs approval elsewhere, the review is theatre.

What to avoid: a “quarterly AI governance steering committee” with 15 attendees, a 40-slide deck, and no operating authority. We have never seen one of those change anything.

5. The one-page policy per agent

Each production agent has a one-page policy document. It contains: what the agent is for, what it is not for, which data it can access, what the evaluation threshold is, who the business owner is, and what the kill-switch procedure is.

One page. Not twenty.

The policy is reviewed and signed by the business owner, the engineering lead, and the data protection officer. Changes trigger a PR against the document, which needs the same sign-offs.

This is the only document in the governance framework that engineers actually read. They read it because it is one page and it tells them what they need to know to do their job.

What to avoid: a 50-page “AI System Documentation” template borrowed from the Big Four. Nobody reads it. Nobody updates it. It does not help anyone.

What is missing from this list

Notably absent: an AI governance platform, an AI risk committee, a written framework, a maturity model, a RACI matrix. These artifacts are not wrong — they just are not controls. Controls are things that physically prevent bad outcomes. A written framework prevents nothing on its own; it requires the five things above to make it real.

The pattern

Look at the five controls together. Every single one of them is wired into the place engineers already spend their day — pull requests, dashboards, paging systems, PR reviews. None of them require engineers to open a new tool, log into a new portal, remember a new process, or attend a new meeting without a clear outcome.

That is the entire secret of AI governance that actually works. Meet the engineers where they are, give them controls that fit their existing workflow, and hold them to metrics that are generated automatically.

Everything else is theatre.

If you are building or rebuilding an AI governance programme and want a practitioner perspective, we do one-day audits — write to [email protected].

Thinking about AI in your business?

Skygena is a boutique European AI studio engineering autonomous agents and LLM products. If you're wrestling with where to start — or where to stop — we can help.

Book a 30-minute call