Policy as Code (OPA, Gatekeeper & Kyverno)

On this page

Someone shipped a privileged container
What policy as code actually means
Where the engine sits
OPA / Gatekeeper vs Kyverno
A real policy: deny privileged pods
Catch it in CI, not just at the cluster
Preventive vs detective enforcement
Common mistakes that cost hours
Takeaways
Where to go next

TL;DR

Stop trusting code review to catch unsafe manifests. Encode guardrails as policy with OPA, Gatekeeper, or Kyverno, then enforce them automatically both in CI and at cluster admission, preventive and detective.

Someone shipped a privileged container

On a Tuesday afternoon a developer needed their pod to mount a host path for a quick debug session. They added securityContext: { privileged: true }, opened a PR, got a thumbs-up from a busy reviewer who skimmed the diff, and merged. The manifest sailed through CI. It deployed to production. Nobody caught it.

A privileged container can escape to the host node. From there it can read other tenants' secrets, tamper with the kubelet, or pivot across the cluster. The control that was supposed to stop this, a human reading a YAML diff at 4pm, is the least reliable control you own. Reviewers get tired, files get long, and privileged: true is one word buried in a 300-line manifest.

The fix is not more discipline. It is policy as code: write the rule once, and let a machine enforce it on every change, forever, without getting tired.

Who this is for

Platform, DevOps, and security engineers who own a shared Kubernetes cluster or an IaC pipeline and are tired of guardrails that live in a wiki page nobody reads. You should be comfortable with Kubernetes manifests and YAML; Rego is explained from scratch.

What policy as code actually means

Policy as code is your security and compliance guardrails written as machine-readable rules, version-controlled, tested, and enforced automatically, so the rule is the gate, not a human's memory of the rule.

Every team already has policies: "no privileged containers," "every workload must set resource limits," "images must come from our registry," "every resource needs a cost-center label." The question is only *where those policies live*. In most orgs they live in a Confluence page and in the heads of two senior engineers. Policy as code moves them into a Git repo and into an engine that checks them on every change.

A written building code (no exposed wiring, fire exits required)Your policy rules, version-controlled in Git

An inspector who physically blocks unsafe work from continuingThe policy engine that denies a non-compliant resource

Inspection at the permit desk, before a wall is ever builtPreventive enforcement, admission control and CI gates

A walkthrough of finished buildings to flag violations after the factDetective enforcement, audit scans of running resources

Policy as code is an automated building inspector wired into the construction site itself.

The inspector never has a bad day, never skims, and applies the same code to the billionaire's tower and the corner shop. That consistency, not raw intelligence, is what makes the control trustworthy.

Where the engine sits

A policy engine intercepts a change *before* it takes effect, evaluates it against your rules, and returns a single verdict: allow or deny. There are two natural places to put that gate, in your CI pipeline (before a manifest ever reaches the cluster) and at the Kubernetes admission stage (the last line of defense, catching anything that bypasses CI).

A change flows through one or more policy gates; each returns allow or deny before the resource is created.

1
A change is authored
A developer writes or edits a Kubernetes manifest or a Terraform plan and pushes it.
2
CI evaluates it first
A pipeline step runs the policies against the rendered manifests. A violation fails the build, the change never merges. This is the fast feedback loop.
3
The API server intercepts the apply
If something reaches the cluster anyway (a kubectl from a laptop, a Helm install, a compromised pipeline), the admission webhook fires before the object is persisted.
4
The engine returns a verdict
The policy engine evaluates the request against every matching policy and answers allow or deny in milliseconds.
5
Allow creates, deny blocks
On allow, the resource is created. On deny, the user gets a clear message, privileged containers are not allowed, and nothing is created.

OPA / Gatekeeper vs Kyverno

Two tools dominate Kubernetes policy enforcement, and the choice mostly comes down to one question: are you willing to learn a policy language? OPA (Open Policy Agent) is a general-purpose policy engine that uses a language called Rego; Gatekeeper is the project that wires OPA into Kubernetes admission control. Kyverno is Kubernetes-native and writes policies as plain YAML, no new language to learn.

Dimension	OPA / Gatekeeper	Kyverno
Policy language	Rego (purpose-built logic language)	YAML (Kubernetes-style resources)
Kubernetes-native	Runs in k8s, but OPA itself is platform-agnostic	Built only for Kubernetes; speaks CRDs natively
Beyond admission	Validate, plus mutate & generate	Validate, mutate, generate & image verification
Reuse outside k8s	Same Rego works for IaC, APIs, microservices	Kubernetes only
Learning curve	Steeper, Rego takes practice	Gentle, if you know manifests, you can write policy
Best when	You want one engine for k8s + Terraform + app authz	You want fast, readable, k8s-focused guardrails

Pick OPA/Gatekeeper for cross-platform reach; pick Kyverno for a YAML-native, lower-friction start.

A reasonable default

If your scope is "guardrails for our Kubernetes clusters," start with Kyverno, the YAML policies are easier for the whole team to read and contribute to. Reach for OPA/Rego when you need the *same* policy logic to also gate Terraform plans, CI, and application API authorization.

A real policy: deny privileged pods

Here is the rule that would have stopped our Tuesday incident, written as a Kyverno policy. It is just a Kubernetes resource, you kubectl apply it like anything else, and from that moment every pod is checked.

disallow-privileged.yaml

yaml

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-privileged-containers
spec:
  validationFailureAction: Enforce   # Audit while testing, Enforce to block
  background: true                    # also scan existing resources
  rules:
    - name: privileged-containers
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "Privileged containers are not allowed."
        pattern:
          spec:
            =(securityContext):
              =(privileged): "false"
            containers:
              - =(securityContext):
                  =(privileged): "false"

The same intent in Rego (for OPA/Gatekeeper) reads as logic rather than a pattern. A violation is produced for any container that asks for privilege:

privileged.rego

rego

package kubernetes.admission

violation[{"msg": msg}] {
  container := input.review.object.spec.containers[_]
  container.securityContext.privileged == true
  msg := sprintf("container %v runs privileged; not allowed", [container.name])
}

The same shape covers the boring-but-critical rules too. Requiring a cost-center label is a Kyverno validate that asserts the key exists under metadata.labels, a two-line change that makes every untagged resource impossible to create.

Test your policies like code

Both ecosystems have test runners, kyverno test and opa test, that take a policy plus example resources and assert which pass and which fail. A policy with no tests is a guess. Commit the policy and its test fixtures together.

Catch it in CI, not just at the cluster

Admission control is your safety net, but the *fastest* feedback is in CI, before a change is ever merged. Running the same policies against rendered manifests in the pipeline means a developer sees privileged containers are not allowed in their PR check, not after a failed deploy. Same rule, earlier gate.

.github/workflows/policy.yml

yaml

name: policy-check
on: [pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Render manifests
        run: helm template ./chart > /tmp/rendered.yaml
      - name: Run Kyverno policies
        run: |
          kyverno apply ./policies/ \
            --resource /tmp/rendered.yaml \
            --audit-warn=false   # nonzero exit on any failure

The same idea applies to infrastructure. With OPA/conftest you can gate a Terraform plan: render terraform show -json plan.out and let Rego deny, say, an S3 bucket without encryption or a security group open to 0.0.0.0/0. Bad infrastructure-as-code never reaches apply. This is exactly the kind of drift that a Cloud Security Posture Management (CSPM) tool would otherwise find *after* it is already running and exposed.

Preventive vs detective enforcement

Every control is one of two kinds, and you need both. Preventive controls stop the bad thing from ever happening, a CI gate that fails the build, an admission webhook that denies the request. Detective controls find the bad thing that already happened, a background scan that flags running pods which violate a policy you added later.

Preventive is stronger but stricter: if your policy is wrong, you block legitimate work and people route around you. Detective is safer to roll out but reactive: by the time it alerts, the privileged container has been running for an hour. The mature pattern is to start detective, then graduate to preventive. Deploy a policy in audit mode, watch what it *would* have blocked for a week, fix the false positives, then flip it to enforce. In Kyverno that is one field: validationFailureAction: Audit becomes Enforce.

Audit is a phase, not a destination

A policy stuck in audit mode forever is theater, it generates reports nobody reads while violations keep shipping. Audit exists to build confidence so you can safely turn enforcement on. If a policy has been in audit for months, either enforce it or delete it.

Common mistakes that cost hours

1Audit-only forever. Teams deploy in audit mode "to be safe" and never flip to enforce. The guardrail looks present but blocks nothing, the privileged container still ships. Set a date to enforce when you deploy.
2No policy tests. A policy is code; untested code lies. Without kyverno test / opa test fixtures you ship a regex that matches nothing, or a Rego rule that blocks every pod. Commit tests with the policy.
3Over-strict blocking on day one. Turning on twenty enforce-mode policies at once breaks deploys org-wide and burns your credibility in an afternoon. Roll out one policy at a time, audit first.
4No exception path. Real systems have legitimate edge cases. If the only way past a policy is to disable it, people disable it. Build a scoped, reviewed exception mechanism (a namespace label, an annotation) so the policy survives contact with reality.
5Vague denial messages. validation error: pattern mismatch tells a developer nothing. A good message names the rule and the fix: privileged containers are not allowed; remove securityContext.privileged.
6CI-only or admission-only. CI alone misses anything applied outside the pipeline; admission alone gives slow feedback. Run the same policies in both places.

Takeaways

The whole article in seven lines

Policy as code turns guardrails into version-controlled, tested rules an engine enforces automatically.
A policy engine intercepts a change and returns one verdict: allow or deny.
Gate in two places: CI for fast feedback, admission control as the last line of defense.
OPA/Gatekeeper uses Rego and reaches beyond Kubernetes; Kyverno uses YAML and is Kubernetes-native.
Preventive controls block before; detective controls find after, start detective, graduate to preventive.
Test every policy with example resources, and ship clear denial messages.
Avoid audit-only-forever, untested policies, and over-strict day-one blocking.

Where to go next

Policy as code is most powerful once you understand the cluster it is guarding and the broader security posture it fits into. Build the foundation, then add the guardrails.

Get hands-on with the API server and admission flow in the kubectl lab.
See how these guardrails fit a real cluster in Kubernetes in Production: Beyond the Tutorial.
Pair preventive policy with detective scanning in Cloud Security Posture Management (CSPM).
Put it all together on the DevOps Engineer career path.

You're choosing a policy engine for a shared Kubernetes cluster. Which path fits your team?

Check your understanding

1. What does "policy as code" replace as the enforcement mechanism?

2. How do OPA/Gatekeeper and Kyverno differ in how you write policies?

Frequently asked questions

Why is code review not enough to catch unsafe manifests?

A human reading a YAML diff at 4pm is the least reliable control you own: reviewers get tired, files get long, and something like privileged: true is one word buried in a 300-line manifest. The fix is not more discipline, it is encoding the rule so a machine enforces it on every change.

Where does a policy engine sit in my workflow?

A policy engine intercepts a change before it takes effect, evaluates it against your rules, and returns a single verdict of allow or deny. There are two natural places for the gate: in your CI pipeline before a manifest reaches the cluster, and at the Kubernetes admission stage as the last line of defense.

Should I choose OPA/Gatekeeper or Kyverno?

The choice mostly comes down to whether you are willing to learn a policy language. OPA is a general-purpose engine that uses Rego, with Gatekeeper wiring it into Kubernetes admission control, while Kyverno is Kubernetes-native and writes policies as plain YAML.

Why is a privileged container so dangerous?

A privileged container can escape to the host node, and from there it can read other tenants' secrets, tamper with the kubelet, or pivot across the cluster. That is why a policy that denies privileged pods is a common first rule to enforce automatically.

Was this article helpful?

Want to go deeper?

This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.

Explore Career Paths Try the Labs

Keep reading

DevOps

What DevOps Actually Is (It's Not a Job Title)

Read

DevOps

CI/CD Fundamentals: What a Pipeline Really Does

Read

DevOps

Your First CI Pipeline with GitHub Actions

Read

Policy as Code (OPA, Gatekeeper & Kyverno)

01Someone shipped a privileged container

02What policy as code actually means

03Where the engine sits

04OPA / Gatekeeper vs Kyverno

05A real policy: deny privileged pods

06Catch it in CI, not just at the cluster

07Preventive vs detective enforcement

08Common mistakes that cost hours

09Takeaways

10Where to go next

Frequently asked questions

Want to go deeper?

What DevOps Actually Is (It's Not a Job Title)

CI/CD Fundamentals: What a Pipeline Really Does

Your First CI Pipeline with GitHub Actions

Someone shipped a privileged container

What policy as code actually means

Where the engine sits

OPA / Gatekeeper vs Kyverno

A real policy: deny privileged pods

Catch it in CI, not just at the cluster

Preventive vs detective enforcement

Common mistakes that cost hours

Takeaways

Where to go next