Stop relying on humans to catch bad manifests in code review. Encode your security and compliance guardrails as policy, and let an engine block unsafe changes automatically, at admission and in CI.
On a Tuesday afternoon a developer needed their pod to mount a host path for a quick debug session. They added securityContext: { privileged: true }, opened a PR, got a thumbs-up from a busy reviewer who skimmed the diff, and merged. The manifest sailed through CI. It deployed to production. Nobody caught it.
A privileged container can escape to the host node. From there it can read other tenants' secrets, tamper with the kubelet, or pivot across the cluster. The control that was supposed to stop this, a human reading a YAML diff at 4pm, is the least reliable control you own. Reviewers get tired, files get long, and privileged: true is one word buried in a 300-line manifest.
The fix is not more discipline. It is policy as code: write the rule once, and let a machine enforce it on every change, forever, without getting tired.
Who this is for
Platform, DevOps, and security engineers who own a shared Kubernetes cluster or an IaC pipeline and are tired of guardrails that live in a wiki page nobody reads. You should be comfortable with Kubernetes manifests and YAML; Rego is explained from scratch.
What policy as code actually means
Policy as code is your security and compliance guardrails written as machine-readable rules, version-controlled, tested, and enforced automatically, so the rule is the gate, not a human's memory of the rule.
Every team already has policies: "no privileged containers," "every workload must set resource limits," "images must come from our registry," "every resource needs a cost-center label." The question is only *where those policies live*. In most orgs they live in a Confluence page and in the heads of two senior engineers. Policy as code moves them into a Git repo and into an engine that checks them on every change.
A written building code (no exposed wiring, fire exits required)Your policy rules, version-controlled in Git
An inspector who physically blocks unsafe work from continuingThe policy engine that denies a non-compliant resource
Inspection at the permit desk, before a wall is ever builtPreventive enforcement, admission control and CI gates
A walkthrough of finished buildings to flag violations after the factDetective enforcement, audit scans of running resources
Policy as code is an automated building inspector wired into the construction site itself.
The inspector never has a bad day, never skims, and applies the same code to the billionaire's tower and the corner shop. That consistency, not raw intelligence, is what makes the control trustworthy.
Where the engine sits
A policy engine intercepts a change *before* it takes effect, evaluates it against your rules, and returns a single verdict: allow or deny. There are two natural places to put that gate, in your CI pipeline (before a manifest ever reaches the cluster) and at the Kubernetes admission stage (the last line of defense, catching anything that bypasses CI).
A change flows through one or more policy gates; each returns allow or deny before the resource is created.
1
A change is authored
A developer writes or edits a Kubernetes manifest or a Terraform plan and pushes it.
2
CI evaluates it first
A pipeline step runs the policies against the rendered manifests. A violation fails the build, the change never merges. This is the fast feedback loop.
3
The API server intercepts the apply
If something reaches the cluster anyway (a kubectl from a laptop, a Helm install, a compromised pipeline), the admission webhook fires before the object is persisted.
4
The engine returns a verdict
The policy engine evaluates the request against every matching policy and answers allow or deny in milliseconds.
5
Allow creates, deny blocks
On allow, the resource is created. On deny, the user gets a clear message, `privileged containers are not allowed`, and nothing is created.
OPA / Gatekeeper vs Kyverno
Two tools dominate Kubernetes policy enforcement, and the choice mostly comes down to one question: are you willing to learn a policy language? OPA (Open Policy Agent) is a general-purpose policy engine that uses a language called Rego; Gatekeeper is the project that wires OPA into Kubernetes admission control. Kyverno is Kubernetes-native and writes policies as plain YAML, no new language to learn.
Dimension
OPA / Gatekeeper
Kyverno
Policy language
Rego (purpose-built logic language)
YAML (Kubernetes-style resources)
Kubernetes-native
Runs in k8s, but OPA itself is platform-agnostic
Built only for Kubernetes; speaks CRDs natively
Beyond admission
Validate, plus mutate & generate
Validate, mutate, generate & image verification
Reuse outside k8s
Same Rego works for IaC, APIs, microservices
Kubernetes only
Learning curve
Steeper, Rego takes practice
Gentle, if you know manifests, you can write policy
Best when
You want one engine for k8s + Terraform + app authz
You want fast, readable, k8s-focused guardrails
Pick OPA/Gatekeeper for cross-platform reach; pick Kyverno for a YAML-native, lower-friction start.
A reasonable default
If your scope is "guardrails for our Kubernetes clusters," start with Kyverno, the YAML policies are easier for the whole team to read and contribute to. Reach for OPA/Rego when you need the *same* policy logic to also gate Terraform plans, CI, and application API authorization.
A real policy: deny privileged pods
Here is the rule that would have stopped our Tuesday incident, written as a Kyverno policy. It is just a Kubernetes resource, you kubectl apply it like anything else, and from that moment every pod is checked.
disallow-privileged.yaml
yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-privileged-containers
spec:
validationFailureAction: Enforce # Audit while testing, Enforce to blockbackground: true# also scan existing resourcesrules:
- name: privileged-containers
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Privileged containers are not allowed."pattern:
spec:
=(securityContext):
=(privileged): "false"containers:
- =(securityContext):
=(privileged): "false"
The same intent in Rego (for OPA/Gatekeeper) reads as logic rather than a pattern. A violation is produced for any container that asks for privilege:
The same shape covers the boring-but-critical rules too. Requiring a cost-center label is a Kyverno validate that asserts the key exists under metadata.labels, a two-line change that makes every untagged resource impossible to create.
Test your policies like code
Both ecosystems have test runners, `kyverno test` and `opa test`, that take a policy plus example resources and assert which pass and which fail. A policy with no tests is a guess. Commit the policy and its test fixtures together.
Catch it in CI, not just at the cluster
Admission control is your safety net, but the *fastest* feedback is in CI, before a change is ever merged. Running the same policies against rendered manifests in the pipeline means a developer sees privileged containers are not allowed in their PR check, not after a failed deploy. Same rule, earlier gate.
The same idea applies to infrastructure. With OPA/conftest you can gate a Terraform plan: render terraform show -json plan.out and let Rego deny, say, an S3 bucket without encryption or a security group open to 0.0.0.0/0. Bad infrastructure-as-code never reaches apply. This is exactly the kind of drift that a Cloud Security Posture Management (CSPM) tool would otherwise find *after* it is already running and exposed.
Preventive vs detective enforcement
Every control is one of two kinds, and you need both. Preventive controls stop the bad thing from ever happening, a CI gate that fails the build, an admission webhook that denies the request. Detective controls find the bad thing that already happened, a background scan that flags running pods which violate a policy you added later.
Preventive is stronger but stricter: if your policy is wrong, you block legitimate work and people route around you. Detective is safer to roll out but reactive: by the time it alerts, the privileged container has been running for an hour. The mature pattern is to start detective, then graduate to preventive. Deploy a policy in audit mode, watch what it *would* have blocked for a week, fix the false positives, then flip it to enforce. In Kyverno that is one field: validationFailureAction: Audit becomes Enforce.
Audit is a phase, not a destination
A policy stuck in audit mode forever is theater, it generates reports nobody reads while violations keep shipping. Audit exists to build confidence so you can safely turn enforcement on. If a policy has been in audit for months, either enforce it or delete it.
Common mistakes that cost hours
Audit-only forever. Teams deploy in audit mode "to be safe" and never flip to enforce. The guardrail looks present but blocks nothing, the privileged container still ships. Set a date to enforce when you deploy.
No policy tests. A policy is code; untested code lies. Without kyverno test / opa test fixtures you ship a regex that matches nothing, or a Rego rule that blocks every pod. Commit tests with the policy.
Over-strict blocking on day one. Turning on twenty enforce-mode policies at once breaks deploys org-wide and burns your credibility in an afternoon. Roll out one policy at a time, audit first.
No exception path. Real systems have legitimate edge cases. If the only way past a policy is to disable it, people disable it. Build a scoped, reviewed exception mechanism (a namespace label, an annotation) so the policy survives contact with reality.
Vague denial messages.validation error: pattern mismatch tells a developer nothing. A good message names the rule and the fix: privileged containers are not allowed; remove securityContext.privileged.
CI-only or admission-only. CI alone misses anything applied outside the pipeline; admission alone gives slow feedback. Run the same policies in both places.
Takeaways
The whole article in seven lines
Policy as code turns guardrails into version-controlled, tested rules an engine enforces automatically.
A policy engine intercepts a change and returns one verdict: allow or deny.
Gate in two places: CI for fast feedback, admission control as the last line of defense.
OPA/Gatekeeper uses Rego and reaches beyond Kubernetes; Kyverno uses YAML and is Kubernetes-native.
Test every policy with example resources, and ship clear denial messages.
Avoid audit-only-forever, untested policies, and over-strict day-one blocking.
Where to go next
Policy as code is most powerful once you understand the cluster it is guarding and the broader security posture it fits into. Build the foundation, then add the guardrails.
Get hands-on with the API server and admission flow in the kubectl lab.
This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.