Skip to main content
Career Paths
Concepts
Configuration Management
The Simplified Tech

Role-based learning paths to help you master cloud engineering with clarity and confidence.

Product

  • Career Paths
  • Interview Prep
  • Scenarios
  • AI Features
  • Cloud Comparison
  • Resume Builder
  • Pricing

Community

  • Join Discord

Account

  • Dashboard
  • Credits
  • Updates
  • Sign in
  • Sign up
  • Contact Support

Stay updated

Get the latest learning tips and updates. No spam, ever.

Terms of ServicePrivacy Policy

© 2026 TheSimplifiedTech. All rights reserved.

BackBack
Interactive Explainer

Configuration Management

Ansible, Chef, Puppet, and SaltStack enforce desired state across fleets of servers — eliminating configuration drift and replacing error-prone manual SSH sessions with repeatable, auditable automation.

🎯Key Takeaways
Configuration management tools enforce desired state — idempotent runs converge servers to the correct state without manual intervention
Configuration drift (servers diverging from their intended state) is the core problem; continuous execution of playbooks/recipes prevents it
Push tools (Ansible) are agentless and simple; pull tools (Chef/Puppet) scale better for continuous convergence at thousands of nodes
Configuration management, IaC (Terraform), and containers are complementary: Terraform provisions, Ansible configures the OS, Docker runs the app
All server changes must go through version-controlled configuration management — no raw SSH changes in production

Configuration Management

Ansible, Chef, Puppet, and SaltStack enforce desired state across fleets of servers — eliminating configuration drift and replacing error-prone manual SSH sessions with repeatable, auditable automation.

~8 min read
Be the first to complete!
What you'll learn
  • Configuration management tools enforce desired state — idempotent runs converge servers to the correct state without manual intervention
  • Configuration drift (servers diverging from their intended state) is the core problem; continuous execution of playbooks/recipes prevents it
  • Push tools (Ansible) are agentless and simple; pull tools (Chef/Puppet) scale better for continuous convergence at thousands of nodes
  • Configuration management, IaC (Terraform), and containers are complementary: Terraform provisions, Ansible configures the OS, Docker runs the app
  • All server changes must go through version-controlled configuration management — no raw SSH changes in production

Lesson outline

What Configuration Management Solves

Before configuration management tools existed, engineers maintained servers by SSHing in and running commands manually. Every engineer had slightly different habits. Every server slowly diverged. This is called configuration drift — and it is the silent killer of production reliability.

Desired State vs Imperative Commands

Imperative: "run these commands in this order." Declarative/desired state: "this is how the machine should look." Configuration management tools take the declarative approach — you describe the end state, and the tool figures out how to get there. If the state is already correct, the tool does nothing.

The core concept is idempotency: running a playbook or recipe 1 time or 100 times produces the same result. This means you can safely re-run configuration on every server, on a schedule, without fear of breaking things that are already correct.

What configuration management tools manage

  • Packages — Install nginx 1.24, nodejs 20, python3.11 — pinned versions, always.
  • Files — Deploy /etc/nginx/nginx.conf from a template. If someone edits it manually, the next run restores it.
  • Services — Ensure nginx is running and enabled on boot. If it crashes, the next run restarts it.
  • Users and permissions — Create deploy user, set SSH keys, restrict sudo access.
  • System settings — Set kernel parameters, configure firewalls, manage cron jobs.

The Golden Rule of Config Management

No human should ever SSH into a production server and make a change without capturing that change in configuration management. "Just this once" exceptions compound into drift. If you made a change manually, your next step is to encode it in the playbook.

Push vs Pull Models

Configuration management tools fall into two architectural camps: push (controller pushes changes to nodes) and pull (nodes pull their config from a central server). Each has trade-offs that affect how you operate at scale.

ModelHow It WorksToolsProsCons
PushCentral controller SSHs into nodes and applies config on demandAnsibleAgentless (no software on nodes), simple to start, great for ad-hoc runsController must reach all nodes; slower at scale; no continuous drift correction
PullAgent on each node polls central server every N minutes and applies its configChef, Puppet, SaltStack (pull mode)Scales well, continuous drift correction, nodes self-healAgents must be installed and maintained; more infrastructure to run

In practice: Ansible dominates for its simplicity and agentless design — great for teams starting out or managing heterogeneous environments. Chef and Puppet are preferred at large enterprises (thousands of nodes) where continuous pull-based convergence and enterprise support matter. SaltStack offers both push and pull with very high performance.

Ansible Playbook Example

--- # nginx.yml - hosts: webservers become: true tasks: - name: Install nginx apt: name: nginx state: present - name: Deploy config template: src: nginx.conf.j2 dest: /etc/nginx/nginx.conf notify: Restart nginx handlers: - name: Restart nginx service: name=nginx state=restarted Running: ansible-playbook nginx.yml -i inventory This is idempotent: running it 10 times has the same effect as running it once.

Configuration Drift: The Silent Killer

Configuration drift happens when the actual state of a server diverges from its intended state. It starts small — a developer adds a debug flag, an on-call engineer installs a package, a failed deploy leaves a half-configured file. Over time, every server becomes a unique snowflake.

Signs You Have a Drift Problem

"Works on my server but not theirs." Intermittent failures that only affect some instances. "Do not restart that server — something will break." Engineers afraid to scale because new instances behave differently. These are all symptoms of unmanaged configuration drift.

Configuration management solves drift through continuous convergence: run your tool on a schedule (every 30 minutes with Puppet/Chef, or via cron with Ansible) and any drift gets corrected automatically. Some tools also have drift detection modes that report divergence without changing anything — useful for compliance audits.

Config management vs containers vs IaC — when to use each

  • Configuration management (Ansible/Chef/Puppet) — Best for managing long-lived VMs and bare metal servers. Configuring OS-level concerns (packages, services, files, users) across a fleet. Ideal when you cannot containerize everything.
  • Containers (Docker/Kubernetes) — Best for stateless application workloads. Configuration is baked into the image at build time — no drift possible. Preferred for modern microservices.
  • Infrastructure as Code (Terraform/Pulumi) — Best for provisioning infrastructure (networks, VMs, databases, cloud resources). Works at the resource level, not the OS configuration level. Complementary to config management.
  • The winning combo — Terraform provisions the VM, Ansible configures the OS and installs software, Docker runs the application. All three tools play different roles.

Ansible in Practice: Roles, Inventory, and Vault

Beyond simple playbooks, production Ansible use involves three key concepts: inventory (which servers exist), roles (reusable bundles of tasks), and Vault (encrypted secrets).

→

01

Inventory: Define your servers in static files or dynamic inventory scripts (AWS, GCP plugins). Group them: [webservers], [databases], [workers]. Playbooks target groups.

→

02

Roles: Organize tasks into reusable roles: roles/nginx/tasks/main.yml, roles/nginx/templates/nginx.conf.j2. Roles are composable — a playbook applies multiple roles to a host.

→

03

Variables: Use group_vars/ and host_vars/ to customize behavior per environment. dev has debug: true, prod has debug: false. Same role, different behavior.

→

04

Vault: Encrypt secrets with ansible-vault encrypt_string. Store encrypted values in vars files. Run with --ask-vault-pass or --vault-password-file. Secrets in version control, safely encrypted.

05

CI integration: Run ansible-playbook --check (dry run) in CI on every PR. Run --diff to show what would change. Gate merges on successful dry runs.

1

Inventory: Define your servers in static files or dynamic inventory scripts (AWS, GCP plugins). Group them: [webservers], [databases], [workers]. Playbooks target groups.

2

Roles: Organize tasks into reusable roles: roles/nginx/tasks/main.yml, roles/nginx/templates/nginx.conf.j2. Roles are composable — a playbook applies multiple roles to a host.

3

Variables: Use group_vars/ and host_vars/ to customize behavior per environment. dev has debug: true, prod has debug: false. Same role, different behavior.

4

Vault: Encrypt secrets with ansible-vault encrypt_string. Store encrypted values in vars files. Run with --ask-vault-pass or --vault-password-file. Secrets in version control, safely encrypted.

5

CI integration: Run ansible-playbook --check (dry run) in CI on every PR. Run --diff to show what would change. Gate merges on successful dry runs.

Test Your Playbooks with Molecule

Molecule is the standard testing framework for Ansible roles. It spins up Docker containers or VMs, runs your role, then verifies the result. Add Molecule tests to every role: they catch broken playbooks before they reach production.

How to Start with Configuration Management

Most teams start with Ansible because it requires no agents — just Python on the managed nodes and SSH access from the controller.

→

01

Audit one server: Pick one production server. Document every package, file, and service that is not default. This is your baseline.

→

02

Write a playbook: Translate that baseline into an Ansible playbook. Run it against a fresh VM and verify the result matches the production server.

→

03

Run against all servers: Apply the playbook to your entire fleet. Fix any failures — these reveal existing drift.

→

04

Automate execution: Set up a cron job or CI pipeline that runs the playbook on a schedule. Now drift is continuously corrected.

05

Code review all changes: Mandate that all server changes go through the playbook via pull request. No more raw SSH changes.

1

Audit one server: Pick one production server. Document every package, file, and service that is not default. This is your baseline.

2

Write a playbook: Translate that baseline into an Ansible playbook. Run it against a fresh VM and verify the result matches the production server.

3

Run against all servers: Apply the playbook to your entire fleet. Fix any failures — these reveal existing drift.

4

Automate execution: Set up a cron job or CI pipeline that runs the playbook on a schedule. Now drift is continuously corrected.

5

Code review all changes: Mandate that all server changes go through the playbook via pull request. No more raw SSH changes.

How this might come up in interviews

In DevOps and SRE interviews, config management comes up in "how do you manage server fleets" questions. System design rounds may ask how you ensure consistency across hundreds of instances. Behavioral questions may probe past incidents caused by configuration drift.

Common questions:

  • What is the difference between push and pull configuration management? When would you choose Ansible over Puppet?
  • How do you handle secrets in Ansible? Walk me through Ansible Vault.
  • What is configuration drift and how do you prevent it?
  • How does configuration management fit alongside containers and Terraform in a modern infrastructure stack?
  • Describe how you would migrate a fleet of manually-managed servers to Ansible.

Strong answer: Explaining idempotency clearly. Knowing push vs pull trade-offs. Mentioning Ansible Vault for secrets. Understanding that config management, IaC, and containers are complementary, not competing. Describing how continuous execution prevents drift.

Red flags: Thinking config management is obsolete because of containers (containers solve the app layer; VMs, bare metal, and OS config still need it). Confusing Ansible playbooks with shell scripts (idempotency and desired state are the key differences). Not knowing what idempotency means.

Quick check · Configuration Management

1 / 3

You run an Ansible playbook that installs nginx and deploys a config file. You run the same playbook 5 minutes later without any changes. What happens?

Key takeaways

  • Configuration management tools enforce desired state — idempotent runs converge servers to the correct state without manual intervention
  • Configuration drift (servers diverging from their intended state) is the core problem; continuous execution of playbooks/recipes prevents it
  • Push tools (Ansible) are agentless and simple; pull tools (Chef/Puppet) scale better for continuous convergence at thousands of nodes
  • Configuration management, IaC (Terraform), and containers are complementary: Terraform provisions, Ansible configures the OS, Docker runs the app
  • All server changes must go through version-controlled configuration management — no raw SSH changes in production
🧠Mental Model

💡 Analogy

Configuration management is like a building inspector with a checklist. Every week, the inspector visits every apartment (server) and checks: Is the smoke detector installed? Is the fire extinguisher present? Is the door lock working? If anything is missing or broken, the inspector fixes it on the spot. The inspector does not care who broke what — they just make sure every apartment matches the spec. You (the landlord) never need to manually visit apartments — you trust the inspector to keep everything consistent.

⚡ Core Idea

Describe the desired state of your servers once, in code. Run the tool to converge every server to that state. Idempotency means runs are safe to repeat. Drift is automatically corrected. No more snowflake servers.

🎯 Why It Matters

Configuration drift is insidious — it builds slowly and only becomes visible during incidents. By the time "that server" has a problem, it may have hundreds of undocumented manual changes. Configuration management makes servers boring and predictable, which is exactly what you want in production.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.