Documentation

How Enroll works

The mental model, output modes, and the knobs you'll actually use day-to-day.

Mental model

Enroll is intentionally simple: it collects facts first, then renders Ansible from those facts.

1) Harvest
Snapshot state into a bundle
  • Detect installed packages and services
  • Collect config that deviates from packaged defaults (where possible)
  • Grab relevant custom/unowned files in service dirs
  • Capture non-system users & SSH public keys, .bashrc files etc
2) Manifest
Generate an Ansible repo structure
  • Roles with files/templates and defaults
  • Playbooks to apply the captured state
  • Optional inventory structure for multi-host runs: each host gets its own playbook
Typical flow
$ enroll harvest --out /tmp/enroll-harvest
$ enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible
$ ansible-playbook -i "localhost," -c local /tmp/enroll-ansible/playbook.yml

How harvesting works

At a high level, this is what happens when enroll harvest runs on a host:

  • Detects the OS and its package backend (e.g dpkg vs rpm)
  • Detects what packages are installed
  • For each package, it tries to detect files in /etc that have been modified from the default that get shipped with the package.
  • It detects running/enabled services and timers via systemd. For each of these, it looks for the unit files, any 'drop-in' files, environment variable files, etc, as well as what executable it executes, and tries to map those systemd services to the packages it's already learned about earlier (that way, those 'packages' or future Ansible roles, can also be associated with 'handlers' in Ansible, to handle restart of the services if/when the configs change)
  • Aside from known packages already learned, it optimistically tries to capture extra system configuration in /etc that is common for config management. This is stuff like the apt or dnf configuration, crons, logrotate configs, networking settings, hosts files, etc.
  • For applications that commonly make use of symlinks (think Apache2 or Nginx's sites-enabled or mods-enabled), it notes what symlinks exist so that it can capture those in Ansible
  • It also looks for other snowflake stuff in /etc not associated with packages/services or other typical system config, and will put these into an etc_custom role.
  • Likewise, it looks in /usr/local for stuff, on the assumption that this is an area that custom apps/configs might've been placed in. These go into a usr_local_custom role.
  • It captures non-system user accounts, their group memberships and files such as their .ssh/authorized_keys, and .bashrc, .profile, .bash_aliases, .bash_logout if these files differ from the skel defaults
  • It takes into account anything the user set with --exclude-path or --include-path. For anything extra that is included, it will put these into an 'extra_paths' role. The location could be anywhere e.g something in /opt, /srv, whatever you want.
  • It writes the state.json and captures the artifacts.

Other things to be aware of:

  • You can use multiple invocations of --exclude-path to skip the bits you don't want. You also can always comment out from the playbook.yml or delete certain roles it generates once you've run the enroll manifest.
  • In terms of safety measures: it doesn't traverse into symlinks, and it has an 'IgnorePolicy' that makes it ignore most binary files (except GPG binary keys used with apt) - though if you specify certain paths with --include-path and use --dangerous, it will skip some policy statements such as what types of content to ignore.
  • It will skip files that are too large, and it also currently has a hardcoded cap of the number of files that it will harvest (4000 for /etc, /usr/local/etc and /usr/local/bin, and 500 files per 'role'), to avoid unintentional 'runaway' situations.
  • If you are using the 'remote' mode to harvest, and your remote user requires a password for sudo, you can pass in --ask-become-pass (or -K) and it will prompt for the password. If you forget, and remote requires password for sudo, it'll still fall back to prompting for a password, but will be a bit slower to do so.
Does Enroll use Ansible community/galaxy roles?
No, Enroll doesn't have any knowledge of Ansible Galaxy roles or community plugins. It generates all the roles itself. If you really want to use roles from the community, Enroll may not be the tool for you, other than perhaps to help get you started.

Keep in mind that a lot of software config files are also good candidates for being Jinja templates with abstracted vars for separate hosts.

Enroll does use my companion tool JinjaTurtle if it's installed, but JinjaTurtle only recognises certain types of files (.ini style, .json, .xml, .yaml, .toml, but not special ones like Nginx or Apache conf files which have their own special syntax). When Enroll can't turn a config file into a template, it copies the raw file instead and uses it with ansible.builtin.copy in role tasks.

State schema

Enroll writes a state.json file describing what was harvested. The canonical definition of that file format is the JSON Schema below.

You can also validate a harvest state file against the schema by using enroll validate /path/to/harvest.

Single-site vs multi-site

Manifest output has two styles. Choose based on how you'll use the result.

Single-site (default)

Best when you are enrolling one host, or you're producing a reusable "golden" role set that could be applied anywhere.

  • Roles are self-contained
  • Raw files live in each role's files/
  • Template variables live in defaults/main.yml
Multi-site (--fqdn)

Best when you want to enroll several existing servers quickly, especially if they differ.

  • Roles are shared; raw files live in host-specific inventory
  • Inventory decides what gets managed on each host (files/packages/services)
  • Non-templated files go under inventory/host_vars/<fqdn>/<role>/.files
Multi-site example
$ enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
$ ansible-playbook /tmp/enroll-ansible/playbooks/"$(hostname -f)".yml
Tip: role tags
Generated playbooks tag each role as role_<name> (e.g. role_users, role_services, role_other). You can target a subset with ansible-playbook ... --tags role_users.

Remote harvesting over SSH

Run Enroll on your workstation, harvest a remote host over SSH. The harvest is pulled locally.

$ enroll harvest --remote-host myhost.example.com --remote-user myuser --out /tmp/enroll-harvest
$ enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-manifest

# Alternatively, run both commands combined together with the 'single-shot' mode:

$ enroll single-shot --remote-host myhost.example.com --remote-user myuser \
  --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible \
  --fqdn myhost.example.com
Tip
If you don't want/need sudo on the remote side, add --no-sudo. However, be aware that you may get a more limited harvest depending on permissions.

If your remote user requires a password for sudo, pass --ask-become-pass or -K and you'll be prompted to enter the password. If you forget, Enroll will still prompt for the password if it detects it's needed, but will be slightly slower to do so.

If your remote host requires additional SSH configuration that you've defined in your ~/.ssh/config, pass --remote-ssh-config ~/.ssh/config. Enroll will understand how to translate the Host alias, IdentityFile, ProxyCommand, ConnectTimeout and AddressFamily values. You must still pass a value for --remote-host that matches the Host value of the entry in the SSH config file.

JinjaTurtle integration

If JinjaTurtle (one of my other projects) is installed, Enroll can also produce Jinja2 templates for ini/json/xml/toml-style config and extract variables cleanly into Ansible, instead of just storing the 'raw' files.

Modes
  • --jinjaturtle to force on
  • --no-jinjaturtle to force off
  • Default is auto
Where variables land
  • Single-site: roles/<role>/defaults/main.yml
  • Multi-site: inventory/host_vars/<fqdn>/<role>.yml

INI config file

If you're repeating flags (include/exclude patterns, SOPS settings, etc.), store defaults in enroll.ini and keep your muscle memory intact.

Discovery order
You can pass -c/--config, set ENROLL_CONFIG, or let Enroll auto-discover ./enroll.ini, ./.enroll.ini, or ~/.config/enroll/enroll.ini.
[enroll]
# (future global flags may live here)

[harvest]
dangerous = false
include_path =
  /home/*/.bashrc
  /home/*/.profile
exclude_path = /usr/local/bin/docker-*, /usr/local/bin/some-tool
# remote_host = yourserver.example.com
# remote_user = you
# remote_port = 2222

[manifest]
no_jinjaturtle = true
sops = 00AE817C24A10C2540461A9C1D7CDE0234DB458D

[diff]
# ignore noisy drift
exclude_path = /var/anacron
ignore_package_versions = true
# enforce = true  # requires ansible-playbook on PATH
Note
In INI sections, option names use underscores (e.g. include_path) even when the CLI flag uses hyphens (e.g. --include-path).

Drift detection with enroll diff

One of the things I miss from my Puppet days, was the way the Puppet 'agent' would check in with the server and realign itself to the declared desired state. With Ansible, it's easy for systems to fall 'out of date', especially if someone is doing the wrong thing and changing things on-the-fly instead of via config management!

The purpose of enroll diff is to compare two 'harvests' and detect what has changed - be it adding/removing of programs, change to systemd unit state, modifications, addition or removal of files, and so on.

Notifications for diff
The enroll diff feature supports sending the difference to a webhook of your choosing, or by e-mail. The payload can be sent in json, plain text, or markdown.
Noise suppression
Use --exclude-path to ignore file/dir drift under specific paths (e.g. /var/anacron). Use --ignore-package-versions to ignore routine package upgrades/downgrades while still reporting added/removed packages.
$ enroll diff \
--old /path/to/harvestA \
--new /path/to/harvestB \
--exclude-path /var/spool/anacron \
--ignore-package-versions
Optional: enforce the old harvest state (--enforce)
If drift exists and ansible-playbook is on PATH, Enroll can generate a manifest from the old harvest and apply it locally to restore expected state. It avoids package downgrades, and will often run Ansible with --tags role_... so only the roles implicated by the drift are applied. This is very much like a return to Puppet's agent mode!
$ enroll diff \
--old /path/to/harvestA \
--new /path/to/harvestB \
--enforce

How to run enroll diff automatically on a timer

A great way to use enroll diff is to run it periodically (e.g via cron or a systemd timer). Below is an example.

Store the below file at /usr/local/bin/enroll-harvest-diff.sh and make it executable.

#!/usr/bin/env bash
set -euo pipefail

# Required env
: "${WEBHOOK_URL:?Set WEBHOOK_URL in /etc/enroll/enroll-harvest-diff}"
: "${ENROLL_SECRET:?Set ENROLL_SECRET in /etc/enroll/enroll-harvest-diff}"

# Optional env
STATE_DIR="${ENROLL_STATE_DIR:-/var/lib/enroll}"
GOLDEN_DIR="${STATE_DIR}/golden"
PROMOTE_NEW="${PROMOTE_NEW:-1}"          # 1=promote new->golden; 0=keep golden fixed
KEEP_BACKUPS="${KEEP_BACKUPS:-7}"        # only used if PROMOTE_NEW=1
LOCKFILE="${STATE_DIR}/.enroll-harvest-diff.lock"

mkdir -p "${STATE_DIR}"
chmod 700 "${STATE_DIR}" || true

# single-instance lock (avoid overlapping timer runs)
exec 9>"${LOCKFILE}"
flock -n 9 || exit 0

tmp_new=""
cleanup() {
  if [[ -n "${tmp_new}" && -d "${tmp_new}" ]]; then
    rm -rf "${tmp_new}"
  fi
}
trap cleanup EXIT

make_tmp_dir() {
  mktemp -d "${STATE_DIR}/.harvest.XXXXXX"
}

run_harvest() {
  local out_dir="$1"
  rm -rf "${out_dir}"
  mkdir -p "${out_dir}"
  chmod 700 "${out_dir}" || true
  enroll harvest --out "${out_dir}" >/dev/null
}

# A) create golden if missing
if [[ ! -f "${GOLDEN_DIR}/state.json" ]]; then
  tmp="$(make_tmp_dir)"
  run_harvest "${tmp}"
  rm -rf "${GOLDEN_DIR}"
  mv "${tmp}" "${GOLDEN_DIR}"
  echo "Golden harvest created at ${GOLDEN_DIR}"
  exit 0
fi

# B) create new harvest
tmp_new="$(make_tmp_dir)"
run_harvest "${tmp_new}"

# C) diff + webhook notify
enroll diff \
  --old "${GOLDEN_DIR}" \
  --new "${tmp_new}" \
  --webhook "${WEBHOOK_URL}" \
  --webhook-format json \
  --webhook-header "X-Enroll-Secret: ${ENROLL_SECRET}" # You can send multiple --webhook-header params as you need

# Promote or discard new harvest
if [[ "${PROMOTE_NEW}" == "1" || "${PROMOTE_NEW,,}" == "true" || "${PROMOTE_NEW}" == "yes" ]]; then
  ts="$(date -u +%Y%m%d-%H%M%S)"
  backup="${STATE_DIR}/golden.prev.${ts}"
  mv "${GOLDEN_DIR}" "${backup}"
  mv "${tmp_new}" "${GOLDEN_DIR}"
  tmp_new=""  # don't delete it in trap

  # Keep only latest N backups
  if [[ "${KEEP_BACKUPS}" =~ ^[0-9]+$ ]] && (( KEEP_BACKUPS > 0 )); then
    ls -1dt "${STATE_DIR}"/golden.prev.* 2>/dev/null | tail -n +"$((KEEP_BACKUPS+1))" | xargs -r rm -rf
  fi

  echo "Diff complete; baseline updated."
else
  # tmp_new will be deleted by trap
  echo "Diff complete; baseline unchanged (PROMOTE_NEW=${PROMOTE_NEW})."
fi

Save these environment variables in /etc/enroll/enroll-harvest-diff


# Where to store golden + temp harvests
ENROLL_STATE_DIR=/var/lib/enroll

# 1 = each run becomes new baseline ("since last harvest")
# 0 = compare against a fixed baseline ("since golden")
PROMOTE_NEW=1

# If PROMOTE_NEW=1, keep this many old baselines
KEEP_BACKUPS=7

WEBHOOK_URL=https://example.com/webhook/xxxxxxxx
ENROLL_SECRET=xxxxxxxxxxxxxxxxxxxx

Webhook headers
The --webhook-header parameter can be used multiple times. You can, for example, send X-Enroll-Secret and a secret value of your choice, to help secure your webhook endpoint.

Save this systemd unit file to /etc/systemd/system/enroll-harvest-diff.service


[Unit]
Description=Enroll harvest + diff + webhook notify
Wants=network-online.target
After=network-online.target
ConditionPathExists=/etc/enroll/enroll-harvest-diff

[Service]
Type=oneshot
EnvironmentFile=/etc/enroll/enroll-harvest-diff
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
UMask=0077

# Create /var/lib/enroll automatically
StateDirectory=enroll

ExecStart=/usr/local/bin/enroll-harvest-diff.sh

Save this systemd timer to /etc/systemd/system/enroll-harvest-diff.timer


[Unit]
Description=Run Enroll harvest diff hourly

[Timer]
OnCalendar=hourly
RandomizedDelaySec=10m
Persistent=true

[Install]
WantedBy=timers.target

Now you can enable and test it!


sudo systemctl daemon-reload
sudo systemctl enable --now enroll-harvest-diff.timer

# run once now
sudo systemctl start enroll-harvest-diff.service
# watch it in the logs
sudo journalctl -u enroll-harvest-diff.service -n 200 --no-pager

Need help with writing webhooks?
I use Node-RED. Here's a sample Node-RED flow that might help run your webhook, pre-configured to parse the enroll diff JSON payload!

Why did Enroll include/exclude something? enroll explain

When you run enroll harvest, Enroll records why it chose to include or exclude each path in state.json. The enroll explain subcommand summarizes that data so you can quickly sanity-check a harvest, tune include/exclude rules, and understand where packages/services came from.

What can it read?
enroll explain accepts a harvest bundle directory, a direct path to state.json, a .tar.gz/.tgz bundle, or an encrypted .tar.gz.sops bundle.
$ enroll explain /tmp/enroll-harvest

# or point at the state.json path directly
$ enroll explain /tmp/enroll-harvest/state.json

The default output is human-readable text. For scripting or deeper inspection, use JSON output:

$ enroll explain /tmp/enroll-harvest --format json | jq .

# show more example paths per reason
$ enroll explain /tmp/enroll-harvest --max-examples 10

If you stored a harvest as a single SOPS-encrypted bundle, enroll explain can decrypt it on the fly (it will also auto-detect files ending with .sops):

$ enroll explain /var/lib/enroll/harvest.tar.gz.sops --sops

What you get back:

  • A summary of what roles were collected (users, services, package snapshots, etc_custom, usr_local_custom, etc.).
  • Why packages ended up in inventory (observed_via), e.g. user-installed vs referenced by a harvested systemd unit.
  • Breakdowns of managed_files.reason, managed_dirs.reason, and excluded.reason, with a few example paths for each reason.
Tip
Use enroll explain after a first harvest to decide what to exclude (noise) and what to include (snowflake app/config under /opt, /srv, etc.) before you generate a manifest.

Security note: enroll explain doesn't print file contents, but it can print path names and unit/package names. Treat the output as sensitive if your environment uses revealing path conventions (and especially if you harvested with --dangerous).

Tips

Start safe

Default harvesting tries to avoid likely secrets via path rules, content sniffing, and size caps. Use --dangerous only when you've planned where the output will live.

Encrypt at rest

If you plan to keep harvests/manifests long term (especially in git), use --sops to produce a single encrypted bundle file. Note: enroll diff can be passed --sops to decrypt and compare two harvests on-the-fly!

Multi-host safety

For fleets, prefer multi-site output so roles stay generic and host inventory controls what is applied per host - reducing "shared role breaks other host" surprises.

Keep it reproducible

Commit the manifest output, run it in CI, and use enroll diff as a drift alarm (webhook/email).