Mental model
Enroll is intentionally simple: it collects facts first, then renders Ansible from those facts.
- Detect installed packages and services
- Collect config that deviates from packaged defaults (where possible)
- Grab relevant custom/unowned files in service dirs
- Capture non-system users & SSH public keys, .bashrc files etc
- Roles with files/templates and defaults
- Playbooks to apply the captured state
- Optional inventory structure for multi-host runs: each host gets its own playbook
$ enroll harvest --out /tmp/enroll-harvest
$ enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible
$ ansible-playbook -i "localhost," -c local /tmp/enroll-ansible/playbook.yml
How harvesting works
At a high level, this is what happens when enroll harvest runs on a host:
- Detects the OS and its package backend (e.g dpkg vs rpm)
- Detects what packages are installed
- For each package, it tries to detect files in
/etcthat have been modified from the default that get shipped with the package. - It detects running/enabled services and timers via systemd. For each of these, it looks for the unit files, any 'drop-in' files, environment variable files, etc, as well as what executable it executes, and tries to map those systemd services to the packages it's already learned about earlier (that way, those 'packages' or future Ansible roles, can also be associated with 'handlers' in Ansible, to handle restart of the services if/when the configs change)
- Aside from known packages already learned, it optimistically tries to capture extra system configuration in
/etcthat is common for config management. This is stuff like the apt or dnf configuration, crons, logrotate configs, networking settings, hosts files, etc. - For applications that commonly make use of symlinks (think Apache2 or Nginx's
sites-enabledormods-enabled), it notes what symlinks exist so that it can capture those in Ansible - It also looks for other snowflake stuff in
/etcnot associated with packages/services or other typical system config, and will put these into anetc_customrole. - Likewise, it looks in
/usr/localfor stuff, on the assumption that this is an area that custom apps/configs might've been placed in. These go into ausr_local_customrole. - It captures non-system user accounts, their group memberships and files such as their
.ssh/authorized_keys, and.bashrc,.profile,.bash_aliases,.bash_logoutif these files differ from theskeldefaults - It takes into account anything the user set with
--exclude-pathor--include-path. For anything extra that is included, it will put these into an 'extra_paths' role. The location could be anywhere e.g something in/opt,/srv, whatever you want. - It writes the state.json and captures the artifacts.
Other things to be aware of:
- You can use multiple invocations of
--exclude-pathto skip the bits you don't want. You also can always comment out from the playbook.yml or delete certain roles it generates once you've run theenroll manifest. - In terms of safety measures: it doesn't traverse into symlinks, and it has an 'IgnorePolicy' that makes it ignore most binary files (except GPG binary keys used with apt) - though if you specify certain paths with
--include-pathand use--dangerous, it will skip some policy statements such as what types of content to ignore. - It will skip files that are too large, and it also currently has a hardcoded cap of the number of files that it will harvest (4000 for
/etc,/usr/local/etcand/usr/local/bin, and 500 files per 'role'), to avoid unintentional 'runaway' situations. - If you are using the 'remote' mode to harvest, and your remote user requires a password for sudo, you can pass in
--ask-become-pass(or-K) and it will prompt for the password. If you forget, and remote requires password for sudo, it'll still fall back to prompting for a password, but will be a bit slower to do so.
ansible.builtin.copy in role tasks.State schema
Enroll writes a state.json file describing what was harvested. The canonical definition of that file format is the JSON Schema below.
You can also validate a harvest state file against the schema by using enroll validate /path/to/harvest.
Single-site vs multi-site
Manifest output has two styles. Choose based on how you'll use the result.
Best when you are enrolling one host, or you're producing a reusable "golden" role set that could be applied anywhere.
- Roles are self-contained
- Raw files live in each role's
files/ - Template variables live in
defaults/main.yml
--fqdn)Best when you want to enroll several existing servers quickly, especially if they differ.
- Roles are shared; raw files live in host-specific inventory
- Inventory decides what gets managed on each host (files/packages/services)
- Non-templated files go under
inventory/host_vars/<fqdn>/<role>/.files
$ enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
$ ansible-playbook /tmp/enroll-ansible/playbooks/"$(hostname -f)".yml
role_<name> (e.g. role_users, role_services, role_other). You can target a subset with ansible-playbook ... --tags role_users.Remote harvesting over SSH
Run Enroll on your workstation, harvest a remote host over SSH. The harvest is pulled locally.
$ enroll harvest --remote-host myhost.example.com --remote-user myuser --out /tmp/enroll-harvest
$ enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-manifest
# Alternatively, run both commands combined together with the 'single-shot' mode:
$ enroll single-shot --remote-host myhost.example.com --remote-user myuser \
--harvest /tmp/enroll-harvest --out /tmp/enroll-ansible \
--fqdn myhost.example.com
--no-sudo. However, be aware that you may get a more limited harvest depending on permissions.--ask-become-pass or -K and you'll be prompted to enter the password. If you forget, Enroll will still prompt for the password if it detects it's needed, but will be slightly slower to do so.~/.ssh/config, pass --remote-ssh-config ~/.ssh/config. Enroll will understand how to translate the Host alias, IdentityFile, ProxyCommand, ConnectTimeout and AddressFamily values. You must still pass a value for --remote-host that matches the Host value of the entry in the SSH config file.JinjaTurtle integration
If JinjaTurtle (one of my other projects) is installed, Enroll can also produce Jinja2 templates for ini/json/xml/toml-style config and extract variables cleanly into Ansible, instead of just storing the 'raw' files.
--jinjaturtleto force on--no-jinjaturtleto force off- Default is auto
- Single-site:
roles/<role>/defaults/main.yml - Multi-site:
inventory/host_vars/<fqdn>/<role>.yml
INI config file
If you're repeating flags (include/exclude patterns, SOPS settings, etc.), store defaults in enroll.ini and keep your muscle memory intact.
-c/--config, set ENROLL_CONFIG, or let Enroll auto-discover ./enroll.ini, ./.enroll.ini, or ~/.config/enroll/enroll.ini.[enroll]
# (future global flags may live here)
[harvest]
dangerous = false
include_path =
/home/*/.bashrc
/home/*/.profile
exclude_path = /usr/local/bin/docker-*, /usr/local/bin/some-tool
# remote_host = yourserver.example.com
# remote_user = you
# remote_port = 2222
[manifest]
no_jinjaturtle = true
sops = 00AE817C24A10C2540461A9C1D7CDE0234DB458D
[diff]
# ignore noisy drift
exclude_path = /var/anacron
ignore_package_versions = true
# enforce = true # requires ansible-playbook on PATH
include_path) even when the CLI flag uses hyphens (e.g. --include-path).Drift detection with enroll diff
One of the things I miss from my Puppet days, was the way the Puppet 'agent' would check in with the server and realign itself to the declared desired state. With Ansible, it's easy for systems to fall 'out of date', especially if someone is doing the wrong thing and changing things on-the-fly instead of via config management!
The purpose of enroll diff is to compare two 'harvests' and detect what has changed - be it adding/removing of programs, change to systemd unit state, modifications, addition or removal of files, and so on.
enroll diff feature supports sending the difference to a webhook of your choosing, or by e-mail. The payload can be sent in json, plain text, or markdown.--exclude-path to ignore file/dir drift under specific paths (e.g. /var/anacron). Use --ignore-package-versions to ignore routine package upgrades/downgrades while still reporting added/removed packages.$ enroll diff \
--old /path/to/harvestA \
--new /path/to/harvestB \
--exclude-path /var/spool/anacron \
--ignore-package-versions
--enforce)ansible-playbook is on PATH, Enroll can generate a manifest from the old harvest and apply it locally to restore expected state. It avoids package downgrades, and will often run Ansible with --tags role_... so only the roles implicated by the drift are applied. This is very much like a return to Puppet's agent mode!$ enroll diff \
--old /path/to/harvestA \
--new /path/to/harvestB \
--enforce
How to run enroll diff automatically on a timer
A great way to use enroll diff is to run it periodically (e.g via cron or a systemd timer). Below is an example.
Store the below file at /usr/local/bin/enroll-harvest-diff.sh and make it executable.
#!/usr/bin/env bash
set -euo pipefail
# Required env
: "${WEBHOOK_URL:?Set WEBHOOK_URL in /etc/enroll/enroll-harvest-diff}"
: "${ENROLL_SECRET:?Set ENROLL_SECRET in /etc/enroll/enroll-harvest-diff}"
# Optional env
STATE_DIR="${ENROLL_STATE_DIR:-/var/lib/enroll}"
GOLDEN_DIR="${STATE_DIR}/golden"
PROMOTE_NEW="${PROMOTE_NEW:-1}" # 1=promote new->golden; 0=keep golden fixed
KEEP_BACKUPS="${KEEP_BACKUPS:-7}" # only used if PROMOTE_NEW=1
LOCKFILE="${STATE_DIR}/.enroll-harvest-diff.lock"
mkdir -p "${STATE_DIR}"
chmod 700 "${STATE_DIR}" || true
# single-instance lock (avoid overlapping timer runs)
exec 9>"${LOCKFILE}"
flock -n 9 || exit 0
tmp_new=""
cleanup() {
if [[ -n "${tmp_new}" && -d "${tmp_new}" ]]; then
rm -rf "${tmp_new}"
fi
}
trap cleanup EXIT
make_tmp_dir() {
mktemp -d "${STATE_DIR}/.harvest.XXXXXX"
}
run_harvest() {
local out_dir="$1"
rm -rf "${out_dir}"
mkdir -p "${out_dir}"
chmod 700 "${out_dir}" || true
enroll harvest --out "${out_dir}" >/dev/null
}
# A) create golden if missing
if [[ ! -f "${GOLDEN_DIR}/state.json" ]]; then
tmp="$(make_tmp_dir)"
run_harvest "${tmp}"
rm -rf "${GOLDEN_DIR}"
mv "${tmp}" "${GOLDEN_DIR}"
echo "Golden harvest created at ${GOLDEN_DIR}"
exit 0
fi
# B) create new harvest
tmp_new="$(make_tmp_dir)"
run_harvest "${tmp_new}"
# C) diff + webhook notify
enroll diff \
--old "${GOLDEN_DIR}" \
--new "${tmp_new}" \
--webhook "${WEBHOOK_URL}" \
--webhook-format json \
--webhook-header "X-Enroll-Secret: ${ENROLL_SECRET}" # You can send multiple --webhook-header params as you need
# Promote or discard new harvest
if [[ "${PROMOTE_NEW}" == "1" || "${PROMOTE_NEW,,}" == "true" || "${PROMOTE_NEW}" == "yes" ]]; then
ts="$(date -u +%Y%m%d-%H%M%S)"
backup="${STATE_DIR}/golden.prev.${ts}"
mv "${GOLDEN_DIR}" "${backup}"
mv "${tmp_new}" "${GOLDEN_DIR}"
tmp_new="" # don't delete it in trap
# Keep only latest N backups
if [[ "${KEEP_BACKUPS}" =~ ^[0-9]+$ ]] && (( KEEP_BACKUPS > 0 )); then
ls -1dt "${STATE_DIR}"/golden.prev.* 2>/dev/null | tail -n +"$((KEEP_BACKUPS+1))" | xargs -r rm -rf
fi
echo "Diff complete; baseline updated."
else
# tmp_new will be deleted by trap
echo "Diff complete; baseline unchanged (PROMOTE_NEW=${PROMOTE_NEW})."
fi
Save these environment variables in /etc/enroll/enroll-harvest-diff
# Where to store golden + temp harvests
ENROLL_STATE_DIR=/var/lib/enroll
# 1 = each run becomes new baseline ("since last harvest")
# 0 = compare against a fixed baseline ("since golden")
PROMOTE_NEW=1
# If PROMOTE_NEW=1, keep this many old baselines
KEEP_BACKUPS=7
WEBHOOK_URL=https://example.com/webhook/xxxxxxxx
ENROLL_SECRET=xxxxxxxxxxxxxxxxxxxx
--webhook-header parameter can be used multiple times. You can, for example, send X-Enroll-Secret and a secret value of your choice, to help secure your webhook endpoint.Save this systemd unit file to /etc/systemd/system/enroll-harvest-diff.service
[Unit]
Description=Enroll harvest + diff + webhook notify
Wants=network-online.target
After=network-online.target
ConditionPathExists=/etc/enroll/enroll-harvest-diff
[Service]
Type=oneshot
EnvironmentFile=/etc/enroll/enroll-harvest-diff
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
UMask=0077
# Create /var/lib/enroll automatically
StateDirectory=enroll
ExecStart=/usr/local/bin/enroll-harvest-diff.sh
Save this systemd timer to /etc/systemd/system/enroll-harvest-diff.timer
[Unit]
Description=Run Enroll harvest diff hourly
[Timer]
OnCalendar=hourly
RandomizedDelaySec=10m
Persistent=true
[Install]
WantedBy=timers.target
Now you can enable and test it!
sudo systemctl daemon-reload
sudo systemctl enable --now enroll-harvest-diff.timer
# run once now
sudo systemctl start enroll-harvest-diff.service
# watch it in the logs
sudo journalctl -u enroll-harvest-diff.service -n 200 --no-pager
enroll diff JSON payload!Why did Enroll include/exclude something? enroll explain
When you run enroll harvest, Enroll records why it chose to include or exclude each path in state.json. The enroll explain subcommand summarizes that data so you can quickly sanity-check a harvest, tune include/exclude rules, and understand where packages/services came from.
enroll explain accepts a harvest bundle directory, a direct path to state.json, a .tar.gz/.tgz bundle, or an encrypted .tar.gz.sops bundle.$ enroll explain /tmp/enroll-harvest
# or point at the state.json path directly
$ enroll explain /tmp/enroll-harvest/state.json
The default output is human-readable text. For scripting or deeper inspection, use JSON output:
$ enroll explain /tmp/enroll-harvest --format json | jq .
# show more example paths per reason
$ enroll explain /tmp/enroll-harvest --max-examples 10
If you stored a harvest as a single SOPS-encrypted bundle, enroll explain can decrypt it on the fly (it will also auto-detect files ending with .sops):
$ enroll explain /var/lib/enroll/harvest.tar.gz.sops --sops
What you get back:
- A summary of what roles were collected (users, services, package snapshots,
etc_custom,usr_local_custom, etc.). - Why packages ended up in inventory (
observed_via), e.g. user-installed vs referenced by a harvested systemd unit. - Breakdowns of
managed_files.reason,managed_dirs.reason, andexcluded.reason, with a few example paths for each reason.
enroll explain after a first harvest to decide what to exclude (noise) and what to include (snowflake app/config under /opt, /srv, etc.) before you generate a manifest.enroll explain doesn't print file contents, but it can print path names and unit/package names. Treat the output as sensitive if your environment uses revealing path conventions (and especially if you harvested with --dangerous).Tips
Default harvesting tries to avoid likely secrets via path rules, content sniffing, and size caps. Use --dangerous only when you've planned where the output will live.
If you plan to keep harvests/manifests long term (especially in git), use --sops to produce a single encrypted bundle file. Note: enroll diff can be passed --sops to decrypt and compare two harvests on-the-fly!
For fleets, prefer multi-site output so roles stay generic and host inventory controls what is applied per host - reducing "shared role breaks other host" surprises.
Commit the manifest output, run it in CI, and use enroll diff as a drift alarm (webhook/email).