Removing stale facts from PuppetDB

PuppetBoard and PuppetExplorer are both excellent tools but can be slowed down significantly if there are a very large number of facts in PuppetDB. I recently had an issue with some legacy facts tracking stats about mounted filesystems causing a significant amount of bloat, and this is how I cleaned them up.

The problem

A long time ago, someone decided it would be useful to have some extra fact data recording which filesystems were mounted, the types and how much space was being used on each. These were recorded such as:

fstarget_/home=/dev/mapper/system-home
fstype_/home=ext4
fsused_/home=4096

It turned out that none of these ever got used for anything useful, but not before we amassed 1900 unique filesystems being tracked across the estate and with three facts each that accounted for almost 6000 useless facts.

Too many facts!

The PuppetDB visualisation tools both have a page that lists all the unique facts, retrieved from the PuppetDB API using the /fact-names endpoint. Having several thousand records to retrieve and render caused each tool to delay page loads by around 30 seconds, and typing into the realtime filter box could take minutes to update, one character appearing at a time.

Removing the facts

Modifying the code to stop the fact being present on the machine is the easy part. Since the /fact-names reports the unique fact names across all nodes, in order to make them disappear completely we must make sure all nodes check in with the updated fact list that omits the removed facts.

How you do this depends on your setup. Perhaps you have the puppet agent running on a regular schedule; maybe you have mcollective or another orchestration tool running on all your nodes; failing any of those a mass-SSH run.

So we update all the nodes and refresh PuppetExplorer… and it’s still slow. Damn, missed something.

Don’t forget the deactivated nodes!

If we take a closer look at the documentation for the /fact-names documentation we see the line:

This will return an alphabetical list of all known fact names, including those which are known only for deactivated nodes.

Ah ha! The facts are still present in PuppetDB for all the deactivated nodes, but since they’re not active we didn’t/cannot do a puppet run on them to update the list of facts. We’re going to have to remove them from the database entirely.

Purging old nodes from PuppetDB

By default, PuppetDB doesn’t ever remove deactivated nodes, which means the facts hang around forever. You can tweak this by enabling node-purge-ttl in PuppetDB’s database.ini. As a once-off tidy up, I set node-purge-ttl = 1d and restarted PuppetDB. Tailing the logs I see PuppetDB runs a garbage collection on startup and all of my deactivated nodes were purged immediately.

Success!

Now.. to deal with the thousand entries from the built-in network facts…


Comments

4 responses to “Removing stale facts from PuppetDB”

  1. For reference, the Puppet Enterprise defaults set node-ttl to 7d and node-purge-ttl to 0s, meaning that nodes will be deactivated after 7 days and removed immediately after deactivation.

    1. >>node-purge-ttl to 0s, meaning that nodes will be deactivated after 7 days and removed immediately after deactivation.
      When node-purge-ttl is 0s “auto-deletion of nodes is disabled”
      https://docs.puppet.com/puppetdb/latest/configure.html#node-purge-ttl

  2. Did you ever figure out how to remove the stale facts?

    I have a similar problem with AWS EC2 network interface facts which are named after the MAC address so every new node that connects to puppet generates a new, useless fact!

    1. Hi Robin,

      Removing the stale facts is as described above. I think you’re asking more about classifying which facts are stale?

      In my case the network facts aren’t stale, there’s just a massive number of network-related facts because of the number of hosts reporting in. I’m thinking of the interface_*, ipaddress_*, macaddress_* etc facts.

      In the case of AWS I imagine you’re getting one set of these facts per node. Each time the node checks in it will update the facts stored about it to the latest values. If for any reason you change the network interfaces on the node the old facts should be removed at this point. So I can only imagine if you have lots of stale entries in the PuppetDB you’re not clearing up after decommissioned hosts at all?

      Use “puppet node deactivate $certname” to mark the hosts as decommissioned and as discussed above, set the node-ttl and/or node-purge-ttl to have them completely removed from the database that time after they’re decommissioned. They’ll still show up in the /fact-names call until the entries are totally purged.

Leave a Reply to Eli Young Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.