Migrating to a new Puppet certification authority

At Songkick all our servers are managed using Puppet, an open source configuration management tool. We use it in client-server mode and recently had the need to replace the certification authority certificates on all our nodes. I couldn’t find much information on how to do this without logging onto every machine, so I’ve documented my method.

What is this Puppet CA anyway?

If you’re using puppet in its typical client-server or agent-master setup, then when the puppet master is first started it will create a certification authority (CA) which all clients that connect to it need to be trusted by and must trust. This usually happens transparently, so often people aren’t aware that this certification authority exists.

The CA is an attempt to have trust between the agents and the master, so that an attacker cannot set up malicious puppet masters and tell puppet agents to do his or her bidding and also prevent malicious clients being able to see configuration data for other clients. Agents should only connect to masters that have certificates signed by its CA and masters should only send configuration information to clients that have certificates signed by the same CA.

There’s a more comprehensive explanation of Puppet SSL written by Brice Figureau which goes into far more detail than we have space for. The main thing to understand is that the CA is an important part of maintaining security and that you can only have one CA across a set of machines that access the same secured resources.

Why would I want to migrate to a new CA?

  • Your current CA certificate is about to expire. By default, CA certificates have a validity period of 5 years, so fairly early adopters of puppet will need to replace them.
  • You’ve had multiple CAs in use and need to consolidate on one.
  • You believe that your certificates and private keys are in the hands of people who could cause mischief with them.
  • You have fallen foul of bugs relating to the fact that you use a CA created in an older version of puppet.

It was in fact the second of these reasons that applied to Songkick; we’d previously been using multiple puppet masters, each with their own CA. We wanted to start using exported resources, stored in the same PuppetDB instance for all nodes. This meant that each master needed to be trusted by the same CA that signed the PuppetDB instance; hence we needed to consolidate on one CA.

How do I do it?

Set up NEW puppet master(s)

Set up at least one new puppet master server, with a new CA certificate.

If you have a lot of existing hosts managed by puppet, then it’s worth considering enabling the autosign option, even if only temporarily, as you’ll have a number of certificate requests to approve manually otherwise.

Configure AGents to connect to THe new master(S)

We’re assuming here that you’re managing the puppet agent configuration through puppet. and that changes to the puppet configuration cause an automatic restart of the puppet agent.

Change the configuration of your puppet agents, to connect to the new master(s) and use a different ssldir:

server = <new server hostname> 
ssldir = /var/lib/puppet/ssl2

Be careful not to apply this change to your newly created puppet master.

Your clients should reconfigure themselves, restart and when they start up, connect to your new puppet master, forgetting their old ssl configuration, including the CA certificates.

If you have autodiscovery records for puppet in DNS, e.g. an A record for ‘puppet’ or the SRV records, then you should leave them in place for now. Agents that have not been migrated to the new CA may need it.

It is a good idea to test this on a handful of nodes and check that it works in a completely automated fashion before applying to every node.

Tidying up (part 1)

Once every node has checked in with the new master and been issued with a new certificate, it’s time to start the process of tidying up. It’s a good idea to revert back to using the default ssldir, so that when agents bootstrap themselves with the default config, they do not then switch to the new ssldir and thus forget their old certificates. This will cause the master to refuse to talk to them, as this looks like a spoofing attempt.

On each client, we mirror the new ssldir to the old one:

file { '/var/lib/puppet/ssl': 
  source => 'file:///var/lib/puppet/ssl2',
  recurse => true, 
  purge => true, 
  force => true, 

Be careful not to apply this change to your newly created puppet master.

Tidy up (part 2)

Once that’s shipped everywhere, we remove the ssldir configuration, fall back on the default ssldir and remove the above resource definiton to copy the ssldir.

Tidy up (part 3)

You can now update your autodiscovery DNS entries, to point to the new servers and remove the autosign configuration, if desired.

Finally, we ship a change to the clients that removes the temporary /var/lib/puppet/ssl2 directory.

And that’s it, everything has been migrated to the new CA, with no need to do anything outside of puppet.