Tuesday, 8 December 2015

Puppet in the Pipeline

I gave a talk at the recent PuppetConf called "Puppet in the Pipeline" - a round up of workflow planning, deployment pipelines, and integration points. I start out with a very basic setup, and walk through various stages of complexity, talking though technical options and things to consider. I can't seem to get it written down as a satisfactory blog post, but for now I will just link to the video and slides:

Video: https://www.youtube.com/watch?v=4jXGmxkEoeM

Slideshare: http://www.slideshare.net/AnnaKennedy11/puppet-in-the-pipeline-55953094

Thursday, 27 August 2015

Good ticket guide

I wrote this guide at $job-1 (where it seems to be still in circulation) and thought maybe it could be a useful thing to share further.


We love a good ticket. It makes the job easier, it gets done faster, and it keeps everyone a bit happier. Here's a quick guide to what makes good tickets.

Describe the problem - not the solution

  • Whilst ideas about what might be wrong can be helpful, the absolute best thing you can do is describe the problem in its entirety

A descriptive ticket title

  • so that we can find your ticket at a glance

Relevant information and examples

  • If something is broken, then please outline how we can test it out for ourselves. How does it normally work? What happens now? When did it break?
  • If you need something changing, what is currently configured, and what would you prefer?
  • If you need something new, is it like anything that currently exists?

What’s the timescale?

  • Any indication of due dates, or potential blockers are useful
  • Set the ticket priority as appropriate
  • If it’s urgent, it’s usually best to log a ticket and then come over in person

What’s the value?
  • If something is broken or not working correctly, what is the impact to the business?
  • If the ticket is for some new infrastructure, what project is it supporting? 
  • Can you justify your request? How does it fit into the organisational priorities?
  • This information helps us do the most important tickets first.

Other helpful things to include:

  • Error messages
  • Screenshots
  • Example URLs

What not to include:

  • A direct set of commands to run without a full explanation of the problem. 
  • You don’t need to say thank you! At least not on the ticket, as this re-opens closed tickets. You’re welcome to come and say thank you in person (or buy us a beer at the pub).

Friday, 31 July 2015

Automated server testing with Serverspec, output for Logstash, results in Kibana. Part 2: Logging

At the end of Part 1 we had a Serverspec installation running tests which were stored alongside our configs.
Command-line arguments passed in the name of the VM and a list of modules to be tested.

Next, we want to look carefully at the output generated by Serverspec so that we can track and visualise our tests. We need to track our data carefully so that we can cope with the results of many different VMs.

Serverspec outputs

Serverspec has a number of output options. The 'documentation' style is what we've seen printed to screen so far; there are also json and html reports. It is possible to get all of these formatting options at once by adding the following line to your Rakefile:

 t.rspec_opts = "--format documentation --format html --out /opt/serverspec/reports/#{$host}.html --format json --out /opt/serverspec/reports/#{$host}.json"

So now we have two files at /opt/serverspec/report: www.example.com.html and www.example.com.json.
The json file is the one we're going to pick up and turn into our log.

Logging format

If we inspect the contents of the www.example.com.json report, we can see that it is of the format:
    "examples": [
            "description": "should be installed",
            "file_path": "/opt/puppetcode/modules/ntp/serverspec/init_spec.rb",
            "full_description": "Package \"ntp\" should be installed",
            "line_number": 4,
            "run_time": 2.525189129,
            "status": "passed"
    "summary": {
        "duration": 2.609159102,
        "example_count": 1,
        "failure_count": 0,
        "pending_count": 0
    "summary_line": "4 examples, 0 failures"
Each test is an element in the 'examples' array, and at the end we have a summary and a summary_line.

We're going to pick up every test as a separate json object, insert some identifying metadata, and output each test as a line in /var/log/serverspec.log

Apart from the host and module identifiers, it might also be helpful if we knew, for example, that the OS version of the host was, which git branch it came from, and maybe a UUID unique to a test (which could encompass multiple VMs).

With this in mind, we re-write our /opt/serverspec/Rakefile as follows:

require 'rake'
require 'rspec/core/rake_task'
require 'json'

# Command line variables
$uuid   = ENV['uuid']
$host   = ENV['host']
$modulelist = File.readlines(ENV['filename']).map(&:chomp)
$branch = ENV['branch']
$osrel  = ENV['osrel']

task :spec => ["spec:#{$host}", "output"]

# Run the Serverspec tests
namespace :spec do
  desc "Running serverspec on host #{$host}"
  RSpec::Core::RakeTask.new($host) do |t|
    ENV['TARGET_HOST'] = $host
    t.pattern = '/opt/puppetcode/modules/{' + $modulelist.join(",") + '}/serverspec/*_spec.rb'
    t.fail_on_error = false
    t.rspec_opts = "--format documentation --format html --out /opt/serverspec/reports/#{$host}.html --format json --out /opt/serverspec/reports/#{$host}.json"

# Edit the serverspec json file to add in useful fields
task :output do
  File.open("/var/log/serverspec.log","a") do |f|
    # Read in the json file that serverspec wrote
    ss_json = JSON[File.read("/opt/serverspec2/reports/#{$host}.json")]
    puts "/opt/serverspec2/reports/#{$host}.json"
    ss_json.each do |key, val|
      if key=='examples'
        val.each { |test|
          modulename = test["file_path"].gsub(/\/opt\/puppetcode\/modules\//,"").gsub(/\/serverspec\/.*/,"")
          test["module"] = modulename

# Add in the rest of our useful data
def insert_metadata ( json_hash )
  json_hash["time"]   = Time.now.strftime("%Y-%m-%d-%H:%M")
  json_hash["uuid"]   = $uuid
  json_hash["hostip"] = $host
  json_hash["branch"] = $branch
  json_hash["osrel"]  = $osrel

Now we can run

rake spec host=www.example.com filename=/opt/serverspec/modulelist branch=dev osrel=7.1 uuid=12345

And see in /var/log/serverspec.log

{"description":"should be installed","full_description":"Package \"ntp\" should be installed","status":"passed","file_path":"/opt/puppetcode/modules/ntp/serverspec/init_spec.rb","line_number":4,"run_time":0.029166597,"module":"ntp","time":"2015-07-31-12:21","uuid":"12345","host":"www.example.com","branch":"dev","osrel":"7.1"}

This log can now be collected by Logstash, indexed by Elasticsearch, and visualised with Kibana.

Automated server testing with Serverspec, output for Logstash, results in Kibana. Part 1: Serverspec

Whether you're spawning VMs to cope with spikes in traffic, or you want to verify your app works on a range of operating systems, it's incredibly useful to have some automated testing to go with your automated VM creation and configuration.

This is a quick run-down of one way to implement such automated testing with Serverspec and get results back that are ultimately visualisable in Kibana. NB the orchestration of the following steps is beyond the scope of this article - maybe some CI tool like Jenkins, orchestration tool like vRO or some custom software.

  • Automagically create VMs (AWS, OpenStack, etc)
  • Configure the VMs with some config management tool (Puppet, Chef, etc)
  • Perform functional testing of VMs with Serverspec 
  • Output logs that are collected by Logstash
  • Visualise output in Kibana

The first two points are essentially prerequisites to this article: create some VMs and install them with whatever cloud and config magic you like. For the purposes of this article, it doesn't really matter. I'm just going to assume that your VMs are 'normal', ie running and contactable.

Functional testing with Serverspec

Serverspec, if you've not used it, is an rspec-based tool to perform functional testing. It's ruby-based, has quite an easy set-up, and doesn't require anything to be installed on the target servers, just that it is able to ssh into the target machines with an ssh key.

Install and set up a la the Serverspec documentation

# gem install serverspec
# mkdir /opt/serverspec
# cd /opt/serverspec
# serverspec-init

This will have created you a basic directory structure with some files to get you started.

Right now we have:

# ls /opt/serverspec


The default setup of Serverspec is that you define a set of tests for each and every server and then run the contents of each directory against the matching host. However this doesn't really fit the workflow we're setting up here.

Re-organise Serverspec from host-based to app-based layout

To get started, let's delete the www.example.com directory - we don't want to define a set of tests per host like this, we want to make an app-based layout.

In my opinion, one of the easiest ways to organise the layout for your functional tests is to store it alongside your config management code. With this in mind, let's write a simple ntp test.

Writing a Serverspec test

Our ntp Puppet config is found at, and looks like:

# cat /opt/puppetcode/modules/ntp/manifests/init.pp

class ntp {
  package { 'ntp':
  ensure => installed

So alongside this directory we can make a sister Serverspec directory, and put our first test in there:

# cat /opt/puppetcode/modules/ntp/serverspec/init_spec.rb

require 'spec_helper'
describe package('ntp') do
  it { should be_installed }

Making Serverspec run our test

Now we need to edit the Rakefile to reflect this restructuring:

# cat /opt/serverspec/Rakefile

require 'rake'
require 'rspec/core/rake_task'

$host       = 'www.example.com'
$modulelist = %w( ntp )

task :spec => "spec:#{$host}"

namespace :spec do
  desc "Running serverspec on host #{$host}"
  RSpec::Core::RakeTask.new($host) do |t|
    ENV['TARGET_HOST'] = $host
    t.pattern = '/opt/puppetcode/modules/{' + $modulelist.join(",") + '}/serverspec/*_spec.rb'
    t.fail_on_error = false

Yes, we did just hard-code the host name and modulelist to test. Don't worry, we'll switch these out in a bit.
Note that we provide a pattern path with a regex to the directory containing our tests. Essentially when we run this file, we will pick up every test that matches the pattern and run these tests against the desired host.

Run the test

Now, making sure we are standing in the /opt/serverspec/ directory, we can run
# rake spec
Package 'ntp'
  should be installed
Green means that the test ran, and the output was successful. So as it stands, we can test our one www.example.com host with our one ntp test. Great! 

Rewrite the Rakefile to take command-line options rather than hard-coding variables

Right now, our host identifier and our list of modules to test are hard-coded in the Rakefile. Let's rewrite so these are passed in on the command line.

# cat /opt/serverspec/Rakefile

require 'rake'
require 'rspec/core/rake_task'

$host       = ENV['host']
$modulelist = ENV['modulelist']

task :spec => "spec:#{$host}"

namespace :spec do
  desc "Running serverspec on host #{$host}"
  RSpec::Core::RakeTask.new($host) do |t|
    ENV['TARGET_HOST'] = $host
    t.pattern = '/opt/puppetcode/modules/{' + $modulelist.join(",") + '}/serverspec/*_spec.rb'
    t.fail_on_error = false

Now to run the tests we need to do 
# rake spec host=www.example.com modulelist=/opt/serverspec/modulelist
# cat /opt/serverspec/modulelist
The modulelist file can be one you write yourself, or generated from something like a server's /var/lib/puppet/classes.txt. It's a way to narrow down what tests are run against each server, as all modules are not necessarily implemented everywhere.

Part 2: Generate logs that can be collected by Logstash, indexed by Elasticsearch, and visualised in Kibana

Friday, 8 May 2015

Recovering from puppet cert clean --all

If you just did 'puppet cert clean --all' because reasons and now everything is broken like:
test-server:~# puppet agent -vt
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: Error 400 on SERVER: Could not retrieve facts for test-server.vm: Failed to find facts from PuppetDB at puppetmaster.example.com:8081: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [certificate revoked for /CN=puppetmaster.example.com]
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to submit 'replace facts' command for test-server.vm to PuppetDB at puppetmaster.example.com:8081: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [certificate revoked for /CN=puppetmaster.example.com]
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

STOP PANICKING: we can fix this.

If you have a backup of the puppetmaster's /var/lib/puppet directory, do a restore and hopefully all will be well.

If not, let's fix the puppetmaster (NB here I'm using a monolithic installation - if your puppetmaster and puppetdbs are on separate machines you'll have to adapt this a little bit).

Cleaning all the certificates means that the puppetmaster's own certificate is missing too, so re-generate it with

puppetmaster:~# puppet cert generate puppetmaster.example.com

Now the puppetmaster has new ssl bits and bobs but puppetdb has the old ones. Clean out the puppetdb ssl directory:

puppetmaster:~# rm -rf /etc/puppetdb/ssl/*

And use the handy ssl-setup script to copy the new ones to the right places

puppetmaster:~# puppetdb ssl-setup
PEM files in /etc/puppetdb/ssl are missing, we will move them into place for you
Copying files: /var/lib/puppet/ssl/certs/ca.pem, /var/lib/puppet/ssl/private_keys/puppetmaster.example.com.pem and /var/lib/puppet/ssl/certs/puppetmaster.example.com.pem to /etc/puppetdb/ssl
Setting ssl-host in /etc/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-port in /etc/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-key in /etc/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-cert in /etc/puppetdb/conf.d/jetty.ini already correct.
Setting ssl-ca-cert in /etc/puppetdb/conf.d/jetty.ini already correct.

Restart all the things:

puppetmaster:~# service puppetmaster restart
puppetmaster:~# service puppetdb restart

Now, let's fix the nodes.

Start with a test node (preferably not in production), to verify all the steps so far worked as expected.
Remove the existing ssl certs with

test-server:~# rm -rf /var/lib/puppet/ssl/*

Now run puppet manually with

test-server:~# puppet agent -vt
Info: Creating a new SSL key for test-server.vm
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for test-server.vm
Info: Certificate Request fingerprint (SHA256): 92:A9:A6:B1:88:7B:DB:A7:65:00...
Info: Caching certificate for ca
Exiting; no certificate found and waitforcert is disabled

Sign the certificate on the master as usual

puppetmaster:~# puppet cert sign test-server.vm

Now your node should run as usual

test-server:~# puppet agent -vt
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for varnish4.vm
Info: Applying configuration version '1431084513'

The final step is to re-generate certificates for all the rest of your nodes.
Option 1: log into every server and repeat the above.
Option 2: automate option 2 - think ssh, clusterssh, etc

Good luck!

PS I lied - the final, final step is to set up proper backup and restore of your certificate store at /var/lib/puppet/ssl and delete the clean --all line from your command history so you can't accidentally run it again.

References: https://docs.puppetlabs.com/puppetdb/latest/install_from_source.html