↓ Archives ↓

Category → chef

Infrastructure as Code – A comprehensive overview

I've been tracking infrastructure as code for a few years now. Over the years it has gotten closer to real code.

Close but no sigar yet.... We've come a long way but when you compare it to real languages it still feels in it's infancy. In this updated overview I gave at the ABUG, I went through:

  • the basic concepts of infrastructure as code
  • the differences/concepts in the languages (chef, puppet, ...)
  • the editors , syntax checkers, highlighting
  • integration with git version control
  • integration with CI systems
  • the different forms of testing (syntax, compile, unit, smoke testing)
  • using vagrant, veewee and the tools in that eco-system
  • debugging , profiling your code

This talk is probably the most comprehensive tool list that I've seen/made about the subject. But feel free to post and add your findings in the comments!

Note: that at the end of the presentation there are many extra links still to be sorted or slight outdated tools.

I've given previous versions of this talk at Devoxx 2012 and Jax2012. Enjoy the Jax2012 video here:

What if Config Management was created by Game Designers

Ever wondered what Config Management would look like if it was created by Game Designers?

Enjoy my Ignite session from Devopsdays London 2013


(n) Steps to Configuration Management Paradise…

Over the past five years I’ve come to experience the delights of Puppet, CFEngine and Chef across a wide range of deployments ranging from a couple of web servers to host two or three hundred sites, to thousands of servers underpinning an OpenStack-based cloud solution. I’d like to share a couple of thoughts on what [...]

Vagrant and Drupal, a winning team

While heading back home from DrupalCon Munich after 4 days of good interaction with lots of Drupal folks.
I realized to my big suprise that there are a lot of people using Vagrant to make sure that developers are not working on platforms they invented their own. Lots of people have realized that "It works on my computer" is not something they want to hear from a developer and are reaching out to give them viable solutions to work on shared and reproducible solutions.

There were 2 talks proposing solutions to the problem,

the first one was ..Fearless development with Drush, Vagrant and Aegir by Christopher Gervais He talked about Drush VAgrant Integraion and how extentions to Drush allow for easy vagrant integration , bridging this gap allows rupal developers to use a tool they are already familiar with

The second one was Jochen Lillich who explained how he us using Vagrant an Chef for this purpose his talk titled Use datacenter tools to make your dev life easier has been posted already.

During the Vagrant BOF , I briefly ran over @patrickdebois old slides on Vagrant after which people started discussing their use cases.. 2 other projects came up

First is Project Oscar which aims at providing developers with a default Drupal development environment in a Jiffy. they do this by providing a bunch of puppetmanifests that sets up a working environment.

And the second one is Ariadne which is a standardized virtual machine development evironment for easily developing Drupal sites in a local sandbox that is essentially identical to a fully-configured hosted solution. It attempts to emulate a dedicated Acquia/Pantheon server as closely as possible, with added development tools. Project Ariadne is just like the examples from Jochen Lillich based on Chef

With all of these tools and examples around , there should be no excuses anymore for Drupal developers to hack on their own machine and tell the systems people "It works on my machine" (let alone to hack in production).

My PHPNE Talk on Vagrant

I’m not a public speaker. In fact, i’m normally found either sitting at the back behind a sound desk or running round fixing technical problems.

However, Anthony Sterling approached me the week before the May 2012 PHP North East meetup and asked if i’d do a talk on Vagrant.

In order to give something back to the group, and knowing it was a product I really like that i’d be able to come up with plenty of content, I agreed.

I titled the talk Virtualized Development Environments with Vagrant so I could go into the background and why you would want to use such a tool, as well as how to use it. I also gave a brief introduction to Intrastructure as Code and Configuration Management with Puppet and Chef.

The full contents were:

  • Introduction to:
    • Development Environments
    • Virtualization
  • Why virtualize your development environment?
  • Introduction to Vagrant
  • Using Vagrant
  • Vagrant Plugins
  • Automated Provisioning (Configuration Management)

Watching it back, my public speaking and presentation skills do need quite a bit of work. However, given that it was a first attempt at public speaking, which is not natural to me, hopefully it was a good attempt. The feedback was good, plenty of people had positive things to say, questions and further discussion in the pub afterwards so that nice. I now also have some points to improve on.

The talk was video’d and available on Vimeo:

And the slides available on Slideshare or SpeakerDeck:

Continuous Integration for the world – Agile 2011

In 2008 at Agile Toronto, I did a session on Agile Infrastructure. This is where I met Andrew Shafer (working at Reductive Labs). There wasn't that much attention for it back then, we were still figuring out the impact and ideas. Now 3 years later, I was keen on doing a follow-up at the big Agile2011 conference in Utah.

I had the pleasure to co-present with the illustrious Julian Simpson (aka Build Doctor).

The talk was based on showing off a few of the tools that are currently used in the devops space and pointing out the similarities between the release process of code and the release process of servers:

Thanks to the concept of infrastructure as code and virtualization, we can define and build our infrastructure based on textfiles. Those files can be version-controlled and tested like regular code. The artefact (ami, image), can then be deployed on an infrastructure. The following image gives you an overview on the similarities.

Some people have asked for an overview on how all these tools relate to each-other. Therefore I started with the following overview to make understandable. I'd love to hear your comments on it.

Demo of veewee creating Just Enough Operating System


Demo of vagrant spinning up an ubuntu box


Demo of chef on vagrant spinning up an apache webserver


Demo of cucumber-nagios testing the deployment of the webserver


Demo of mccloud for doing vagrant-alike deployment on ec2


Drupal and Configuration Mgmt, we’re getting there …

For those who haven't noticed yet .. I`m into devops .. I`m also a little bit into Drupal, (blame my last name..) , so one of the frustrations I've been having with Drupal (an much other software) is the automation of deployment and upgrades of Drupal sites ...

So for the past couple of days I've been trying to catch up to the ongoing discussion regarding the results of the configuration mgmt sprint , I've been looking at it mainly from a systems point of view , being with the use of Puppet/ Chef or similar tools in mind .. I know I`m late to the discussion but hey , some people take holidays in this season :) So below you can read a bunch of my comments ... and thoughts on the topic ..

First of all , to me JSON looks like a valid option.
Initially there was the plan to wrap the JSON in a PHP header for "security" reasons, but that seems to be gone even while nobody mentioned the problems that would have been caused for external configuration management tools.
When thinking about external tools that should be capable of mangling the file plenty of them support JSON but won't be able to recognize a JSON file with a weird header ( thinking e.g about Augeas (augeas.net) , I`m not talking about IDE's , GUI's etc here, I`m talking about system level tools and libraries that are designed to mangle standard files. For Augeas we could create a separate lens to manage these files , but other tools might have bigger problems with the concept.

As catch suggest a clean .htaccess should be capable of preventing people to access the .json files There's other methods to figure out if files have been tampered with , not sure if this even fits within Drupal (I`m thinking about reusing existing CA setups rather than having yet another security setup to manage) ,

In general to me tools such as puppet should be capable of modifying config files , and then activating that config with no human interaction required , obviously drush is a good candidate here to trigger the system after the config files have been change, but unlike some people think having to browse to a web page to confirm the changes is not an acceptable solution. Just think about having to do this on multiple environments ... manual actions are error prone..

Apart from that I also think the storing of the certificates should not be part of the file. What about a meta file with the appropriate checksums ? (Also if I`m using Puppet or any other tool to manage my config files then the security , preventing to tamper these files, is already covered by the configuration management tools, I do understand that people want to build Drupal in the most secure way possible, but I don't think this belongs in any web application.

When I look at other similar discussions that wanted to provide a similar secure setup they ran into a lot of end user problems with these kind of setups, an alternative approach is to make this configurable and or plugable. The default approach should be to have it enable, but the more experienced users should have the opportunity to disable this, or replace it with another framework. Making it plugable upfront solves a lot of hassle later.

Someone in the discussion noted :
"One simple suggestion for enhancing security might be to make it possible to omit the secret key file and require the user to enter the key into the UI or drush in order to load configuration from disk."

Requiring the user to enter a key in the UI or drush would be counterproductive in the goal one wants to achieve, the last thing you want as a requirement is manual/human interaction when automating setups. therefore a feature like this should never be implemented

Luckily there seems to be new idea around that doesn't plan on using a raped json file
instead of storing the config files in a standard place, we store them in a directory that is named using a hash of your site's private key, like sites/default/config_723fd490de3fb7203c3a408abee8c0bf3c2d302392. The files in this directory would still be protected via .htaccess/web.config, but if that protection failed then the files would still be essentially impossible to find. This means we could store pure, native .json files everywhere instead, to still bring the benefits of JSON (human editable, syntax checkable, interoperability with external configuration management tools, native + speedy encoding/decoding functions), without the confusing and controversial PHP wrapper.

Figuring out the directory name for the configs from a configuration mgmt tool then could be done by something similar to

  1. cd sites/default/conf/$(ls sites/default/conf|head -1)

In general I think the proposed setup looks acceptable , it definitely goes in the right direction of providing systems people with a way to automate the deployment of Drupal sites and applications at scale.

I`ll be keeping a eye on both the direction they are heading into and the evolution of the code !

Drupal and Configuration Mgmt, we’re getting there …

For those who haven't noticed yet .. I`m into devops .. I`m also a little bit into Drupal, (blame my last name..) , so one of the frustrations I've been having with Drupal (an much other software) is the automation of deployment and upgrades of Drupal sites ...

So for the past couple of days I've been trying to catch up to the ongoing discussion regarding the results of the configuration mgmt sprint , I've been looking at it mainly from a systems point of view , being with the use of Puppet/ Chef or similar tools in mind .. I know I`m late to the discussion but hey , some people take holidays in this season :) So below you can read a bunch of my comments ... and thoughts on the topic ..

First of all , to me JSON looks like a valid option.
Initially there was the plan to wrap the JSON in a PHP header for "security" reasons, but that seems to be gone even while nobody mentioned the problems that would have been caused for external configuration management tools.
When thinking about external tools that should be capable of mangling the file plenty of them support JSON but won't be able to recognize a JSON file with a weird header ( thinking e.g about Augeas (augeas.net) , I`m not talking about IDE's , GUI's etc here, I`m talking about system level tools and libraries that are designed to mangle standard files. For Augeas we could create a separate lens to manage these files , but other tools might have bigger problems with the concept.

As catch suggest a clean .htaccess should be capable of preventing people to access the .json files There's other methods to figure out if files have been tampered with , not sure if this even fits within Drupal (I`m thinking about reusing existing CA setups rather than having yet another security setup to manage) ,

In general to me tools such as puppet should be capable of modifying config files , and then activating that config with no human interaction required , obviously drush is a good candidate here to trigger the system after the config files have been change, but unlike some people think having to browse to a web page to confirm the changes is not an acceptable solution. Just think about having to do this on multiple environments ... manual actions are error prone..

Apart from that I also think the storing of the certificates should not be part of the file. What about a meta file with the appropriate checksums ? (Also if I`m using Puppet or any other tool to manage my config files then the security , preventing to tamper these files, is already covered by the configuration management tools, I do understand that people want to build Drupal in the most secure way possible, but I don't think this belongs in any web application.

When I look at other similar discussions that wanted to provide a similar secure setup they ran into a lot of end user problems with these kind of setups, an alternative approach is to make this configurable and or plugable. The default approach should be to have it enable, but the more experienced users should have the opportunity to disable this, or replace it with another framework. Making it plugable upfront solves a lot of hassle later.

Someone in the discussion noted :
"One simple suggestion for enhancing security might be to make it possible to omit the secret key file and require the user to enter the key into the UI or drush in order to load configuration from disk."

Requiring the user to enter a key in the UI or drush would be counterproductive in the goal one wants to achieve, the last thing you want as a requirement is manual/human interaction when automating setups. therefore a feature like this should never be implemented

Luckily there seems to be new idea around that doesn't plan on using a raped json file
instead of storing the config files in a standard place, we store them in a directory that is named using a hash of your site's private key, like sites/default/config_723fd490de3fb7203c3a408abee8c0bf3c2d302392. The files in this directory would still be protected via .htaccess/web.config, but if that protection failed then the files would still be essentially impossible to find. This means we could store pure, native .json files everywhere instead, to still bring the benefits of JSON (human editable, syntax checkable, interoperability with external configuration management tools, native + speedy encoding/decoding functions), without the confusing and controversial PHP wrapper.

Figuring out the directory name for the configs from a configuration mgmt tool then could be done by something similar to

  1. cd sites/default/conf/$(ls sites/default/conf|head -1)

In general I think the proposed setup looks acceptable , it definitely goes in the right direction of providing systems people with a way to automate the deployment of Drupal sites and applications at scale.

I`ll be keeping a eye on both the direction they are heading into and the evolution of the code !

Command-line cookbook dependency solving with knife exec

Imagine you have a fairly complicated infrastructre with a large number of nodes and roles. Suppose you have a requirement to take one of the nodes and rebuild it in an entirely new network, perhaps even for a completely different organization. This should be easy, right? We have our infrastructure in the form of code. However, our current infrastructure has hundreds of uploaded cookbooks - how do we know the minimum ones to download and move over? We need to find out from a node exactly what cookbooks are needed for that node to be built.

The obvious place to start is with the node itself:

$ knife node show controller
Node Name:   controller
Environment: _default
FQDN:        controller
IP:          182.13.194.41
Run List:    role[base], recipe[apt::cacher], role[pxe_server]
Roles:       pxe_server, base
Recipes      apt::cacher, pxe_dust::server, dhcp, dhcp::config
Platform:    ubuntu 10.04

OK, this tells us we need the apt, pxe_dust and dhcp cookbooks. But what about them - do they have any dependencies? How could we find out? Well, dependencies are specified in two places - in the cookbook metadata, and in the individual recipes. Here’s a primitive way to illustrate this:

bash-3.2$ for c in apt pxe_dust dhcp
> do
> grep -iER 'include_recipe|^depends' $c/* | cut -d '"' -f 2 | sort | uniq
> done
apt::cacher-client
apache2
pxe_dust::server
tftp
tftp::server
utils

As I said - primitive. However the problem doesn’t end here. In order to be sure, we now need to repeat this for each dependency, recursively. And of course it would be nice to present them more attractively. Thinking about it, it would be rather useful to know what cookbook versions are in use too. This is definitely not a job for a shell one liner - is there a better way?

As it happens, there is. Think about it - the Chef server already needs to solve these dependencies to know what cookbooks to push to API clients. Can we access this logic? Of course we can - clients carry out all their interactions with the Chef server via the API. This means we can let the server solve the dependencies and query it via the API ourselves.

Chef provides two powerful ways to access the API without having to write a RESTful client. The first, Shef, is an interactive REPL based on IRB, which when launched gives access to the Chef server. This isn’t trivial to use. The second, much simpler way is the knife exec subcommand. This allows you to write Ruby scripts or simple one-liners that are executed in the context of a fully configured Chef API Client using the knife configuration file.

knife exec -E '(api.get "nodes/controller/cookbooks").each { |cb| pp cb[0] => cb[1].version }'

The /nodes/NODE_NAME/cookbooks endpoint returns the cookbook attributes, definitions, libraries and recipes that are required for this node. The response is a hash of cookbook name and Chef::CookbookVersion object. We simply iterate over each one, and pretty print the cookbook name and the version.

Let’s give it a try:

$ knife exec -E '(api.get "nodes/controller/cookbooks").each { |cb| pp cb[0] => cb[1].version }'
{"apt"=>"1.1.1"}
{"tftp"=>"0.1.0"}
{"apache2"=>"0.99.3"}
{"dhcp"=>"0.1.0"}
{"utils"=>"0.9.5"}
{"pxe_dust"=>"1.1.0"}

Nifty! :)

Command-line cookbook dependency solving with knife exec

Imagine you have a fairly complicated infrastructre with a large number of nodes and roles. Suppose you have a requirement to take one of the nodes and rebuild it in an entirely new network, perhaps even for a completely different organization. This should be easy, right? We have our infrastructure in the form of code. However, our current infrastructure has hundreds of uploaded cookbooks - how do we know the minimum ones to download and move over? We need to find out from a node exactly what cookbooks are needed for that node to be built.

The obvious place to start is with the node itself:

$ knife node show controller
Node Name:   controller
Environment: _default
FQDN:        controller
IP:          182.13.194.41
Run List:    role[base], recipe[apt::cacher], role[pxe_server]
Roles:       pxe_server, base
Recipes      apt::cacher, pxe_dust::server, dhcp, dhcp::config
Platform:    ubuntu 10.04

OK, this tells us we need the apt, pxe_dust and dhcp cookbooks. But what about them - do they have any dependencies? How could we find out? Well, dependencies are specified in two places - in the cookbook metadata, and in the individual recipes. Here’s a primitive way to illustrate this:

bash-3.2$ for c in apt pxe_dust dhcp
> do
> grep -iER 'include_recipe|^depends' $c/* | cut -d '"' -f 2 | sort | uniq
> done
apt::cacher-client
apache2
pxe_dust::server
tftp
tftp::server
utils

As I said - primitive. However the problem doesn’t end here. In order to be sure, we now need to repeat this for each dependency, recursively. And of course it would be nice to present them more attractively. Thinking about it, it would be rather useful to know what cookbook versions are in use too. This is definitely not a job for a shell one liner - is there a better way?

As it happens, there is. Think about it - the Chef server already needs to solve these dependencies to know what cookbooks to push to API clients. Can we access this logic? Of course we can - clients carry out all their interactions with the Chef server via the API. This means we can let the server solve the dependencies and query it via the API ourselves.

Chef provides two powerful ways to access the API without having to write a RESTful client. The first, Shef, is an interactive REPL based on IRB, which when launched gives access to the Chef server. This isn’t trivial to use. The second, much simpler way is the knife exec subcommand. This allows you to write Ruby scripts or simple one-liners that are executed in the context of a fully configured Chef API Client using the knife configuration file.

knife exec -E '(api.get "nodes/controller/cookbooks").each { |cb| pp cb[0] => cb[1].version }'

The /nodes/NODE_NAME/cookbooks endpoint returns the cookbook attributes, definitions, libraries and recipes that are required for this node. The response is a hash of cookbook name and Chef::CookbookVersion object. We simply iterate over each one, and pretty print the cookbook name and the version.

Let’s give it a try:

$ knife exec -E '(api.get "nodes/controller/cookbooks").each { |cb| pp cb[0] => cb[1].version }'
{"apt"=>"1.1.1"}
{"tftp"=>"0.1.0"}
{"apache2"=>"0.99.3"}
{"dhcp"=>"0.1.0"}
{"utils"=>"0.9.5"}
{"pxe_dust"=>"1.1.0"}

Nifty! :)