Category → chef
Beyond Configuration Mgmt
(This post has been sitting in the drafts folder for way to long, I decided to push the publish button anyhow .. some people might get ideas from it..)
We've all run in to the problem, you've puppetized, or euh .. cooked , about every part of your infrastructure and then there's this one service which has no config files, a broken api that doesn't allow you to configure antyhing, but a magnificent web gui to configure all aspects of the service. Magnificent for the eye , full of AJAX and other fancy stuff which wget isn't really keen on. Off course before it even starts working you need to set it's password , from that webgui.
Sometimes when you are lucky they store al their config in a database, which you can dump, parse and replace all the host specific parameters for other deployments, but is that an approach you like ? As for each new version you'll need to reanalyze the db layout. But no matter how you look at it ,dumping the DB and restoring it is an ugly hack you don't want.
Other alternatives like sniffing the traffic and replaying the POSTS etc were considered ... but fancy AJAX stuff and SSL make that less trivial than it seems
Wo while discussing with an upstream project they proposed to actually screenscrape their config webgui .
So screenscraping the config gui it is .. but how ... I started looking at tools that are typically used for testing rather than for automation, with the purpose of replaying the scenarios one needs to configure the services.
My first attempt was Selenium, it plugs into a browser , so it's easy to acraully record what it has to do, and it saves it's scenarios in a somewhat readable/ editable format.
Having found the export to perl function it alll looked promising. However the export to perl isn't really an export to perl as I epxected .. I assumed it would just generate the perl code to run the same scneario which would be awesome .. it however generates a perl script that instructs a selenium server to run the script.
One of the annoyancies I ran into with Selenium is that a browser
doesn't accept self signed certificates , and one can't preprovision a browser easyily with those freshly created certificates. (Yes Karl I already read about certutil ... )
I had heard good things about Cucumber so I was pretty eager to start testing it ... In short Cucumber lack documentation ,
I tried a couple of things but I couldn't get beyond testing if a certain string was on a page.. couldn't figure out how to fill in a form etc ...
Maybe if anyone could point me to some great documentation on how you should write recipe's here ... I didn't find the documentation all to easy to find ..
Bummer as it really really looks promisiung .. specially since it is so lightweight ..
IP played with JMeter and Sahi too .. but still
So apart from filing bugs to the upstream project/product and hoping they understand your problem and are willing to oopen up their API , what other options do you folks suggest ?
I gave a short talk about this at Puppetcamp in Amsterdam and the audience came up with a bunch of other potential projects to look at .
The main problem still is that all these are tools to automate testing , they don't provide you with a general purpose approach to solve the configuration mgmt problem, each time the upstream vendor modifies the layout of his page you hav e to do the work again and that .. really doesn't sound promising ..
Amazon Web Services, Hosting in the Cloud and Configuration Management
Amazon is probably the biggest cloud provider in the industry – they certainly have the most features and are adding more at an amazing rate.
Amongst the long list of services provided under the AWS (Amazon Web Services) banner are:
- Elastic Compute Cloud (EC2) – scalable virtual servers based on the Xen Hypervisor.
- Simple Storage Service (S3) – scalable cloud storage.
- Elastic Load Balancing (ELB) – high availability load balancing and traffic distribution.
- Elastic IP Addresses – re-assignable static ip addresses to EC2 instances.
- Elastic Block Store (EBS) – persistant storage volumes for EC2.
- Relational Database Service (RDS) – scalable MySQL compatible database services.
- CloudFront – a Content Delivery Network (CDN) for serving content from S3.
- Simple E-Mail System (SES) – for sending bulk e-mail.
- Route 53 – high availability and scalable Domain Name System (DNS).
- CloudWatch – monitoring of resources such as EC2 instances.
Amazon provides these services in 5 different regions:
- US East (North Virginia)
- US West (North California)
- Europe (Ireland)
- Asia Pacific – Tokyo
- Asia Pacific – Singapore
Each region has it’s own pricing and features available.
Within each region, Amazon provides multiple “Availability Zones”. These different zones are completely isolated from each other – probably in separate data centers, as Amazon describes them as follows:
Q: How isolated are Availability Zones from one another?
Each availability zone runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. Common points of failures like generators and cooling equipment are not shared across Availability Zones. Additionally, they are physically separate, such that even extremely uncommon disasters such as fires, tornados or flooding would only affect a single Availability Zone.
However, unless you have been offline for the past few days, you will have no doubt heard about the extended outage Amazon has been having in their US East region. The outage started on Thursday, 21st April 2011) taking down some big name sites such as Reddit, Quora, Foursquare & Heroku and the problems are still ongoing now, nearly 2 days later – with Reddit and Quora still running in an impaired state.
I have to confess, my first reaction was that of surprise that such big names didn’t have more redundancy in place – however, once more information came to light, it became apparent that the outage was affecting multiple availability zones – something Amazon seems to imply above shouldn’t happen.
You may well ask why such sites are not split across regions to give more isolation against such outages. The answer to this lies in the implementation of the zones and regions in AWS. Although isolated, the zones within a single region are close enough together that low cost, low latency links can be provided between the different zones within the same region. Once you start trying to run services across regions, all inta-region communication will go over the normal internet and is therefore comparatively slow, expensive and unreliable so it becomes much more difficult and expensive to keep data reliably syncronised. This coupled with Amazon’s above claims about the isolation between zones and best practises has lead to the common setup being to split services over multiple availability zones within the same region – and what makes this outage worst is that US East is the most popular region due to it being a convenient location for sites targeting both the US and Europe.
On the back of this, there are many people are giving both Amazon and cloud hosting a good bashing in both blog posts and on Twitter.
Where Amazon has let everyone down in this instance is that they let a problem (which in this case is largely centered around EBS) to affect multiple availability zones and thus screwing everyone who either had not implemented redundancy or had followed Amazon’s own guidelines and assurances of isolation. I also believe that their communication has been poor and had customers been aware it would take so long to get back online, they may have been in a position to look at measures to get back online much sooner.
In reality though, both Amazon and cloud computing less to do with this problem and more specifically the blame associated with it. At the end of the day, we work in an industry that is susceptible to failure. Whether you are hosting on bare metal or in the cloud, you will experience failure sooner or later and part of the design of any infrastructure you need to take that into account. Failure will happen – it’s all about mitigating the risk of this failure through measures like backups and redundancy. There is a trade-off between the cost, time and complexity of implementing multiple levels of redundancy verses the risk of failure and downtime. On each project or infrastructure setup, you need to work out where on this sliding scale you are.
In my opinion, cloud computing provides us an easy way out of such problems. Cloud computing gives us the ability to quickly spin up new services and server instances within minutes, pay by the hour for them and destroy them when they are no longer required. Gone are the days of having to order servers or upgrades and wait in a queue for a data center technician to deal with hardware. It was the norm to incur large setup costs and/or get locked into contracts. In the cloud, instances can be resized, provisioned or destroyed in minutes and often without human intervention as most cloud computing providers also provide an API so users can handle the management of their services programatically. Under load, instances can be upgraded or additional instances brought online and in quiet periods, instances can be downgraded or destroyed, yielding a significant cost saving. Another huge bonus is that instances can be spun up for development, testing or to perform an intensive task and thrown away afterwards.
Being able to spin new instances up in minutes is however less effective if you have to spend hours installing and configuring each instance before it can perform it’s task. This is especially true if more time is wasted chasing and debugging problems because something is setup differently or missed during the setup procedure. This is where configuration management tools or the ‘infrastructure as code’ principles come in. Tools such as Puppet and Chef were created to allow you to describe your infrastructure and configuration in code and have machines or instances provisioned or updated automatically.
Sure, with virtual machines and cloud computing, things have got a little easier by easily allowing re-usable machine images. You can setup a certain type of system once and re-use the image for any subsequent systems of the same type. This is however greatly limiting in that it’s very time consuming to then later update that image with small changes, to cope with small variations between systems and almost impossible to keep track of what changes have been made to which instances.
Configuration Management tools like Puppet and Chef manage system configuration centrally and can:
- Be used to provision new machines automatically.
- Roll out a configuration change across a number of servers.
- Deal with small variations between systems or different types of systems (web, database, app, dns, mail, development etc).
- Ensure all systems are in a consistant state.
- Ensure consistency and repeatability.
- Easily allow the use of source code control (version control) systems to keep a history of changes.
- Easily allow the provisioning of development and staging environments which mimic production.
As time permits, i’ll publish some follow up posts which go into Puppet and Chef in more detail and look at how they can be used. I’ll also be publishing a review of James Turnbull’s new book, Pro Puppet which is due to go to print at the end of the month.
Vagrant Testing, Testing, One Two
Now that we have Vagrant up and running with our favorite Config Management, let's see how we can integrate testing into our workflow.
Given our awesome project from my 'Using Vagrant as a Team' post we have the following components:
[DIR] awesome-vagrant (2)
- [DIR] awesome-frontend
- [DIR] awesome-datastore
- [DIR] awesome-data
- [DIR] awesome-chefrepo (1a)
- [DIR] awesome-puppetrepo (1b)
What do we test?
As awesome-{frontend,datastore,data} are considered traditional software components, they would include the usual unit and integration tests from themselves. You can find ample information on the web for your favorite software component.
Cucumber and friends
Testing your configuration management is not that common yet, let's explore our options there:
Most of the current tools are inspired by 'cucumber' a 'behavior driven development' tool. Lindsay Holmwood his great presentation at devopsdays 2009 on 'cucumber-nagios inspired a lot of the authors to use it.
A good book on Cucumber is the rspec book and here is a great slideshare presentation on 'Writing software not code with cucumber' and some caveats in You're cuking it wrong.
Alternatively there is another framework called Babushka that sets out with it's own testing DSL. I find it refreshing to see another approach being build upon.
Puppet testing options
puppet you have 'cucumber-puppet' written by Nikolay Sturm a testing framework for your manifests.
- It relies on the 'noop' parameter of the puppet standalone client to simulate a run and see the results of that run.
- It also checks if the catalog of the next puppetrun compiles ok.
- Tom Sulston did a great videocast for Infoq BDD with Puppet and Cucumber
- Dean Wilson added additional steps to check providers with it. , they extend the cucumber-puppet by using puppet to actually test things on a provisioned system.
Chef testing options
As chef did not implement the noop-mode, I guess it took some time to have an equivalent.
My first thought was to have puppet noop runs against a chef install, but that seemed limited for the business behavior and would only test if chef did it's job.
Recently hedgehog announced writing chef steps for cucumber . The good thing is he's packaging these steps +those from cucumber nagios and others into a new gem called 'Cuken (pronounced Cookin)' . The origin of the cuken project is Aruba a set of cucumber tests to test a CLI application.
Also do check out Stephen Nelson-Smith [videocast on doing TDD with Chef and Cucumber with LXC containers on EC2] (http://skillsmatter.com/podcast/home/cucumber-chef/js-1541).
Integration testing
For our project we took another route: Instead of testing our chef recipes as standalone piece, we would test the whole of our deployed stack: the provisioned/configured system + all application and data deployed. You have to see this as complementary to your recipe/manifest tests:
- Testing all components together allows you to test the interaction/integration,
- where as if you only test the recipes itself, it would not test integration stuff like (sessions no being generated). But the advantage is that you have a better idea where things are failing when in type 1 tests.
This is very similar to the complementary fact of unit tests and bdd tests: test inside out, and outside in.
Installing cucumber
cucumber is a rubygem: this means that we now require not only the 'vagrant' gem needs to be installed cucumber and cuken too. Note we will include only cucumber-nagios steps and not the cuken part as they still conflict in their ssh steps.
To avoid that we need to communicate the exact version to every team member or any subsequent gem we need, we set out to create a 'Gemfile' that can be used by bundler. Our Gemfile would look like this
source 'http://rubygems.org' gem 'vagrant', '0.7.2' gem 'cuken' gem 'cucumber' gem 'cucumber-nagios'
I tried to include cuken (that has the chef steps) work from the latest gitrepo:
gem 'cuken', :git => "git://github.com/hedgehog/cuken.git" gem 'ssh-forever', :git => "git://github.com/mattwynne/ssh-forever.git"
But it complains on ssh-forever not being there because that version was yanked . So no chef steps yet....
Update: 31/03/2011: It should work, and was probably a temporary fluke in my gemset
Now let's continue the installation of our gems using bundler.
We use a global gemset with rvm to install the bundler gem for all subsequent projects. And install run bundler on our awesome-vagrant gemset
$ rvm gemset use @global $ gem install bundler $ bundle install $ rvm gemset use awesome-vagrant
So now instead of doing 'gem install', you do:
$ bundle install
And it will install all the versions you specified in Gemspec the awesome-vagrant gemset . We add it to our git repo of the awesome-vagrant so people can add things if they need to.
You should now be able to run the cucumber command:
$ cucumber
Setting up our feature structure
In contract to using cucumber with other frameworks such as rails, we have do some work to get it working. We need to create a feature directory similar to below.
[DIR]awesome-vagrant
- Vagrantfile
- Gemspec
- awesome-{frontend,datastore,date,chefrepo} git repos
- features
- steps
(steps go here)
- support
env.rb
- (features go here)
In env.rb you can put all the necessary requires for libraries you want to include :
require 'bundler' begin Bundler.setup(:default, :development) rescue Bundler::BundlerError => e $stderr.puts e.message $stderr.puts "Run `bundle install` to install missing gems" exit e.status_code end $LOAD_PATH.unshift(File.dirname(__FILE__) + '/../../lib') # Disabling cuken until it gets less conflicting with other parts # require 'cuken/ssh' # require 'cuken/cmd' # require 'cuken/file' # require 'cuken/chef' # We don't include all nagios steps only the http , but there are of-course more # require 'cucumber/nagios/steps' # Disable the following line if you want to use the extended ssh_steps require 'cucumber/nagios/steps/ssh_steps' require 'cucumber/nagios/steps/http_steps' require 'cucumber/nagios/steps/http_header_steps' require 'rspec/expectations' # We use mechanize as this doesn't require us to be a rack application require 'mechanize' require 'webrat' World(Webrat) World do Webrat::Session.new(Webrat::MechanizeAdapter.new) end
Using SSH to run commands
Our first feature using cucumber ssh steps
Let's write our first feature that checks our apache. Based on the example described on the cucumber nagios blogpost
Feature: Executing commands In order to test a running system As an administrator I want to verify the apache behavior Scenario: Checking if apache is running When I ssh to "localhost" with the following credentials: | username | password | | vagrant | vagrant | And I run "ps -ef |grep http|grep -v grep" Then I should see "http" in the output
Now run (assuming you have apache of course)
$ cucumber
The problem with the standard cucumber-nagios steps is that it assumes to be on port 22 and vagrant has mapped our port. See the ssh_steps code for details.
Our enhanced version of the ssh steps
We decided to extend the ssh steps to add a few more rinkles to it.
- Download our extended ssh steps file and put it into the steps directory we created earlier as filename 'ssh_extended_steps.rb'. It extends the ssh_steps to be able specify the ssh_port, and capture stderr, stdout and the exit-code too.
- And do the same for 'vagrant_steps.rb': this will make your ssh steps vagrant aware
Note: To avoid conflict with the cucumber-nagios be sure to disable the "cucumber/nagios/steps/ssh_steps" in your 'env.rb'
Feature: Executing commands In order to test a running system As an administrator I want to verify the apache behavior @apache2 Scenario: Checking if apache is running through vagrant Given I have a vagrant project in "." When I ssh to vagrantbox "default" with the following credentials: | username | password| | vagrant | vagrant | And I run "ps -ef |grep apache2|grep -v grep" Then I should see "apache2" in the output And it should have exitcode 0 And I should see "apache2" on stdout And there should be no output on stderr
The step Given I have a vagrant project, loads the vagrant environment
Given /^I have a vagrant project in "([^\"]*)"$/ do |path| @vagrant_env=Vagrant::Environment.new(:cwd => path) @vagrant_env.load! end
And the step When I ssh to vagrantbox calculates the port it need to ssh too
unless @vagrant_env.multivm? port=@vagrant_env.primary_vm.ssh.port else port=@vagrant_env.vms[boxname.to_sym].ssh.port end
On a side note, you might notice the @apache2 these are tags in cucumber that you can use to specify only certain tasks. This will only run the features with tag apache
$ cucumber -tags @apache
And this is how you the step When I do a vagrant provision is implemented
And /^I do a vagrant provision$/ do Vagrant::CLI.start(["provision"], :env => @vagrant_env) end
Running component unit tests from within the machine
You can use the same mechanism to run your components tests inside the machine itself. You can your application tests mounted inside the VM and run the tests from there. We use it complementary to our 'vagrant project' tests. The advantage of the vagrant tests is that it does an actual network connect without working through loopback and allows you to orchestrate the VM you need to login into in a multivm setup.
Feature: Executing commands In order to test a running system As an administrator I want to verify the apache behavior @unittests Scenario: Checking if componentX unittests ok Given I have a vagrant project in "." When I ssh to vagrantbox "default" with the following credentials: | username | password| | vagrant | vagrant | And I run "cd /opt/awesome-frontend; rails_env=test rake" And it should have exitcode 0
Testing HTTP access to a vagrant box
Besides running commands on the box, we wanted to be able to check HTTP things. The two main webtesting gems in Ruby/Rails land are either webrat or the newcomer on the block Capybara . Both implement different 'browser' types to check your content: they have adaptors for real browsers (firefox, chrome, safari) through selenium or alike. We needed only simple http testing no DOM checking. The usual suspect is 'rack/test' but as we don't have a rack application that failed miserably. We found that webrat has another option through mechanize. The gem comes installed when you install cucumber_nagios. Also the webrat websteps are implemented in http_steps of cucumber_nagios.
Update 31/03/2011: if using capybara there are two frameworks that look an alternative to leave webrat
- akephalos adapter that aims to be headless unit testing framework - https://github.com/bernerdschaefer/akephalos
- mechanize adapter : https://github.com/jeroenvandijk/capybara-mechanize
A feature would like this
Scenario: Surf to apache Given I go to "http://localhost:9000" Then I should see "It works"
Similar to our ssh problem, you see that we have to specify our port to the mapped port of vagrant. And this would also fail for virtual hosts as it would not send the correct 'Host' attribute to the server.
Our enhanced vagrant version adds the Give I go vagrant 'url' syntax
@vagrant Scenario: Surf to apache via vagrant Given I have a vagrant project in "." Given I go to vagrant "http://www.sample.com" Then I should see "It works"
Given /^I go to vagrant "([^\"]*)"$/ do |url| virtual_visit(url) end
The following snippet implements that virtual_visit:
- it assumes @vagrant_env is loaded
- and the correct the Host: headers accordingly to make the site virtual aware
- it maps the url port to the port in the guest machine
- the function is added to the webrat module so it is accessible in your steps
module Webrat #:nodoc: class Session #:nodoc: def virtual_visit(url, data=nil, options = {}) # Options = Headers in regular visit uri = URI.parse(url) # We default to the same port port=uri.port # Now we translate url port to vagrant port # These mappings of ports are global and not per machine if @vagrant_env.nil? throw "No vagrant environment got loaded" end @vagrant_env.config.vm.forwarded_ports.each do |name,mapping| if mapping[:guestport]==uri.port port=mapping[:hostport] end end # Override the hostname to the Headers header=options headers=options.merge({ 'Host' => uri.host+":"+port.to_s}) # For the extended get method we need to wrap it # Traditional get method works # => with an URL as first arg # => and second = parameters (methods I guess) # But given some other arguments the get command behaves differently # See http://mechanize.rubyforge.org/mechanize/Mechanize.html for the source # https://github.com/brynary/webrat/blob/master/lib/webrat/adapters/mechanize.rb # https://github.com/brynary/webrat/blob/master/lib/webrat/core/session.rb # def get(options, parameters = [], referer = nil) @response = get({ :headers => headers, :url => "#{uri.scheme}://localhost:#{port}#{uri.path}?#{uri.query}", :verb => :get}, nil,options['Referer']) end end end
Now we can use the standard URL and behind the scenes the URL is translated to the correct http request.
Final note:
This is pretty much work in progress, I hope to both contribute to the cuken project for the vagrant and ssh steps to make them uniformly available. Also while writing this blogpost it occurred to me that we need a vagrant-cucumber plugin that will generate the feature structure and integrate cucumber as a subcommand.
Also I'm aware that these are bad examples of BDD, as they don't express Business talk unless your customer is a Sysadmin :)
I've cut off this blogpost here, I did promise you the integration in Jenkins in a CI, so that's the next blogpost.
Hope to hear from you if you found this useful.
Using Vagrant as a Team
This blogpost goes into detail how we leverage Vagrant in our day to day work. We use it with a team of 7 people to integrate a pretty complex application. To get an idea on the complexity:
- We have a nodejs server talking to a redis database
- a grails application that reads from the redis database and writes to a mysql db
- a rails frontend that reads from the grails rest services and writes to a mysql db
- a perl application importing data into the mysql db from an external source
- the nodejs logs via flume to a hadoop storage
- we extract data via sqoop from the hadoop storage
And all this is done on one Vagrant machine. We can't even imagine having to synchronize this setup on all the different development machines without Vagrant.
So thank you "Mitchell Hashimoto" and "John Bender" for this awesome tool!
We hope this blogpost (and the next ones in this series) will inspire you to do great things with it.
Preparing yourself for takeoff
Standard requirements
Vagrant as described on the website is a tool for building and distributing virtualized development environments.
- In order to use it, you need to have some things in place :
- you need to have Virtualbox 4.0.X installed.
- have ruby running
- We like to add the following to the mix (not strictly required)
- we recommend the use of RVM)
- and have some version control (we use git) installed
Installing rvm (optional)
RVM is a great way of managing various things of ruby on a system. We really like it because: - It does everything in userland (no sudo for gems) - it allows the use of separate gemsets for each (project/customer) individually - allows you to use different versions of ruby on the same machine
Installing it is plain easy:
$ bash < <( curl http://rvm.beginrescueend.com/releases/rvm-install-head )
To have your shell pick it up you can
$ source "$HOME/.rvm/scripts/rvm"
or to make it permanent add it to your .bash_profile
# This loads RVM into a shell session. $ [[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"
Setting up rvm
Up until now we only have the RVM scripts and no ruby yet. To install f.i. ruby 1.9.2 on your system you can now:
$ rvm install 1.9.2
Setting up a vagrant project called 'awesome' with rvm
Create a directory structure
$ mkdir awesome-vagrant
Now create a file called .rvmrc
echo "rvm_gemset_create_on_use_flag=1" > .rvmrc echo "rvm gemset use awesome-vagrant" >> .rvmrc echo "rvm use 1.9.2" >> .rvmrc
Go back one directory
$ cd ..
Trigger the read of the .rvmrc (works through bash hooks) . This will ask you to trust your new .rvmrc file
$ cd awesome-vagrant ... RVM has encountered a not yet trusted .rvmrc file in the = = current working directory which may contain nasty code. ...
You should see the correct ruby and gem version now
$ ruby -version $ gem -version
So now every-time you enter the 'awesome-vagrant' directory it will have the correct gemset 'awesome-vagrant' loaded, and have the ruby version you like. Pretty cool, not?
Installing git (optional)
Most os'es now have package available for git. Just use your favorite yum, apt, dpkg or whatever to install it.
On Mac OSX you can use macports, we use homebrew because you don't need root rights (it installs stuff in /usr/local/bin)
Alternatively, rvm provides a script based install of git
bash < <( curl http://rvm.beginrescueend.com/install/git )
Firing up the engines
Vagrant 101
Now that all the prerequisites are in place we can move on to the most basic example of using vagrant.
The example on the vagrant website goes like this
$ cd awesome-vagrant $ gem install vagrant $ vagrant box add base http://files.vagrantup.com/lucid32.box $ vagrant init $ vagrant up $ vagrant ssh
Et voilà, that's all it takes to get you up and running as a developer with a lucid box! Pretty neat he? Under the cover the following happens:
- vagrant box add base http://files.vagrantup.com/lucid32.box
- it will download lucid32.box file
- extract the lucid32.box file into your $HOME/.vagrant/boxes directory
- and give it the name 'base'
- vagrant init :
- creates a file called 'Vagrantfile' in your current directory
- when you look at the file, it will contain the directive
... config.vm.box = "base" ...
, this is what makes the link to the box we called 'base' - you can further edit the Vagrantfile before you start it
- vagrant up:
- up until now, no virtual machine was created
- therefore vagrant will import the disks from the box 'base' into Virtualbox
- map via NAT the port 22 from your VM to a free local port
- it will create a .vagrant file : a file that contains a mapping between your description 'base' and the UUID of the virtual machine
- If you want to follow the magic, just start Virtualbox and you will see the machine being created
- vagrant ssh:
- this will lookup the mapping of the ssh inside and will execute the SSH process to log into the machine
- use a privatekey of use vagrant to login to a box that has the user vagrant with it's public setup in the virtual machine
What about windows?
Some of our team members are not using a MacOSX or Linux variant but are running Windows.
There are some excellent instructions in getting Vagrant running on windows as a host: - Vagrant and Windows - Vagrant and Windows 64-Bit - Jruby, Winole32, Vagrant and Windows
We used the following:
- Install Java 64 Bit version
- Set $JAVA_HOME environment variable to the 64 Bit version
- Put $JAVA_HOME/bin in your path
- Install the Jruby 64 Bit (Ole version)
- Put $Jruby/bin in your path
- install the vagrant gem
- use Putty instead of vagrant ssh subcommand
- import the vagrant private key into your putty
Starting the vagrant command is a lot slower then under linux/macosx. I don't know why, but it slows the interaction down.
- We found that destroying a windows box, sometimes requires you to manually cleanup the Virtualbox Machine directory of that virtual machine.
Running windows as VM managed by Vagrant is currently still a dream. But the Winrm project is making good way to become a ssh alternative to windows machines. The opscode guys are already integrating winrm in chef/knife. Maybe I'll start writing a winrm vagrant plugin for that soon.
A word on Vagrant baseboxes
Up until recently, finding Vagrant baseboxes, was matter of searching the internet and finding the URL's on different individuals websites. Gareth Rushgrove has done a great job by setting up vagrantbox.es where you can submit your own baseboxes in a central directory.
Those baseboxes are great, but you have to trust the one who packaged the box. In the future we might see vendors providing baseboxes for their setup similar to providing official AMI's on Amazon, but we're not there yet.
You can create a virtualbox virtual machine yourself (manual install, pxe install, or starting from an existing basebox), and then export it as a vagrant box
$ vagrant package --base my_base_box
Introducing veewee : an easy way to bootstrap new baseboxes
An alternative is to use veewee to bootstrap a machine automatically from scratch. This a vagrant plugin I created that eases the creation of baseboxes from scratch. It simulates a manual install by levering VRDP to type some linux boot string and have the kickstart/preseed read over an HTTP server.
The following is a rundown on how to create an ubuntu basebox with veewee
Install the gem:
$ gem install veewee
List the veewee basebox definitions available:
$ vagrant basebox templates The following templates are available: ... vagrant basebox define '<boxname>' 'ubuntu-10.10-server-i386' vagrant basebox define '<boxname>' 'ubuntu-10.10-server-i386-netboot' ...
Define a new box , this creates a definition directory
$ vagrant basebox define 'myubuntu' 'ubuntu-10.10-server-i386'
Have a look at the definition directory and change them if you want
$ ls definitions/myubuntu definition.rb postinstall.sh preseed.cfg
Build the box. Note this will download the necessary iso file if needed
$ vagrant basebox build 'myubuntu'
Export the created vm as a basebox. This will finally create a myubuntubox.box
$ vagrant basebox export'myubuntu'
It's still experimental, but we have automated installation for various versions of Archlinux,Centos, Debian, Freebsd, Ubuntu working. I think the benefit from it, is that you don't need a PXE environment to setup machines and it allows you to test your preseed, kickstart files and version control the behavior of your basebox.
Remember it's code
Now is a good time to version control your awesome-vagrant project
$ cd $ git init $ git add Vagrantfile $ git commit -m "This was just my first commit"
Taking a test flight
Getting your code on board
Now that you have your basebox running and are able to login to it, I know you are eager to start development. So let's grab that code you already did from git.
$ cd awesome-vagrant $ git clone git@somerepo:/var/git/awesome-datastore $ git clone git@somerepo:/var/git/awesome-frontend $ git clone git@somerepo:/var/git/awesome-data
This results in the following structure
[DIR] awesome-vagrant
- [DIR] awesome-datastore (component1)
- [DIR] awesome-frontend (component2)
- [DIR] awesome-data (component3)
- Vagrantfile
Each directory shown here is a git repository that is checked out separately. Now we can mount this as directories inside our virtualmachine. This is what vagrant calls shared folders.
Our Vagrantfile looks like this
config.vm.share_folder "awesome-datastore", "/home/vagrant/awesome-datastore", "./awesome-datastore" config.vm.share_folder "awesome-frontend", "/home/vagrant/awesome-frontend", "./awesome-frontend" config.vm.share_folder "awesome-data", "/home/vagrant/awesome-data", "./awesome-data"
This will set up the directories inside your vm so you can edit them using your favorite IDE on your laptop and have the files instantly available inside your VM without the need for sync.
After editing the file you need to 'reboot' the machine to take this settings
$ vagrant reload
We've hit quite a few problems with writing to shared folders. Standard Vagrant used the Virtualbox Guest additions to share a folder of your local/host machine to the Virtual machine. There have been numerous of complaints about the stability and therefore you might want to check out the use of NFS folders to share the directories. Just add the share NFS flag at the end. Please note that this requires an nfs-client to be installed in the basebox first.
config.vm.share_folder "awesome-frontend", "/home/vagrant/awesome-frontend", "./awesome-frontend",{:nfs => true}
...
The communication between host and vm is done over a host-only network, so your nfs shares will not get exposed to the outside world. Therefore you need to enable hostonly networking by adding the following to your vagrant file
config.vm.network "33.33.33.10"
Don't forget to reload after changing that
$ vagrant reload
Adding config management to the mix
It might be tempting to login into your new vagrant box and install a bunch of packages manually to get things started. You should all remember Willem van den Ende saying Server login considered harmful
The real power of vagrant is that it promotes the use of configuration management for that. Infrastructure as code, FTW!
Vagrant currently support both Chef-Solo, Chef, Puppet, Puppet-Server and bash scripting as 'provisioners'. Provisioners are different from traditional installation scripts, as they follow the idempotence principle. They can be run over and over again and get the same results.
The vagrant command to run this is:
$ vagrant provision
and provisioning is also run when you do a
$ vagrant up
If don't want it to run, you can specify
$ vagrant up --no-provision
Chef-Solo sample setup
The setup and explanation of Chef is beyond the scope of this blogpost. There is a great description on the Opscode website on how to setup a chef repository.
awesome-chefrepo
[DIR]cookbooks #those that come from opscode
[DIR]site-cookbooks #or your own
A sample Vagrantfile snippet looks like this:
config.vm.provision :chef_solo do |chef|
chef.cookbooks_path = ["awesome-chefrepo/cookbooks",
"awesome-chefrepo/site-cookbooks"]
chef.log_level = "debug"
chef.add_recipe("nfs-client")
chef.json.merge!({
:mysql => {
:server_root_password => "supersecret",
:server_repl_password => "supersecret",
:server_debian_password => "supersecret"},
:java => {
:install_flavor => "sun"}
})
end
Running 'vagrant provision' will:
- share the cookbooks_path's (and rolepaths,...) in the virtualmachine
- generate a solo.rb configfile and transfer it to /tmp
- generate a dna.json file: a merge of a vagrant json information and the json you provided
- login to the virtualmachine as vagrant and do a
sudo chef-solo -r solo.rb -j dna.json
More detailed notes can be found on the Vagrant Provisioner website section
Puppet sample setup
James Turnbull wrote the puppet provisioner
We setup our puppet-repo like this:
awesome-puppetrepo
[DIR] manifests
[DIR] modules
mybox.pp
With the corresponding the following puppet Vagrantfile section
config.vm.provision :puppet do |puppet|
puppet.pp_path = "/tmp/vagrant-puppet"
puppet.manifests_path = "./awesome-puppetrepo/manifests"
puppet.module_path = "./awesome-puppetrepo/modules"
puppet.manifest_file = "./awesome-puppetrepo/mybox.pp"
end
Where mybox.pp contains the manifest (f.i. apache2) to be run on that box
package "apache2": { ensure => 'installed' }
Running 'vagrant provision' will:
- share the manifests_paths + module_paths in the virtualmachine
- transfer your manifest_file to /tmp
- login to the virtualmachine as vagrant and do a
sudo puppet --modulepath awesome-puppetrepo/modules mybox.pp
More detailed notes can be found on the Vagrant Provisioner website section
Opening up the box - Network
Now that you have both your code and your environment inside the VM setup, the next step is to gain access to some of the network services. Vagrant makes this damn easy by mapping ports inside the VM to ports on your local system.
# Forward a port from the guest to the host, which allows for outside # computers to access the VM, whereas host only networking does not. # config.vm.forward_port "http", 80, 8080 config.vm.forward_port "awesome-datastore", 8080, 8080 config.vm.forward_port "awesome-frontend", 8000, 80
Again to make these mapping take effect you need to restart vagrant box
$ vagrant reload
Now you can surf to your http://localhost:8080 and access your frontend inside the box
Overcoming bad network performance
We noticed that some of our network services would perform badly when accessed from the outside and be fast from the inside. At first we suspected it to be the Virtualbox network natting slowing things down, but it turned out that DNS resolving was causing the delays. We been told before that 'Everything is Freaking DNS problem' and yes:
- depending on the network you were running, DNS was badly setup for resolving internal IP's. Check your resolver
- we had libavahi installed (apparently came with java) : so we had to disable that to speed up the resolving
Tuning your engines
Customizing Vagrantfile
The great thing about the Vagrantfile is that is actual ruby code.
Settings for only some hosts
The following snippit allows us to still share the Vagrantfile but allow people to use NFS if they need it
# Switching to nfs for only those who want it
nfs_hosts=%w(mylaptop1 ruben-meanmachine)
require 'socket'
my_hostname=Socket.gethostname.split(/\./)[0]
if nfs_hosts.include?(my_hostname)
# Assign this VM to a host only network IP, allowing you to access it
# via the IP.
config.vm.network "33.33.33.10"
share_flags={:nfs => true}
else
share_flags={}
end
Enabling different settings based on environment
Besides people developing code or configuration management code, we also have people who use a Vagrant machine to give demo's at various place. They pull the latest version from git and are able to have the VM build with the latest features enabled.
For the demos they don't need to have the shared directories of all the code components available. We introduced the notion of
- vagrant_env : development,test, production, demo,
- awesome_mode : a flag indicating what mode our applications should run into
Both can be set as environment variables and are picked up by the Vagrantfile
if (vagrant_env=="development" && ENV["AWESOME_MODE"]!="demo")
Setting these variables is as easy as prepending them to the vagrant command
$ vagrant_env=development vagrant up
And now as a team please :
Some observations
We've been using vagrant as a team for about 2 months now and here are some observations we made:
- It clearly helps everybody to have a consistent environment to develop against, the lastest version is just one git pull away.
The central approach drives people to a) do frequent commits and b) do stable commits.
The task of writing recipes/manifests is not picked up by all team members, and seem to stay the main job of the system oriented people on the team.
- Reading manifests help people understand what is needed and makes it easy to point out what needs to be changed. But learning the skills to write recipes/manifest is a blocking factor just as having a backend developer writing frontend code.
- When manifest/recipes are modified during a sprint, provisioning an existing virtual machine might fail as we don't take migrations from one VM state to the other into account. In that case, a box destroy and full provision is required.
The test the admins do before committing their manifests, is that they destroy their own 'development' box and re-provision a new box to see if this works.
The longer the the provision takes, the less frequent people do it. It's important to keep that process as fast as possible. It's all about feedback and we want it fast.
- Installation problems would get noticed far sooner in the process.
- People would only do a full rebuild in the morning when getting their coffee.
Having both development and production mode running on the Vagrant box
In our environment we have both the development and the production version running.
F.i. for our rails component we have:
- a share of the awesome-frontend inside the box (and when this starts it runs on port 3000)
- we have the production mode running on port 80 (pulled from git as the latest tagged production)
This allows us to easily have both versions running. The production version is installed by a manifest/recipe and the share is started manually.
In summary
Vagrant rocks, but by now you should know !
Don't worry the journey continues...
In the next post, I'll introduce you how we setup Vagrant with testing and use a Continuous integration environment to have it build a new box , run the tests and make everybody happy. So stay tuned!
For additional inspiration:
- Vagrant main website
- Mitchell Hashimoto Blog
- John Bender's blog
- Cloudspace: Setting up development environments with vagrant and the opscode platform
- My vagrant workflow - by Dougal Matthews
- Bring the cloud on your desktop with Vagrant
- Start using Vagrant by Theodo
- Vagrant - Spatula
- Rvm with vagrant
- Instant Rails in A Virtualbox
- Openstack - Single node nova installation using Vagrant and chef
- Upgrading to Movable Type5 using chef and vagrant
- Haskell and Vagrant's Middleware
- Getting the most out of chef with Scalarium and Vagrant
- Gareth Rushgrove - talk at Fosdem on Configuration Management for development environments
- Gareth Rushgrove - default Recipes for Vagrant Virtual Machines - Using Gitrepos for your recipes
- Gareth Rushgrove - A continuous deployment example setup
The Impact of Amazon’s new CloudFormation service
Let me put to rest the worst of the FUD. This was never a master plan by Amazon to wipe out Chef and Puppet in a hostile takeover of the configuration management territory. Opscode were part of the CloudFormation Beta, and deeper integration with Chef is very much part of the future roadmap. So don’t worry - this is not an apocalyptic disaster - it’s an overwhelmingly good and exciting development that promises to make the task of complex orchestration a little bit easier.
CloudFormation is a service that simplifies the process of firing up a complete AWS stack. Instead of making individual API calls to set up EC2 instances, elastic load balancers, scaling groups and other offerings, we simply make one call. This is great - because previously making these calls was a bit of pain. Your options ranged from using the AWS console, which is pretty unpleasnant, through using tools such the Java-based EC2 command line tools, through to scripting a series of calls with a library such as Fog or Boto.
Does that sound a lot like Chef or Puppet to you? No. Sure, knife has EC2 management capabilities because it wraps Fog, but that’s peripheral, and is really just recognition of the fact that Amazon hadn’t produced a fully featured and consistent way to drive their API.
The main point of confusion here is that people are equating provisioning and configuration management. Provisioning is going to the shop and buying a server. Racking it and cabling it. Putting it in the right VLAN. Giving it a port and an IP address and sticking an operating system on it. Outside of the cloud this is a pretty major undertaking, but the cloud makes all this very easy. Configuration management is policy driven. It’s deciding what software goes onto the machine, how it’s configured, how it should behave in certain circumstances, and enforcing that. You need both - CloudFormation provides the former.
Let’s be clear - I’m not downplaying the significance or awesomeness of the service. What Amazon have done with CloudFormation is make it much much easier to do this at a stack level rather than for each individual component of an AWS infrastructure. Together with Elastic Beanstalk, Amazon are doing some important and innovative stuff in this space.
For me the area which is of most interest is the mechanism for creating these stacks. CloudFormation uses JSON templates to specify the infrastructure components and interdependencies. Amazon have provided some sample templates for provisioning popular opensource stacks such as Drupal, Wordpress and Redmine. I think this is what has caused all the excitement. However, it’s important to remember that this is purely image-based - there’s no ongoing management of the essential configuration of these machines.
What excites me about all this is that it’s… JSON. We like JSON - JSON is used throughout Chef, and CloudFormation opens up lots of possibilities for creative interplay. Far from competing with or replacing Chef, CloudFormation plays directly to its strengths. Chef metadata can be passed from a JSON template, including role information, validation key and Chef server URL. The end result is a fully configured and managed AWS infrastructure, from scracth, with one call.
The other exciting thing is that this JSON can just be stored in a databag. This suddenly makes it really rather easy to manage and control some of the more complicated and powerful AWS services such as the queing service, or cloud watch alarms from the very heart of your configuration management tool.
So: is CloudFormation awesome? Yes. Exciting? Absolutely. Powerful? You bet! A replacement? A threat? Absolutely not - what we have here is the next generation in server automation and provisioning, in a form which slots in perfectly with next generation system integration and configuration management. Bring it on.
The Impact of Amazon’s new CloudFormation service
Let me put to rest the worst of the FUD. This was never a master plan by Amazon to wipe out Chef and Puppet in a hostile takeover of the configuration management territory. Opscode were part of the CloudFormation Beta, and deeper integration with Chef is very much part of the future roadmap. So don’t worry - this is not an apocalyptic disaster - it’s an overwhelmingly good and exciting development that promises to make the task of complex orchestration a little bit easier.
CloudFormation is a service that simplifies the process of firing up a complete AWS stack. Instead of making individual API calls to set up EC2 instances, elastic load balancers, scaling groups and other offerings, we simply make one call. This is great - because previously making these calls was a bit of pain. Your options ranged from using the AWS console, which is pretty unpleasnant, through using tools such the Java-based EC2 command line tools, through to scripting a series of calls with a library such as Fog or Boto.
Does that sound a lot like Chef or Puppet to you? No. Sure, knife has EC2 management capabilities because it wraps Fog, but that’s peripheral, and is really just recognition of the fact that Amazon hadn’t produced a fully featured and consistent way to drive their API.
The main point of confusion here is that people are equating provisioning and configuration management. Provisioning is going to the shop and buying a server. Racking it and cabling it. Putting it in the right VLAN. Giving it a port and an IP address and sticking an operating system on it. Outside of the cloud this is a pretty major undertaking, but the cloud makes all this very easy. Configuration management is policy driven. It’s deciding what software goes onto the machine, how it’s configured, how it should behave in certain circumstances, and enforcing that. You need both - CloudFormation provides the former.
Let’s be clear - I’m not downplaying the significance or awesomeness of the service. What Amazon have done with CloudFormation is make it much much easier to do this at a stack level rather than for each individual component of an AWS infrastructure. Together with Elastic Beanstalk, Amazon are doing some important and innovative stuff in this space.
For me the area which is of most interest is the mechanism for creating these stacks. CloudFormation uses JSON templates to specify the infrastructure components and interdependencies. Amazon have provided some sample templates for provisioning popular opensource stacks such as Drupal, Wordpress and Redmine. I think this is what has caused all the excitement. However, it’s important to remember that this is purely image-based - there’s no ongoing management of the essential configuration of these machines.
What excites me about all this is that it’s… JSON. We like JSON - JSON is used throughout Chef, and CloudFormation opens up lots of possibilities for creative interplay. Far from competing with or replacing Chef, CloudFormation plays directly to its strengths. Chef metadata can be passed from a JSON template, including role information, validation key and Chef server URL. The end result is a fully configured and managed AWS infrastructure, from scracth, with one call.
The other exciting thing is that this JSON can just be stored in a databag. This suddenly makes it really rather easy to manage and control some of the more complicated and powerful AWS services such as the queing service, or cloud watch alarms from the very heart of your configuration management tool.
So: is CloudFormation awesome? Yes. Exciting? Absolutely. Powerful? You bet! A replacement? A threat? Absolutely not - what we have here is the next generation in server automation and provisioning, in a form which slots in perfectly with next generation system integration and configuration management. Bring it on.
Opscode Chef Fundamentals Training 2011

It’s configuration management season in Europe! Prior to the eagerly anticipated Fosdem Config Management Dev Room, Opscode’s technical evangelist Joshua Timberman will be in London on the 31st of January and the 1st and 2nd of Feburary, to give his highly regarded Chef Fundamentals training course.
For those of you who want the full menu, there is the opportuity to follow up this course with advanced Chef training, hosted by Patrick Debois, in Gent on the 3rd and 4th.
The Course
Chef Fundamentals is a 3-day comprehensive class covering the basic architecture of Chef and all of the underlying components. We will be covering installation basics of Chef Client and Chef Solo. Other topics will include: creating Chef repositories, creating cookbooks and advanced use of the command line utility, Knife. This class will include lecture, labs and some comprehensive case studies.
Pricing
Atalanta Systems is able to offer a significant reduction against the usual pricing of £500 per day, and offer a special community cost of £500 + VAT all inclusive for the whole three day course.
To register, contact Dee Strutt at Atalanta Systems. Places are limited, and last autumn’s class filled up quickly, so book soon!
Location

Training will take place at the prestigious Radisson Edwardian Hampshire Hotel in Leicester Square, right in the heart of London’s theatre land.
Trainer
![]()
Joshua Timberman is a technologist, focused on automation and continual improvement of software processes. As such, he has become an Agile practitioner. With over 10 years experience in Linux and Unix system administration, Joshua has worked for companies from 5 person startups, up to the largest IT company in the world. His background includes deploying highly available enterprise application environments and providing internal infrastructure services and team-based training. Joshua currently works for Opscode, where he is an infrastructure cooking expert with Chef. He speaks at local user group meetings and has a passion for teaching people how to make the most out of automation.
Opscode Chef Fundamentals Training 2011

It’s configuration management season in Europe! Prior to the eagerly anticipated Fosdem Config Management Dev Room, Opscode’s technical evangelist Joshua Timberman will be in London on the 31st of January and the 1st and 2nd of Feburary, to give his highly regarded Chef Fundamentals training course.
For those of you who want the full menu, there is the opportuity to follow up this course with advanced Chef training, hosted by Patrick Debois, in Gent on the 3rd and 4th.
The Course
Chef Fundamentals is a 3-day comprehensive class covering the basic architecture of Chef and all of the underlying components. We will be covering installation basics of Chef Client and Chef Solo. Other topics will include: creating Chef repositories, creating cookbooks and advanced use of the command line utility, Knife. This class will include lecture, labs and some comprehensive case studies.
Pricing
Atalanta Systems is able to offer a significant reduction against the usual pricing of £500 per day, and offer a special community cost of £500 + VAT all inclusive for the whole three day course.
To register, contact Dee Strutt at Atalanta Systems. Places are limited, and last autumn’s class filled up quickly, so book soon!
Location

Training will take place at the prestigious Radisson Edwardian Hampshire Hotel in Leicester Square, right in the heart of London’s theatre land.
Trainer
![]()
Joshua Timberman is a technologist, focused on automation and continual improvement of software processes. As such, he has become an Agile practitioner. With over 10 years experience in Linux and Unix system administration, Joshua has worked for companies from 5 person startups, up to the largest IT company in the world. His background includes deploying highly available enterprise application environments and providing internal infrastructure services and team-based training. Joshua currently works for Opscode, where he is an infrastructure cooking expert with Chef. He speaks at local user group meetings and has a passion for teaching people how to make the most out of automation.
Opscode Chef Fundamentals Training 2011

It’s configuration management season in Europe! Prior to the eagerly anticipated Fosdem Config Management Dev Room, Opscode’s technical evangelist Joshua Timberman will be in London on the 31st of January and the 1st and 2nd of Feburary, to give his highly regarded Chef Fundamentals training course.
For those of you who want the full menu, there is the opportuity to follow up this course with advanced Chef training, hosted by Patrick Debois, in Gent on the 3rd and 4th.
The Course
Chef Fundamentals is a 3-day comprehensive class covering the basic architecture of Chef and all of the underlying components. We will be covering installation basics of Chef Client and Chef Solo. Other topics will include: creating Chef repositories, creating cookbooks and advanced use of the command line utility, Knife. This class will include lecture, labs and some comprehensive case studies.
Pricing
Atalanta Systems is able to offer a significant reduction against the usual pricing of £500 per day, and offer a special community cost of £500 + VAT all inclusive for the whole three day course.
To register, contact Dee Strutt at Atalanta Systems. Places are limited, and last autumn’s class filled up quickly, so book soon!
Location

Training will take place at the prestigious Radisson Edwardian Hampshire Hotel in Leicester Square, right in the heart of London’s theatre land.
Trainer
![]()
Joshua Timberman is a technologist, focused on automation and continual improvement of software processes. As such, he has become an Agile practitioner. With over 10 years experience in Linux and Unix system administration, Joshua has worked for companies from 5 person startups, up to the largest IT company in the world. His background includes deploying highly available enterprise application environments and providing internal infrastructure services and team-based training. Joshua currently works for Opscode, where he is an infrastructure cooking expert with Chef. He speaks at local user group meetings and has a passion for teaching people how to make the most out of automation.
Chef and Encrypted Data Bags – Revisted
In my previous post here I described the logic behind wanting to store data in an encrypted form in our Chef data bags. I also described some general encryption techniques and gotchas for making that happen.
I've since done quite a bit of work in that regard and implemented this at our company. I wanted to go over a bit of detail about how to use my solution. Fair warning, this is a long post. Lot's of scrolling.
A little recap
As I mentioned in my previous post, the only reliable way to do the encryption of data bag items in an automated fashion is to handle key management yourself outside of Chef. I mentioned two techniques:
- storing the decryption key on the server in a flat file
- calling a remote resource to grab the key
Essentially the biggest problem of this issue is key management and, in an optimal world, how to automate it reliably. For this demonstration, I've gone with storing a flat text file on the server. As I also said in my previous post, this assumes you tightly control access to that server. We're going with the original assumption that if a malicious person gets on your box, you're screwed no matter what.
Creating the key file
I used the knife command to handle my key creation for now:
knife ssh '*:*' interactive
echo "somedecryptionstringblahblahblah" > /tmp/.chef_decrypt.key
chmod 0640 /tmp/.chef_decrypt.key
Setting up the databags and the rake tasks
One of the previous things I mentioned is knowing when and what to encrypt. Be sensible and keep it simple. We don't want to throw out the baby with the bath water. The Chef platform has lots of neat search capabilities that we'd like to keep. In this vein, I've created a fairly opinionated method for storing the encrypted data bag items.
We're going to want to create a new databag called "passwords". The format of the data bag is VERY simple:
We have an "id" that we want to use and the plaintext value that we want to encrypt.
Rake tasks
In my local chef-repo, I've created a 'tasks' folder. In that folder, I've added the following file:
As you can see, this requires a rubygem called encrypted_strings. I've done a cursory glance over the code and I can't see anything immediately unsafe about it. It only provides an abstraction to the native OpenSSL support in Ruby with an additional String helper. However I'm not a cryptographer by any stretch so you should do your own due diligence.
At the end of your existing Rakefile, add the following:
load File.join(TOPDIR, 'tasks','encrypt_databag_item.rake')
If you now run rake -T you should see the new task listed:
rake encrypt_databag[databag_item] # Encrypt a databag item in the passwords databag
If you didn't already create a sample data bag and item, do so now:
mkdir data_bags/passwords/
echo '{"id":"supersecretpassword","data":"mysupersecretpassword"}' > data_bags/passwords/supersecretpassword.json
Now we run the rake task:
rake encrypt_databag[supersecretpassword]
Found item: supersecretpassword. Encrypting
Encrypted data is <some ugly string>
Uploading to Chef server
INFO: Updated data_bag_item[supersecretpassword_crypted.json]
You can test that the data was uploaded successfully:
knife data bag show passwords supersecretpassword
{
"data": "<some really ugly string>",
"id": "supersecretpassword"
}
Additionally, you should have in your 'data_bags/passwords' directory a new file called 'supersecretpassword_crypted.json'. The reason for keeping both files around is for key management. Should you need to change your passphrase/key, you'll need the original file around to reencrypt with the new key. You can decided to remove the unencrypted file if you want as long as you have a way of recreating it.
Using the encrypted data
So now that we have a data bag item uploaded that we need to use, how do we get it on the client?
That will require two cookbooks:
- databag_decrypt
- A cookbook which needs the decrypted data. example
- include the decryption recipe
include_recipe "databag_decrypt::default"
password = search(:passwords, "id:supersecretpassword").firstdecrypted_password = item_decrypt(password[:data])From there, it's no different that any other recipe. Here's an example of how I use it to securely store Amazon S3 credentials as databag items:
include_recipe "databag_decrypt::default"
s3_access_key = item_decrypt(search(:passwords, "id:s3_access_key").first[:data])
s3_secret_key = item_decrypt(search(:passwords, "id:s3_secret_key").first[:data])
s3_file erlang_tar_gz do
bucket "our-packages"
object_name erlang_file_name
aws_access_key_id s3_access_key
aws_secret_access_key s3_secret_key
checksum erl_checksum
end
Changing the key
Should you need to change the key, you'll need to jump through a few hoops:
- Update the passphrase on each client. Ease depends on your method of key distribution
- Update the passphrase in the rake task
- Reencypt all your data bag items.
I store all of my data bag items in large json files and use split-em.rb to break them into individual files. Those file I upload with knife:
bin/split-em.rb -f data_bags/passwords/passwords.json -d passwords -o
Parsing data for svnpass into file data_bags/passwords/svnpass.json
Parsing data for s3_access_key into file data_bags/passwords/s3_access_key.json
Parsing data for s3_secret_key into file data_bags/passwords/s3_secret_key.json
#Run the following command to load the split bags into the passwords in chef
for i in svnpass s3_access_key s3_secret_key; do knife data bag from file passwords $i.json; done
You could then run that through the rake task to reupload the encrypted data:
for i in svnpass s3_access_key s3_secret_key; do rake encrypt_databag[$i]; done
Limitations/Gotchas/Additional Tips
Take note of the following, please.Key management
The current method of key management is somewhat cumbersome. Ideally, the passphrase should be moved outside of the rake task. Additionally, the rekey process should be made a distinct rake task. I imagine a workflow similar to this:
- rake accepts a path to the encryption key
- additional rake task to change the encryption key in the form of oldpassfile/newpassfile.
- Existing data is decrypted using oldpassfile, reencrypted using new passfile and sent back to the chef server.
Optimally, the rake task would understand the same attributes that the decryption cookbook does so it can handle key managment on the client for you. I'd also like to make the cipher selection configurable as well an integrate it into the above steps.
Duplicate work
Seth Falcon at Opscode is already in the process of adding official support for encrypted data bags to Chef. His method involves converting the entire databag sans "id" to YAML and encrypting it. I wholeheartedly support that effort but that would obviously require a universal upgrade to Chef as well. The purpose of my cookbook and tasks is to work with the existing version.
AWS IAM
If you're an Amazon EC2 user, you should start using IAM NOW. Stop putting your master credentials in to recipes and limit your risk. I've created a 'chef' user who I give limited access to certain AWS operations. You can see the policy file here. It gives the chef user read-only access to 'my_bucket' and 'my_other_bucket'.
If you wanted to get REALLY sneaky, you could use fake two-factor authentication to store your key in S3:
- Encrypt data bag items with "crediential B" password except for one item "s3_credentials"
- s3_credentials (crendential A) is encrypted with a passphrase and managed similar to this article
- Use transient credentials to access S3 and grab a passphrase file (credential B)
- Decrypt data with secondary credentials
File-based passphrases
I'm not a big fan of the file-based passphrase method. While we agreed that you should consider yourself screwed if someone gets on the box, that still leaves poorly coded applications running as an attack vector. Imagine you have an application that must run as root. Now it can read the passphrase. Should that application become remotely exploitable, the passphrase file is vulnerable. I'm leaning to the method of a private server that allows RESTful access to grab the key. I've already added support in the cookbook for a passphrase type of 'url'.
Wrapup
I think that covers anything. I'd love some feedback on what people think. We've already implemented this in a limited scope for using IAM credentials in our cookbooks. I can easily revoke those should they get compromised without having to generate all new master keys.