↓ Archives ↓

Archive → December, 2011

What is devops ?

I`m parsing the responses of the Deploying Drupal survey I started a couple of months ago (more on that later)

One of the questions in the survey is "What is devops" , apparently when you ask a zillion people (ok ok, just a large bunch of Tweeps..), you get a large amount of different answers ranging from totally wrong to spot on.

So let's go over them and see what we can learn from them ..

The most Wrong definition one can give is probably :

  • A buzzword

I think we've long passed the buzzword phase, definitely since it's not new, it's a new term we put to an existing practice. A new term that gives a lot of people that were already doing devops , a common word to dicuss about it. Also lots of people still seem to think that devops is a specific role, a job description , that it points to a specific group of people doing a certain job, it's not . Yes you'll see a lot of organisations looing for devops people, and giving them a devops job title. But it's kinda hard to be the only one doing devops in an organisation.

I described one of my current roles as Devops Kickstarter, it pretty much describes what I`m doing and it does contain devops :)

But devops also isn't

  • The connection between operations and development.
  • people that keep it running
  • crazy little fellows who find beauty in black/white letters( aka code) rather than a view like that of Taj in a full moon light.
  • the combination of developer and operations into one overall functionality
  • The perfect mixture between a developer and a system engineer. Someone who can optimize and simplify certain flows that are required by developers and system engineers, but sometimes are just outside of the scope for both of them.
  • Proxy between developer and management
  • The people in charge of the build/release cycle and planning.
  • A creature, made from 8-bit cells, with the knowledge of a seasoned developer, the skillset of a trained systems engineer and the perseverence of a true hacker.
  • The people filling the gap between the developer world and the sysadmin world. They understand dev. issues and system issues as well. They use tools from both world to solve them.

Or

  • Developers looking at the operations of the company and how we can save the company time and money

And it's definitely not

  • Someone who mixes both a sysop and dev duties
  • developers who know how to deploy and manage sites, including content and configuration.
  • I believe there's a thin line line between Ops and Devs where we need to do parts of each others jobs (or at least try) to reach our common goal..
  • A developer that creates and maintains environments tools to help other developers be more successful in building and releasing new products
  • Developers who also do IT operations, or visa versa.
  • Software developers that support development teams and assist with infrastructure systems

So no, developers that take on systems roles next to their own role and want to go for NoOps isn't feasable at all ..you really want collaboration, you want people with different skillsets that (try to) understand eachoter and (try to) work together towards a common goal.

Devops is also not just infrastructure as code

  • Writing software to manage operations
  • system administrators with a development culture.
  • Bring code management to operations, automating system admin tasks.
  • The melding of the art of Systems Administration and the skill of development with a focus on automation. A side effect of devops is the tearing down of the virtual wall that has existed between SA's and developers.
  • Infrastructure as code.
  • Applying some of the development worlds techniques (eg source control, builds, testing etc) to the operations world.
  • Code for infrastructure

Sure infastructure as code is a big part of the Automation part listed in CAMS, but just because you are doing puppet/chef doesn't mean you are doing devops.
Devops is also not just continous delivery

  • A way to let operations deploy sites in regular intervals to enable developers to interact on the systems earlier and make deployments easier.
  • Devops is the process of how you go from development to release.

Obviously lots of people doing devops also often try to achieve Continuous delivery, but just like Infrastructure as Code it devops is not limited to that :)

But I guess the truth is somewhere in the definitions below ...

  • That sweet spot between "operating system" or platform stack and the application layer. It is wanting sys admins who are willing to go beyond the normal package installers, and developers who know how to make their platform hum with their application.
  • Breaking the wall between dev and ops in the same way agile breaks the wall between business and dev e.g. coming to terms with changing requirements, iterative cycles
  • Not being an arsehole!
  • Sysadmin best-practise, using configuration as code, and facilitating communication between sysadmins and developers, with each understanding and participating in the activities of the other.
  • Devops is both the process of developers and system operators working closer together, as well as people who know (or who have worked in) both development and system operations.
  • Culture collaboration, tool-chains
  • Removing barriers to communication and efficiency through shared vocabulary, ideals, and business objectives to to deliver value.
  • A set of principles and good practices to improve the interactions between Operations and Development.
  • Collaboration between developers and sysadmins to work towards more reliable platforms
  • Building a bridge between development and operations
  • The systematic process of building, deploying, managing, and using an application or group of applications such as a drupal site.
  • Devops is collaboration and Integration between Software Development and System Administration.
  • Devops is an emerging set of principles, methods and practices for communication, collaboration and integration between software development (application/software engineering) and IT operations (systems administration/infrastructure) professionals.[1] It has developed in response to the emerging understanding of the interdependence and importance of both the development and operations disciplines in meeting an organization's goal of rapidly producing software products and services.
  • bringing together technology (development) & content (management) closer together
  • Making developers and admins understand each other.
  • Communication between developers and systems folk.
  • a cultural movement to improve agility between dev and ops
  • The cultural extension of agile to bring operations into development teams.
  • Tight collaboration of developers, operations team (sys admins) and QA-team.

But I can only conclude that there is a huge amount of evangelisation that still needs to be done, Lots of people still don't understand what devops is , or have a totally different view on it.

A number of technology conferences are and have taken up devops as a part of their conference program, inviting experienced people from outside of their focus field to talk about how they improve the quality of life !

There is still a large number of devops related problems to solve, so that's what I`ll be doing in 2012

Yes, but how do I get the doman signed?

I’ve had pdnssec running for this domain and  a few others for some time now and the domains have been signed “locally” in preparation for signing by the parent domain. This works really nicely and now that I’m doing consultancy and contract work I thought I’d look into how I get the domain signed by […]

Value of DevOps Culture: It’s not just hugs and kumbaya

The importance of culture is a recurring theme in most DevOps discussions. It’s often cited as the thing your should start with and the thing you should worry about the most.

But other than the rather obvious idea that it’s beneficial for any company to have a culture of trust, communication, and collaboration… can using DevOps thinking to change your culture actually provide a distinct business advantage?

Let’s take the example of Continuous Deployment (or it’s sibling, Continuous Delivery). This is an operating model that embodies a lot of the ideals that you’ll hear about in DevOps circles and is impossible to properly implement if your org suffers from DevOps problems. 

Continuous Deployment is not just a model where companies can release services quicker and more reliably (if you don’t understand why that is NOT a paradox, please go read more about Continuous Deployment). Whether or not you think it could work for your organization, Continuous Deployment is a model that has been proven to unleash the creative and inventive potential of other organizations. Because of this, Continuous Deployment is a good proxy for examining the effects of solving DevOps problems.

Eric Ries sums it up better than I can when he describes the transformative effect that takes place the further you can reduce the cost, friction, and time between releases (i.e. tests to see if you can better please the customer).

“When you have only one test, you don’t have entrepreneurs, you have politicians, because you have to sell. Out of a hundred good ideas, you’ve got to sell your idea. So you build up a society of politicians and salespeople. When you have five hundred tests you’re running, then everybody’s ideas can run. And then you create entrepreneurs who run and learn and can retest and relearn as opposed to a society of politicians.”

-Eric Ries
  The Lean Startup (pg. 33) 

That’s a business advantage. That’s value derived from a DevOps-style change in culture.

 

The post Value of DevOps Culture: It’s not just hugs and kumbaya appeared first on dev2ops.

Value of DevOps Culture: It’s not just hugs and kumbaya

The importance of culture is a recurring theme in most DevOps discussions. It's often cited as the thing your should start with and the thing you should worry about the most.

But other than the rather obvious idea that it's beneficial for any company to have a culture of trust, communication, and collaboration... can using DevOps thinking to change your culture actually provide a distinct business advantage?

Let's take the example of Continuous Deployment (or it's sibling, Continuous Delivery). This is an operating model that embodies a lot of the ideals that you'll hear about in DevOps circles and is impossible to properly implement if your org suffers from DevOps problems. 

Continuous Deployment is not just a model where companies can release services quicker and more reliably (if you don't understand why that is NOT a paradox, please go read more about Continuous Deployment). Whether or not you think it could work for your organization, Continuous Deployment is a model that has been proven to unleash the creative and inventive potential of other organizations. Because of this, Continuous Deployment is a good proxy for examining the effects of solving DevOps problems.

Eric Ries sums it up better than I can when he describes the transformative effect that takes place the further you can reduce the cost, friction, and time between releases (i.e. tests to see if you can better please the customer).

"When you have only one test, you don’t have entrepreneurs, you have politicians, because you have to sell. Out of a hundred good ideas, you’ve got to sell your idea. So you build up a society of politicians and salespeople. When you have five hundred tests you’re running, then everybody’s ideas can run. And then you create entrepreneurs who run and learn and can retest and relearn as opposed to a society of politicians."

-Eric Ries
  The Lean Startup (pg. 33) 

 
That's a business advantage. That's value derived from a DevOps-style change in culture.

 

 

Markdown to Confluence Convertor

Recently in Confluence 4.0 the Wiki Markup Editor was removed for various engineering reasons. I like to type my text in wiki style, and most of all using Markdown.

This code is a quick hack for converting markdown to Atlassian confluence markup language. Which you can still insert via the menu.

It's not a 100% full conversion, but I find it rather usuable already. I will continue to improve where possible.

The gem is based on Kramdown

Installation:

Via gem

$ gem install markdown2confluence

From github:

$ gem install bundler
$ git clone git://github.com/jedi4ever/markdown2confluence.git
$ bundle install vendor

Usage:

If using Gem:

$ markdown2confluence <inputfile>

If using bundler:

$ bundle exec bin/markdown2confluence <inputfile>

Extending/Improving it:

there is really one class to edit

  • see lib/markdown2confluence/convertor/confluence.rb Feel free to enhance or improve tag handling.

Behavioral testing with Vagrant – Take 2

A big thanks to Atlassian for allowing me to post this series!!

Running tests from within the VM

After I covered Puppet Unit Testing, the logical step is writing about Behavioral testing.

While writing this , I can up with a good example of why BDD needs to complement your Unit tests: I have installed the Apache Puppet Module, and all provision ran ok. I wasn't until I tested the webpage with lynx http://localhost that I understood I needed to create a default website. This is of course a trivial example, but I shows you that BDD can help you in testing logical errors.

When this topic arises, most people are familiar with Cucumber Nagios. It contains a series of Cucumber steps that allow you to test http request, amqp, dns, ssh, command.

From what I found, most people would execute these test on the VMs directly. This requires you to install cucumber and all of it's dependent gems in the VM. Gareth RushGrove wrote a great blogpost on packaging cucumber-nagios with fpm

Running tests from outside the VM - Take 1

In some situations, the required gems, libraries might lead to conflicts or introduce dependencies you would rather not have on your production machine. And they would become another point to maintenance in your production machines.

So in a previous blogpost Vagrant Testing,Testing One Two , I already described using modified Cucumber-Nagios steps that interact with Vagrant over ssh.

Running tests from outside the VM - Take 2

But I had a problem with the previous approach. Depending on the situation I would need to run the same tests via different connection methods: vagrant uses ssh, ec2 via fog, openvz via vzctl etc...

So I came up with a new flexible approach: use a configurable command to connect to a vm and have it execute the same steps.

With a little Aruba help

While Cucumber-Nagios slowly moves into Cuken, the SSH steps are getting converted Aruba steps for local exection. And in combination to the ssh-forever steps for ssh interaction.

The Aruba gem is a set of CLI Steps for Cucumber. You can use it to interactively interact with a process or just do a run. Example steps could look like:

Given I run "ssh localhost -p 2222" interactively
And I type "apache2ctl configtest"
And the exit status should be 0

Making it connection neutral

As you can see in the previous step, there is still the connection in the Feature. Not great if we want to run it local. I rephrased it to:

Feature: apache check

  Scenario: see if the apache header is served
    Given I execute `lynx http://localhost --dump` on a running system
    Then the output should match /It works/
    Then the exit status should be 0

  Scenario: check if the apache config is valid
    Given I execute `apache2ctl configtest` on a running system
    Then the exit status should be 0

Writing the logic

Here is the logic to make this work (put it in features/support/step_definitions/remote_system_connect_steps.rb . It uses two environment variables:

SYSTEM_EXECUTE: the command to execute just one command
SYSTEM_CONNECT: the command to connect to the system

Example for vagrant would be:

SYSTEM_EXECUTE: "vagrant ssh_config | ssh -q -F /dev/stdin default"
SYSTEM_CONNECT: "vagrant ssh"

This can be also your favorite knife ssh, vzctl 33 enter, mc-ssh somehost


When /^I execute `([^`]*)` on a running system$/ do |cmd|
  @execute_command=ENV['SYSTEM_EXECUTE']
  @connect_failed=false
  unless @execute_command.nil?
    steps %Q{ When I run `#{@execute_command} "#{cmd}"` }
  else
    @execute_failed=true
    raise "No SYSTEM_EXECUTE environment variable specified"
  end
end

When /^I connect to a running system interactively$/ do
  @connect_command=ENV['SYSTEM_CONNECT']
  @connect_failed=false
  unless @connect_command.nil?
    steps %Q{
        When I run `#{@connect_command}` interactively
    }
  else
    @connect_failed=true
    raise "No SYSTEM_COMMAND environment variable specified"
  end
end

When /^I disconnect$/ do
  steps %Q{ When I type "exit $?" }
end

Monkey Patching Aruba

By default, Aruba uses shellwords to parse the commandlines you pass, it seems to have an issue with "|" symbols. This is the patch I came up with: (in features/support/env.rb)

require 'aruba/cucumber'
require 'shellwords'

# Here we monkey patch Aruba to work with pipe commands
module Aruba
  class Process
    include Shellwords

    def initialize(cmd, exit_timeout, io_wait)
      @exit_timeout = exit_timeout
      @io_wait = io_wait

      @out = Tempfile.new("aruba-out")
      @err = Tempfile.new("aruba-err")
      @process = ChildProcess.build(cmd)
      @process.io.stdout = @out
      @process.io.stderr = @err
      @process.duplex = true
    end
  end
end

After this a regular cucumber run, should work (Note: use a recent cucumber version 1.1.x)

Automating it with Rake

The last part is automating this for Vagrant. For this we create a little rake task:

require "cucumber/rake/task"
task :default => ["validate"]

# Usage rake validate
# - single vm: rake validate
# - multi vm: rake validate vm=logger
Cucumber::Rake::Task.new(:validate) do |task|
    # VM needs to be running already
    vm_name=ENV['vm'] || ""
    ssh_name=ENV['vm'] || "default"
    ENV['SYSTEM_CONNECT']="vagrant ssh #{vm_name}"
    ENV['SYSTEM_EXECUTE']="vagrant ssh_config #{vm_name}| ssh -q -F /dev/stdin #{ssh_name}"
    task.cucumber_opts = ["-s","-c", "features" ]
end

Final words

The solution allows you to reuse the command execution steps, for running them locally, over ssh, or some other connection command.

  • This only works for commands that run over ssh, but I think it is already powerfull to do this. If would require amqp testing, you could probably find a command check as well.
  • Shell escaping is not 100% correct, this needs more work to work with the special characters or quotes inside quotes.
  • When testing, I sometimes miss the context of how a server is created (f.i. the params passed to the puppet manifest or the facts), maybe I could this in a puppet manifests. Not sure on this
  • If there is an interest, I could turn this into a vagrant plugin, to make it really easy.

All code can be found at the demo project: https://github.com/jedi4ever/vagrant-guard-demo

Lisa 2011

Last week I was in Boston for my 1st and their 25th Edition of the Large Infrastructure System Administration Conferences
Lisa was pretty much all I expected from it. Old Unix wizards with long hair and white beards, the usual suspects, and a mix of devops practitioners on a devops themed conference with on one side awesome and well positioned content and on the other side absolutely basic stuff.

On tuesday I had a devops bof scheduled for 2 hours.

My goal of the session was to not talk myselve, and let the audience figure out the 4 key components of devops as documented by @botchagalupe and @damonedwards being , Culture, Automation, Measurement and Sharing. I have to admit it took me a while to get them to that point .. but they figured out themselves .. the bof was standing room only , and there was a good discussion going on

On wednesday I gave my talk titled , Devops the Future is here, it's just not evenly distributed yet.

During my talk I realized that there was some more explanation needed for the crowd explaining Vagrant ... so I proposed a Bof on that topic too ... I used @patrickdebois 's awesome slides and hosted a small bof on Vagrant on thursday evening.

Friday morning I was scheduled to be in a panel discussing featuring a #devops guy, a storage guy and a network guy ..
as my voice was starting to break down I wasn't really confident . however by the time the panel started I could talk normal again :)
The setup was weird.. it were basically 3 people with totally different backgrounds discussing a variety of topics. There were no rea
lly opposing views , mostly we agreed with eachother , so I`m not really sure if the audience was really entertained :)

Anyhow 2 bofs, a talk and a panel later .. I was exhausted and ready to fly back to Belgium.

Tomorrow I have another presentation together with Patrick at the BeJug .. problem is .. I`m still looking for my voice ;(

So worst case .. I`m just gonna turn on the recording that the Usenix folks made of my talk ...

Must admit .. I've given better talks ..

Common Messaging Patterns Using Stomp – Part 5

This is a post in a series of about Middleware for Stomp users, please read the preceding parts starting at 1 before continuing below.

Today changing things around a bit and not so much talking about using Stomp from Ruby but rather how we would monitor ActiveMQ. The ActiveMQ broker has a statistics plugin that you can interact with over Stomp which is particularly nice – being able to interrogate it over the same protocols as you would to use it.

I’ll run through some basic approaches to monitor:

  • The size of queues
  • The memory usage of persisted messages on a queue
  • The rate of messages through a topic or a queue
  • Various memory usage statistics for the broker itself
  • Message counts and rates for the broker as a whole

These are your standard kinds of things you need to know about a running broker in addition to various things like monitoring the length of garbage collections and such which is standard when dealing with Java applications.

Keeping an eye on your queue sizes is very important. I’ve focused a lot on how Queues help you scale by facilitating horizontally adding consumers. Monitoring facilitates the decision making process for how many consumers you need – when to remove some and when to add some.

First you’re going to want to enable the plugin for ActiveMQ, open up your activemq.xml and add the plugin as below and restart when you are done:

<plugins>
   <statisticsBrokerPlugin/>
</plugins>

A quick word about the output format of the messages you’ll see below. They are a serialized JSON (or XML) representation of a data structure. Unfortunately it isn’t immediately usable without some pre-parsing into a real data structure. The Nagios and Cacti plugins you will see below have a method in them for converting this structure into a normal Ruby hash.

The basic process for requesting stats is a Request Response pattern as per part 3.

stomp.subscribe("/temp-topic/stats", {"transformation" => "jms-map-json"})
 
# request stats for the random generator queue from part 2
stomp.publish("/queue/ActiveMQ.Statistics.Destination.random_generator", "", {"reply-to" => "/temp-topic/stats"})
 
puts stomp.receive.body

First we subscribe to a temporary topic that you first saw in Part 2 and we specify that while ActiveMQ will output a JMS Map it should please convert this for us into a JSON document rather than the java structures.

We then request Destination stats for the random_generator queue and finally wait for the response and print it, what you’ll get from it can be seen below:

{"map":{"entry":[{"string":"memoryUsage","long":0},{"string":"dequeueCount","long":13},{"string":"inflightCount","long":0},{"string":"messagesCached","long":0},
{"string":"averageEnqueueTime","double":0.46153846153846156},{"string":["destinationName","queue:\/\/mcollective.nodes"]},{"string":"size","long":0},
{"string":"memoryPercentUsage","int":0},{"string":"producerCount","long":0},{"string":"consumerCount","long":56},{"string":"minEnqueueTime","double":0},
{"string":"maxEnqueueTime","double":1},{"string":"dispatchCount","long":13},{"string":"expiredCount","long":0},{"string":"enqueueCount","long":13},
{"string":"memoryLimit","long":83886080}]}}

Queue Statistics
Queue sizes are basically as you saw above, hit the Stats Plugin at /queue/ActiveMQ.Statistics.Destination.<queue name> and you get stats back for the queue in question.

Below table lists the meaning of these values from what I understand – quite conceivable I am wrong about the specifics of ones like enqueueTime for example so happy to be corrected in comments:

destinationName The name of the queue in JMS URL format
enqueueCount Amount of messages that was sent to the queue and committed to it
inflightCount Messages sent to the consumers but not consumed – they might be sat in the prefetch buffers
dequeueCount The opposite of enqueueCount – messages sent from the queue to consumers
dispatchCount Like dequeueCount but includes messages that might been rolled back
expiredCount Messages can have a maximum life, these are ones thats expired
maxEnqueueTime The maximum amount of time a message sat on the queue before being consumed
minEnqueueTime The minimum amount of time a message sat on the queue before being consumed
averageEnqueueTime The average amount of time a message sat on the queue before being consumed
memoryUsage Memory used by messages stored in the queue
memoryPercentUsage Percentage of available queue memory used
memoryLimit Total amount of memory this queue can use
size How many messages are currently in the queue
consumerCount Consumers currently subscribed to this queue
producerCount Producers currently producing messages


I have written a nagios plugin that can check the queue sizes:

$ check_activemq_queue.rb --host localhost --user nagios --password passw0rd --queue random_generator --queue-warn 10 --queue-crit 20
OK: ActiveMQ random_generator has 1 messages

You can see there’s enough information about the specific queue to be able to draw rate of messages, consumer counts and all sorts of useful information. I also have a quick script that will return all this data in a format suitable for use by Cacti:

$ activemq-cacti-plugin.rb --host localhost --user nagios --password passw0rd --report exim.stats
size:0 dispatchCount:168951 memoryUsage:0 averageEnqueueTime:1629.42897052992 enqueueCount:168951 minEnqueueTime:0.0 consumerCount:1 producerCount:0 memoryPercentUsage:0 destinationName:queue://exim.stats messagesCached:0 memoryLimit:20971520 inflightCount:0 dequeueCount:168951 expiredCount:0 maxEnqueueTime:328585.0

Broker Statistics
Getting stats for the broker is more of the same, just send a message to /queue/ActiveMQ.Statistics.Broker and tell it where to reply to, you’ll get a message back with these properties, I am only listing ones not seen above, the meanings is the same except in the broker stats its totals for all queues and topics.

storePercentUsage Total percentage of storage used for all queues
storeLimit Total storage space available
storeUsage Storage space currently used
tempLimit Total temporary space available
brokerId Unique ID for this broker that you will see in Advisory messages
dataDirectory Where the broker is configured to store its data for queue persistence etc
brokerName The name this broker was given in its configuration file


Additionally there would be a value for each of your connectors listing the URL to it including protocol and port

Again I have a Cacti plugin to get these values out in a format usable in Cacti data sources:

$ activemq-cacti-plugin.rb --host localhost --user nagios --password passw0rd --report broker
stomp+ssl:stomp+ssl storePercentUsage:81 size:5597 ssl:ssl vm:vm://web3 dataDirectory:/var/log/activemq/activemq-data dispatchCount:169533 brokerName:web3 openwire:tcp://web3:6166 storeUsage:869933776 memoryUsage:1564 tempUsage:0 averageEnqueueTime:1623.90502285799 enqueueCount:174080 minEnqueueTime:0.0 producerCount:0 memoryPercentUsage:0 tempLimit:104857600 messagesCached:0 consumerCount:2 memoryLimit:20971520 storeLimit:1073741824 inflightCount:9 dequeueCount:169525 brokerId:ID:web3-44651-1280002111036-0:0 tempPercentUsage:0 stomp:stomp://web3:6163 maxEnqueueTime:328585.0 expiredCount:0

You can find the plugins mentioned above in my GitHub account.

In the same location is a generic checker that publishes a message and wait for its return within a specified number of seconds – good turn around test for your broker.

I don’t really have good templates to share but you can see a Cacti graph I built below with the above plugins.

Common Messaging Patterns Using Stomp – Part 4

This is an ongoing post in a series of posts about Middlware for Stomp users, please read parts 1, 2 and 3 of this series first before continuing below.

Back in Part 2 we wrote a little system to ship metrics from nodes into Graphite via Stomp. This solved the goals of the problem then but now lets see what to do when our needs change.

Graphite is like RRD where it would summarize data over time and eventually discard old data. Contrast that with OpenTSDB that never summarizes or delete data and can store billions of data points. Imagine we want to use Graphite for a short term reporting service for our data but we also need to store the data long term without losing any data. So we really want to store the data in 2 locations.

We have a few options open to use:

  • Send the metric twice from every node, once to Graphite and once to OpenTSDB.
  • Write a software router that receives metrics on one queue and then route the metric to 2 other queues in the middleware.
  • Use facilities internal to the middleware to do the routing for us

The first option is an obvious bad idea and should just be avoided – this would be the worst case scenario for data collection at scale. The 3rd seems like the natural choice here but first we need to understand the facilities the middleware provides. Todays article will explore what ActiveMQ can do for you in this regard.

The 2nd seems an odd fit but as you’ll see below the capabilities for internal routing at the middleware layer isn’t all that exciting, useful in some cases but I think most projects will reach for some kind of message router in code sooner or later.

Virtual Destinations
If you think back to part 2 you’ll remember we have a publisher that publishes data into a queue and any number of consumers that consumes the queue. The queue will load balance the messages for us thus helping us scale.

In order to also create OpenTSDB data we essentially need to double up the consumer side into 2 groups. Ideally each set of consumers will be scalable horizontally and the sets of consumers should both get a copy of all the data – in other words we need 2 queues with all the data in it, one for Graphite and one for OpenTSDB.

You will also remember that Topics have the behavior of duplicating data they receive to all consumers of the topics. So really what we want is to attach 2 queues to a single topic. This way the topic will duplicate the data and the queues will be used for the scalable consumption of the data.

ActiveMQ provides a feature called Virtual Topics that solves this exact problem by convention. You publish messages to a predictably named topic and then you can create any number of queues that will all get a copy of that message.

The image above shows the convention:

  • Publish to /topic/VirtualTopic.metrics
  • Create consumers for /queue/Consumer.Graphite.VirtualTopic.metrics

Create as many of the consumer queues as you want, changing Graphite for some unique name and each of the resulting queues will behave like a normal queue with all the load balancing, storage and other queue like behaviors but all the queues will get a copy of all the data.

You can customize the name pattern of these queues by changing the ActiveMQ configuration files. I really like this approach to solving the problem vs approaches found in other brokers since this is all done by convention and you do not need to change your code to set up a bunch of internal structures that describes the routing topology. I consider routing topology that is living in code of the consumers to be a form of hard coding. Using this approach all I need to do is make sure the names of the destinations to publish to and consume from is configurable strings.

Our Graphite consumer would not need to change other than the name of the queue it should read and ditto for the producer.

If we find that we simply could not change the code for the consumers/producer or if it just was not a configurable setting you can still achieve this behavior by using something called a Composite Destinations in ActiveMQ that could describe this behavior purely in the config file with any arbitrarily named queues and topics.

Selective Consumers
Imagine we wish to give each one of our thousands of servers a unique destination on the middleware so that we can send machines a command directly. You could simple create queues like /queue/nodes.web1.example.com and keep creating queues per server.

The problem with this approach is that internally to ActiveMQ each queue is a thread. So you’d be creating thousands of threads – not ideal.

As we saw before in Part 3 messages can have headers – there we used the reply-to header. Below you’ll find some code that sets an arbitrary header:

stomp.publish("/queue/nodes", "service restart httpd", {"fqdn" => "web1.example.com"})

We are publishing a message with the text service restart httpd to a queue and we are setting a fqdn header.

Now if every server in our estate subscribed to this one queue with the knowledge you have at this point of Queues this would have the effect of sending this restart request to some random one of our servers, not ideal!

The JMS specification allow for something called selectors to be used while subscribing to a destination:

stomp.subscribe("/queue/nodes", {"selector" => "fqdn = 'web1.example.com'"})

The selector header sets the logic to apply to every message which will help decide if you get the message on your subscription or not. The selector language is defined using SQL 92 language and you can generally apply logic to any header in the message.

This way we set up a queue for all our servers without the overhead of 1000s of threads.

The choice for when to use a queue like this and when to use a traditional queue comes down to weighing up the overhead of validating all the SQL statements vs creating all the threads. There are also some side effects if you have a cluster of brokers – the queue traffic gets duplicated to all cluster brokers where with traditional queues the traffic only gets send to a broker if that broker actually has any subscribers interested in this data.

So you need to carefully consider the implications and do some tests with your work load, message sizes, message frequencies, amount of consumers etc.

Conclusion
There is a 3rd option that combines these 2 techniques. You’d create queues sourcing from the topic based on JMS Selectors deciding what data hits what queue. You would set up this arrangement in the ActiveMQ config file.

This, as far as I am aware, covers all the major areas internal to ActiveMQ that you can use to apply some routing and duplication of messages.

These methods are useful and solves some problems but as I pointed out it’s not really that flexible. In a later part of this series I will look into software routers from software like Apache Camel and how to write your own.

From a technology choices point of view future self is now thanking past self for building the initial metrics system using MOM since rather than go back to the drawing board when our needs changed we were able to solve our problems by virtue of the fact that we built it on a flexible foundation using well known patterns and without changing much if any actual code.

This series continue in part 5.

Test Driven Infrastructure with Vagrant, Puppet and Guard

This is a repost of my SysAdvent blogpost. It's merely here for archival purposes, or for people who read my blog but didn't see the sysadvent blogpost.


Why

Lots has been written about Vagrant. It simply is a great tool: people use it as a sandbox environment to develop their Chef recipes or Puppet manifests in a safe environment.

The workflow usually looks like this:

  • you create a vagrant vm
  • share some puppet/chef files via a shared directory
  • edit some files locally
  • run a vagrant provision to see if this works
  • and if you are happy with it, commit it to your favorite version control repository

Specifically for puppet, thanks to the great work by Nikolay Sturm and Tim Sharpe, we can now also complement this with tests written in rspec-puppet and cucumber-puppet. You can find more info at Puppet unit testing like a pro.

So we got code, and we got tests, what else are we missing? Automation of this process: it's funny if you think of it that we automate the hell out of server installations, but haven't automated the previous described process.

The need to run vagrant provision or rake rspec actually breaks my development flow: I have to leave my editor to run a shell command and then come back to it depending on the output.

Would it not be great if we could automate this whole cycle? And have it run tests and provision whenever files change?

How

The first tool I came across is autotest: it allows one to automatically re-execute tests depending on filesystem changes. Downside is that it could either run cucumber tests or rspec tests.

So enter Guard; it describes itself as a command line tool to easily handle events on file system modifications (FSEvent / Inotify / Polling support). Just what we wanted!

Installing Guard is pretty easy, you require the following gems in your Gemfile

gem 'guard'
gem 'rb-inotify', :require => false
gem 'rb-fsevent', :require => false
gem 'rb-fchange', :require => false
gem 'growl', :require => false
gem 'libnotify', :require => false

As you can tell by the names, it uses different strategies to detect changes in your directories. It uses growl (if correctly setup) on Mac OS X and libnotify on Linux to notify you if your tests pass or fail. Once installed you get a command guard.

Guard uses a configuration file Guardfile, which can be created by guard init. In this file you define different guards based on different helpers: for example there is guard-rspec, guard-cucumber and many more. There is even a guard-puppet(which we will not use because it works only for local provisioning)

To install one of these helpers you just include it in your Gemfile. We are using only two here:

gem 'guard-rspec'
gem 'guard-cucumber'

Each of these helpers has a similar way of configuring themselves inside a Guardfile. A vanilla guard for a ruby gem with rspec testing would look like this:

guard 'rspec' do
  watch(%r{^spec/.+_spec\.rb$})
  watch(%r{^lib/(.+)\.rb$})     { |m| "spec/lib/#{m[1]}_spec.rb" }
  watch('spec/spec_helper.rb')  { "spec" }
end

Whenever a file that matches a watch expression changes, it would run an rspec test. By default if no block is supplied, the file itself is run. You can alter the path in a block as in the example.

Once you have a Guardfile you simply run guard (or bundle exec guard) to have it watch changes. Simple hu?

What

Vagrant setup

Enter our sample puppet/vagrant project. You can find the full source at http://github.com/jedi4ever/vagrant-guard-demo It's a typical vagrant project with the following tree structure:(only 3 levels shown)

├── Gemfile
├── Gemfile.lock
├── Guardfile
├── README.markdown
├── Vagrantfile
├── definitions # Veewee definitions
│   └── lucid64
│       ├── definition.rb
│       ├── postinstall.sh
│       └── preseed.cfg
├── iso # Veewee iso
│   └── ubuntu-10.04.3-server-amd64.iso
└── vendor
    └── ruby
        └── 1.8

Puppet setup

The project follows Jordan Sissel's idea of puppet nodeless configuration. To specify the classes to apply to a host, we use a fact called: server_role. We read this from a file data/etc/server_tags via a custom fact (inspired by self-classifying puppet node).

This allows us to only require one file, site.pp. And we don't have to fiddle with our hostname to get the correct role. Also if we want to test multiple roles on this one test machine, just add another role to the data/etc/server_tags file.

├── data
│   └── etc
│       └── server_tags

$ cat data/etc/server_tags
role:webserver=true

The puppet modules and manifests can be found in puppet-repo. It has class role::webserver which includes class apache.

puppet-repo
├── features # This is where the cucucumber-puppet catalog policy feature lives
│   ├── catalog_policy.feature
│   ├── steps
│   │   ├── catalog_policy.rb
│   └── support
│       ├── hooks.rb
│       └── world.rb
├── manifests
│   └── site.pp #No nodes required
└── modules
    ├── apache
    |    <module content>
    ├── role
    │   ├── manifests
    │   │   └── webserver.pp # Corresponds with the role specified
    │   └── rspec
    │       ├── classes
    │       └── spec_helper.rb
    └── truth # Logic of puppet nodeless configuration
        ├── lib
        │   ├── facter
        │   └── puppet
        └── manifests
            └── enforcer.pp

Puppet - Vagrant setup

These are the settings we use in our Vagrant file to make puppet work:

config.vm.share_folder "v-data", "/data", File.join(File.dirname(__FILE__), "data")
# Enable provisioning with Puppet stand alone.  Puppet manifests
# are contained in a directory path relative to this Vagrantfile.
config.vm.provision :puppet, :options => "--verbose"  do |puppet|
  puppet.module_path = ["puppet-repo/modules"]
  puppet.manifests_path = "puppet-repo/manifests"
  puppet.manifest_file  = "site.pp"
end

Puppet tests setup

The cucumber-puppet tests will check if the catalog compiles for role role::webserver

Feature: Catalog policy
  In order to ensure basic correctness
  I want all catalogs to obey my policy

  Scenario Outline: Generic policy for all server roles
    Given a node with role "<server_role>"
    When I compile its catalog
    Then compilation should succeed
    And all resource dependencies should resolve

    Examples:
      | server_role |
      | role::webserver |

The rspec-puppet tests will check if the package http gets installed

require "#{File.join(File.dirname(__FILE__),'..','spec_helper')}"
describe 'role::webserver', :type => :class do
  let(:facts) {{:server_tags => 'role:webserver=true',
      :operatingsystem => 'Ubuntu'}}
  it { should include_class('apache') }
  it { should contain_package('httpd').with_ensure('present') }
end

Guard setup

To make Guard work with a setup like our puppet-repo directory we need to change some things. This has mostly to do with conventions used in development projects where Guard is normally used.

Fixing Guard-Cucumber to read from puppetrepo/features

The first problem is that the Guard-Cucumber gem standard reads it's features from features directory. This is actually hardcoded in the gem. But nothing a little monkey patching can't solve:

require 'guard/cucumber'

# Inline extending the ::Guard::Cucumber
# Because by default it only looks in the ['features'] directory
# We have it in ['puppet-repo/features']
module ::Guard
  class ExtendedCucumber < ::Guard::Cucumber
    def run_all
      passed = Runner.run(['puppet-repo/features'], options.merge(options[:run_all] || { }).merge(:message => 'Running all features'))

      if passed
        @failed_paths = []
      else
        @failed_paths = read_failed_features if @options[:keep_failed]
      end

      @last_failed = !passed

      throw :task_has_failed unless passed
    end
  end
end

# Monkey patching the Inspector class
# By default it checks if it starts with /feature/
# We tell it that whatever we pass is valid
module ::Guard
  class Cucumber
    module Inspector
      class << self
        def cucumber_folder?(path)
          return true
        end
      end
    end
  end
end

Orchestration of guard runs

The second problem was to have Guard only execute the Vagrant provision when BOTH the cucumber and rspec tests would be OK. Inspired by the comments of Netzpirat, I got it working so that the block vagrant provision would only execute on both tests being complete.

# This block simply calls vagrant provision via a shell
# And shows the output
def vagrant_provision
  IO.popen("vagrant provision") do |output|
    while line = output.gets do
      puts line
    end
  end
end

# So determine if all tests (both rspec and cucumber have been passed)
# This is used to only invoke the vagrant_provision if all tests show green
def all_tests_pass
  cucumber_guard = ::Guard.guards({ :name => 'extendedcucumber', :group => 'tests'}).first
  cucumber_passed = cucumber_guard.instance_variable_get("@failed_paths").empty?
  rspec_guard = ::Guard.guards({ :name => 'rspec', :group => 'tests'}).first
  rspec_passed = rspec_guard.instance_variable_get("@failed_paths").empty?
  return rspec_passed && cucumber_passed
end

Guard matchers

With all the correct guards and logic setup, it's time to specify the correct options to our Guards.

group :tests do

  # Run rspec-puppet tests
  # --format documentation : for better output
  # :spec_paths to pass the correct path to look for features
  guard :rspec, :version => 2, :cli => "--color --format documentation", :spec_paths => ["puppet-repo"]  do
    # Match any .pp file (but be carefull not to include any dot-temporary files)
    watch(%r{^puppet-repo/.*/[^.]*\.pp$}) { "puppet-repo" }
    # Match any .rb file (but be carefull not to include any dot-temporary files)
    watch(%r{^puppet-repo/.*/[^.]*\.rb$}) { "puppet-repo" }
    # Match any _rspec.rb file (but be carefull not to include any dot-temporary files)
    watch(%r{^puppet-repo/.*/[^.]*_rspec.rb})
  end

  # Run cucumber puppet tests
  # This uses our extended cucumber guard, as by default it only looks in the features directory
  # --strict        : because otherwise cucumber would exit with 0 when there are pending steps
  # --format pretty : to get readable output, default is null output
  guard :extendedcucumber, :cli => "--require puppet-repo/features --strict --format pretty" do

    # Match any .pp file (but be carefull not to include any dot-temporary files)
    watch(%r{^puppet-repo/[^.]*\.pp$}) { "puppet-repo/features" }

    # Match any .rb file (but be carefull not to include any dot-temporary files)
    watch(%r{^puppet-repo/[^.]*\.rb$}) { "puppet-repo/features" }

    # Feature files are monitored as well
    watch(%r{^puppet-repo/features/[^.]*.feature})

    # This is only invoked on changes, not at initial startup
    callback(:start_end) do
      vagrant_provision if all_tests_pass
    end
    callback(:run_on_change_end) do
      vagrant_provision if all_tests_pass
    end
  end

end

The full Guardfile is on github

Run it

From within the top directory of the project type

$ guard

Now open a second terminal and change some of the files and watch the magic happen.

Final remarks

The setup described is an idea I only recently started exploring. I'll probably enhance this in the future or may experience other problems.

For the demo project, I only call vagrant provision, but this can of course be extended easily. Some ideas:

  1. Inspired by Oliver Hookins - How we use Vagrant as a throwaway testing environment:
  2. use sahara to create a snapshot just before the provisioning
  3. have it start from a clean machine when all tests pass
  4. Turn this into a guard-vagrant gem, to monitor files and tests