Puppet Camp – San Francisco 2010
Making machine metadata visible
I’m quite the fan of data, metadata and querying these to interact with my infrastructure rather than interacting by hostnames and wanted to show how far I am down this route.
This is more an iterative ongoing process than a fully baked idea at this point since the concept of hostnames is so heavily embedded in our Sysadmin culture. Today I can’t yet fully break away from it due to tools like nagios etc still relying heavily on the hostname as the index but these are things that will improve in time.
The background is that in the old days we attempted to capture a lot of metadata in hostnames, domain names and so forth. This was kind of OK since we had static networks with relatively small amounts of hosts. Today we do ever more complex work on our servers and we have more and more servers. The advent of cloud computing has also brought with it a whole new pain of unpredictable hostnames, rapidly changing infrastructures a much bigger emphasis on role based computing.
My metadata about my machines comes from 3 main sources:
- My Puppet manifests – classes and modules that gets put on a machine
- Facter facts with the ability to add many per machine easily
- MCollective stores the meta data in a MongoDB and let me query the network in real time
Puppet manifests based on query
When setting up machines I keep some data like database master hostnames in extlookup but in many cases I am now moving to a search based approach to finding resources. Here’s a sample manifest that will find the master database for a customers development machines:
$masterdb = search_nodes("{'facts.customer': '${customer}', 'facts.environment':${environment}, classes: 'mysql::master'}")
This is MongoDB query against my infrastructure database, it will find for a given node the name of a node that has the class mysql::master on it, by convention there should be only one per customer in my case. When using it in a template I can get back full objects with all the meta data for a node. Hopefully with Puppet 2.6 I can get full hashes into puppet too!
Making Metadata Visible
With machines doing a lot of work, filling a lot of roles etc and with more and more machines you need to be able to tell immediately what machine you are on.
I do this in several places, first my MOTD can look something like this:
Welcome to Synchronize Your Dogmas
hosted at Hetzner, Germany
Puppet Modules:
- apache
- iptables
- mcollective member
- xen dom0 skeleton
- mw1.xxx.net virtual machineI build this up using snippet from my concat module, each important module like apache can just put something like this in:
motd::register{"Apache Web Server": }
Being managed by my snippet library, if you just remove the include line from the manifests the MOTD will automatically update.
With a big block of welcome done, I now need to also be able to show in my prompts what a machine does, who its for a importantly what environment it is in.

Above a shot of 2 prompts in different environments, you see customer name, environment and major modules. Like with the motd I have a prompt::register define that module use to register into the prompt.
SSH Based on Metadata
With all this meta data in place, mcollective rolled out and everything integrated it’s very easy to now find and access machines based on this.
MCollective does real time resource discovery, so keeping with the mysql example above from puppet:
$ mc-ssh -W "environment=development customer=acme mysql::master" Running: ssh db1.acme.net Last login: Thu Jul 29 00:22:58 2010 from xxxx $
Here i am ssh’ing to a server based on a query, if it found more than one machine matching the query a menu would be presented offering me a choice.
Monitoring Based on Metatdata
Finally setting up monitoring and keeping it in sync with reality can be a big challenge especially in dynamic cloud based environments, again I deal with this through discovery based on meta data:
$ check-mc-nrpe -W "environment=development customer=acme mysql::master" check_load check_load: OK: 1 WARNING: 0 CRITICAL: 0 UNKNOWN: 0|total=1 ok=1 warn=0 crit=0 unknown=0 checktime=0.612054
Summary
This is really the tip of the ice berg, there is a lot more that I already do – like scheduling puppet runs on groups of machines based on metadata – but also a lot more to do this really is early days down this route. I am very keen to get views from others who is struggling with shortcomings in hostname based approaches and how they deal with it.
Devops Meetups and Devops Dojos
Devopsday USA 2010 and the first Silicon Valley Devops Meetup
In late june/early july this year I went to San Francisco for devopsday USA 2010 that I had the pleasure to co-organized with Damon Edwards, Patrick Debois and Andrew Shafer.
I really enjoyed the experience and am glad so many people came to attend the conference (spéciale dédidace to the French Diaspora: Alexis, Olivier, Patrice and Jérôme). I look forward now for another chance to contribute to the next events!
I was still in the Bay Area on july the 6th when the first Silicon Valley Devops Meetup was organized by Dave Nielsen in Mountain View and I decided to join attend their first meetup.
Although Patrick and I have been in contact and working on presentations about "Agile and Operations" and "Continuous Deployment pipelines" months before he pinned the devops term and decided to create the first Devopsdays, we don't live close enough to one another to be able to see each other regularly, and I don't know yet enough devops-minded people locally to be able to start regular meetups, so it was interesting for me to to see what form it would take (sadly I haven't managed to attend to the popular london meetups yet).
The meetup started with a little discussion on the group name and on what the content and form should be for the following meetups.
The first Devopsdays was a 2 days conference with speakers in the morning and openspaces/unconference in the afternoon. I felt it was a nice format since the morning presentation would raise interest on specific subjects and fuel the afternoon debates without restricting them. (We were more constrained by time -only one day- for devopsday USA 2010 and had plenty of speakers so we decided to only have panels and a few lightning talks to raise interest/awareness to other subjects.)
I guess I felt that I was passing on the torch somehow and since some of the topics and discussions that took place during (and after) Devopsdays came back, it was an opportunity to share what was said and done back then. I think that the biggest benefit the devops movement is that it enables people to share their experience with one another, and I believe this is one of the way we can solve the problem I addressed on my first post.
One of the things from Ghent that I mentioned was the very nice experiment by Lindsay Holmwood when he proposed a 1-hour gang-development session on "cucumber as a script language". Not only because the subject was cool, but also because there was actually concrete code produced after this session, and I believe this is great if we can not only exchange ideas but also produce something that goes in the right direction.
Even though the devops movement is very much about people, about having the right mindsets, about breaking silos and about business alignment and change management, it is also about tools. And I think that since developers and ops (and network and security and QA) people meet together during the meetups and conferences, it is also probably the right place for new tools to emerge, tools that can efficiently and elegantly solve the daily pain points and bring people together/help them concentrate on what's really important.
This is why I was really happy to see that the meetup then followed by a nice presentation by Alex Honor on the "devops toolchain project". I'm glad I had the opportunity to meet Alex several time during my stay in the USA as he also had been thinking about those issues for a long time. His work on the toolchain helps pointing the gaps, the same way the "missing tools?" session during Ghent's Devopsdays did and there is a lot to do!
Devops Dojos?
Before I was involved in the devops movement, I was very much influenced by the Agile community, thanks to my friend Raphaël Pierquin (I also met Patrick thanks to him). He is the one who introduced me to the notion of "Coding Dojos".
I'm not sure who invented the Coding Dojos in the first place (it might have been Laurent Bossavit and al), but the idea is roughly "how come you are supposed to become a java expert after a one-week course when it takes a life time of regular training to become a martial art expert?", and as a martial art practitioner myself I find this idea sound.
Still, while I'm sure regular trainings on devops ideas makes sense, I'm not sure exactly how this should be done:
- Do we need to train on a specific problem, a specific tool or on a specific method?
- Maybe we could do retrospectives on a problem we've had and the solution we've implemented, to see how others would have fixed it?
- Maybe this could be an opportunity to design a tool that would solve a specific problem, or a modification on an existing tool so it would be a better fit?
If you guys have an idea about this, I'd be really interested hearing it!
Monitoring ActiveMQ
I have a number of ActiveMQ servers, 7 in total, 3 in a network of brokers the rest standalone. For MCollective I use topics extensively so don’t really need to monitoring them much other than for availability. I also though do a lot of Queued work where lots of machines put data in a queue and others process the data.
In the Queue scenario you absolutely need to monitor queue sizes, memory usage and such. You also need to graph things like rates of messages, consumer counts and memory use. I am busy writing a number of Nagios and Cacti plugins to help with this, you can find them on Github.
To use these you need to have the ActiveMQ Statistics Plugin enabled.
First we need to monitor queue sizes:
$ check_activemq_queue.rb --host localhost --user nagios --password passw0rd --queue exim.stats --queue-warn 1000 --queue-crit 2000 OK: ActiveMQ exim.stats has 1 messages
This will connect to localhost monitoring a queue exim.stats warning you when it’s got 1000 messages and critical at 2000.
I need to add to this the ability to monitor memory usage, this will come over the next few days.
I also have a plugin for Cacti it can output stats for the broker as a whole and also for a specific queue. First the whole broker:
$ activemq-cacti-plugin.rb --host localhost --user nagios --password passw0rd --report broker stomp+ssl:stomp+ssl storePercentUsage:81 size:5597 ssl:ssl vm:vm://web3 dataDirectory:/var/log/activemq/activemq-data dispatchCount:169533 brokerName:web3 openwire:tcp://web3:6166 storeUsage:869933776 memoryUsage:1564 tempUsage:0 averageEnqueueTime:1623.90502285799 enqueueCount:174080 minEnqueueTime:0.0 producerCount:0 memoryPercentUsage:0 tempLimit:104857600 messagesCached:0 consumerCount:2 memoryLimit:20971520 storeLimit:1073741824 inflightCount:9 dequeueCount:169525 brokerId:ID:web3-44651-1280002111036-0:0 tempPercentUsage:0 stomp:stomp://web3:6163 maxEnqueueTime:328585.0 expiredCount:0
Now a specific queue:
$ activemq-cacti-plugin.rb --host localhost --user nagios --password passw0rd --report exim.stats size:0 dispatchCount:168951 memoryUsage:0 averageEnqueueTime:1629.42897052992 enqueueCount:168951 minEnqueueTime:0.0 consumerCount:1 producerCount:0 memoryPercentUsage:0 destinationName:queue://exim.stats messagesCached:0 memoryLimit:20971520 inflightCount:0 dequeueCount:168951 expiredCount:0 maxEnqueueTime:328585.0
Grab the code on GitHub and follow there, I expect a few updates in the next few weeks.
DevOps (live) at OSCON
Early reports from OSCON are that DevOps is a topic of much discussion. My fellow dev2ops.org contributor Alex Honor and I are headed to Portland this morning to give DevOps related talks at OSCON. If you are there Wednesday or Thursday, please come by and say hello!
Wednesday (7/21) 1:40pm in room Portland 251 is Alex's presentation...
Open Source Tool Chains for Cloud Computing
Thursday (7/22) 10:40am in room D135 is Damon's presentation...
The IT Philharmonic: How Out of Tune Are Your Operations?
Both talks feature lots of new content (even though the titles and outdated descriptions on the OSCON site are similar to our Velocity talks)
Puppet 2.6.0 is here! It’s alive!
The journey was long and arduous and many fell along the way but Puppet
Labs is proud to announce the 2.6.0 release!
The 2.6.0 release is a major feature release and includes a huge variety
of new features, fixes, updates and enhancements. These include the
complete cut-over from XMLRPC to the REST API, numerous language
enhancements, a complete rewrite of the events and reporting system, an
internal Ruby DSL, a single binary, Windows support, a new HTTP report
processor, and a myriad of other enhancements.
We’ve included release notes below that you can also see at:
http://projects.puppetlabs.com/projects/puppet/wiki/Release_Notes
The release is available for download at:
http://puppetlabs.com/downloads/puppet/puppet-2.6.0.tar.gz
And I am sure packagers will be hard at work in the not to distant future!
Puppet RC4 nearly almost production out
Okay Puppeteers …. we’re almost there with 2.6.0rc4. We’re hoping that this time this will really be the last RC – so please more testing!
The 2.6.0 release is a major feature release and includes a huge variety of new features, fixes, updates and enhancements. These include the complete cut-over from XMLRPC to the REST API, basic Windows support, numerous language enhancements, a complete rewrite of the events and reporting system, an internal Ruby DSL, a single binary, a new HTTP report processor, and a myriad of other enhancements.
You can read the full release notes at:
http://projects.puppetlabs.com/projects/puppet/wiki/Release_Notes.
And download RC4 at:
http://puppetlabs.com/downloads/puppet/puppet-2.6.0rc4.tar.gz
Puppet Dashboard 1.0.1 released!
So you probably thought the Dashboard didn’t love you anymore … that
we’d forgotten about you and we’re very sorry for that. But we’re
trying to make up for it … starting with the Puppet Dashboard 1.0.1
release.
The 1.0.1 release is a maintenance release that fixes a lot of the
outstanding bugs and issues with the 1.0 release. We’re planning a 1.1
release in the near future that will add additional features (you can
see the Roadmap here)
http://puppetlabs.com/downloads/dashboard/puppet-dashboard-1.0.1.tgz
Fixed in this release is:
* Fixed exception in display of audit log messages
* Fixed deletion of nodes to remove their reports, eliminating orphans
* Fixed exception on node group pages if they had associated classes or
groups
* Fixed unwanted pagination of JSON and YAML results
* Fixed reporting of successful and failed nodes
* Added deletion of single reports
* Added labels and placeholders to form fields
* Added local copies of all JavaScript files
* Added run status chart to node list pages (all, successful, failed)
* Added searching to node, class and group index pages
* Added tooltips to node and report status indicators
* Improved README’s installation and configuration instructions
* Improved sidebar with links to classes and groups, added it to homepage
* Improved tabular display of nodes, groups and classes
* Removed empty reports.css to make packagers happy
* Removed loading of seed data by default
* Updated UI with status icons, improved typography and spacing, more
noticeable buttons
* Updated packaging information for DEB and RPM
There are new packages also available. The new packages are available via APT and Yum repositories hosted by Puppet
Labs.
Overall instructions for installing and running the Dashboard can be
found here.
1. Get DEB Packages via APT
a. Add the following to your /etc/apt/sources.list file:
deb http://apt.puppetlabs.com/ubuntu lucid main deb-src http://apt.puppetlabs.com/ubuntu lucid main
b. Add the new Puppet Labs repository release key to APT (the package is
signed with this key also).
$ gpg --recv-key 4BD6EC30 $ gpg -a --export 4BD6EC30 | sudo apt-key add -
c. Run apt-get update
$ sudo apt-get update
d. Install Puppet Dashboard
$ sudo apt-get install puppet-dashboard
The Dashboard will be installed in /usr/share/puppet-dashboard and
you can run the server from here or create a Passenger
configuration. The updated package contains a simple init and
sysconfig set-up.
2. Get RPM packages via Yum
a. Create a Yum repo entry for Puppet Labs
$ vi /etc/yum.repos.d/puppetlabs.repo [puppetlabs] name=Puppet Labs Packages baseurl=http://yum.puppetlabs.com/base/ enabled=1 gpgcheck=1 gpgkey=http://yum.puppetlabs.com/RPM-GPG-KEY-puppetlabs
b. Install via yum
$ sudo yum install puppet-dashboard
You will be prompted to install the Puppet Labs release key as part
of the installation process and the RPM packages are signed with
that key.
The Dashboard will be installed in /usr/share/puppet-dashboard and
you run the server from here or create a Passenger configuration.
You can also find the RPM spec file here.
DevOps talk at London QCon 2010
I was invited to London QCon this year to give a talk, I chose to talk about how I’ve helped to build a startup heavily favoring the scenario where developers do support, rollouts and maintenance of their code directly in production.
My talk go into the approaches I took while thinking about networks, boxes, operating systems, team structure, monitoring and so forth to attain these goals in a way that does not compromise the traditional goals that sysadmins have as a team and profession.
You can watch the talk – 50 minutes roughly – at the InfoQ site.
I should add I was feeling a bit rough on the day and coming down with a cold, but mostly I think I remained more or less conscious during the talk
