↓ Archives ↓

Category → monitoringsucks

Love, MonitoringLove

Last year we were pretty negative about Monitoring, We shouted out that MonitoringSucked ... A year has passed and a lot has changed ... most importantly our new found love for monitoring, thanks to an inspirational Ignite talk by Ulf Mansson at devopsdays Rome.

Right after Fosdem about 20 people showed up at the #monitoringlove hacksessions hosted at the Inuits.eu offices to work on Open Source monitoring projects and exchange ideas. Some completely new people, some people with already a lot of experience.

Amongst the projects that were worked on was Maciej working on Packaging graphite for Debian, Ohter people were fixing bugs in Puppet , I spent some time with a vagrant box to deploy Sensu using Puppet. Last time I was playing with Sensu was on the flight back from PuppetCon , I gave up the fight with
RabbitMQ and SSL because I had no internet connection .. and now Ulf just pointed out that I could disable SSL at all, which resulted in having a POC up and running in no time.

Patrick was hacking on the Chef counterpart of the vagrant-puppet sensu setup a part of #monigusto. Ulf Mansson was getting dashing to display on a Raspberry Pi ... pretty cool stuff
And Jelle Smet was working on Pyseps a Python based Simple Event Processing Server framework that consume JSON docs from RabbitMQ and forwards them real time to other queues using MongoDB query syntax.

One of the more interesting discussion was around the topic of alerting and modeling business rules and input from a lot of different sources
in order to send the right alerts to the right people.

We explored different ideas like using BPM tools such as Activity or Rules engines like Ruby Rools. There exist some Saas providers that try to solve this need like PagerDuty and friends but obviously there is still a lot of work that needs to be done in order to create a viable alerting system based on different input sources.

The monitoring problem is not solved yet .. and it will stay around for a couple of years .. but with the advent of event such as Monitorama its clear
that an event like our #monitoring love hackessions is needed .. and is probably here to stay for a couple of years.

Reliable UDP: The last major Assimilation feature before the first release

I'm still on track for a first release of the Assimilation code by the end of the year. But there is one last interesting (meaning tricky) feature to write before this release. All communication is over UDP, which means the OS doesn't guarantee packet delivery. So we need to do that ourselves. From an availability perspective, we need to acknowledge packets at the application layer anyway, so nothing much is lost. (Why that's the case is worthy of it's own post). The most interesting part of this is that our protocol needs to be resilient to replay attacks. This post explains what a replay attack is, and how we plan on eliminating them.

Assimilation Project Licensing

When I founded the Assimilation project, I chose a license in order to have chosen a license. I always assumed I would make a final license decision before the first release. With that time coming up in the forseeable future, it seems like time to give thought to a more permanent license decision. This blog entry outlines my thoughts on choice of licenses and related issues.

Assimilation Monitoring LinuxCon Video

I mentioned a few weeks ago that my talk at LinuxCon in San Diego had been very well received. Thanks to some good friends, we also created a video of the event, and this week I want to point you to the final cut of that video. This talk is a great introduction to the Assimilation Monitoring Project.

I see dead servers – in O(1) time

The title for this blog post comes from a T-shirt I had made for the Assimilation Project. I wore a nicer version of it at my recent talk at LinuxCon 2012. The Assimilation project has some significant and unique claims to scalability. Some of these have been discussed before. This blog article will explain the different aspects of the project and how they measure up in terms of scalability.

Injecting Nanoprobes into Servers – What’s that about?

I've recently had some people who have asked about the how nanoprobes work – are they clients, or what exactly are they? They start out like clients, and behave in some ways like peers, and maybe a bit like servers. So what the heck are they? The simplest explanation is that they are autonomous delegates of the central management authority. Read on to find out more about how this unconventional model works and why this authority model is key to unprecedented scalability and stealth discovery™ in the discovery-driven Assimilation monitoring project.

An Assimilation type schema in Neo4j

This week I want to talk about an aspect of the Assimilation database schema which is somewhat controversial, an aspect of the schema for which the jury is still out. I chose to represent the Assimilation node type hierarchy with relationships which currently serve no purpose other than to represent the types of nodes in the database. This post will talk about why I put the type hierarchy in, and why it might be a good idea, or maybe not.

Our #monitoringsucks rpm is repository available

Not only our Rubygems Builds have changed, but also my internal #monitoringsucks repository.

You might have noticed a variety of vagrant- projects on my github acount

http://github.com/KrisBuytaert/vagrant-ganglia
http://github.com/KrisBuytaert/vagrant-graphite
http://github.com/KrisBuytaert/vagrant-puppet-logstash,
Being the #monitoringsucks part of them. All of those Vagrant projects are basically my test setups to play with those new tools.

They contain a bunch of puppet modules that install and configure these tools. (Note that they mostly consist of
of git submodules to other puppet module repositories.

Given the fact that I also like to have my software cleanly installed from a package, that means that some of these tools had to be packaged, or I had to create a personal / internal repository which had packages from upstream that were hiding on the internet available.

I've forked of this repository off the internal Inuits epository so you all can also benefit from these efforts.
(You gotta love pulp :))

That means you can now install all of the above mentionned #monitoringsucks tool from our public repo on

  1. yumrepo { 'monitoringsucks':
  2. baseurl => 'http://pulp.inuits.eu/pulp/repos/monitoring',
  3. descr => 'MonitoringSuck at Inuits',
  4. gpgcheck => '0',
  5. }

Patches to both the Vagrant projects and the puppet modules are welcome ...

Discovering Switches: It’s amazing what you can learn just by listening…

We recently added code to discover switches, switch ports and settings - all in the Steath DiscoveryTM way - without sending out any packets at all! So now you know which switches and which switch ports every monitored server is plugged into. As a bonus we pick up some interesting configuration information on your switch and your particular switch port - just by perking our ears up and listening... Now when you send someone to the closet to do something to your switch port, there is no doubt which port is yours - regardless of that little mistake in the cross-connects, or that tiny error in documentation. [Anyone want to write an iPad switch mapping app for this?]

Clients, Servers and Dependencies, Oh My!

One of the things that people have gotten most excited about in the Assimilation Monitoring Project in the area of discovery is the discovery of clients, servers and particularly dependencies. That code is now in the Assimilation code base. We discover client processes, server processes, and their interconnections. In this post, I'll explore how this works and what this looks like in the Neo4j graph database in this article. These dependencies are discovered without port scanning or packet sniffing - using Stealth DiscoveryTM methods.