↓ Archives ↓

Rough data density calculations

Seagate has just publicly announced 8TB HDD’s in a 3.5″ form factor. I decided to do some rough calculations to understand the density a bit better…

Note: I have decided to ignore the distinction between Terabytes (TB) and Tebibytes (TiB), since I always work in base 2, but I hate the -bi naming conventions. Seagate is most likely announcing an 8TB HDD, which is actually smaller than a true 8TiB drive. If you don’t know the difference it’s worth learning.

Rack Unit Density:

Supermicro sells a high density, double-sided 4U server, which can hold 90 x 3.5″ drives. This means you can easily store:

90 * 8TB = 720TB in 4U,

or:

720TB/4U = 180TB per U.

To store a petabyte of data, since:

1PB = 1024TB,

we need:

1024TB/180TB/U = 5.68 U.

Rounding up we realize that we can easily store one petabyte of raw data in 6U.

Since an average rack is usually 42U (tall racks can be 48U) that means we can store between seven and eight PB per rack:

42U/rack / 6U/PB = 7PB/rack

48U/rack / 6U/PB = 8PB/rack

If you can provide the power and cooling, you can quickly see that small data centers can easily get into exabyte scale if needed. One raw exabyte would only require:

1EB = 1024PB

1024PB/7PB/rack = 146 racks =~ 150 racks.

Raid and Redundancy:

Since you’ll most likely have lots of failures, I would recommend having some number of RAID sets per server, and perhaps a distributed file system like GlusterFS to replicate the data across different servers. Suppose you broke each 90 drive server into five separate RAID 6 bricks for GlusterFS:

90/5 = 18 drives per brick.

In RAID 6, you loose two drives to parity, so that means:

18 drives – 2 drives = 16 drives per brick of usable storage.

16 drives * 5 bricks * 8 TB = 640 TB after RAID 6 in 4U.

640TB/4U = 160TB/U

1024TB/160TB/U = 6.4TB/U =~ 7PB/rack.

Since I rounded a lot, the result is similar. With a replica count of 2 in a standard GlusterFS configuration, you average a total of about 3-4PB of usable storage per rack. Need a petabyte scale filesystem? One rack should do it!

Other considerations:

  • Remember that you need to take into account space for power, cooling and networking.
  • Keep in mind that SMR might be used to increase density even further (unless it’s not already being used on these drives).
  • Remember that these calculations were done to understand the order of magnitude, and not to get a precise measurement on the size of a planned cluster.
  • Petabyte scale is starting to feel small…

Conclusions:

Storage is getting very inexpensive. After the above analysis, I feel safe in concluding that:

  1. Puppet-Gluster could easily automate a petabyte scale filesystem.
  2. I have an embarrassingly small amount of personal storage.

Hope this was fun,

Happy hacking,

James

 

Disclaimer: I have not tried the 8TB Seagate HDD’s, or the Supermicro 90 x 3.5″ servers, but if you are building a petabyte scale cluster with GlusterFS/Puppet-Gluster, I’d like to hear about it!

 


Hybrid management of FreeIPA types with Puppet

(Note: this hybrid management technique is being demonstrated in the puppet-ipa module for FreeIPA, but the idea could be used for other modules and scenarios too. See below for some use cases…)

The error message that puppet hackers are probably most familiar is:

Error: Duplicate declaration: Thing[/foo/bar] is already declared in file /tmp/baz.pp:2; 
cannot redeclare at /tmp/baz.pp:4 on node computer.example.com

Typically this means that there is either a bug in your code, or someone has defined something more than once. As annoying as this might be, a compile error happens for a reason: puppet detected a problem, and it is giving you a chance to fix it, without first running code that could otherwise leave your machine in an undefined state.

The fundamental problem

The fundamental problem is that two or more contradictory declarative definitions might not be able to be properly resolved. For example, assume the following code:

package { 'awesome':
    ensure => present,
}

package { 'awesome':
    ensure => absent,
}

Since the above are contradictory, they can’t be reconciled, and a compiler error occurs. If they were identical, or if they would produce the same effect, then it wouldn’t be an issue, however this is not directly allowed due to a flaw in the design of puppet core. (There is an ensure_resource workaround, to be used very cautiously!)

FreeIPA types

The puppet-ipa module exposes a bunch of different types that map to FreeIPA objects. The most common are users, hosts, and services. If you run a dedicated puppet shop, then puppet can be your interface to manage FreeIPA, and life will go on as usual. The caveat is that FreeIPA provides a stunning web-ui, and a powerful cli, and it would be a shame to ignore both of these.

The FreeIPA webui is gorgeous. It even gets better in the new 4.0 release.

The FreeIPA webui is gorgeous. It even gets better in the new 4.0 release.

Hybrid management

As the title divulges, my puppet-ipa module actually allows hybrid management of the FreeIPA types. This means that puppet can be used in conjunction with the web-ui and the cli to create/modify/delete FreeIPA types. This took a lot of extra thought and engineering to make possible, but I think it was worth the work. This feature is optional, but if you do want to use it, you’ll need to let puppet know of your intentions. Here’s how…

Type excludes

In order to tell puppet to leave certain types alone, the main ipa::server class has type_excludes. Here is an excerpt from that code:

# special
# NOTE: host_excludes is matched with bash regexp matching in: [[ =~ ]]
# if the string regexp passed contains quotes, string matching is done:
# $string='"hostname.example.com"' vs: $regexp='hostname.example.com' !
# obviously, each pattern in the array is tried, and any match will do.
# invalid expressions might cause breakage! use this at your own risk!!
# remember that you are matching against the fqdn's, which have dots...
# a value of true, will automatically add the * character to match all.
$host_excludes = [],       # never purge these host excludes...
$service_excludes = [],    # never purge these service excludes...
$user_excludes = [],       # never purge these user excludes...

Each of these excludes lets you specify a pattern (or an array of patterns) which will be matched against each defined type, and which, if matched, will ensure that your type is not removed if the puppet definition for it is undefined.

Currently these type_excludes support pattern matching in bash regexp syntax. If there is a strong demand for regexp matching in either python or ruby syntax, then I will add it. In addition, other types of exclusions could be added. If you’d like to exclude based on some types value, creation time, or some other property, these could be investigated. The important thing is to understand your use case, so that I know what is both useful and necessary.

Here is an example of some host_excludes:

class { '::ipa::server':
    host_excludes => [
        "'foo-42.example.com'",                  # exact string match
        '"foo-bar.example.com"',                 # exact string match
        "^[a-z0-9-]*\\-foo\\.example\\.com$",    # *-foo.example.com or:
        "^[[:alpha:]]{1}[[:alnum:]-]*\\-foo\\.example\\.com$",
        "^foo\\-[0-9]{1,}\\.example\\.com"       # foo-<\d>.example.com
    ],
}

This example and others are listed in the examples/ folder.

Type modification

Each type in puppet has a $modify parameter. The significance of this is quite simple: if this value is set to false, then puppet will not be able to modify the type. (It will be able to remove the type if it becomes undefined, which is what the type_excludes mentioned above is used for.)

This $modify parameter is particularly useful if you’d like to define your types with puppet, but allow them to be modified afterwards by either the web-ui or the cli. If you change a users phone number, and this parameter is false, then it will not be reverted by puppet. The usefulness of this field is that it allows you to define the type, so that if it is removed manually in the FreeIPA directory, then puppet will notice its absence, and re-create it with the defaults you originally defined.

Here is an example user definition that is using $modify:

ipa::server::user { 'arthur@EXAMPLE.COM':
    first => 'Arthur',
    last => 'Guyton',
    jobtitle => 'Physiologist',
    orgunit => 'Research',
    #modify => true, # optional, since true is the default
}

By default, in true puppet style, the $modify parameter defaults to true. One thing to keep in mind: if you decide to update the puppet definition, then the type will get updated, which could potentially overwrite any manual change you made.

Type watching

Type watching is the strict form of type modification. As with type modification, each type has a $watch parameter. This also defaults to true. When this parameter is true, each puppet run will compare the parameters defined in puppet with what is set on the FreeIPA server. If they are different, then puppet will run a modify command so that harmony is reconciled. This is particularly useful for ensuring that the policy that you’ve defined for certain types in puppet definitions is respected.

Here’s an example:

ipa::server::host { 'nfs':    # NOTE: adding .${domain} is a good idea....
    domain => 'example.com',
    macaddress => "00:11:22:33:44:55",
    random => true,        # set a one time password randomly
    locality => 'Montreal, Canada',
    location => 'Room 641A',
    platform => 'Supermicro',
    osstring => 'RHEL 6.6 x86_64',
    comment => 'Simple NFSv4 Server',
    watch => true,    # read and understand the docs well
}

If someone were to change one of these parameters, puppet would revert it. This detection happens through an elaborate difference engine. This was mentioned briefly in an earlier article, and is probably worth looking at if you’re interested in python and function decorators.

Keep in mind that it logically follows that you must be able to $modify to be able to $watch. If you forget and make this mistake, puppet-ipa will report the error. You can however, have different values of $modify and $watch per individual type.

Use cases

With this hybrid management feature, a bunch of new use cases are now possible! Here are a few ideas:

  • Manage users, hosts, and services that your infrastructure requires, with puppet, but manage non-critical types manually.
  • Manage FreeIPA servers with puppet, but let HR manage user entries with the web-ui.
  • Manage new additions with puppet, but exclude historical entries from management while gradually migrating this data into puppet/hiera as time permits.
  • Use the cli without fear that puppet will revert your work.
  • Use puppet to ensure that certain types are present, but manage their data manually.
  • Exclude your development subdomain or namespace from puppet management.
  • Assert policy over a select set of types, but manage everything else by web-ui and cli.

Testing with Vagrant

You might want to test this all out. It’s all pretty automatic if you’ve followed along with my earlier vagrant work and my puppet-gluster work. You don’t have to use vagrant, but it’s all integrated for you in case that saves you time! The short summary is:

$ git clone --recursive https://github.com/purpleidea/puppet-ipa
$ cd puppet-ipa/vagrant/
$ vs
$ # edit puppet-ipa.yaml (although it's not necessary)
$ # edit puppet/manifests/site.pp (optionally, to add any types)
$ vup ipa1 # it can take a while to download freeipa rpm's
$ vp ipa1 # let the keepalived vip settle
$ vp ipa1 # once settled, ipa-server-install should run
$ vfwd ipa1 80:80 443:443 # if you didn't port forward before...
# echo '127.0.0.1   ipa1.example.com ipa1' >> /etc/hosts
$ firefox https://ipa1.example.com/ # accept self-sign https cert

Conclusion

Sorry that I didn’t write this article sooner. This feature has been baked in for a while now, but I simply forgot to blog about it! Since puppet-ipa is getting quite mature, it might be time for me to create some more formal documentation. Until then,

Happy hacking,

James

 


3 Reasons Why Your Team Needs Rituals

It’s the same every morning: you get up and grab your morning coffee. No matter whether you brew it at home or fetch it on the road, your morning coffee is a ritual you never want to miss.

A ritual is a practice everyone knows how to do. It’s conducted regularly or on well defined occasions. Rituals help to create an identity for a group of people: nations, sports clubs or teams. How can rituals help form a high performing team?

Rituals Act as Social Glue

Rote repetition of team tasks creates a feeling of togetherness. Often you see teams invent their own, sometimes secret, rituals to more closely bind the group. Executing well known procedures together synchronizes people and strengthens the common ground on which to build trust. If you’re building a new team, or have issues with mistrust, introduce a few rituals like daily stand-ups or weekly planning meetings. Social rituals like team celebrations help as well.

Rituals Pace Your Work

They create a rhythm which structures your work week. You get used to having a daily stand-up every morning at 10 a.m. You know exactly that the next demo will be on Thursday afternoon – no surprises here. Adapting to changing schedules depletes energy, but rituals become second nature to everyone and create an environment where work can flow without friction. There’s no need to constantly watch out for last minute changes in schedules.

Rituals Help Keep Your Discipline

If you haven’t built rituals out of important meetings like retrospectives, it’s very likely you start to skip them. There are always reasons not to do what you should, but, if you make a ritual out of the important things, chances are better that you’ll do what’s needed. Following rituals is deeply embedded within humans. By using rituals, you tap into that deep, human power – the basis of which forms our communities.

But Our Work Environment Changes Too Frequently to Build on Rituals

I hear that a lot. In some work environments you hardly know what you’ll be doing in a few hours. How can you plan for a regular “ritual”? In such cases, it helps to introduce them based on events instead of time. A simple example is celebrating the birthday of a colleague. Make sure every birthday follows a certain ritual: bring everyone together for five minutes, blow out a candle and give a gift.

Another ritual which helps structure your day are daily stand-ups. Come together every day at a fixed time. For a few minutes, talk about what you did yesterday, what you’re doing today and whether you’re having problems. This ritual helps synchronize the team and creates a shared understanding of what’s happening.

One of the Biggest Mistakes is Skipping a Ritual

If you’ve agreed on a certain ritual, stick to it no matter what. People come to expect that the ritual will happen, and they’ll be disappointed if their expectations aren’t met. That’s the power of rituals: people rely on them.

Rituals act as social glue, help pace your work, and foster discipline. Like your regular morning coffee, they give your team structure and stability in an ever changing environment.

Rundeck and Automating Operations at Salesforce (Videos)

A few interesting videos have been posted over on Rundeck.org talking about Salesforce’s internal automation project, codename Gigantor. I’ve embedded the videos below.

It’s a great example of using a toolchain philosophy  to quickly build effective solutions at scale:

Rundeck is the workflow engine and system of record
SaltStack is the distributed execution engine (Salesforce’s Kim Ho wrote the SaltStack plugin for Rundeck)
Kingpin is a custom front-end that builds Salesforce-specific concepts and additional security constraints into the user experience

Gigantor_RundeckSalt

 

Kim Ho explains Gigantor’s architecutre and gives a demo of the SaltStack Plugin for Rundeck:

 

Alan Caudill presents an overview of how Gigantor works and some of the design choices:

 

 

The post Rundeck and Automating Operations at Salesforce (Videos) appeared first on dev2ops.

One minute hacks: the nautilus scripts folder

Master SDN hacker Flavio sent me some tunes. They were sitting on my desktop in a folder:

$ ls ~/Desktop/
uncopyrighted_tunes_from_flavio/

I wanted to listen them while hacking, but what was the easiest way…? I wanted to use the nautilus file browser to select which folder to play, and the totem music/video player to do the playing.

Drop a file named totem into:

~/.local/share/nautilus/scripts/

with the contents:

#!/bin/bash
# o hai from purpleidea
exec totem -- "$@"

and make it executable with:

$ chmod u+x ~/.local/share/nautilus/scripts/totem

Now right-click on that music folder in nautilus, and you should see a Scripts menu. In it there will be a totem menu item. Clicking on it should load up all the contents in totem and you’ll be rocking out in no time. You can also run scripts with a selection of various files.

Here’s a screenshot:

nautilus is pretty smart and lets you know that this folder is special

nautilus is pretty smart and even lets you know that this folder is special

I wrote this to demonstrate a cute nautilus hack. Hopefully you’ll use this idea to extend this feature for something even more useful.

Happy hacking,

James

 


One minute hacks: the nautilus scripts folder

Master SDN hacker Flavio sent me some tunes. They were sitting on my desktop in a folder:

$ ls ~/Desktop/
uncopyrighted_tunes_from_flavio/

I wanted to listen them while hacking, but what was the easiest way…? I wanted to use the nautilus file browser to select which folder to play, and the totem music/video player to do the playing.

Drop a file named totem into:

~/.local/share/nautilus/scripts/

with the contents:

#!/bin/bash
# o hai from purpleidea
exec totem -- "$@"

and make it executable with:

$ chmod u+x ~/.local/share/nautilus/scripts/totem

Now right-click on that music folder in nautilus, and you should see a Scripts menu. In it there will be a totem menu item. Clicking on it should load up all the contents in totem and you’ll be rocking out in no time. You can also run scripts with a selection of various files.

Here’s a screenshot:

nautilus is pretty smart and lets you know that this folder is special

nautilus is pretty smart and even lets you know that this folder is special

I wrote this to demonstrate a cute nautilus hack. Hopefully you’ll use this idea to extend this feature for something even more useful.

Happy hacking,

James

 


Common Objections to DevOps from Enterprise Operations

I’ve been in many large enterprise companies helping them learn about devops, helping them understand how to improve their service delivery capability. These companies have heard about devops and are looking for help creating a strategy to adopt devops principles because they need better time to market and higher quality. Not everyone in the company believes in devops for different reasons. To some, devops sounds like a free for all where devs make production changes. To others devops sounds like a bunch of nice sounding high ideals or that devops can’t be adopted because the necessary automation tooling does not exist for their domain.

DevOpsEntOpsObjects

In the enterprise, the operations group is often centralized and supports many different application groups. When it comes to site availability, the buck stops with ops. If there is a performance problem, outage or issue, the ops team is the first line of defense, sometimes escalating issues back to the application team for bug fixes or for help diagnosing a problem.

Enterprises interested in devops are also usually practicing or adopting agile methodology in which case demands on ops happen more often, during sprints (e.g., to set up a test environment) or after a sprint when ops needs to release software to the production site. The quickened pace puts a lot more pressure on the centralized ops team because they often get the work late in the project cycle (i.e., when it’s time to release to production). Because of time pressure or because they are over worked, operations teams have difficulty turning requested work around and begin to hear developers want to do things for themselves. Those users might want to rebuild servers, get shell access, install software, run commands and scripts, provision VMs, modify network ACLs, update load balancers, etc. These users essentially want to do things for themselves and might feel like the centralized ops team needs to get out of their way.

How does the ops team, historically the one responsible for uptime in the production environment, permit or expand access to environments they support? How can they avoid being the bottleneck at the tail end of every application team’s project cycle? How does the business remove the friction but not invite chaos, outages and lack of compliance?

If you’re in this kind of enterprise environment, how do you start approaching devops? If you are a centralized operations team facing the pressure to adopt devops, here are some questions and concerns for the organization to ask or think about. The answer to these questions are important steps to forming your devops strategy.

How does a centralized group handle the work that needs to be done to make applications run in production or across other environments?

For some enterprises, they begin by creating a specialized team called “devops” whose purpose is to solve “devops problems”. Generally, this means making things more operations friendly. This kind of team might also be the group that takes the hand off from application development teams and wrap their software in automation tooling, deploy it, and hand it off to the Site Reliability team. Unfortunately, a centralized devops team can become a silo and suffer from the same “late in the cycle” handoff challenges the traditional ops group sees. Also, there is always more developers and development projects than there can be devops engineers and devops team bandwidth. A centralized devops team can end up facing the same pressures as a traditional QA department does when they try “adding quality testing” as a separate process stage.

To make sure an application operates well in production and across other environments the devops concerns must be baked into the application architecture. This means the work to make applications easy to configure, deploy and monitor is done inside the development stage. The centralized operations group must then learn to develop a shared software delivery process and tool chain. It’s inside the delivery tool chain where the work gets distributed across teams. The centralized ops group can support the tool chain like architects and service providers providing the application development teams a framework and scaffolding to populate the needed artifacts to drive their pipeline.

What about our compliance policies?

Most enterprises abide by a change policy that dictates who can make production changes. Many times this policy is interpreted to mean anybody outside of ops is not allowed to push changes. Software must be handed off to an ops person to push the change. This handoff can introduce extra lead time and possibly errors due to lack of information.

These compliance rules are defined by the business and many times people on the delivery end have never actually read the language of these policies and base process on assumptions or their beliefs formed by tribal knowledge. Over time, tools and processes can morph in arcane ways, twisting into inefficent bureaucracy.

It’s common to find different compliance rules apply depending on the application or customer type. When thinking about how to reduce delivery cycle time, these differences should be taken into account because there might be alternative ways for seeing who and how change can be made.

Besides understanding the compliance rules, it should also be simple and fast to audit your compliance.

This means make it easy to find out:

  • who made the change and were they authorized
  • where the change was applied
  • what change was made and is it acceptable

This kind of query should be instantly accessible and not something done through manual evidence gathering long after the fact (e.g., when something went wrong). Knowing how change was made to an environment should be as visible as seeing a report that shows how busy your servers were in the last 24 hours.
These audit views should contain infrastructure and artifact information because both development and operations people want to know about their environments in software and server terms. A change ticket with a bunch of verbiage and bug links does not paint a complete enough picture.

How do you open access but not lose controls?

After walking through a software delivery process it’s easy to see the flow of work slows anytime the work must be done by a single team that is already past their capacity and is losing effectiveness due to context switching between competing priorities. This is the situation an ops team often finds itself. Ops teams balance work that comes from application development teams (e.g., participate in agile dev sprints), network operations (e.g., handling outages and production issues), business users (e.g., gathering info for compliance, asset info for finance) and finally, their own project work to maintain or improve infrastructure.

To free this process bottleneck the organization must figure out how the work can be redistributed or can be satisified by some self service function. Since deployment, configuration and monitoring are ops concerns that should be designed into the application, distribute this development to the developers. This can really be a collaboration where ops maintains a base set of automation modules and give developers ways to extend it. Create a development environment and tooling that lets developers integrate their changes into this ops framework in their own project sandboxes.
Provide developer access to create hosted environments easily through a self service interface that spins up the VMs or containers and lets them test the ops management code.

Build the compliance auditing logs into the ops management framework so you can track what resources are being created and used. This is important if resource conflicts occur and let you learn where more sandboxing is needed or where more fine grained configuration should be defined.

Moving faster leads to less quality, right?

To the business, moving fast is critical to staying competitive by increasing their velocity of innovation. This need to quicken the software delivery pace is almost always the chief motivation to adopt devops practices.

Devops success stories often begin with how many times deployments are done a day. Ten deploys a day, 1000 deploys a day. To an enterprise these metrics can sound mythical. Some enterprises struggle to make one deploy a month and I have seen some enterprises making major releases on an annual basis and the rollout of this release to their customers taking over 30 days. That’s thirty days of lag time and puts the production environment in an inconsistent state making it hard for everyone to cope with production issues. “Is it the new version or the old version causing this yet unidentified issue?” A primary reason operations is reluctant to move faster is due to the problems that occur during or after a change had been made.

When change leads to problems these are typical outcomes:

  • More control process is added (more approval gates, shorter change windows)
  • Change batches get bigger (cram more work into the given change window)
  • Increase in “emergency fixes” (high priority features get fast tracked to avoid the normal change process)
  • High pressure to make application changes quickly results in patching systems and not through the normal software release cycle.

Given these outcomes the idea of moving faster is crazy because obviously it will lead to breaking more stuff more often.

The question is how do organizations learn to be good at making change to their systems? Firstly, it is helpful to think about what kind of safety practices are important to move change. Moving fast means being able to safely change things fast. Here are some general strategies to consider:

Small batches

Large batches of change require more people on hand due to the volume of work and the work can take longer to get done.
The solution is to push less change through so it’s easier to get it done and have less to check and verify when the change is completed.

Rehearsal

Here’s a good mantra, “Don’t practice until you get it right. Practice until you can’t get it wrong.” Don’t make the production change be the first time you have tried it this way. Your change should have been verified multiple times in non production environments before you tried it in production. Don’t rely on luck. Expect failure.

Verifiable process stages

Whether it is a site build out or an update to an existing application, be sure you have well defined checks for your preconditions. This means if you are deploying an application you have a scripted test that confirms your external or environment dependencies before you do the deployment. If you are building a site, be sure you have confirmed the hardware and network environment before you install the operating platform. Building this kind of automated testing at process stage boundaries adds a huge deal of safety by not letting problems slip down stream. You can use these verification checks to decide to “stop the line”.

Process discipline

What leads to places full of snow flake environments, each full of idiosyncratic, specially customized servers and networks? Lack of discipline. If the organization does not manage change consistently together, everyone ends up doing things their own way. How do you know you have process discipline? Look for how much variation you see. If process differs between environments, that is a variation. Snow flake servers are the symptoms of process variation. Process variation means you don’t have process under control. There are two simple metrics to understand how much control you have over your process: lead time and scrap rate. Lead time is how long it takes you to make the change. Scrap rate is how often the change must be reworked to make it right. Rehersal and verifiable process stages will help you bring process under control by reducing scrap rate and stabilizing lead time. The biggest benefit to process discipline is improving your ability to deliver change predictably. The business depends on predictability. With predictability the business can guage how fast or slow it can move.

More access into ops managed environments?

The better everyone understands how things perform in production the better the organization can design their systems to support operations. Making it hard for developers or testers to see how the service is running only delays improvements that benefit the customer and reduces pressure on operations. It should be easy for anyone to know what version of applications are deployed on what hosts, the host configuration and the performance of the application.

Sometimes data privacy rules make accessing data less straightforward. Some logs contain customer data and regulations might restrict access to only limited users. Instead of saying no or making the data collection and scrubbing process manual, make this data available as an automated self service so developers or auditors can get it for themselves.

Visibility into the production environment is crucial for developers to make their environments production-like. Modeling the development and test envrionment so that it resembles production is another example of reducing variabilty and bringing process under control.

Does this mean shell access for devs?

This question is sometimes the worst one for a traditional enterprise ops team. Often times the question is a symptom of another problem. Why does a dev want shell access to an environment operations is supporting? In a development or early test envrionment shell access might be needed to experiment with developing deployment and configuration code. This is a valid reason for shell access.

Is this request for shell access in a staging or production environment? Requests for shell access could be a sign of ad hoc change methods and undermine the stability of an environment. It’s important that change methods are encapsulated in the automation.

Fundamentally, shell access to live operational environments is a question about risk and trust.


The list doesn’t stop here, but these are the most common questions and concerns  I hear. Feel free to share your experiences in the comments below.

The post Common Objections to DevOps from Enterprise Operations appeared first on dev2ops.

Securely managing secrets for FreeIPA with Puppet

Configuration management is an essential part of securing your infrastructure because it can make sure that it is set up correctly. It is essential that configuration management only enhance security, and not weaken it. Unfortunately, the status-quo of secret management in puppet is pretty poor.

In the worst (and most common) case, plain text passwords are found in manifests. If the module author tried harder, sometimes these password strings are pre-hashed (and sometimes salted) and fed directly into the consumer. (This isn’t always possible without modifying the software you’re managing.)

On better days, these strings are kept separate from the code in unencrypted yaml files, and if the admin is smart enough to store their configurations in git, they hopefully separated out the secrets into a separate repository. Of course none of these solutions are very convincing to someone who puts security at the forefront.

This article describes how I use puppet to correctly and securely setup FreeIPA.

Background:

FreeIPA is an excellent piece of software that combines LDAP and Kerberos with an elegant web ui and command line interface. It can also glue in additional features like NTP. It is essential for any infrastructure that wants single sign on, and unified identity management and security. It is a key piece of infrastructure since you can use it as a cornerstone, and build out your infrastructures from that centrepiece. (I hope to make the puppet-ipa module at least half as good as what the authors have done with FreeIPA core.)

Mechanism:

Passing a secret into the FreeIPA server for installation is simply not possible without it touching puppet. The way I work around this limitation is by generating the dm_password on the FreeIPA server at install time! This typically looks like:

/usr/sbin/ipa-server-install --hostname='ipa.example.com' --domain='example.com' --realm='EXAMPLE.COM' --ds-password=`/usr/bin/pwgen 16 1 | /usr/bin/tee >( /usr/bin/gpg --homedir '/var/lib/puppet/tmp/ipa/gpg/' --encrypt --trust-model always --recipient '24090D66' > '/var/lib/puppet/tmp/ipa/gpg/dm_password.gpg' ) | /bin/cat | /bin/cat` --admin-password=`/usr/bin/pwgen 16 1 | /usr/bin/tee >( /usr/bin/gpg --homedir '/var/lib/puppet/tmp/ipa/gpg/' --encrypt --trust-model always --recipient '24090D66' > '/var/lib/puppet/tmp/ipa/gpg/admin_password.gpg' ) | /bin/cat | /bin/cat` --idstart=16777216 --no-ntp --selfsign --unattended

This command is approximately what puppet generates. The interesting part is:

--ds-password=`/usr/bin/pwgen 16 1 | /usr/bin/tee >( /usr/bin/gpg --homedir '/var/lib/puppet/tmp/ipa/gpg/' --encrypt --trust-model always --recipient '24090D66' > '/var/lib/puppet/tmp/ipa/gpg/dm_password.gpg' ) | /bin/cat | /bin/cat`

If this is hard to follow, here is the synopsis:

  1. The pwgen command is used generate a password.
  2. The password is used for installation.
  3. The password is encrypted with the users GPG key and saved to a file for retrieval.
  4. The encrypted password is (optionally) sent out via email to the admin.

Note that the email portion wasn’t shown since it makes the command longer.

Where did my GPG key come from?

Any respectable FreeIPA admin should already have their own GPG key. If they don’t, they probably shouldn’t be managing a security appliance. You can either pass the public key to gpg_publickey or specify a keyserver with gpg_keyserver. In either case you must supply a valid recipient (-r) string to gpg_recipient. In my case, I use my keyid of 24090D66, which can be used to find my key on the public keyservers. In either case, puppet knows how to import it and use it correctly. A security audit is welcome!

You’ll be pleased to know that I deliberately included the options to use your own keyserver, or to specify your public key manually if you don’t want it stored on any key servers.

But, I want a different password!

It’s recommended that you use the secure password that has been generated for you. There are a few options if you don’t like this approach:

  • The puppet module allows you to specify the password as a string. This isn’t recommended, but it is useful for testing and compatibility with legacy puppet environments that don’t care about security.
  • You can use the secure password initially to authenticate with your FreeIPA server, and then change the password to the one you desire. Doing this is outside the scope of this article, and you should consult the FreeIPA documentation.
  • You can use puppet to regenerate a new password for you. This hasn’t been implemented yet, but will be coming eventually.
  • You can use the interactive password helper. This takes the place of the pwgen command. This will be implemented if there is enough demand. During installation, the admin will be able to connect to a secure console to specify the password.

Other suggestions will be considered.

What about the admin password?

The admin_password is generated following the same process that was used for the dm_password. The chance that the two passwords match is probably about:

1/((((26*2)+10)^16)^2) = ~4.4e-58

In other words, very unlikely.

Testing this easily:

Testing this out is quite straightforward. This process has been integrated with vagrant for easy testing. Start by setting up vagrant if you haven’t already:

Vagrant on Fedora with libvirt (reprise)

Once you are comfortable with vagrant, follow these steps for using Puppet-IPA:

git clone --recursive https://github.com/purpleidea/puppet-ipa
cd vagrant/
vagrant status
# edit the puppet-ipa.yaml file to add your keyid in the recipient field
# if you do not add a keyid, then a password of 'password' will be used
# this default is only used in the vagrant development environment
vagrant up puppet
vagrant up ipa

You should now have a working FreeIPA server. Login as root with:

vscreen root@ipa

yay!

Hope you enjoyed this.

Happy hacking,

James

 


Securely managing secrets for FreeIPA with Puppet

Configuration management is an essential part of securing your infrastructure because it can make sure that it is set up correctly. It is essential that configuration management only enhance security, and not weaken it. Unfortunately, the status-quo of secret management in puppet is pretty poor.

In the worst (and most common) case, plain text passwords are found in manifests. If the module author tried harder, sometimes these password strings are pre-hashed (and sometimes salted) and fed directly into the consumer. (This isn’t always possible without modifying the software you’re managing.)

On better days, these strings are kept separate from the code in unencrypted yaml files, and if the admin is smart enough to store their configurations in git, they hopefully separated out the secrets into a separate repository. Of course none of these solutions are very convincing to someone who puts security at the forefront.

This article describes how I use puppet to correctly and securely setup FreeIPA.

Background:

FreeIPA is an excellent piece of software that combines LDAP and Kerberos with an elegant web ui and command line interface. It can also glue in additional features like NTP. It is essential for any infrastructure that wants single sign on, and unified identity management and security. It is a key piece of infrastructure since you can use it as a cornerstone, and build out your infrastructures from that centrepiece. (I hope to make the puppet-ipa module at least half as good as what the authors have done with FreeIPA core.)

Mechanism:

Passing a secret into the FreeIPA server for installation is simply not possible without it touching puppet. The way I work around this limitation is by generating the dm_password on the FreeIPA server at install time! This typically looks like:

/usr/sbin/ipa-server-install --hostname='ipa.example.com' --domain='example.com' --realm='EXAMPLE.COM' --ds-password=`/usr/bin/pwgen 16 1 | /usr/bin/tee >( /usr/bin/gpg --homedir '/var/lib/puppet/tmp/ipa/gpg/' --encrypt --trust-model always --recipient '24090D66' > '/var/lib/puppet/tmp/ipa/gpg/dm_password.gpg' ) | /bin/cat | /bin/cat` --admin-password=`/usr/bin/pwgen 16 1 | /usr/bin/tee >( /usr/bin/gpg --homedir '/var/lib/puppet/tmp/ipa/gpg/' --encrypt --trust-model always --recipient '24090D66' > '/var/lib/puppet/tmp/ipa/gpg/admin_password.gpg' ) | /bin/cat | /bin/cat` --idstart=16777216 --no-ntp --selfsign --unattended

This command is approximately what puppet generates. The interesting part is:

--ds-password=`/usr/bin/pwgen 16 1 | /usr/bin/tee >( /usr/bin/gpg --homedir '/var/lib/puppet/tmp/ipa/gpg/' --encrypt --trust-model always --recipient '24090D66' > '/var/lib/puppet/tmp/ipa/gpg/dm_password.gpg' ) | /bin/cat | /bin/cat`

If this is hard to follow, here is the synopsis:

  1. The pwgen command is used generate a password.
  2. The password is used for installation.
  3. The password is encrypted with the users GPG key and saved to a file for retrieval.
  4. The encrypted password is (optionally) sent out via email to the admin.

Note that the email portion wasn’t shown since it makes the command longer.

Where did my GPG key come from?

Any respectable FreeIPA admin should already have their own GPG key. If they don’t, they probably shouldn’t be managing a security appliance. You can either pass the public key to gpg_publickey or specify a keyserver with gpg_keyserver. In either case you must supply a valid recipient (-r) string to gpg_recipient. In my case, I use my keyid of 24090D66, which can be used to find my key on the public keyservers. In either case, puppet knows how to import it and use it correctly. A security audit is welcome!

You’ll be pleased to know that I deliberately included the options to use your own keyserver, or to specify your public key manually if you don’t want it stored on any key servers.

But, I want a different password!

It’s recommended that you use the secure password that has been generated for you. There are a few options if you don’t like this approach:

  • The puppet module allows you to specify the password as a string. This isn’t recommended, but it is useful for testing and compatibility with legacy puppet environments that don’t care about security.
  • You can use the secure password initially to authenticate with your FreeIPA server, and then change the password to the one you desire. Doing this is outside the scope of this article, and you should consult the FreeIPA documentation.
  • You can use puppet to regenerate a new password for you. This hasn’t been implemented yet, but will be coming eventually.
  • You can use the interactive password helper. This takes the place of the pwgen command. This will be implemented if there is enough demand. During installation, the admin will be able to connect to a secure console to specify the password.

Other suggestions will be considered.

What about the admin password?

The admin_password is generated following the same process that was used for the dm_password. The chance that the two passwords match is probably about:

1/((((26*2)+10)^16)^2) = ~4.4e-58

In other words, very unlikely.

Testing this easily:

Testing this out is quite straightforward. This process has been integrated with vagrant for easy testing. Start by setting up vagrant if you haven’t already:

Vagrant on Fedora with libvirt (reprise)

Once you are comfortable with vagrant, follow these steps for using Puppet-IPA:

git clone --recursive https://github.com/purpleidea/puppet-ipa
cd vagrant/
vagrant status
# edit the puppet-ipa.yaml file to add your keyid in the recipient field
# if you do not add a keyid, then a password of 'password' will be used
# this default is only used in the vagrant development environment
vagrant up puppet
vagrant up ipa

You should now have a working FreeIPA server. Login as root with:

vscreen root@ipa

yay!

Hope you enjoyed this.

Happy hacking,

James

 


Jenkins, Puppet, Graphite, Logstash and YOU

This is a repost of an article I wrote for the Acquia Blog some time ago.

As mentioned before, devops can be summarized by talking about culture, automation, monitoring metrics and sharing. Although devops is not about tooling, there are a number of open source tools out there that will be able to help you achieve your goals. Some of those tools will also enable better communication between your development and operations teams.

When we talk about Continuous Integration and Continuous Deployment we need a number of tools to help us there. We need to be able to build reproducible artifacts which we can test. And we need a reproducible infrastructure which we can manage in a fast and sane way. To do that we need a Continuous Integration framework like Jenkins.

Formerly known as Hudson, Jenkins has been around for a while. The open source project was initially very popular in the Java community but has now gained popularity in different environments. Jenkins allows you to create reproducible Build and Test scenarios and perform reporting on those. It will provide you with a uniform and managed way to , Build, Test, Release and Trigger the deployment of new Artifacts, both traditional software and infrastructure as code-based projects. Jenkins has a vibrant community that builds new plugins for the tool in different kinds of languages. People use it to build their deployment pipelines, automatically check out new versions of the source code, syntax test it and style test it. If needed, users can compile the software, triggering unit tests, uploading a tested artifact into a repository so it is ready to be deployed on a new platform level.

Jenkins then can trigger an automated way to deploy the tested software on its new target platform. Whether that be development, testing, user acceptance or production is just a parameter. Deployment should not be something we try first in production, it should be done the same on all platforms. The deltas between these platforms should be managed using a configuration management tool such as Puppet, Chef or friends.

In a way this means that Infrastructure as code is a testing dependency, as you also want to be able to deploy a platform to exactly the same state as it was before you ran your tests, so that you can compare the test results of your test runs and make sure they are correct. This means you need to be able to control the starting point of your test and tools like Puppet and Chef can help you here. Which tool you use is the least important part of the discussion, as the important part is that you adopt one of the tools and start treating your infrastructure the same way as you treat your code base: as a tested, stable, reproducible piece of software that you can deploy over and over in a predictable fashion.

Configuration management tools such as Puppet, Chef, CFengine are just a part of the ecosystem and integration with Orchestration and monitoring tools is needed as you want feedback on how your platform is behaving after the changes have been introduced. Lots of people measure the impact of a new deploy, and then we obviously move to the M part of CAMS.

There, Graphite is one of the most popular tools to store metrics. Plenty of other tools in the same area tried to go where Graphite is going , but both on flexibility, scalability and ease of use, not many tools allow developers and operations people to build dashboards for any metric they can think of in a matter of seconds.

Just sending a keyword, a timestamp and a value to the Graphite platform provides you with a large choice of actions that can be done with that metric. You can graph it, transform it, or even set an alert on it. Graphite takes out the complexity of similar tools together with an easy to use API for developers so they can integrate their own self service metrics into dashboards to be used by everyone.

One last tool that deserves our attention is Logstash. Initially just a tool to aggregate, index and search the log files of our platform, it is sometimes a huge missed source of relevant information about how our applications behave.. Logstash and it's Kibana+ElasticSearch ecosystem are now quickly evolving into a real time analytics platform. Implementing the Collect, Ship+Transform, Store and Display pattern we see emerge a lot in the #monitoringlove community. Logstash now allows us to turn boring old logfiles that people only started searching upon failure into valuable information that is being used by product owners and business manager to learn from on the behavior of their users.

Together with the Graphite-based dashboards we mentioned above, these tools help people start sharing their information and communicate better. When thinking about these tools, think about what you are doing, what goals you are trying to reach and where you need to improve. Because after all, devops is not solving a technical problem, it's trying to solve a business problem and bringing better value to the end user at a more sustainable pace. And in that way the biggest tool we need to use is YOU, as the person who enables communication.