December 14, 2013

Day 14 - What is Packer?

Written By: Mike English (@gazoombo)
Edited By: Michelle Carroll

Packer is a new open source tool for building identical machine images. It's written in Go and maintained by HashiCorp, the creators of Vagrant.

Much like VeeWee (an earlier image-building tool by Patrick Debois from which Packer certainly drew some inspiration), Packer allows you to repeatedly create new VM images from build configurations defined in source code.

Packer takes advantage of an extensible plugin architecture to allow the same source templates to be used with multiple builders, provisioners, and post-processors to create artifacts. For example, a single JSON template and set of Chef cookbooks or Puppet modules could be used to create an AMI, a VirtualBox-based Vagrant basebox, and a VMWare image to be deployed to vSphere.

It allows for a great deal of code reuse in this regard, as well as portability to other environments. Packer also extends the ideas of infrastructure as code and automation down to the level of OS installation.

What can I use it for?

You can use Packer with as little as a template and a kickstart file (or equivalent) to build JeOS images for your organization. This is a great way to make explicit one's assumptions about what constitutes a "minimal OS install" as it pertains to your distribution of choice and options like SELinux or default language.

You can also use Packer along with higher level provisioning tools to build images with packages and configuration pre-installed. For example, you may have an application server built atop a JeOS image that isn't ready to accept an application deployment until it's been provisioned for the first time with a 45 minute Chef run. You could use Packer with the Chef-Solo provisioner to build pre-provisioned images so that the whole 45 minute Chef run doesn't need to occur for each individual node.

Depending on how your application manages state, you may even be able to build fully-provisioned images ready to run services on boot. For example, having pre-built, fully-provisioned images could be very useful when auto-scaling quickly.

Traditional image-based deploys

Aren't images a big step backwards?

There are some upsides to the traditional approach to images:

  • Once you get a gold master image, you can keep using it... until your needs change (hint: never change your requirements).
  • Re-using images is better than configuring everything by hand every time.

But, here are some of the downsides often associated with the traditional approach:

  • It takes 3 weeks to get an image after filling out the paperwork to request one! Like most IT problems, this is partly a tooling problem, but mostly an organizational one.
  • Images are often created by hand, meaning they are error-prone, undocumented, and not easily repeatable. This can lead to discrepancies between Production and Development environments, or worse—discrepancies and idiosyncrasies between discrete Production systems.
  • These problems lead to overprotective attitude of the "gold master" image, landing us back in a situation where the heavy-handed request processes and long turnaround time are unlikely to be challenged.

Packer is not your [old boss]'s approach to machine images!

Configuration Management

Yeah, so, configuration management tools saved us from all that, right? I thought we were freed from images by tools like Puppet or Chef!

Using tools like Puppet and Chef has drastically improved configuration management by making it repeatable and self-documenting, leading to parity between Production and Development environments and enabling rapid change.

Downsides?

  • It's not quite as easy to "ship" CM as it is to "ship" an image. (Take, for example, launching a new EC2 instance from an AMI vs. launching one with an AMI plus a bunch of CloudFormation scripts…)
  • It can take a really long time to get through the first provisioning run on a new system when you're configuring something especially complex.
  • OS-level configuration not always well-addressed across platforms. Assumptions about the underlying OS installation are not made explicit in code and documentation can fall through the cracks.

Images the Packer Way

Using Configuration Management tools with Packer, images can be repeatable, self-documenting, and portable across multiple platforms (thanks the many builder plugins available). Packer also ensures production/development parity more than configuration management alone. With Packer, you're more likely to include OS-installation-level config in the source code (kickstart / preseed / etc.), and it forces a bare minimum of documentation of assumptions about the underlying OS installation.

Immutable Infrastructure and the Question of State

When building images for more than a simple well-defined JeOS base, the inevitable question arises of how you deal with state. This is good. We should all spend a lot more time thinking about this.

Taking Inspiration from Functional Programming

Finding better ways to manage state has led many software developers to adopt a functional programming paradigm. Some of these concepts can be applied to the way we think about our infrastructure as well.

Over the summer, Chad Fowler put forth the interesting idea of Immutable Deployments:

Many of us in the software industry are starting to take notice of the benefits of immutability in software architecture. We’ve seen an increased interest over the past few years in functional programming techniques with rising popularity of languages such as Erlang, Scala, Haskell, and Clojure. Functional languages offer immutable data structures and single assignment variables. The claim (which many of us believe based on informal empirical evidence) is that immutability leads to programs that are easier to reason about and harder to screw up.

So why not take this approach (where possible) with infrastructure? If you absolutely know a system has been created via automation and never changed since the moment of creation, most of the problems I describe above disappear. Need to upgrade? No problem. Build a new, upgraded system and throw the old one away. New app revision? Same thing. Build a server (or image) with a new revision and throw away the old ones.

Not There Yet

Kris Buytaert recently remarked that "...immutable applications are really the exception rather than the rule.". That is, we should be careful about thinking that we can start deploying all of our applications as binary virtual appliances.

It's true, most useful applications need to persist a lot of state, and often do so in complex ways. That isn't to say that "Immutable Deployments" are impossible—it just means we have a lot of work to do.

Virtual Appliances

This past year, (but, unfortunately, before Packer was released), I worked on a project to build a Virtual Appliance. The deliverable was a process for creating an OVA containing several related applications suitable for deployment on the most commonly used hypervisors in the enterprise.

From early on in the project, we treated the question of state persistence as a primary concern. Our approach was to make sure we had a good import/export process. Newly launched appliance images were designed to be able to import from backups or go through a first-time configuration process. In this way, new artifacts can be deployed as images while maintaining continuity. Even though we had a good automated build process for our appliance images, we had to do the work to ensure we managed the application state appropriately.

Disaster Recovery and Phoenix Servers

Combining well-managed state with a good automated image build process also provides a great deal of value when it comes to disaster recovery. If you lose all of your production servers, but you have offsite backups that are easily imported to new nodes running your up-to-date image, you can get back online much more quickly.

In his definition of a Phoenix Server, Martin Fowler provides an apt thought experiment:

One day I had this fantasy of starting a certification service for operations. The certification assessment would consist of a colleague and I turning up at the corporate data center and setting about critical production servers with a baseball bat, a chainsaw, and a water pistol. The assessment would be based on how long it would take for the operations team to get all the applications up and running again.

This may be a daft fantasy, but there's a nugget of wisdom here. While you should forego the baseball bats, it is a good idea to virtually burn down your servers at regular intervals. A server should be like a phoenix, regularly rising from the ashes.

In order to prevent your new images from becoming the dreaded stale gold masters of old, consider taking advantage of the repeatability and automation these new tools provide.

In short

Packer is a tool for building identical machine images. It can help to provide repeatability, documentation, parity between production and development environments, portability across platforms, and faster deployments. But like any tool, it still requires a good and thoughtful workflow to be used most effectively.

For an example of how to use Packer, see the Getting Started section of the documentation.

1 comment :

Sarah Elkins said...

Nice overview of some of the history and pros/cons of different approaches. I enjoyed the flashback to gold images (I remember burning the gold CD and then duplicating it).