Saturday, 26 June 2010

fabric and deployment tools

For some years I have made it a rule that if I ever find myself putting more than a couple of lines into a shell script, that I should make it a python script instead. That way, if  I end up reusing it and it inevitably grows in complexity. I dont drift into the land of unmaintainable conditionals and loops of bash scripts of hundreds of lines, but stay with familiar and modular python modules, classes and functions.

This use of raw python for shell scripts has evolved over the last month or so into using the python library Fabric. Now all my shell scripts, are metamorphosing into  its aptly named fabfiles. There are a number of what may be called shell frameworks around with Fabric being the leading python one, and Capistrano an equivalent Ruby offering.

The core benefits that these tools offer is a set of handy functions and standard approaches to running ssh based shell commands across one or more servers. Since everything is in effect controlled locally via ssh the servers themselves need nothing extra installed on them, and indeed if they are missing some standard utilities, then they can be also be run locally instead and the resulting preprocessed files punted over to the remote box.

The hidden benefit they offer is that by taking a few minutes to always commit any repeated command line to a fabfile, you soon  build up a modular library of common shell processes that would otherwise either exist as very project / platform specific shell scripts or maybe even more inefficiently still be done manually.

You are  now on the road to recording (and hopefully comitting to a versioning system) all the processes that surround a project, and hence take a major step towards its future maintenance, documentation and hopefully fully automated build, release, data population, testing etc.

How does it work?

The answer is that its just python, and its just standard command line actions. So for example perhaps the most basic of functions, passed no args, to restart apache on a remote production server ...

@hosts('', '')
def restart_prod:
  """ Restart the production apaches """
  out = sudo(apache2ctl configtest)
  if out.find('Syntax OK')>-1:
    sudo(apache2ctl graceful)
    print 'Not restarting apache due to config errors - see above'

This example restarts apache on two production apache servers that deliver the site, it checks to see if the config is OK first then uses fabrics sudo function to run the restart if it is.

The next steps are you normally want to restart Apache for a reason.
So you are rolling out a code change perhaps. Or rebuilding data, or clearing server session caches etc. The first of these is perhaps the most common. The temptation is only to rollout code via upping it from your versioning system. But with a simple fabric function you can add a local check in to this. If its a production server that its rolling out to, then a function that tags the current HEAD or TIP to prod_stable and to an auto incremented release number is also handy.
Of course if you have written a test suite then automatically running that on a demo build may a step you want to add in as an acceptance flag for allowing the prod_stable tagging.
You may also need to do some standard actions after checking out code such as relink in some local var data directory into the checkout locale.

All these actions are simple in themselves but take a number of commands and across different servers are quite time consuming.
Now you can just run one or two lines to rebuild your data, deploy your code, eg.
fab deploy:prod

Almost as vital for future maintenance / handover, you also have a versioned script containing all the configuration and deployment details headed with constants that describe all the dependent servers of your dev, demo, test, train or production deployments for the project and can live with the project code in a build directory.

Where shell frameworks, build tools and configuration management meet

Having said that there are a number of areas where a shell framework can be used to do things that are really either in the realm of a build tool, or a configuration management system. The line that should be drawn here is that if a platform has  mature build tools that work with it, e.g. buildout or pip in the case of python, then dont recreate them in a shell framework. Similarly for configuration management. This is a difficult line to draw between developer and sys-admin.  A practical line is anything that is specific to the project code, data fixtures etc., is suited to developer custom shell framework deployment / scripting . Whilst the VM, OS, core language installation, project layout, data backup is more in the realm of configuration management (puppet, bcfg, et al.). But this may just be because we tend to have release management as a developer role, whilst system failover rebuild etc. is one for the sys admin.

Of course this is a natural point of conflict with language build tools, shell frameworks etc. aiming to allow completely replicable build or configuration  processes across any platform. Whilst config management tends to revolve around building the whole platform - hence naturally aims to use whichever platforms its building's own setup tools (e.g. linux distros package managers) as the most stable and tested for it.

So as a rule of thumb if you have to build consistently across different hosters where you dont have access to config management then you may be tempted to step more on its toes with a shell framework, similarly if build tools are somehow restricted for you, then their territory is up for grabs. But on the whole all of these tools can have a place in a configuration and deployment stack, as illustrated in this posting on a deployment blog.