Monday, March 31, 2014

AWS, Chef and scaling a mixed Windows/Linux environment (Part 4)

The AMI Bakery


Here's the part that I really wanted to write about.

We decided it was time to invest in the ability to generate fully configured AMIs for our applications.

The Roles Cookbook


Back to the roles cookbook from a prior post. Each role had a "role_launch" recipe that did all the things necessary to take a base CentOS/Windows AMI and turn it into that thing. We now added a role_ami recipe that was very similar, but with some important differences:

  • nothing would start or be run during creation: db migrations, service starts, etc... would be set to run at boot time, but not now. 
  • the ami_prep recipe would install a startup script that would allow the instance to register itself with the chef-server when it came online. 

AMI Prep


Windows. Linux. userdata. So many different ways to do things, and none of them quite as controlled or specific as we wanted. 

Like so many other wheels, I reinvented it. 

  • Each platform has an init script. For linux it's a standard SysV init style, and for Windows it's a powershell script that runs at boot via task scheduler. 
  • The script only runs on the initial boot. It looks for a file firstboot.txt in the root volume and only runs if it is present. On completion of the run the file is removed. 
  • Both use an sns handler recipe to generate an alert if the run fails. This way we only hear about it when something bad happens.  
  • In both cases the instances use IAM roles and AWS command line tools to fetch a default runlist from an S3 bucket and execute chef-client. 

Adding the --bake flag

Our main tool for instance creation is the appropriately named "create_instance.py" script.

Previously this script would:
  • create/update security groups
  • create/update IAM roles
  • create instance
  • bootstrap chef / run the role_launch recipe
With the new --bake flag it could now:
  • create/update security groups
  • create/update IAM roles
  • create instance
  • bootstrap chef / run the role_ami recipe
  • stop the instance
  • take an ami of the instance
  • wait for the ami to be available
  • terminate the instance
  • clean up the chef node objects

Post deploy task

Immediately post-deploy, a job in Jenkins is triggered to bundle new AMIs for the current release. That way if we need to scale rapidly, we have an image. More on that when I post about scaling.

Previous: Improvements

No comments:

Post a Comment