Building a custom Amazon Machine Image (AMI) in Amazon’s EC2 cloud is not difficult, however, bundling AMI after AMI and keeping track of different versions can be. This post will help you better understand the process of “baking” AMI’s and provide you with a script to make the process a trivial matter. This post assumes that you are starting with an existing AMI from one of the hundreds (possibly thousands by now) of publicly available AMI’s registered in EC2. I wanted a Centos5 AMI so I chose one from the good people at RightScale as they have a very good reputation. They even publish the scripts that they used to build the AMI’s so you can see exactly what goes into them. You could start there by modifying one of their build scripts to produce your initial AMI, but this is an extra step that is really not necessary for most people. These are the basic steps in getting your custom AMI into EC2 so that you can use it to launch purpose dedicated instances.
- Launch an instance using an initial public AMI
- Modify the instance to fit your needs
- Make a new AMI (bundle) from the customized instance
- upload the bundle to Amazon’s S3
- Register the new AMI as either a public or private AMI so that you can use to launch new machines
At OpenX, our AMI’s have evolved quite a bit since our original spec was written and so we started versioning them early on. I place the AMI version number in the /etc/motd so that as soon as I log into a machine, I can see the version number and instantly know what I am dealing with.
The script I wrote is in bash and takes as an argument the current version number, ex: 1.2 . OpenX has multiple Amazon EC2 accounts but we want them all to use the same AMI’s, so I need to bundle one for each account. This becomes a very time consuming task very quickly, thus this script.
I start by creating a directory called ec2/account_name/ for each EC2 account and putting the cert and key pem files inside along with a file called vars which contains all of the information that I need for the bundle and upload procedures:
EC2=backend
EC2_CERT=/ebs/ec2/backend/cert-XXXXXXXXXXXXXXXXXXXX.pem
EC2_PRIVATE_KEY=/ebs/ec2/backend/pk-XXXXXXXXXXXXXXXXXXX.pem
awsUID=XXXX-XXXX-XXXX
awsAID=XXXXXXXXXXXXXXXXXXX
awsSID=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
awss3bucket=base.$bits.backend
The first part of my script deals with getting the version number from the command line and sourcing the variables for all the accounts. It also determines whether the machine to be bundled is a 32 bit or 64 bit machine:
#!/bin/bash
#this script will deploy current ami for each account listed in ec2/keys
accounts=`ls /ebs/ec2`
arch=`uname -i`
log=/ebs/log.bundle
version=$1
#sanity checks
if [ -z $1 ]
then
echo "Missing parameter: version number"
exit 1
fi
#set bits variable according to system arch
if [ $arch = "i386" ]
then
bits=32
else
bits=64
fi
cat /dev/null > $log
for i in $accounts
do
source /ebs/ec2/$i/vars
This command is available in the Amazon AMI Tools package and works by creating a snapshot of the current filesystem, then compresses, encrypts and signs the snapshot. Amazon has determined that the max size an AMI can be is 10GB. This obviously limits the number of packages that you can have installed when you bundle your AMI. This is not really a huge limitation since you don’t want to lock yourself into a lot of “baked in” applications. You can always add them after you launch your instance and that will give more flexibility in the long run. The things you want to customize in the AMI are things like, base packages, new YUM repositories, different shells, core services that start on boot, motd’s etc.
The second segment of my bash script deals with the bundle command:
#bundle
echo
echo -en '\E[40;32m'"\033[1m"`date +%H:%M:%S` Bundling for $i"\033[0m"
echo `date +%H:%M:%S`" Bundling for $i" >> $log
echo
ec2-bundle-vol -e /ebs -p "baseV"$version"-"$bits.$i -d /mnt -k $EC2_PRIVATE_KEY -c $EC2_CERT -u $awsUID -r $arch
if [ $? -ne 0 ]
then
echo
echo -en '\E[40;31m'"\033[1m"`date +%H:%M:%S` Bundling for $i failed"\033[0m"
echo
echo `date +%H:%M:%S`" Bundle of $i failed " >> $log
else
echo -en '\E[40;32m'"\033[1m"`date +%H:%M:%S`" Bundling of $i completed successfully""\033[0m"
echo
fi
This will generate 73 seperate “parts” of your image in /mnt along with an xml manifest, which tells Amazon the details of your AMI, and put them into /mnt.
Note: If the manifest or one of the parts is ever deleted from the S3 bucket that we are going to upload to in the subsequent steps, you will not be able to use that AMI again.
The next thing to do is to upload the parts and manifest to S3. We use the ec2-upload-bundle to do this:
ec2-upload-bundle -b S3-BUCKET -m MANIFEST-PATH -a AWS-ACCESS-KEY-ID -s AWS-SECRET-KEY [--acl ACL] [--ec2certificate PATH] [-d DIRECTORY] [--part PART] [--url URL] [--retry] [--skipmanifest]
I used s3 firefox organizer to create two buckets in S3 for each account. I called one base.32.account_name and called another base.64.account_name for 32 and 64 bit AMI’s respectively. I wrote my script to automatically upload the parts and manifest to the proper buckets in the proper accounts.
The next section of the script uploads to S3 using the credentials from our ec2/account_name/vars files and the $version and $bits variables from the first segment:
#upload
manifest="baseV"$version"-"$bits.$i."manifest.xml"
echo
echo -en '\E[40;32m'"\033[1m"`date +%H:%M:%S` Uploading to $i:$awss3bucket"\033[0m"
echo `date +%H:%M:%S`" Uploading to $i:$awss3bucket" >> $log
echo
ec2-upload-bundle -b $awss3bucket -a $awsAID -s $awsSID -m /mnt/$manifest
if [ $? -ne 0 ]
then
echo
echo -en '\E[40;31m'"\033[1m"`date +%H:%M:%S`" Upload to $i failed ""\033[0m"
echo
echo `date +%H:%M:%S`" Upload to $i failed " >> $log
echo
echo -en '\E[40;32m'"\033[1m"`date +%H:%M:%S` Re-Uploading to $i:$awss3bucket"\033[0m"
echo `date +%H:%M:%S`" Re-Uploading to $i:$awss3bucket" >> $log
echo
ec2-upload-bundle -b $awss3bucket -a $awsAID -s $awsSID -m /mnt/$manifest
else
echo -en '\E[40;32m'"\033[1m"`date +%H:%M:%S`" Uppload to $i completed successfully""\033[0m"
echo
fi
#rm -rf /mnt/"baseV"$version.$i*
done
Just need to wrap up the code:
echo
echo -en '\E[05;40;32m'"\033[1m"`date +%H:%M:%S`" Deployment completed""\033[0m"
echo `date +%H:%M:%S`" Completed" >> $log
echo
exit 0
How I use it all:
I created a small EBS and copied the bundle.sh script and the ec2/ dir to it. I then made a snapshot for good measure. Anytime I need to “bake” a new AMI, I launch an instance using the previous version of the AMI and then attach the EBS with the bundling tools on it to the new instance in /ebs. The mount point is important as the ec2-bundle command (when called from my script) excludes that directory while bundling so that my tool box and more importantly, my sensitive EC2 information does get baked into the new AMI. I then update the version numbers and make any changes to the instance, “cd” to the /ebs directory and finally do a “history -c” to keep from developing a giant command history over the generations. Actually, there are several things that can grow like cancer from generation to generation, the root authorized_keys file being another. Every time you launch a new instance EC2 adds the launch_key to the authorized_keys file, so if you don’t clean it out before bundling, you will have an extra set of keys in the file. The next time you’ll have two extra keys, and so on, and so on. Mail for root or other accounts is another thing to watch.
You are not quite ready to use your AMI yet. The final step will be to register your AMI with Amazon. That will described be in the next post, with a perl script to do the heavy lifting for you. But if you can’t wait, look into the ec2-register command.




Hi
This is really great and exactly what I was looking for.
Its been a long time since you posted this, but is there any chance of seeing the perl script you use to register the the AMI with Amazon – “The final step will be to register your AMI with Amazon. That will described be in the next post, with a perl script to do the heavy lifting for you.”
Cheers
Rob
Rob,
Glad to hear that it was helpful. Sorry to report that while writing the registration script the process changed, the scope of the script started to change and then the company I was working for decided to move our cloud internally. The short story is, I have some snippets in perl but nothing complete to offer.
Cheer to you sir,
Jeff
Hi, this is a helpful and informative post. One small correction: The current limit on an EC2 AMI is 10 GB, not 10 MB.
Quite right, thank you. I made the correction in my post.
[...] your own Amazon EC2 image Jeff Roberts, the vim-fu guru, does it again with a great post on “Bundling versioned AMIs rapidly in Amazon’s EC2“. It’s a step-by-step guide on how to roll your own AMI, bundle it and upload it to S3, [...]