Launching EC2 On Demand: Video Transcoding

by

At Grio, we use EC2 to power almost all of our server needs. Amazon hosting provides a convenient means of housing a web server and database server, a wiki, and our client development environment. It’s a cost efficient solution for companies like ours, in that we can avoid the hassle or purchasing and maintaining hardware. The strategy allows us to add servers only when we need them and remove them when they are no longer needed. Since Amazon’s pricing structure is based on the duration of server’s up-time, we want to make sure that we only use a server when necessary.

If you’re interested in saving money (who isn’t?) your EC2 instances should have a limited idle time. In this blog entry, I’ll discuss how one can create an EC2 instance (which is equivalent to a “server”), use it to process computing-intensive tasks, and terminate it when the process is completed. We assume some familiarity with the EC2 API.

As a real-world example, we’ll build a server that will transcode high-definition movies ill-suited for web delivery to smaller flash-based FLV files that are more appropriate for that purpose. Once the transcoding process completes, the server will terminate. Transcoding is a very cpu-intensive process. Thus, co-locating your video file manipulation with your your web server may not be such a good idea. At the same time, (unless you are You Tube and need real-time transcoding) having a dedicated video transcoding EC2 instance running 24/7 may be redundant and will incur high hosting cost (at the time of this writing, it costs about $70/month for small instance and $550/month for extra large one).

Before we start, let’s pick an AMI (Amazon Machine Image) that will be used as the basis of our work. We’ll use Amazon’s Fedora 8 public AMI (ami-id=ami-f51aff9c). Then, we’ll need to perform the following tasks:

  1. Configure the instance to run custom script on instance launch.
  2. Bundle and register the configured instance. 
  3. Create script to install required software for video transcoding.
  4. Write script to download videos, transcode them and send the results to a destination such as S3.
  5. Launch the instance by passing it with the scripts.
Let’s get started.
Configure the Instance to Run a Custom Script
Our first objective is to configure an instance so that it can automatically execute a given script on instance creation. Let’s run the base instance using the EC2 API:

ec2-run-instances ami-f51aff9c

Assuming that we’ve authorized port 22, we can now ssh into the instance. Modify /etc/rc.local file and add the following line at the end of the file:

wget http://169.254.169.254/1.0/user-data -O /tmp/autorun
sh /tmp/autorun

Here we are introducing a bootstrapping mechanism into the instance, so that we can run a shell script when the instance is started. The wget call takes Amazon’s magic URL as an argument. Even though this looks a bit strange, everyone uses the same ip address 169.254.169.254 in the url. Amazon will analyze the wget request and return data based on the caller’s ip:

The wget is used to pull down the data uploaded by ec2-run-instance when it’s called with -f argument. So, if the following command is executed locally…

ec2-run-instances ami-xxxxxxxx -f transcoding-bootstrap.sh

…then file transcoding_bootstrap.sh will be uploaded to Amazon’s server and become downloadable from our instance ip through the magic URL http://169.254.169.254/1.0/user-data. One restriction in this process is that the uploaded script has to be less than 16KB in size. Otherwise, we’ll get the following error:

Client.InvalidParameterValue: User data is limited to 16384 bytes

The “sh /tmp/autorun” line basically treats the downloaded file (viz transcoding_bootstrap.sh) as a valid shell script and runs it.

Bundle and register the configured instance

Now that we have an instance that’s capable of running a custom bootstrap script, we are ready to bundle it and register it as an AMI for later use. To bundle the instance, we’ll ssh in and run this line:

ec2-bundle-vol -d /mnt/ –cert [your_certificate] –private [your_privatekey] –user [your_user_access_id]

This will create an image of our instance and place it in /mnt folder. Now we’ll push the image to S3 so we can register it later:

ec2-upload-bundle –access-key [your_access_key] –secret-key [your_secret_key] –bucket [your_s3_bucket] –manifest image.manifest.xml

Now, let’s register the instance:

ec2-register [your_s3_bucket]/image.manifest.xml

If all succeeds, we’ll get back an AMI id. Let’s say it’s ami-a123beef.
Installing Transcoding Software
Let’s create a bootstrap script that will be useful in our transcoding scenario. There are two logical parts to this script: installing the required transcoding software and performing the transcoding. 
The script will first install FFMPEG and all of its required libraries. We could have taken the approach of installing the software first and bundling the AMI second. However, our approach of decoupling the script from the AMI gives us greater control over which versions of the software to use. Furthermore, it’s not a particular burden as installing the software will only take a few minutes to complete.
Here’s the contents of the script:

#
# install ffmpeg
#
yum -y install ncurses-devel gcc gcc-c++ libtool svn git yasm gsm-devel libogg-devel libvorbis-devel libtheora-devel

svn export svn://svn.mplayerhq.hu/ffmpeg/trunk /mnt/ffmpeg-trunk-source
cd /mnt/ffmpeg-trunk-source
git clone git://git.videolan.org/x264.git
cd x264
./configure  –prefix=/usr –enable-shared –enable-pthread –disable-asm
make
make install

cd ..
wget http://liba52.sourceforge.net/files/a52dec-0.7.4.tar.gz
tar -zxvf a52dec-0.7.4.tar.gz
cd a52dec-0.7.4
./configure –prefix=/usr –enable-double
make
make install
cd ..

wget http://downloads.sourceforge.net/faac/faac-1.26.tar.gz
tar -zxvf faac-1.26.tar.gz
cd faac
autoreconf -vif
./configure –prefix=/usr
make
make install
cd ..

wget http://downloads.sourceforge.net/faac/faad2-2.6.1.tar.gz
tar -zxvf faad2-2.6.1.tar.gz
cd faad2
autoreconf -vif
./configure –prefix=/usr
make
make install
cd ..

wget http://downloads.sourceforge.net/lame/lame-3.98b8.tar.gz
tar -zxvf lame-3.98b8.tar.gz
cd lame-3.98b8
./configure –prefix=/usr
make
make install
cd ..

wget http://libmpeg2.sourceforge.net/files/mpeg2dec-0.4.1.tar.gz
tar -zxvf mpeg2dec-0.4.1.tar.gz
cd mpeg2dec-0.4.1
./configure –prefix=/usr
make
make install
cd ..

wget http://downloads.xvid.org/downloads/xvidcore-1.1.3.tar.gz
tar -zxvf xvidcore-1.1.3.tar.gz
cd xvidcore-1.1.3/build/generic
./configure –prefix=/usr
make
make install
cd ../../../

wget http://ftp.penguin.cz/pub/users/utx/amr/amrnb-7.0.0.1.tar.bz2
tar -jxvf amrnb-7.0.0.1.tar.bz2
cd amrnb-7.0.0.1
./configure –prefix=/usr
make
make install
cd ..

./configure –prefix=/usr –enable-static –enable-shared –enable-gpl –enable-nonfree –enable-postproc –enable-avfilter –enable-avfilter-lavf  –enable-libamr-nb –enable-libfaac –enable-libfaad –enable-libfaadbin –enable-libmp3lame –enable-libvorbis –enable-libx264 –enable-libxvid
make
make install

When the script is run, we’ll have a proper ffmpeg software running on the instance. The next thing to do is to extend the script so it will automatically pull videos to transcode, transcode them and push the result to a storage destination. Here’s the basic pseudocode to perform the tasks:

while more videos are available {
download [video_file]
ffmpeg -i [video_file] -f flv -b 350kbps [output_file]
push [out_file] to destination
}


There are several ways of supplying the set of video files to transcode. If the videos are from external sources, an RSS feed may be the easiest way to retrieve them. Amazon SQS (simple queueing service) may be another alternative. The choice will depends on how our business and technology platform is structured.
Our script demonstrates the use of ffmpeg to transcode video into flv movie with 350kbps bit rate. This bitrate is high enough to produce good quality for web consumption.
Once transcoded, the videos may be uploaded to any destination. A CDN such as limelight or Akamai will provide an FTP interface to upload files to its repository. S3 is another popular destination.

Launch the instance by passing it with the scripts

Now we have the script (let’s call it transcoding_bootstrap.sh) and the AMI (ami-a123beef). We are ready to launch it whenever there are videos to be transcoded.

 

To launch, we’ll execute the following command:

ec2-run-instances ami-a123beef -f /path/to/transcode.sh

This will launch Amazon’s Fedora 8 instance, install FFMPEG, download videos and transcode them. Once the process has completed, we can terminate the instance by issuing the command “ec2-terminate-instances”).
Conclusions
EC2 provides the flexibility of launching a virtual server at a very attractive price point. However, if not used intelligently, you may end up paying as much as if not more than if you were to host a physical server. Knowing that it’s possible and easy to bring up a server and tear it down will do wonders for your pocketbook. If an instance is 50% idle, launching that instance on demand will save 50% of the hosting cost. At Grio, we love to see you save money.

Leave a Reply

Your email address will not be published. Required fields are marked *