Spock Loves Amazon Web Services

Though we don’t usually blog about technical details here, we’re so crushingly enamored with Amazon S3 and EC2 that we felt compelled to write a short post about it.

Since the first months of Spock when we were busy prototyping early versions of the site, we’ve made heavy use of Amazon Simple Storage Service (S3). When you’ve got millions of profile pictures to display and each photo can have many thumbnail sizes, that’s a lot of data to store, manage, and serve. We started by serving the photos off our web servers, but that quickly got to be a management nightmare. We did some quick math and realized we’ve save quite a bit of money using S3 to serve our photos instead. Though there was some work in the initial integration, after we finished, we’ve probably spent less than 15 minutes/month thinking about photo storage. Since then we’ve used S3 for various other tasks like serving our Spock Challenge data set, which is gigabytes big and was heavily downloaded.

Aside from S3, we’ve also made good use of EC2. Amazon’s Elastic Compute Cloud (EC2) is a great way of getting computing power when you need it, and more when you need more. And really, when you’re crawling, classifying, indexing, and extracting people info from the entire World Wide Web, you definitely need more. EC2 is great because you can create one virtual machine with all the software you want, and then instantly clone it across hundreds of virtual boxes. The more we use it, the more we begin to wonder why we bother to operate any physical machines ourselves.

So yeah, we’re pretty psyched with Amazon Web Services, and we’re looking forward to leveraging other neat services they roll out in the future.

