setting cache metadata on files in s3

i use Amazon’s AWS Console for uploading files to S3 and i always have to remember to manually add in the Cache-Control metadata so that i don’t get boned on bandwidth fees. i wish they would put in a nice default for that but oh well. i searched around for a better tool to use that would let me also recursively update the metadata on all my S3 files but the only one that seemed to have that feature and run on OSX cost $70! youch!

well it turns out to be easy to code up a script to do this in python using the boto library so to save you all some time (and money!!) here’s what i ended up writing :) this recursively runs through all the objects in all the buckets on an S3 account and sets the Cache-Control for JPG and PNG files.

from boto.s3.connection import S3Connection
connection = S3Connection('aws access key', 'aws secret key')
buckets = connection.get_all_buckets()
for bucket in buckets:
    for key in bucket.list():
        print('%s' % key)
        if key.name.endswith('.jpg'):
            contentType = 'image/jpeg'
        elif key.name.endswith('.png'):
            contentType = 'image/png'
        else:
            continue
        key.metadata.update({
            'Content-Type': contentType,
            'Cache-Control': 'max-age=864000'
        })
        key.copy(
            key.bucket.name,
            key.name,
            key.metadata,
            preserve_acl=True
        )