Migrating the Drupal public filesystem to S3

Problemspace

Solutionspace

How about moving the Drupal filesystem up to The Cloud? One of Amazon's earliest products in AWS was the Simple Storage Service (S3), and one of it's core usecases is serving public files like images for websites, removing the need for storage of the assets and the (admittedly minimal) compute resources to serve them.

We had both of the issues outlined above and have just completed a migration of all our files up to S3, so I thought I'd write down some discoveries.


s3fs module

This is a rather unfortunately named module, since there is another open source project out there with the exact same name. I knew of it first and so assumed that this module had something to do with that, but that's not the case. Btw, I learned this in Steven Merrill's excellent session at DrupalCon in May, check out the video if you're still with me.

In a nutshell, s3fs hijacks the Drupal filesystem. You can put both your public and private filesystems up there, simple to do since S3 has a very rich permissions feature set. Just don't deliberately make your private filesystem “public” and you're set. It then rewrites any URLs that would've been to assets on your local filesystem to point to their new location in S3.

The setup is pretty straightforward, so just a few observations.

aws s3 sync . s3://NAME_OF_BUCKET/nameofsite.com/s3fs-public/ --acl public-read
// from Drupal docroot
$ rsync -av --prune-empty-dirs --include='\*/' / 
--include='\*.jpg' --include='\*.png' --include='\*.svg' / 
--include="\*.js" --include="\*.css" --include="\*.gif" /
--include="\*.woff" --include="\*.ttf" --include="\*.map" /
 --exclude='\*' . ~/some/destination/dir

This says “gimme all those file types in the whole Drupal file tree and move them over to some other dir” that I can then use AWS CLI to sync up to S3. You should take the opportunity before running this to delete all your local public:// files, because they'll get sucked up in this command as well. You won't need them anymore after you do this migration anyway.

// from ~/some/destination/dir
$ aws s3 sync . s3://NAME_OF_BUCKET --acl public-read

All in all fairly simple, and in theory makes our setup much more portable between environments as well as vendors. Another excellent writeup of this module can be found here.

#drupal