Using AWS S3 with WordPress for Media Asset Storage

By | January 15, 2014

As you know I recently migrated this blog from Google’s Blogger service to a self hosted WordPress installation running on the AWS Cloud. Since then, I have been working on improving the resilience and scalability of the new platform by using a few additional AWS features.

WPandS3LogosOne of the things I have changed is to move media assets, such as images, from the EC2 instance that runs the WordPress application to be externally hosted on AWS S3. This can offer a few benefits. Firstly, offloading this aspect of the application reduces memory usage and processor load on the WordPress server. Secondly, it paves the way for future use of AWS Auto Scaling Groups to simplify scaling of the WordPress environment and finally it will also allow for future use of the AWS CloudFront content distribution network to reduced content load times.

Not necessarily required for a very low traffic site like this one, but for me it’s also a useful exercise in how to transition to an architecture that offers higher availability and scalability, which is a process that many people need to go through at some point.

So, how did I go about making the transition? Firstly, there are a couple of WordPress plugins that you can install to integrate your WordPress environment with AWS S3. These are as follows:

The wp-amazon-web-services plugin contains the AWS PHP libraries and manages access keys. This is required by other AWS plugins. You can find this on Github here.

The Amazon S3 and CloudFront plugin, which copies files to Amazon S3 as they are uploaded to the Media Library and optionally allows you to configure Amazon CloudFront for faster delivery. This is available from wordpress.org here.

Once you have installed the two plugins you need to provide your AWS credentials to allow the plugin to access AWS S3 to create objects that will hold your images once they have been uploaded to your media library. In order to do this you need to store some AWS access key credentials in an unencrypted file called ‘wp-config.php’ on your WordPress server.

This brings us neatly to the subject of this post, which is how to use AWS Identify and Access Management, or IAM, to restrict the permissions that you allocate to an application, assigning minimal permissions so that it can only access specific AWS services or resources or perform certain actions. The means that if anyone unauthorised gets hold of the access keys that are stored in ‘wp-config.php’ they will be restricted to performing the actions defined within the IAM policies that apply to those keys and will not be able to access any other services, for example launching high cost EC2 instances, or reading or deleting data held within other S3 buckets

In this specific case the WordPress S3 plugin needs to be able to list the buckets that exist within my S3 account in order to allow one to be selected to hold my images and to be able to create objects within the one specific bucket that I am going to use to host the media assets for my WordPress site.

Here’s a guide on how to achieve this using the AWS Management Console.

Firstly, navigate to the IAM Management Console at https://console.aws.amazon.com/iam/. Once you have this open in your browser select ‘Users’ underneath the detail heading in the left hand panel and click the ‘Create New Users’ button that appears.

Next, enter the username that you want to use. It’s a good idea to use something memorable so that you can remember which users are associated with which applications. In this instance I used ‘wp-s3-user’. Make sure that ‘Generate an access key for each User’ is checked and click the ‘Create’ button at the bottom. This will open a pop-up confirming that your user has been created and allowing you to either show the security credentials so that you can cut and paste them to into a text file to save them or download the security credentials in a comma separated format text file. Make sure you do one of these because you are going to need the credentials later to insert into your wp-config.

Close the pop-up and you will be returned to the user list where you will be able to see the new user that you have just created listed. Next click the checkbox next to the newly created user, select the permissions tab in the frame beneath the user list and click on the ‘Attach User Policy’ button.

You will now see three options for creating your user policy. Select ‘Custom Policy’ here. We are going to perform this step twice because we are going to create two distinct policies, one for listing the S3 buckets in the account and one for allowing the new user to make changes to a specific S3 bucket.

Enter a policy name for the first policy that will allow the user you created to list all of the S3 buckets under your account. I used ‘AllowListAllMyBuckets-wp-s3-user’ and cut and paste  the policy document below into the Policy Document field:

{ “Version”: “2012-10-17″, “Statement”: [ { “Sid”: “AllowListAllS3Buckets”, “Effect”: “Allow”, “Action”: [ “s3:ListAllMyBuckets” ], “Resource”: [ “*” ] } ] }

Click ‘Apply Policy’ to save the first policy. Now select ‘Custom Policy’ again to enter the second policy, which will allow the newly created user to make changes to the specific S3 bucket that you will be using to store the media assets for your WordPress application. I named this policy ‘AmazonS3FullAccesstoBucket-<BUCKET NAME>-wp-s3-user’. Once again, cut and paste the policy document below into the Policy Document field, replacing <BUCKET NAME> with the name of the bucket you are going to use, but leaving the /* on the end because the policy needs to be applied to all objects in the bucket.

{ “Version”: “2012-10-17″, “Statement”: [ { “Effect”: “Allow”, “Action”: “s3:*”, “Resource”: “arn:aws:s3:::<BUCKET NAME>/*” } ] }

That completes the IAM configuration. Next you need to access the your WordPress dashboard and select the AWS Settings option from the left hand menu. This will provide you with details of the changes that you need to make to your wp-config file to add the credentials that you stored earlier. Once you have edited and saved your  wp-config you can select the ‘S3 and CloudFront’ menu from the AWS menu in your WordPress dashboard and select the S3 bucket that you gave permissions for the WordPress application to access and you’re done.

When you upload an image using the Media Library on WordPress you will see that the image appears in the S3 bucket a few minutes later and that any posts that you use that image in will reference the external URL where the image is stored on S3. Importantly, the plugin also creates resized images to use as ‘featured’ images and replaces references for the featured images with external S3 references.

Resized images in an S3 Bucket

The changes we have made mean that all media assets that you upload from this point onwards will be placed in an S3 bucket rather than on your WordPress application servers. However, it’s important to note that the plugin does not move any pre-existing images to S3. If you want to do this you either need to delete and re-upload the images, or manually upload the images to S3 and update your posts to reflect the new image locations. I would recommend the first option of re-uploading because this fixes the references for featured images as well as images that are used within posts.

Thanks for reading, and feel free to ask any questions that you might have in the comments.

8 thoughts on “Using AWS S3 with WordPress for Media Asset Storage

  1. Nick Abbey

    Nice post, we are looking at something similar but have many sites that exist with hundreds of megs of assets already stored locally. Are you aware of a mechanism to move that existing data to s3 AND change the url to the asset in the database?

    Reply
    1. Ian Massingham Post author

      Hi Nick. There’s a lot of comments/feature requests on the Github page for the plugin that address this topic. I would suggest taking a look there. It can certainly be done.

      Reply
    1. Ian Massingham Post author

      That’s one way to do it, but all of the content that you move and/or upload would still need to be served out via the web server on your EC2 instance or instances, which isn’t the most scalable or cost efficient way to approach it. This is a quick and easy way to scale the amount of storage that you have for media assets for low traffic sites though, but ideally you should look to completely offload serving the assets from your web/application instances and let S3 and/or CloudFront handle this traffic instead.

      Reply
  2. Nick Abbey

    Thanks for offering solutions, I really appreciate the input. However, It looks like were going with a one time push of of pre-exiting assets to the S3 bucket using the aws cli s3 sync command, and will use a re-write rule in the site’s .htaccess to handle any requests for old media. Since we intend to continue using the Amazon S3 and Cloudfront plugin to automate the uploads of new media, the rewrite rule has to take that in to consideration and only rewrite get requests, so that local uploads complete normally. This is necessary to allow the Amazon S3 and Cloudfront plugin to operate normally. It’s a stopgap solution, and while it does provide a lot of flexibility, we intend to implement a more permanent solution in the form of sql scripts that will modify the urls within the db such that all urls for “old” media point to the S3 bucket. At that point we can take the rewrite rule out of the htaccess. As it stands, I’ve almost completed the implementation. Just hacking around with THE_REQUEST directive for apache to get the rewrite for get requests working properly. Asked the ServerFault community for a little help with that, here:
    http://serverfault.com/questions/610000/apache-rewrite-using-the-request-to-rewrite-the-url-get-request-to-an-s3-bucket

    Reply
  3. Tomislav Mavrovic

    I encountered a weird behaviour while trying to get this S3/CF to work with WP. Everything seems to be working on the AWS part. As I upload images in my WP via Add Media button, I see those images in my bucket along with resized versions, so that communication seems to be working fine.

    In AWS Amazon S3 and CloudFront settings, the right bucket is selected and under “CloudFront Settings” domain name I have the dxxxx cloudfront net domain.

    I’m using HawkHost to host WP files and in cPanel there is a Simple DNS Zone Editor where I made a User-Defined Record CNAME cdn website com for my dxxxx cloudfront net

    However, media files don’t have the cdn address, they have the dxxxx net one. Also, as soon as I activated the access key in wp config and added AWS plugin, the old images like blog logo disappeared. I can reupload it no problem, but why would they disappear?

    Is it safe to put the cname cdn website com address in CloudFront settings? Thanks!

    Reply
  4. Ryan

    I found that my image meta captions/alt txt no longer works once this is installed and serving from S3. Is there a way to still get image captions?

    Reply

Leave a Reply to Chief Editor Cancel reply