WordPress, Simple Press, and AWS

End of last month I’ve set up a site based on WordPress with a forum system called Simplepress for a project of two friends of mine and me. I decided to take advantage of the free Amazon AWS tier for this, so I’ve set up a new account for the project with the appropriate domain name to get started immediately.

As a fan of Debian I went for the Debian 7 AMI, 64 bit, for the first AWS t1.micro instance. Since the applications to be installed will require a database, I opted to use the AWS RDS products rather than a locally installed MariaDB instance, mainly as I knew that over time I will add more servers, and as such having a central DB would make it much easier in the long run. Installing and setting up a WordPress system is straight forward, so won’t cover much of that here. Of course where it comes to the DB settings, enter the AWS RDS parameters. First thing I usually do is moving the wp-config.php file up one directory, as it is really not needed in the directory which will be served world wide, and WP will recognize the file in the new location. However the directory one up is the only place where WP will find wp-config.php if it’s not in its default location. Within AWS I registered an Elastic IP and pointed it to the new instance.

Following the configuration of all required items, I’ve added and activated the Simplepress plugin and proceeded with the configuration for the forum such as the forum groups, naming, etc., and when all done, I installed the “W3 Total Cache” (W3TC) plugin to use its CDN function for AWS S3. The plan is to create an S3 bucket where user avatars will be uploaded to from the machine to be served from the bucket instead. This will require the enabling of the option to serve sites from S3 in the bucket options, as well as a separate user with AWS permissions and credentials to access the S3 bucket.

The least amount of S3 actions are required, the policy below is enough for the functionality of the W3TC plugin with the S3 bucket:

{
  "Statement": [
    {
      "Action": [
        "s3:ListAllMyBuckets"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::*"
    },
    {
      "Action": ["s3:GetObject","s3:PutObject","s3:PutObjectAcl","s3:PutObjectVersionAcl"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::BUCKET_NAME", "arn:aws:s3:::BUCKET_NAME/*"]
    }
  ]
}

Be careful to not enable the option within the CDN options named “Force over-writing of existing files”, otherwise it will constantly upload existing files to the bucket, at the specified interval within the CDN options, which of course can rack up your S3 usage and therefore the costs. Which I of course found out in the long run after wondering why the S3 PUT/COPY/POST/LIST requests where rising so fast so high. Unfortunately however the plugin doesn’t quite do what I’d expected from it. It will upload the items to the S3 bucket as per specified plugin CDN parameters, however it will retain a copy of the file on the local disk and will still require on it being there. As example, user X uploads avatar #A1, #A1 will be uploaded to the Simplepress avatars directory and immediately visible to the user. W3TC will upon its next synchronisation run upload the file to the S3 bucket from where it will be served the through rewriting the URL to the avatar. However I’ve noticed both that the Simplepress profile overview will always take the local URL, not the S3 served URL, which is a bit annoying, but also when I delete the local file, the S3 file will not be served anymore, even though it’s still located in the bucket. So in the end the local file will still always be required, which is a bit contrary than what I wanted and in the long run means I may need to add additional disk space.

For the first few days it was all functional, all good and no problems. But what if the server would go down one day, when I’m not readily available to fix it immediately, or if there’ll be a higher load of visitors one day, causing the site to slow down?

So a second server would be required, of course. Easiest way is to spin up another t1.micro and configure it the same way as the first one, with $0.02 (AWS only calculates in USDollars) per hour it’s fairly cheap. However within AWS you can also request spot instances, which are instances of the various available sizes, but at lower prices, based on availability and what customers are willing to pay, somewhat similar to a sort of bidding system for prices and instances. If the current spot price is lower than your max bid, the server will be activated, if it’s higher, it will shut down. So I requested a spot instance and set the max price to be the same as the regular price for a t1.micro, $0.02. Checking the pricing history, the spot price for a t1.micro is usually $0.0061, so nice and cheap. However once in a while it has gone up to $0.08, 4x higher than the regular price, but went back to $0.0061. This has only happened a few times in the last month according to AWS history, in the two months before that it stayed stable at $0.0061. So for the most part it’s stable enough, and as a secondary server it’s good enough.

Once that server was all started, set up and configured and an elastic IP added to it, of course a form of load balancer is required to serve the traffic. AWS offers the Elastic Load Balancer (ELB), which is good enough for what is required for this site.

Configure it for the ports required and the site, point the DNS (all done in Route 53 via AWS) to the ELB, and the second site is functional. Again also make sure here you have really configured everything. In my case I was troubleshooting for a bit why once in a while the site would not load. In the end I visited each server by IP only to realise that of course the RDS Database would need to have access granted from the new machine as well. That all done, both sites worked fine finally.

What would need adjustment still are the synchronization of the forum avatars between both machines and the load balancer stickiness. As mentioned above, if the forum avatar is synced to the S3 bucket but not available locally, it will still fail to show up. So having a server where it’s located on one but not on the other, it will therefore also only show up around 50% of the time. So a synchronisation between both machines has to be added as well, in this case done with rsync, which runs every two minutes via cronjob.

This will also mean that when a user uploads an avatar, it may not immediately be on the server he currently is being forwarded to via the ELB. To combat this, I’ve enabled the stickiness on the ELB and set it to 5 minutes. This will not of course catch the edge cases, say someone’s only for 280 seconds on one server, uploads the image, and upon the expiry of the 5 minutes ends up on the other server, but for most cases it will be fine.

I’m currently investigating the options and pricing of keeping S3 storage and using as it is, or adding two further machines to deliver an NFS service. However that would also open up a whole new can of worms, as at the end of the day, those two NFS machines also need to be synchronized one way or another. And so far from a cost perspective S3 will likely be cheaper until we probably exceed 1.5TB of storage, then perhaps an NFS cluster may be more cost effective.

Once all this was done, another issue popped up. Mailing. Initially all of the sites mails were relayed through one of my private mail servers for delivery, though of course this meant that the reverse DNS wasn’t the correct site’s address.

This I’ve addressed by adding another t1.micro instance, in this case a reserved instance. A reserved instance is a server of any type you wish, where you pay a certain amount upfront for a one or three year term, and subsequent hourly prices are much cheaper, so over the course of the reserved period the reserved instance is cheaper than a regular instance. As this machine will be required all the time, I’ve opted for this way, and will likely in the near future add a secondary MX machine, then via a spot instance.

Amazon offers the option to set a custom reverse DNS rather than having the default AWS DNS. This will require the A record of the same name to be point to the machine you want the reverse DNS set. So in this case I would need to set the A record for example.com to the mailing machine, whereas currently example.com and www.example.com are pointed to the load balancer to be served.

Meaning that all traffic to example.com would now need redirecting somehow. For a small task like this the Apache webserver would be overkill, and nginx instead was taken for this task. The configuration is dead easy as well for the redirection. Remove the existing symlink from /etc/nginx/sites-enabled, create a new file in /etc/nginx/sites-available, enter (and adjust) below config, and symlink it back to /etc/nginx/sites-enabled, followed by an nginx restart. That’s all that’s required for it, and will now forward all incoming traffic to example.com (the mail machine) to www.example.com (the web server cluster behind the ELB)

server {
    listen   80;
    server_name example.com;
    return 301 $scheme://www.example.com/;
}

The mailing system itself has been configured using postfix, dovecot and a local mySQL installation, and clamav for mail virus scanning.

There are still things which will need attention, such as a secondary MX server and a better way of syncing avatars and their availability. I was thinking of either modifying a plugin to always rewrite for the requested image to be served from S3, or using apache rewrite formulas, however in the long run those are just quick-fixes and not a full solution.

However for now the systems are doing well, and the monitoring tool I use (OMD – see Open Monitoring Distribution) agrees with me, as you can see from the graphs.

Below are the graphs for memory used, CPU load and CPU utilization for the first webserver, second webserver and the mailing server. The latter has a bit more use caused by the regular clamav scanning.

bb01_mem_used_month

bb01_cpu_load_used_month

bb01_cpu_util_used_month

bb02_mem_used_month

bb02_cpu_load_used_month

bb02_cpu_util_used_month

bbm01_mem_used_month

bbm01_cpu_load_used_month

bbm01_cpu_util_used_month

Leave a Reply

Your email address will not be published. Required fields are marked *