EC2 Backup Ideas: SVN


April 10, 2008 by Tom Maiaroto (0 Comments)

So as many already know, EC2 doesn't save data if the instance is turned off...in fact, as of right now my data isn't backed up. I have an image, but it's dated, meaning this post won't be "safe" if something were to happen.

While I don't expect my instance to crash and while I'm not doing anything that would make it crash, you never know. So everyone is figuring out backup and failover solutions. In fact, it's a requirement for any production site.

So I'm going to jot down a list of possible solutions and tips over time so I can go back to figure out which to employ.

First, Subversion is your friend. I already have this site on SVN. I don't use FTP (bleh) any longer in fact. All changes to this site are done via SVN post commit scripts. Meaning after I make a change and submit it to the repository for versioning, it updates this site with just the changes. This really saves a boat load of time.

We can use Subversion and it's great hook scripts for more though. In addition to running an update command on the working copy that runs this site, we can also mirror our SVN repository to an S3 account. Not only are we backing up to S3 now (on each change to the site), but we're getting it all under version control.

The downside in my mind to this is that I make a bunch of changes (especially in development) so this will give us an increased number of requests which can add up. I really think requests are the silent killer when it comes to S3. So perhaps instead of running on the post-commit, we can simply setup a cron job to run the command to sync the repository every day or whatever time period.

Remember with SVN: It only updates what has changed, so the transfer in will be much less than if you sent all the files or even the changed file. SVN is jus sending the changes.

So that's two options.

1. Keeping the site's code completely backed up from the minute it changes.

2. Backing it up on a daily/hourly/etc. basis.

It will completely depend on the level of risk you can or wish to take. I'll most likely (I haven't set this up yet) run a cron job every day. Also important to note: If there are no changes, nothing happens. So this backup is efficient.

Drawbacks: It won't copy over the database or any images/files that were uploaded outside the SVN. For example, I can attach an image with this post and SVN won't know about it. So it would be lost along with this post. A mySQL replication/backup solution would have to be figured out along with some sort of rsync of the files (ignoring .svn directories) to catch the images. This does backup the site's scripts as well...but doesn't put them under version control which is a very nice thing to have. So in the event of a crash on a larger site, not only would there be recovery, but all the changes that were made to the site would still be saved...Very handy if it was a scripting error that caused the instance to crash somehow. You could pinpoint it easier.

Here's a more info about how from the free SVN book.

Cleaning Up...


April 9, 2008 by Tom Maiaroto (0 Comments)

There will be some cleaning up going on to this blog application. I will also be changing up the way it looks, probably in line and on track with the templating system...which of course just uses CakePHP's "themes" approach.

Though the blog's functionality is pretty far along, so I'm going to be working on some refinements to what I have now. More importantly, I'm going to be working with setting up load balancing solutions and take care of other scalability tasks. This way I can move forward with a more prepared system.

The overall goal here, in case anyone was wondering, is to build a blog (media blog) application that is infinitely scalable and hopefully *impossible* to crash. Of course with the ability to extend the base code and have more than just a media blog in the form you see here.

Setup Blog on EC2


April 9, 2008 by Tom Maiaroto (0 Comments)

Ok, so I moved to Amazon EC2 from the old host, www.apthost.com. Apthost was great (and I still have them of course for a while), but oddly enough my sessions started having issues and I couldn't log in to the admin back end of my application...I thought I broke my code...naturally I check it locally then on another server of mine. It worked. So I'm going to assume (without spending a ton of time) that Apthost did something to their php.ini or other settings that created and issue.

There's a lot of problems like this with shared hosting and that's ultimately the reason why I'll never go back...after I start moving off all my things from Apthost, Lunarpages, and some other host that I really shouldn't have. Now, EC2 has it's drawbacks of course as well. I hope to outline some of these and other topics related to EC2 in this blog along with posting updates about what I'm doing with Minervablog, which is of course a media blog. Maybe I'll even have some screencasts. Important to note though is that Minervablog isn't dead in the water (for the very few that have seen it so far) and it is probably going to take on some new changes due to EC2...I can now program it to infinitely scale and test that ability out without incurring a ton of charges. Then who knows, you might see an AMI out there of Minervablog =)