Amazon S3 Backups – Large Files From VPS To S3 Using s3cmd

As you’ve hopefully read I’m pretty anal about backing up my computer and my websites.  I’ve been in the market for a new 2nd online location for my backup files.  Over the years I’ve heard fantastic stories regarding Amazon S3 as a wonderfully inexpensive and reliable  storage system and I’ve been anxious to give it a try.   Add to the fact that you can get 5GB of storage for free made it even more tempting.  Unfortunately I couldn’t find any easy way to FTP, SFTP, SSH, rsync, or otherwise get files over to my Amazon S3 buckets, and my tech chops are relatively limited 🙁

Fortunately a couple brilliant people pointed me in a direction that led me to a really great option for storing my backups onto S3!  Here are some links, details and steps that will hopefully help others in their search!

Web Resources:

Basic Setup / Process:

  1. Use a combo of the daily MySQL backup script and other backup processes to create backup files of my databases, sites, etc.
  2. Copy (sync) the backups to Amazon S3 using s3cmd
  3. Auto archive backups older than 60 days from Amazon S3 to Amazon Glacier
  4. Delete local server backups older than 10 days.

I get 5 GB of storage free on S3 for the first year.  At most I’ll probably use an additional 5 – 10 GB, but even this is only about $1 – $2 a month!  I’ll then backup maybe 10-30 GB to Glacier which is a whopping $0.01 per GB… YES, a PENNY PER GIG!

Total monthly backup cost: About $3.00

You can do your own pricing calculations using Amazon’s calculator here: http://calculator.s3.amazonaws.com/calc5.html

How I set it all up:

First I read and watched all the resources listed above.  I also have a basic knowledge of PuTTY and unix commands.

Next I signed up with Amazon s3 and created a bucket (plenty of info on the web for this process.  I suggest searching youtube).

After I was setup with Amazon I installed s3cmd on my VPS… well, I had my VPS techs do it, but here are the commands they used:

bash-3.2# cd /etc/yum.repos.d
bash-3.2# wget http://s3tools.org/repo/RHEL_5/s3tools.repo
bash-3.2# yum install s3cmd

Next I did some tests to make sure I could access S3 using s3cmd by doing simple commands like:

s3cmd ls s3://mybackupbucket

I then decided to dive in and sync one of my huge backup directories with a bucket on S3:

s3cmd -r sync /home/account/backups s3://s3cmd ls s3://mybackupbucket

Next thing I knew, I had files flying from my VPS over to Amazon S3!  The only thing that surprised me was the speeds weren’t nearly as fast as I would have liked / expected.  I was transferring files at an average rate of 250 kB/s.  I’m not sure if that a limit on Amazon’s side or my VPS.  I did notice that I could open up additional PuTTy sessions and run more sync commands which would run at 250 kB/s, so multiple threads running in parallel didn’t seem to slow things down.

Right now I’m manually running the sync command, but my plan is to setup a cron job that will run the following commands nightly:

  • My daily MySQL backup script
  • Sync with Amazon S3: s3cmd -r sync /home/account/backups s3://s3cmd ls s3://mybackupbucket
  • Delete Old Files: find /path/to/files* -mtime +5 -exec rm {} \;

I’m far from being an expert with this stuff… which may actually help others that are at the same level of technicity that I am.

So, what do you think about this backup process and using this system for storing large files on Amazon S3?   If you have any thoughts or comments on what I’m doing or how I’m doing it, please post them below!

UPDATE 4/7/14:
The backup system is working great!  My charges for the past few months have been:

  • November:$0.12
  • December:$0.22
  • January:$0.35
  • February:$0.59
  • March: $0.80

That’s CRAZY!

Here’s a screenshot of my most recent bill:

aws-s3-price

You can see that most of the charge is from my usage in Glacier. What’s crazy is that Amazon just announced a drop in storage pricing!

UPDATE 5/20/15:
Here are my charges over the past year:

s3bill

I’ve got a TON of old daily backups from a year+ ago that I probably don’t need to keep, but deleting them is almost not even worth it to save $1 – $2 a month.

 

1 thought on “Amazon S3 Backups – Large Files From VPS To S3 Using s3cmd”

Comments are closed.