Get Heroku pgbackup captures every ten minutes for free! (almost)
You might have heard about the recent AWS outage. Needless to say this outage affected many websites and also many services that piggybacks on AWS cheap cloudy goodness. Heroku is one such service. Heroku also happens to be the service of choice for the startup I work for. While I applaud the herculean effort both by Amazon and by Heroku to manage and mitigate the disaster, the whole thing caught us a bit with our pants not properly around our proverbial waists. It turns out Heroku’s database backup feature, pgbackups, is not a one click automatic affair.
Before I say anything else I do want to mention that Heroku is AWESOME. If you are a small company that does not have the resources to manage your own racks of computers somewhere then I would say Heroku is the best of breed cloud based Rails hosting.
One of the few things I found a bit confusing though is their pgbackups system. You one click add the free “daily backup” feature, take a look at the docs and everything looks straightforward and simple. The only thing is that adding the feature to your app does not in any way schedule it to actually backup anything. To the best of my knowledge you basically need to use the Heroku cron service and put a call to capture a backup in there. Maybe this is obvious to some, but given how effortlessly and automatically everything else works on Heroku, I think it is easy to see how it could be over looked.
Either way, backups were not gotten, aws goes down, Heroku goes down, our site goes down.
Over 48 hours later the majority of service had been restored, except some sites that were using the shared database system like ours. One of the Heroku devs asked if we had a backup we could restore from as this would allow us to be up immediately rather than wait for amazon to restore the EBS volumes that our data happened to be sitting on. As already mentioned, we did not.
Fortunately everything did eventually come back to life and no data was lost, but what a gut wrenching scare it was. I vowed to implement very aggressive backing up of said data.
The solution I came to was to use the staging server we still had around from before our move over to Heroku (its a slicehost VPS machine) as a backup fetcher/storer. I wrote a pretty straight forward rake task that would:
- Initiate the capturing of a backup
- Download the backup
- Load the backup into a the local database of our app on another machine outside of Heroku (and AWS/EC2)
- Keep around only a handful of the backups at a time (so we don't accidently run out of HD space)
- Let me know immediately if the backup did not get fetched for any reason
Here is the code for the rake task. Its not sexy, it could probably be considerably better, but it gets the job done and was easy to write:
task :backup_and_load_db => :environment do
begin
tmp_path = File.expand_path(File.join(Rails.root, 'tmp'))
app = "your-heroku-app-name-here"
puts "capturing a new backup..."
capture_output = `heroku pgbackups:capture --app #{app} --expire`
backup_id = capture_output.split("\n\n").first.split("---> ")[1]
puts "creating url for backup: #{backup_id}"
backup_url = `heroku pgbackups:url #{backup_id} --app #{app}`
puts "created url: #{backup_url}"
backup_file = File.join(tmp_path, "#{backup_id}.dump")
puts "downloading dump file..."
open(backup_url) do |src|
File.open(backup_file,"wb") do |dst|
dst.write(src.read)
end
end
puts "download complete. saved to: #{backup_file}"
puts "loading dump file to database: yourdb_#{Rails.env}"
# this could be specified differently, but worked out for the places i needed to use this
user = `whoami`.to_s.chomp.strip
# restore the database from the dumpfile
restore_command = "pg_restore --verbose --clean --no-acl --no-owner -h localhost -U #{user} -d yourdb_#{Rails.env} #{backup_file}"
puts restore_command
`#{restore_command}`
puts "load complete!"
# find all existing dumpfiles
previous_dumps = []
Dir.new(tmp_path).each do |f|
previous_dumps << f if f.to_s =~ /^.*.dump/
end
# delete the oldest dump keeping a max of 5
if previous_dumps.size > 5
File.delete(File.join(tmp_path, previous_dumps.sort.first))
puts "removed oldest backup"
end
rescue Exception => e
# if anything goes wrong notify me
# we were already using Hoptoad in our app
# if not using hoptoad this could be just an email
HoptoadNotifier.notify(
:error_class => "HerokuDatabaseBackupError",
:error_message => e.message
)
end
end
Then I run the thing with the following cron command:
PATH=/your/path:/stuff/here
*/10 * * * * cd /path/to/your/rails/root/ && RAILS_ENV=yourenv rake backup_and_load_db >> /dev/null 2>&1
For PATH just type echo $PATH from the command line and copy and paste the result in there. Alternatively you could just put in the absolute path for rake (use ‘which rake’ to see where the binary lives). Notice the /dev/null bit at the end. This is very important, if you don’t do this then whatever user is running the cron will get its mailbox JAMMED with all of the output from the thing. If not super familiar with cron I would highly recommend reading one of the many primers out there - its a somewhat fussy feature in unix flavored land and is very easy to screw up (I know I did, repeatedly).
With this working we now get a full dump of our production database and then made redundant by copying to a machine outside of the Heroku ecosystem - every ten minutes - for free (minus the cost of the super small slicehost box). Happy day.
-UPDATE-
Before running the task via cron you will need to switch over to that unix user and authenticate it via $ heroku auth:login. Once authenticated it should run fine as long as you don’t change your password or something of that nature.