paperplane.io blog

Using wget to make any site into a static site

Tony Dewan September 18, 2013

I want to make something clear: Paperplane.io doesn't support any kind of server-side scripting language like PHP. But that's on purpose! We believe that many sites don't need that level of sophistication, especially if you speak HTML and CSS.

Lots of sites are built on tools like WordPress, though, and it can be a daunting task to convert a site from WordPress to static HTML and CSS. We certainly wouldn't want to tackle it. Luckily, there's an easy way to backup any public website with a quick command, called wget. Here's how.

The only caveat: you'll need to use the command line. If you're scared of that, don't be! We'll make it easy. (Also, these commands are for Mac OSX. If you're on Windows, some things will be different. Let us know if you need help!)

First step, start the Terminal:

If this is new: the Terminal is the command-line interface to your Mac. More info here.

You should see something that looks like this:

Type or copy the following command into the terminal, making sure to replace YOURURL.com with the website you want to backup.

wget -p -P ~/Desktop/websitebackup --convert-links -m -nH http://YOURURL.com/

It will look something like this:

This will backup the site to a folder on your desktop called websitebackup.

Let me a explain a little bit about what's going on here. We're using a standard tool called wget. You can see that at the beginning of the line. We're then passing several flags (or options) to make sure it does exactly what we want it to do. Here's a few of the options we've set and what they mean:

-p
Tells wget to get all the necessary files to display the page (images, CSS, etc)
-P ~/Desktop/websitebackup
Sets the path on your local machine for the output (the new static version of your site)
--convert-links
After the download is complete, convert the links in the document to make them suitable for local viewing
-m
Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps ftp directory listings.
http://YOURURL.com/
The final parameter tells wget which url to get.

You can read all about the different options for wget at Wikipedia or gnu.org.

Now, if you want o backup to directly to your paperplane account, it's as simple as changing the path (-P) option. If, for example, your site has this url blogbackup.paperplane.io, just use this command:

wget -p -P ~/Dropbox/Apps/paperplane.io/blogbackup.paperplane.io \
  --convert-links -m -nH http://YOURURL.com/

This works well for converting sites for archival purposes, but what if you want to make sonething more editable? Jekyll is a fantastic tool for static blogging (we use it for this blog!) We'll write an in depth post about it soon!