December 16, 2021

How to Download a Whole Website from the Wayback Machine

As I was looking for ways to download an entire website snapshot from the Wayback Machine, I found this article: 

How to Download Entire Website from the Wayback Machine

If like me, you're on Windows, you're gonna have to download and install WGET in order to proceed. Here's a link to download it: 

Windows binaries of GNU Wget

It comes as an EXE file, so you're gonna have to copy it to your C:\Windows\ directory. From there, it'll be recognized as a command in CMD. You'll be able to follow the step-by-step guide above, in order to download a full site snapshot from the Wayback Machine. 

Here's a command I've used in order to download an old snapshot of my forum, dating back to 2005: 

wget --recursive --no-clobber --page-requisites --convert-links --domains web.archive.org --no-parent http://web.archive.org/web/20060404024947/http://paranormalnetwork.net

The command itself works perfectly. The result isn't always great though, as it seems most of the time, the Wayback Machine only archived the site's homepage. Therefore, most links don't lead anywhere. I guess it depends on what platform the site is built on. 

You'll obviously have more success if the site you're looking for is static. The less dynamic the site, the more likely you are to retrieve pages other than the homepage. 

Have fun!  


No comments:

Post a Comment