• Web Archiving: Playback Tools

  • Back
 by 

If you found my content helpful then please consider supporting me to do even more crazy projects and writeups. Just click the button below to donate.

Below I’ve listed some of the tools I found to playback web archives. The two most popular tools that I found were OpenWayback and PyWb. Of the tests I did on these two, PyWb came out as faster and more accurate but your experience may differ.

OpenWayback

OpenWayback is an open source project building on top of code that is used by archive.org for their wayback machine. The software provides an interface for browsing and viewing a web archive. It also has bounded software for indexing (W)ARCs. The software is built with Java and is used by many organisations worldwide although seems to be falling out of favour for PyWb in more recent years.

PyWb

PyWb is a Python (2 and 3) web archiving toolkit for replaying web archives large and small as accurately as possible. It performs the same functions as Openwayback. In my tests I found PyWb to be more accurate in it’s ability to play web archives. Like OpenWayback it also comes with software to index (W)ARCs. It also supports the Memento protocol as described below.

ReplayWeb.page

ReplayWeb.page is a web app for viewing web archives. It comes as a website you can visit or a standalone Electron application. The tool supports the usual WARC/CDX files but is also in the process of supporting the newer WACZ format. It boasts the same feature set as it’s predecessor WebRecorder Player but also has new features such as being able to embed webarchives in web pages.

NetCapsule (Depreciated)

NetCapsule is a proof of concept system for playing back web archive content using old browsers that would of been around at the time of the capture. It has the ability to playback using several different versions of browsers including Mosaic, Netscape, Internet Explorer, Safari, Firefox and Chrome. It uses PyWb for archive playback.

Webrecorder Player (Depreciated)

Webrecorder Player is a desktop player for (W)ARC files. It is compatible with OSX, Windows and Linux and provides a very low barrier of entry to viewing (W)ARC files. Webrecorder Player can only load a single WARC at a time so large collections are difficult or impossible to browse. It was been depreciated in favour of ReplayWeb.page below.

Memento

With Memento, you are able to access a version of a Web resource as it existed at some date in the past, by entering that resource’s HTTP address in your browser like you always do, and by specifying the desired date in a browser plug-in.

I’m unsure how much support the memento protocol has so it may be of limited use at this point. Should you wish to have a play then a list of things related Memento can be found on Github.