Posts under this project (3)
Web Archiving: Formats
In this post we’ll look at the different formats that tend to be used around the topic of website archiving. This post is in no way an exhaustive...
16 February 2018 by Thomas Preece Read more...Web Archiving: Playback Tools
Below I’ve listed some of the tools I found to playback web archives. The two most popular tools that I found were OpenWayback and PyWb. Of the tests...
23 February 2018 by Thomas Preece Read more...Web Archiving: Crawling Tools
Below I’ve listed tools I’ve come across for archiving live websites. All of these tools act as Crawlers and vary in the quality they obtain and the amount...
2 March 2018 by Thomas Preece Read more...I’m quite passionate about archiving the web as I think it’s quite important. Whilst a trainee at BBC R&D I organised a placement at BBC Archives to work on Web Archiving. I learnt a lot during this time and have subsequently applied that knowledge to the projects I have joined afterwards. In particular, I always make sure there is a process in place to decommission web experiences and archive them. Most recently, this was in the StoryKit team. As StoryKit experiences are impossible to crawl, I worked on creating a tool dedicated to archiving them. This used inside knowledge to create an industry standard WARC which could be passed to BBC Archives.