Ds Scholarship

Rescuing the Historical Record in a Digital World

Katie: Plus, we archive ProPublica’s apps for data journalism—its complete catalog, if we can. They are one of our partners in this business. They create some of these really interesting, complex and powerful websites that query a real-time database. One of them is titled “Are the hospitals near me ready to confront the Corona virus?” -Allows you to enter your zip code and see how full the hospitals are. This of course was very useful last winter.

ProPublica produces many different versions of this but there is no technology capable of capturing and archiving sites – yet. We work with various partners, develop tools, and believe that we will eventually be able to capture all of our ProPublica journalism applications.

What does the archiving process look like?

Katie: It’s not easy to look at a data journalism website and see if it’s archivable. We are working on a flowchart that will help digital archivists and data journalists know exactly what they have built and what aspects can be preserved. Some things can be archived using Web Recorder, a dynamic high-resolution web archiving tool that can capture a lot of things, but it can present issues with moving archives to library catalogs and making them available to researchers later. Sometimes it’s not until you get to the QA step and check the archived copy that you realize it didn’t capture important parts of the site.

Vicki: But our tool, ReproZipWeb, enables us to do server-side archiving. Anyone can use it – it’s free and open source. If you have access to a server where the material in production or a copy of that material is hosted, you first start the server, which engages the tool and keeps track of everything that happens on the server, including the software it touches, the data you use, the database, the type of database, etc. that. It captures a lot of in-depth metadata required for active and continuous digital preservation. At the end of the process, you get a bundled file that is small in size and shareable and contains all the assets needed to restart the web application in different environments. Not only is it easy to archive, but it is also easy to reuse for others.

If we don’t have access to different computational environments such as different operating systems and different servers in the long run, much of that work becomes moot. If you don’t have a copy of Windows 93 but have a Windows 93 file and you open it now, it will look like Wingdings. Archiving software is an essential part of this business.

It’s illogical to think that something that was posted online as recently as last year is really in danger of getting lost. How widespread is the problem?

Katie: Curiously, there are books published 500 years ago that are much more stable and archivable than some of these dynamic sites. Sites can be exceptionally fragile, especially with some of these news organizations, such as Vox or Chalkbeat, that have no old posts behind. There has been a lot of interesting data journalism created during COVID that has really gone away, and data journalists are sounding the alarm about losing their work. Digital media start-ups are incredibly volatile.

Germany: Number of Int’l Occasional Students Surges by 4%

Occasional students are returning in Germany, as the number of such students increased...

Talawanda student arrested after threats cause lock down

OXFORD, Ohio — UPDATE: 12:15 pm: Talawanda Schools officials told the Journal-News a...

After Texas shooting, Conn. senator begs for gun compromise – Redlands Daily Facts



Please enter your comment!
Please enter your name here