The Internet Age that vanished

There are no yellowing, faded newspaper Web sites. Except for the WayBack machine — the best we have, but admittedly spotty — the whole dawn of Internet newspapers from the mid-1990s on could vanish, never to appear in a yard sale or a treasure proffered up on the PBS’s Antiques Roadshow.

A database crash, a decision to shut down some old servers, or even some spirited housecleaning, and, blink, days, months, years of an electronic newspaper could be gone

It’s already happening. The Wayback archive does have some early Knoxnews home pages, but many of the graphics and photos are either gone or moved to a different location and don’t work. In our early days in Vignette, we aged stories off in a couple of weeks, including online only stories for which there was no archive elsewhere. They are just gone. The organization of many static packages has been lost. Even if we still have them, we don’t know where they are. When we transferred our articles from our old Fast Forward Web publishing platform to our current Ellington/Django platform, the stories were ported, but all comments were lost.

It’s ironic that the digital forms of paper newspapers are making for a better historical record than the now often more robust online versions. For example, we have an archive solution (in fact, more than one) for the printed paper version of a dramatic jury trial in our digital archives, but the online version probably also contains important supporting documents as PDF files, such as motions and rulings, that would be of use to future researchers. The comments added to the story also provide some glimpse into the public’s view at the time of the events.

Other than posting on the Web site, there is no archiving done with future use in mind. There’s a good chance that over time the links will be broken as some new platform or technology comes along and changes everything.

Michael Miner explores this issue using the dormant site and how the newspaper archives that are being transferred to the Denver library system don’t include those on the Web site. It’s a good piece except for an odd aside about online comment management.

He writes:

The point is that real archiving’s not a business–it’s a public service. The digital newspapers of the early 21st century will be unknown in the 22nd unless they’re aggressively safeguarded. They won’t sit around in boxes until they’re shredded or burned. Simple neglect will destroy them. 

Do any newspapers have explicit archiving strategies for Web content?