On (im)permanence

I got asked a very interesting sort of computer question by the host of the fourth of July party ashacat and I attended.. Her question didn't have anything to do with her having a balky scanner or a wayward wireless router. She was asking me about preserving documents. Specifically, how to preserve electronic records being generated by an oral history project she's involved with in Suffield. My brother, who is Director of the Wethersfield Historical Society, was sitting with us. I think that my answers surprised them both. Here's what I said.

I told our host that first and foremost, once the project is done, she should secure a ream or two of acid-free paper and print everything out on a laser printer. Print several sets, get them bound with some kind of mechanical binding[1] and distribute them. One to the Suffield public library, one to the historical society, one to the Town Clerk (if he or she would have it), and others in the hands of the principle researchers.

Second, I advised that whatever file formats they were using for storing and editing their documents, they should save copies as either plain text (a.k.a. flat ASCII) or, at most, simple HTML (flat ASCII with <tags> in it).

Last, I said that if they had any color photographs, it would be in their interest to get black and white prints made.

Now, why did I say those things?

First, nothing digital is permanent. Hard drives fail. Old data gets lost, or people don't bother migrating it to new systems. File formats change. Material stored in proprietary binary formats (e.g. MS Word .doc format) is subject to stranding as software versions tick upward and programs come and go. Show me a priceless manuscript written in XyWrite and I'll show you something you need a lucky find in a computer museum to ever read again.

Second, storage media have finite life spans. Magnetic tape such as old mainframe round-reel or square cartridge "3480" tape is good for 20-30 years. Crappy consumer mag tape such as audio cassettes and VHS is good for maybe 20 years.[2] Let us not even talk about floppy disks. The manufacturers claim that recordable CDs and DVDs have a shelf life of 80 to 200 years, depending on the brand. No one really knows, because the technology simply isn't that old. CD-Rs and DVD-R/DVD+Rs have the same problem that limits the shelf life of color film and prints -- the organic dyes they are based on are chemically unstable. Your family photos from the 1970s are probably looking a little muddy by now. Even photographic slides stored carefully, are subject to decay.[3]

So, what can you do if you want to keep something around for posterity?

We have been making paper long enough to know how to make paper that will be here for the long term. Acid-free cotton bond paper will survive for centuries with decent care. The black toner used in a laser printer or copier is a stable polymer -- a kind of plastic -- that will also be here for a long, long time.[4]

We have also been making photographs long enough to know that black and while negatives and prints -- when processed correctly -- will stand the test of time. The silver salts that are created when a piece of exposed B&W paper is developed are chemically stable. The limiting factor is not the silver compounds in the image but the paper itself. Which, again, we know how to make to archival standards. There are prints and negative still in existence (and in fine shape, thankyouverymuch) from the US civil war.[5]

Choosing a file format for your electronic information is a little trickier. For text and simple data, go with the simplest format you can. Flat ASCII (a.k.a. plain text -- .txt) has your best chance of being readable in 200 years (if the storage medium survives). For spreadsheet or database tables I'd suggest dumping copies out to comma-delimited (.csv) or tab-delimited text. And for the love of Dog, include a header row with the field names! Simple HTML is an OK choice because it is essentially flat ASCII with tags in it. A human reader in the 23rd century can weed out the <div class=head align=center> tags and recover the text. Again, providing that successive archivists have migrated the file forward from one storage system to the next over the years.

Images and multimedia files present bigger challenges. I'd love to hear what people are doing to preserve sound and video for the long haul. For images I'd suggest sticking with the most common formats -- TIFF, JPG, or PNG. ashacat suggests that Encapsulated PostScript (.EPS) may be a good choice for preserving graphs and figures. Proprietary formats such as Photshop .PSD files and especially camera-specific "RAW" formats are a Bad Plan.

If you've gotten the idea that after 18 years in the industry I'm pretty down on the survivability of electronic information, you're intellect has not failed you. Paper books are just about the best archival means of storage we have come up with. Stone tablets or laser-engraved titanium sheets are more permanent, but in the real world of trade-offs between survivability and cost, a book or a binder is most practical way for you and me to leave something lasting for those that will come long after we're gone.

[1] Use a plastic comb binding or 3-ring binder, etc. Glue or tape bindings are a no-go. You don't know what's in that glue (in terms of acids) or how long it will last.

[2] If you have any old VHS tapes kicking around that you want to save, get them copied to DVD. It's not permanent, but it's better than doing nothing.

[3] This is not to say that there is no such thing as an archival color print. The Ilfochrom (nee Cibachrome) process is reputed to be very stable. It's also expensive and beyond the reach of the average amateur historian.

[4] Inkjet printing is in the same boat as burned CDs/DVDs and color film and prints. Even the modern stuff (such as HP's Vivera ink and photo paper combo) is only "better" than a drugstore color photo. Inkjets do not produce archival prints.

[5] Many glass-plate negatives from the war were sold off as window glazing for greenhouses. After the war, no one much wanted to see images of dead soldiers heaped up on battlefields, so thousands of plates wound up on greenhouse roofs. The ones that didn't get shattered over the years are as sharp today as they were in 1865.


6 comments
Jul. 15th, 2007 01:20 am (UTC)
If I could be so bold as to have an opinion...

Please do!

I'm several reves behind on the verion of Photoshop I'm using (5.5), so I didn't know about DNG. Nice. I also didn't know that Adobe has handed PDF over to ISO. That's a very positive development.
Jul. 13th, 2007 05:41 pm (UTC)
I remember hearing a story on the radio about how some archivist in Washington D.C. decided to encode all the audio onto a vinyl or wax disk. VERY low fidelity, but good enough to hear by dragging a needle on the spinning disk. Limited to three minutes per side, but as long as you have a thin stick, a pin and a paper cup, you'll be able to make it out.
Jul. 15th, 2007 01:23 am (UTC)
Re: Audio
Interesting. Low-tech sometimes carries the day. A you could actually do this companion to those jokes about reading a CD with a magnifying glass and a calculator... :-)
Jul. 14th, 2007 09:04 am (UTC)
((blink)) I love it when my generic mental image of the world gets fantastic, unimaginable things added to it such as plates of dead soldiers used in greenhouse roofing. Whoa.

Here by way of matociquala. Appreciate the archival info. Interesting how impermanent all the purely digital things are.
Jul. 15th, 2007 01:27 am (UTC)
Glad to be of service. :-)
6 comments

