?

Log in

No account? Create an account

Previous Entry | Next Entry

Filing

mapmakr asked me some questions about file naming and long-term stability. He had had some problems with files that had both long file names (greater than 8.3) and "non-standard characters (@ & # " and such). Here are my two-or-three cents on creating file names for maximum durability.

  • Go ahead and use long file names: they're just too damn useful. But don't go nuts. Keep them sane -- something like 32 to 64 characters max. You can run into odd limitations of directory structure depth and file name length where Windows will show you the file but will throw up its hands if you ask it to delete the file (observed on both 2000 and XP with NTFS volumes).

  • Make all file names all lower case. UNIX & Linux are strictly caseful. Window's case-sensitivity is iffy -- sometimes it is and sometimes it isn't. The safest, simplest, and most readable thing to do is to use all lower case characters.

  • No spaces! That's why the ASCII gave you the underscore! ( _ ). IMO, file names with spaces in them are more prone to name-related problems than 'space free' file names. Spaces are also anathema to easy handling of files in scripts and through CLI tools. Yes, even UNIX is getting better in terms of coping with directory and file names with spaces in them, but just don't do it.

  • No punctuation characters other than the dot. Period. Pun intended.


And now we come to the "for the love of Dog!" section. All of these deal with creating file names that will sort properly. Remember, computers are incredibly fast, and incredibly simple minded. We must do everything in our power to help them do what we want them to do.

  • For the love of Dog, if you're going to include a serial number in your file name, pad it with zeros! For example: file_001.tiff vs. file_1.tif Think of how many files you could possibly have in this series. Then double it. Then round up to the next power of ten. If that's 400 files, that becomes 800, which rounds to 1,000. So you should start with 0001. That way your files will sort out as 0001, 0002, 0010, 0152, 0407 etc and not 1, 2, 25, 27, 3, 34, 302...

  • Also for the love of all that is holy and good use big-endian dates!. This means you write today's date as 2007-11-19. The fourth of July was 2007-07-04 and New Years was 2007-01-01. Christmas will be 2007-12-25. This is the more human-readable version. If you want to be compact about it you can skip the dashes. Boxing Day would thus become 20071226. Putting the month first (eg: 11-19-07) means that all of the Novembers from all of the years in your file archive will be sorted in together. This is a complete pain in the ass, and not just for scripts but for other humans too. Put the year first, then the month, then the day.


And that, as they say, is pretty much that. If you follow these conventions your files will sort properly and you won't get into any bizarre twists if you are working cross-platform. FWIW, remember that this advice is free. :-)

Comments

( 1 comment — Leave a comment )
mapmakr
Nov. 20th, 2007 01:56 am (UTC)
Thanks
Very useful thoughts - the basic rules for file naming were in the first class I took in mapmaking school. But since I'm an "old" mapmaker, I was starting to wonder if the world has changed.

So nice to know that ones and zeroes have not.
( 1 comment — Leave a comment )

Latest Month

January 2017
S M T W T F S
1234567
891011121314
15161718192021
22232425262728
293031    

Tags

Page Summary

Powered by LiveJournal.com
Designed by Lilia Ahner