In the Livejournal Macosx community, one user noted some interesting behaviour in Mac OS X. When you get to the bottom of what's going on, it's an interesting insight into the way a couple of unrelated design decisions can turn around to produce unexpected behaviour that's only really predictable in hindsight. There's a bit of a lesson in here.
The user in question was going through O'Reilly's Mac OS X Hacks book, and tried out the whoami command, which is there to tell you who you are. It produces the following output:
gnosis:~ cmiller$ whoami cmiller
However, he also discovered that the undocumented command whoamI (with an upper-case I), gives much more interesting output:
gnosis:~ cmiller$ whoamI
uid=501(cmiller) gid=501(cmiller) groups=501(cmiller),
79(appserverusr), 80(admin), 81(appserveradm)
Furthermore, this enhanced command only seemed to be available in the bash shell. In tcsh, it was only available if you addressed it by its full path.
[gnosis:~] cmiller% whoamI
tcsh: whoamI: Command not found.
[gnosis:~] cmiller% /usr/bin/whoamI
uid=501(cmiller) gid=501(cmiller) groups=501(cmiller),
79(appserverusr), 80(admin), 81(appserveradm)
The explanation is pretty simple, if a bit long-winded.
- The HFS+ filesystem under OS X is case-preserving, but not case-sensitive. Thus,
whoamiandwhoamIend up addressing exactly the same file. - A user familiar with Unix will recognise the output of
whoamIto be identical to that of theidcommand. Deeper investigation shows that theid,whoamiandgroupscommands are all hard-links to the same binary, which checks the value ofargv[0]to see what it should be running as. Since this is a Unix program, and thus is case-sensitive, it doesn't recognise thatwhoamIis a legitimate way to call it, and falls back on its default behaviour:id. - The
tcshshell maintains an internal hash of the programs that are on your default search-path. Often, if you're messing with the contents of your path, you need to call the internal shell-commandrehashto have it rebuild the hash. Once again,tcshis a Unix program, and thus assumes case-sensitivitiy.whoamIisn't on its search-path, and thus it's not found unless you specify explicitly where to find the file.bash, on the other hand, either doesn't maintain such a hash, or doesn't trust it. It's quite happy to ask the filesystem ifwhoamIexists, and run it for you.
So there it is. A series of rational design-decisions in four unconnected components combines to produce unpredictable results. So where's the lesson in all this for programmers?
Each component aside from bash is making an assumption about the behaviour of another component. The filesystem, by definition, is the final arbiter of whether two filenames are identical or not. On the other hand, tcsh and the whoami/id/groups binary each believe that they already know how the filesystem functions, and replicate little bits of its behaviour internally as optimisations and shortcuts.
So when the behaviour of the filesystem changes from the Unix default of being case-sensitive to the OS X default of just being case-preserving, it causes unpredictable behaviour in those applications.
It's really just a practical example of the value of Once and Only Once. A system is both more robust and more flexible if each question has an authoritative answer from only one place.
cool article..but i didn't get the difference b/w case-preserving and case-sensitive
Pramod:
Case preserving. If you enter a mixed case file name the case in which you entered it is preserved. For example "Firewall". Because it's not case sensitive, you cannot put a file called "FireWall" in the same location without over-writing the first.
If the filesystem is case-sensitive, then you can have those files in the same directory, and they will be regarded as different files.
Unix/Linux file systems are generally case sentitive - Windows (recent-ish versions) and (I believe) Mac simply preserve the case, but otherwise don't care what it is.
OS/2 also had case-preserving but not case-sensitive file system. I tried to rename a file to have the right case once to make some ported Unix tool happy; since I had a bash shell open I used mv. Unfortunately, if you try to rename abc to ABC, the compatibility layer would check for the existence of ABC and delete it (since OS/2 doesn't allow you to delete a file by renaming over it), then failed because abc didn't exist any more. It wasn't fun.
Bash does have a path cache, but it works differently; when you ask it to run xyz, it remembers where it found the executable, so that when you invoke xyz again, it doesn't need to search the path.