Complexity: The mysterious 'whoamI' command.

In the Livejournal Macosx community, one user noted some interesting behaviour in Mac OS X. When you get to the bottom of what's going on, it's an interesting insight into the way a couple of unrelated design decisions can turn around to produce unexpected behaviour that's only really predictable in hindsight. There's a bit of a lesson in here.

The user in question was going through O'Reilly's Mac OS X Hacks book, and tried out the whoami command, which is there to tell you who you are. It produces the following output:

gnosis:~ cmiller$ whoami
cmiller

However, he also discovered that the undocumented command whoamI (with an upper-case I), gives much more interesting output:

gnosis:~ cmiller$ whoamI
uid=501(cmiller) gid=501(cmiller) groups=501(cmiller), 
    79(appserverusr), 80(admin), 81(appserveradm)

Furthermore, this enhanced command only seemed to be available in the bash shell. In tcsh, it was only available if you addressed it by its full path.

[gnosis:~] cmiller% whoamI
tcsh: whoamI: Command not found.
[gnosis:~] cmiller% /usr/bin/whoamI
uid=501(cmiller) gid=501(cmiller) groups=501(cmiller), 
    79(appserverusr), 80(admin), 81(appserveradm)

The explanation is pretty simple, if a bit long-winded.

The HFS+ filesystem under OS X is case-preserving, but not case-sensitive. Thus, whoami and whoamI end up addressing exactly the same file.
A user familiar with Unix will recognise the output of whoamI to be identical to that of the id command. Deeper investigation shows that the id, whoami and groups commands are all hard-links to the same binary, which checks the value of argv[0] to see what it should be running as. Since this is a Unix program, and thus is case-sensitive, it doesn't recognise that whoamI is a legitimate way to call it, and falls back on its default behaviour: id.
The tcsh shell maintains an internal hash of the programs that are on your default search-path. Often, if you're messing with the contents of your path, you need to call the internal shell-command rehash to have it rebuild the hash. Once again, tcsh is a Unix program, and thus assumes case-sensitivitiy. whoamI isn't on its search-path, and thus it's not found unless you specify explicitly where to find the file.
bash, on the other hand, either doesn't maintain such a hash, or doesn't trust it. It's quite happy to ask the filesystem if whoamI exists, and run it for you.

So there it is. A series of rational design-decisions in four unconnected components combines to produce unpredictable results. So where's the lesson in all this for programmers?

Each component aside from bash is making an assumption about the behaviour of another component. The filesystem, by definition, is the final arbiter of whether two filenames are identical or not. On the other hand, tcsh and the whoami/id/groups binary each believe that they already know how the filesystem functions, and replicate little bits of its behaviour internally as optimisations and shortcuts.

So when the behaviour of the filesystem changes from the Unix default of being case-sensitive to the OS X default of just being case-preserving, it causes unpredictable behaviour in those applications.

It's really just a practical example of the value of Once and Only Once. A system is both more robust and more flexible if each question has an authoritative answer from only one place.

The Fishbowl

Complexity: The mysterious 'whoamI' command.