December 2005

« November 2005 | Main Index | Archives | January 2006 »

27
Dec

My brother Nick and his wife Megan visited Sydney for Christmas.

IMG_0187.JPGIMG_0190.JPG
IMG_0193.JPGIMG_0194.JPG
IMG_0208.JPGIMG_0211.JPG

There was another round in the endless PR debate between Wikipedia and the Encyclopædia Britannica this week, with studies finding that in some selection of articles, the two contained about the same number of errors..

Andrew Orlowski from The Register shot back, leading to this weird juxtaposition in Google News:

Headline 1: Wikipedia founder shot, according to Wikipedia. Headline 2: Wikipedia as accurate as Brittanica

Whenever I see arguments like this, I can't help but think the question is wrong. I'm reminded of something out of a Cory Doctorow speech:

New media don't succeed because they're like the old media, only better: they succeed because they're worse than the old media at the stuff the old media is good at, and better at the stuff the old media are bad at. Books are good at being paperwhite, high-resolution, low-infrastructure, cheap and disposable. Ebooks are good at being everywhere in the world at the same time for free in a form that is so malleable that you can just pastebomb it into your IM session or turn it into a page-a-day mailing list.

(As an aside, you can apply this objection equally to the annoying web app vs desktop app debate. They do different things well, and different things badly. Shouldn't we just vive la difference?)

The biggest lesson of the information age is that all media is to be taken with a critical eye, and that no information is valuable until you also understand its source. (One reason for the success of blogs: the information and the source are intimately related, so you always know where you are.)

A simple numerical comparison of error frequency in each source is meaningless unless it's accompanied by some analysis of how they were wrong. What kind of errors were they, and how did they pass through each publication's (formal or informal) safeguards?

Several Nature reviewers agreed with Panelas' point on readability, commenting that the Wikipedia article they reviewed was poorly structured and confusing. This criticism is common among information scientists, who also point to other problems with article quality, such as undue prominence given to controversial scientific theories.

This paragraph of the Nature article, which was reported as little more than a footnote to the numerical smackdown headlines, sums up the problems I have with Wikipedia. Coming across a Wikipedia article that is both well-written and clearly organised is a moment to be cherished, because it happens so rarely. Half the time I visit the site, I end up on the edit page saying "Right, I'm going to clean this bastard up". Then I realise that this would consume forty-five minutes of my time that would be better spent elsewhere, and I wisely walk away.

But really, what have I lost? It was free, it was linked from Google, I got the information I wanted, it just wasn't as cleanly presented, as "paper-white" as I could have got from a dead tree encyclopædia. Different media good for different things.

The other thing I like to watch is the fanboy side of Wikipedia. While scientific and factual subjects may be heavily peer-reviewed and bludgeoned into respectability, the more you drift towards the fringes, especially to the kind of article that wouldn't make it into Brittanica in the first place, the more likely a subject's Wikipedia presence is maintained entirely by its own fans.

Take a walk, for example, through Wikipedia's incredibly detailed coverage of Pokémon, professional wrestling, or fan fiction. No aspect of the miscellany or trivia of their subject-matter is left uncovered. They satisfy Wikipedia's requirement of a "neutral point of view" by including a brief section on criticisms or objections, but are so clearly written from the inside looking out that they demonstrate perfectly how a neutral point of view is not necessarily an objective point of view.

(The Wikipedia article on Killology shows this sort of thing isn't solely restricted to silly forms of entertainment. Looking at the associated discussion page it's clear that editors are aware of the problems with the article -- "Is there any evidence that this word is in wide usage outside that one guy's book?" -- but it's just too much of a fringe subject for anyone to dare tackle it with authority.)

Britannica is good at being respectable, professionally edited, protected from subtle vandalism, and if necessary, useful as a bludgeoning weapon. Wikipedia is good at being free, accessible, up-to-the-minute, and occasionally wacky fun. Both are likely to contain errors, but to different extents, from different sources and for different reasons.

And if we can just get over the "us vs them" obsession for a while, despite the fact that it sells page-views, we might end up making both sources of information better.

This year, the Atlassian Christmas party was held on a boat. Quite a lot of alcohol was consumed, as is traditional at these events, and we found an outlet for our competitive streaks by shooting infra-red guns at flying reflective plastic targets.

For the record I won the preliminary round with only one miss, but then bombed out in the final thanks to the cumulative effect of an intervening two bottles of beer, and a new scoring system that valued speed over accuracy.

Translation: If you fuck with me, make sure you shoot first. :)

The number of cameras at the event was frightening. In our society of pervasive digital surveillance, no indiscretion will go un-flickr'd.

See also: My photo-set of the event (Justin's, Jeremy's), and this caption competition.

A bit of a war of words has broken out over Martin Fowler's post about Humane Interface Design, in which he proposes that the interface of a class be designed to maximise its usefulness, rather than to minimise its complexity. As his example, he contrasts the Ruby Array class with its Java equivalent: java.util.List.

(Fowler has been linking to the various sides of the ensuing debate at the bottom of his post, which saves me having to do it here.)

I'm not going to come down on either side of the debate, because I haven't really formed a clear opinion either way. Instead, I'll just throw a little more kerosene on the fire. :)

1. An interface isn't an Interface

In Java, List is an Interface. In Ruby, Array is a class. The distinction may seem to be hair-splitting, but it's important.

A Java interface defines a type, a series of messages to which any object wishing to be called a List must respond correctly, but it can provide no implementation for those methods. As such for every method that List defines, anyone who wants to provide their own List implementation must implement that method. Sure, there's an AbstractList that can take some of the weight off you, but extending AbstractList locks you into a particular line of superclass inheritance, which might not be helpful.

Java utility classes like <a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Collections.html">java.util.Collections</a> exist just so Java can provide static implementations of functions that can be applied to all implementors of an interface, without any mucking about with inheritance.

Ruby's Array doesn't have this problem. Partly this is because Ruby doesn't have the equivalent of a Java Interface, and partly this is because Ruby allows for mixins as an alternative to multiple inheritance. 21 of Array's methods are, in fact, mixed-in functionality from the Enumerable module.

Java's design affords small interfaces, and utility functions provided as static methods on helper clases. Ruby's design affords larger classes with mixed-in utility methods.

2. Array is a really bad example

Part of the reason this argument could go on forever is that Ruby's Array is both an example of arguments for Humane design, and arguments against it. Nobody could really dispute the usefulness of last(), join(), sort() or map(). Similarly, Ruby's convention of having two methods for many operations -- foo to return a new object, and foo! performing the same operation but modifying the existing object in place -- is a useful one.

On the other hand, many of Array's methods are harder to defend. Methods like rassoc() fetch() or pack() bear the strong smell of being Perl or Lisp refugees that don't really belong.

3. List is a really bad example

java.util.List isn't really a shining example of good interface design either.

Take the fact that List defines an add() method. Implementing add() is, according to the Javadoc, optional. This completely defeats the purpose of having an Interface in the first place. Instead of being able to rely on the object's type to determine its capabilities, the only way to find out if you can, in fact, add something to a list is to try to add something and hope it doesn't throw an (unchecked) UnsupportedOperationException.

(All mutators on the List interface -- 9 of List's 25 methods -- are optional in this way)

Under some circumstances, for example the custom List implementation returned from java.util.Arrays.asList(), you may or may not be able to call add() on the resulting list, depending on whether your operation will overflow the list's backing array - something else you can't ask the List interface about beforehand.

The penalty for adding something to this kind of list at the wrong time isn't even an UnsupportedOperationException, it's an ArrayIndexOutOfBoundsException -- presumably because there's no such thing as a PartiallySupportedOperationException.

The fact that this sort of thing doesn't trip most Java programmers up more than once or twice a year is a convincing argument in favour of dynamic typing. Programmers just make sure they know which kind of List is being passed around where, without the assistance of extra type information.

4. Synonyms are a really bad idea

Fowler:

When you want the length of a list, should you use length or size? Some libraries use one, some the other, Ruby's Array has both, marked as aliases so that either name calls the same code. The rubyist view being that it's easier for the library to have both than to ask the users of a library to remember which one it is.

This is where I have to disagree vehemently.

Having two otherwise equivalent ways to perform the same operation is bad user-interface design, and it's bad library interface design, because the existence of the synonyms actually adds to your cognitive load by making you choose between them.

Say you're scanning a class, trying to work out what method to call. You find the class has two methods with synonymous names. Do they do exactly the same thing, or are they subtly different? Well, now you have to go to the documentation to find out (or the code, if the documentation doesn't explicitly say "these methods are 100% identical"). If there was just the one method, you'd probably have just used it without a second thought.

5. Ruby could be both humane and minimal

In Ruby, libraries can add methods to existing classes. As such, a lot of the less core methods on Array could be harmlessly moved into the standard library, to be re-applied if needed by include 'pack', include 'synonyms' or include 'obscure-lisp-stuff'.

Kirrill Grouchnikov asks who it's better to hire for a Java project: a good Java developer, or an excellent Perl developer?

Arguably, an excellent non-Java developer can learn Java syntax in 4-5 days. But is syntax everything you need to know to write excellent Java code? How much time will it take until he starts to write Java code that looks like Java code and not like Perl code (don't forget that working on team means that the code is maintained by everybody, and even if somebody leaves, his code stays)? Do big projects really need "stellar" developers, or perhaps a team of good developers with solid Java knowledge does better job in the long run?

For the sake of argument, let's substitute Perl with Python (or Smalltalk, Ruby, Lisp or C++), because even great Perl programmers tend to have strange ideas about object orientation.

We'll also skip the false dichotomy -- a lot of "great x programmers" are also competent Java programmers so it's possible to hire both in the one person -- and ignore the fact that you'd get the best of both worlds hiring one of the many already-great Java programmers out there before you raid the scripting-language gene-pool.

I'll also assume that both applicants are equally committed, and the Python developer won't jump ship the first time they get an offer to work in a language more to their taste.

I think my answer would be: if I was hiring someone for a six-month contract on a straightforward project, I'd go with the merely good Java programmer. If I was picking someone to work full-time with/for me on Confluence, I'd be more tempted by an excellent programmer, whatever language they happened to excel in.

(At this point it is, of course, obligatory to mention that we're hiring, but that I don't actually decide who makes the cut.)

The crux of the matter is the oft-repeated, and at least partly grounded in truth, old saw that an excellent developer can be an order of magnitude better than a good one. It may take the Python developer 12-18 months to build a comprehensive knowledge of the libraries and syntax, but you'll be seeing the advantages of having them around significantly before that.

As an aside, the "order of magnitude" thing is hard to measure. For certain godlike hackers it's a given -- if you're writing a 3d engine, you'd be better off with John Carmack than you would with a hundred lesser lights -- but in the realm of mere mortals it's not so clear-cut.

A great programmer may not crank out features ten times as fast as a good one, but they still may have that much performance benefit overall. For me, what gives great the edge over good is some combination of: attention to detail, which means their features will be more completely implemented with fewer incidental bugs, attention to design, which means they'll leave the code in a better state than they found it, and inspiration, where they will find solutions to a problem that simply wouldn't occur to other programmers. All of these have really, really powerful flow-on productivity effects for the whole team.

Ideally, I'd hire the great Python programmer and have them pair-program with a competent Java developer. The Java developer could provide expertise on concrete things like the minutiae of the Collections classes, and the probably-frequent-at-first "no, in Java you do it this way", moments. The Python nerd could chip in with "what about this case?" and "why don't you do it this way?", and the whole would be greater than the sum of its parts.

You just have to put up with the constant complaining that whatever feature you're working on could be implemented better in five lines of Python.

Happy Birthday to me
Happy Birthday to me
Thr... tw... on... er...
Oh fuck.

(see also: 2004, 2003, 2002)

Definitions

  • 7:08 PM

Found on Xooglers:

Non-trivial
It means impossible. Since no engineer is going to admit something is impossible, they use this word instead. When an engineer says something is “non-trivial,” it’s the equivalent of an airline pilot calmly telling you that you might encounter “just a bit of turbulence” as he flies you into a cat 5 hurricane.