June 2004

« May 2004 | Main Index | Archives | July 2004 »

Apparently, last night I prevented Marc Fleury from hitting somebody, but I can't remember how, or even whether it happened.

Last night's semi-drunken haze was thanks to the rather excellent beer being provided by O'Reilly. I'd like to thank the JBoss folks for helping me rather blatantly crash the party: if anyone is keeping track of these things, by accepting their invitation I've now obviously been bought by JBoss, and my opinions are worthless.

It's the JavaBlogs party tonight, and if you look here and here, you'll see there's a pretty large and interesting group of nerds who've volunteered to turn up. Should be fun.

But for now, it's time to try to elbow my way into the Steve Jobs keynote. I described this to someone the other day as being "kinda like visiting Mecca", and got a very strange look.

Can anyone help me out on this one?

I was looking through my T610 address-book this afternoon, and found a weird entry. So I tried to delete it. Immediately, the phone rebooted... and has continued to reboot every thirty seconds since. I might get ten seconds of interactivity with my phone between reboots, but trying to do anything (including the Master Reset) just causes another reboot.

Lots of hourglass and "please wait" happening.

Tried pulling the battery. Tried pulling the SIM card. Tried booting without the SIM card. Tried leaving it for half an hour without the battery in and booting again. No luck.

Of course, this had to happen while I'm in a foreign country, kinda relying on my phone to keep me in touch with the various people I'm trying to visit, and pretty much as far away from my phone company as it's possible to get.

Is there some key-combination that I can hold down to boot in diagnostic mode, or trigger a master reset without going through the menu, or generally just cause my phone to wipe its memory and restore itself to factory settings?

Or is there somewhere neat in SF where they might be able to give my poor, ailing phone the breath of life?

Preface. I DO NOT HAVE ANY GMAIL INVITATIONS. I have given them all to friends, co-workers, and one random IRC acquaintance.

We know that Google's parceling out of Gmail invitations was a really smart marketing move. The invite system made the service seem that much more exclusive, and thus that much more desireable. However, I'm betting it wasn't their primary motivation. I suspect there was a far more important reason to launch the service this way: stability.

Here's a possible scenario.

Google finish the beta program for Gmail, and open subscriptions to the public. Over the next few days, millions of people subscribe and explore the interface. Many start pumping their mail archives into the service. This creates a load-spike orders of magnitude higher than the service would normally have to maintain.

At the same time, some Google engineer starts discovering that a number of parts of Gmail that they thought would scale linearly, don't.

Over the next month Gmail is down more often than it is up, none of the programmers are getting any sleep because they're too busy putting out fires, and some sub-editor at the New York Times decides "Gfail?" would make a good headline for the post-mortem, which really isn't good publicity for a company looking towards its IPO.

Invite codes are a really smart way to control growth. After a few weeks, anyone "in the know" enough to really want a Gmail account, can have one. At the same time, Google can control the service's rate of growth. There's no explosive rush, and if anyone notices something isn't scaling as well as it should, they can throttle back the issue of new invitations until the problem is solved, without too many sleepless nights.

Unfortunately, it's the sort of tactic that only works if you're as desireable a service as Google: giving Bob three invitations only helps if Bob knows three people who want to sign up. If you're just a startup with a good idea, an invitation program would just kill any chance of your service getting traction.

I wouldn't be surprised, though, if I saw future big-name MMORPGs opening their doors post-beta with an invitation program.

Caption competition:

  • Imagine a Beowulf cluster of....
  • When I asked for a G4 tower on my desktop...
  • So, if one Mac is a lover, what's six stacked on top of each other?
  • Recycled aluminium: 10c per can.
  • Desperately trying to prove we belong at WWDC.
  • Maybe if you hook them all together, they'll be able to run Doom 3?
  • Confluence. Because if you can't trust a room full of crazed Mac bigots, who can you trust?
  • All that's missing are a bunch of monkeys running around the base beating each other with sticks.
  • Now we know what they're doing all day instead of fixing Javablogs...

Edit: I forgot...

  • Hey, one of us has a girlfriend, too!

One trend that's been wandering casually around the Internet lately has been the use of Javascript to highlight words in a page, if you visit that page via a Google search.

Like many of these web-tricks, it was interesting and 'neat' at first, but now I think its 15 minutes is up, and can we please quietly pack that script away and move on to something else?

The benefit of search-term highlighting is that it allows you to see where in a document the match occurred, which may sometimes be hard to spot.

The practical result, however, is different. It's very rare that I ask Google to give me a page that occasionally mentions the terms I am searching for. If it does, then Google is either not doing its job, or it's a really obscure search. What I usually want (and end up with) is a page that is largely about the terms I am searching for.

Even if I'm searching for something really specific that might only be mentioned in one section of a document, I'm probably going to have used several search terms that occur all the way through the page, and then added the specific, narrowing term on the end.

Which means, from experience, that these Javascript-enhanced pages light up like a Christmas tree.

The effect of the highlighting is to completely disrupt the flow of the page. The highlighted terms are dotted pretty evenly through the page (making having your eyes drawn to their location pointless), and the highlighting is usually more colourful and 'interesting' to your eyes than the page's headings, which might be more useful in locating the precise information you were after.

Here's a quick quiz. You have a table in Postgres 7.3.x with BIGINT primary keys.

Q: What is the difference between the following queries?
  1. select * from foo where pkey = 12345;
  2. select * from foo where pkey = '12345';

A: The latter does an index lookup. The former performs a full table scan.

We discovered this while messing around with Postgres's EXPLAIN command, trying to work out why Javablogs was running so slowly. It turned out that without the quotes, selecting a single blog entry by its primary key cost 5000 of whatever metric Postgres measures query performance in. With the quotes, the cost dropped to... 5.

If you make the Primary Key a NUMERIC instead of a BIGINT, you get the index scan both ways (but it looks like the index scan over the numeric key is slower than it is over the bigint key...)

Anton tried to explain it to me, but my brain seized up. It was something to do with casting between INT4 and INT8. You can get an unofficial patch for the Postgres JDBC driver that fixes the problem with explicit casts, but it's ugly and untested.

Alan: "Now you know why good DBAs get paid so much, and why they are so horrified at the thought of developers writing SQL."

Apropos a conversation with cow-orkers on the way to lunch yesterday:

If those kooky stem-cell researchers were to discover a way to grow human meat in a vat, such that it was never at any point a real, living human being...

Would you eat it?

"Error: Valid requests expected!" has become the new bane of my existence. I've been spending most of the day trying to sort it out, and it's starting to get a little frustrating.

Finally, now my Powerbook is fast enough, I can say goodbye to RedHat on the desktop, and move to OS X as my primary development platform at work. So far everything is both hunky and dory... except when I try to do anything CVS-like from IDEA. Whenever I tell IDEA to run a CVS command, it replies:

"Error: Valid requests expected!"

Here's the setup:

  • Using the command-line CVS client that comes with OS X 10.3.4 (cvs 1.10)
  • CVS_RSH=ssh
  • SSH key in my authorized_keys file on the server
  • Running ssh-agent (specifically, SSHKeychain) for passphrase-entering

Bear in mind this exact same setup worked on my Linux box, except I was using vanilla ssh-agent instead of SSHKeychain.

  • From the command-line, everything Just Works. CVS connects to the box and does its funky thang without requesting any passwords or throwing up any errors
  • In fact, if I copy the (failing) CVS command verbatim from IDEA's console and paste it into a terminal window, the shell performs that command without complaint.
  • I know that IDEA is calling ssh (and ssh making use of ssh-agent) because when I ask IDEA to connect to the CVS repository the first time, the agent asks me for my passphrase.
  • Random aside: to make environment variables in OS X available to programs you're not launching from a shell, they need to go in ~/.MacOSX/environment.plist

Anyone have any idea what the fix is here? I trawled through Google and the IDEA Wiki and the Jetbrains knowledge-base to no avail.

Update: I've now also tried with the command-line ssh-agent, and with no agent and a passphrase-less ssh key, both to no avail.

Saggitarius: Your hard disk light will not go out today. Plan for occasional bursts of activity. Your lucky number is 2.6.5-gentoo-r1.

The things you own end up owning you:

1. Formatted capacity less. 2. Battery life depends on configuration and use. 3. Actual speeds lower. Compatible ISP required. Modem will function according to V.90 standards if V.92 services are not available. 4. Wireless Internet access requires AirPort Extreme Card, AirPort Extreme Base Station or AirPort Base Station, and Internet access (fees may apply). Some ISPs are not currently compatible with AirPort and AirPort Extreme. 5. Weight varies by configuration and manufacturing process. IMPORTANT: Use of this product is subject to acceptance of the software license agreements included in this package. Product contains electronic documentation. Backup copy of software included.

Name: The File/Stream Duality

Context: Your API retrieves data from, or writes data to a file


  • The naive approach is to have the API take in a filename, or a File object to work on.
  • Far too many developers believe this is sufficient
  • Unless the file is random-access, then in order to read or write to that file, you're going to have to turn it into a byte stream.
  • One day, somebody is going to have a stream of bytes that is not a file, but that they want to pass through your API.
  • When that happens, this person will curse you, your parents, and the town you grew up in.


If you are writing an API that takes a filename, instead provide an API that does precisely the same thing to an arbitrary stream of bytes, and then add "convenience" methods that apply those stream-based methods to files.

"You need to buy a printer."

"No, I don't. Pretty much everything I do is electronic, all of my records are electronic, why do I need to print anything out?"

"No seriously, you need a printer."

"I've done quite well without one so far. The rare times I need hard copy for a letter or something, I just, er, steal office supplies."

"Go buy a printer."

"Look, you know me. I never throw anything out. Within six months, my life would be completely buried in paper."

"But you need one!"

"We've gone through this. On the computer, everything's sorted, searchable, and even if it's clutter it only takes up virtual space. Why do I need anything on paper?"



"Yeah. You can't scribble on electronic documents. Some documents are easily editable. Some document formats can be annotated. Some let you 'cross items off' a list. Inevitably, though, this means you have to create a document with the specific intention of being editable, or annotated, or things-to-do-ish."

"...whereas any document can be printed out and scribbled on. You've got a point."

"Well, I am you."

"You are? Why am I talking to myself then?"

"Call it a literary conceit."

"Ah, I'm being pretentious again, aren't I."

"Hey, you said it, not me."

(Prediction: bc298e458cdacac56b5247bc5f8f1a62)

Statement 1:

Bob said something that I disagree with. This is why I think he's wrong.

Statement 2:

Bob is an idiot. This is why I think he's an idiot.

I hope it's pretty clear that the first statement is at least vaguely productive, while the second is not.

Which isn't to say I restrict myself to being productive all the time. This is my blog, and what's the point of self-publishing if you can't be self-indulgent and insulting when you feel like it?

What's annoying is when you try quite hard to write a type-1 post, and someone comes in and writes a type-2 comment. Any work you did to stick to the facts is immediately undermined: however balanced your post may or may not have been, the presence of the comment skews any perception of the original post.

What do you do? If you leave the comment un-challenged, you look like you're providing a supportive environment for random flamers. If you try to interject, you're getting into an argument that's really between the commenter and Bob, that you really have no personal interest in being a part of. Ultimately, you don't care, and any time spent debating the matter is time that could be better spent drinking tea and watching documentaries about just how many maggots can legally be contained within a can of tomatoes according to the US FDA1.

And, of course, if you delete the comment then you feel like you're silencing dissent, which is totally counter to the reason you opened up comments in the first place.

1 Two, I think.

John Gruber of Daring Fireball dares ask the question:

Why are Windows users besieged by security exploits, but Mac users are not?

Boiled down his answers are:

  1. Market-share is a factor, but there has to be some other explanation for the fact that Windows' market-share in malware vastly outstrips its market-share on the desktop
  2. There are fewer places to hide bad programs on the Mac
  3. Mac users are far less tolerant of programs that spread malware

I disagree with the first point. You can explain almost all of the relative safety in running Mac OS X with its low market-share.


This argument ignores numerous facts, such as that the Mac’s share of viruses is effectively zero; no matter how you peg the Mac’s overall market share, its share of viruses/worms/Trojans is significantly disproportionate.

In order to spread, viruses, worms and trojans rely on network effects. The value of a network grows as the square of the number of users. Therefore viruses, trojans and other malware are simply orders of magnitude more effective when targeted against a widely deployed platform.

Imagine you send the latest Mac-targetting email trojan to 100 random addresses. If you're lucky, three of them might be Mac users. If you're lucky, one of them might open the attachment, causing the trojan to be sent to all of the people in that person's address-book, most of whom will also be Windows users. Meanwhile all the Windows users will receive this attachment that they can't run, and get back to the person who sent it to them.

The trojan's just not going to get off the ground. The effectiveness of sending a Windows-targetting trojan is just several orders of magnitude higher. Even if your initial mail-out went only to Mac users, it would probably fizzle out after the first generation.

Even with spyware and adware that do not propagate over the network, the Mac is a small enough target that it is not worth tackling.

For packaged software, there are market segments. There's value in targetting a product at a small market, so long as the market wants the software, and the competition is perhaps less cut-throat than in the dominant market. That's why software exists for the Mac. Malware has no market segments, because people aren't looking to install malware. If someone has one piece of spyware installed, that doesn't mean they're not going to get another: on the contrary, it means they're more likely to install another. There's no value in targetting malware at a niche market.

I would dispute that there are fewer places for malware to hide on the Mac: I could think of some pretty interesting places you could hide programs in the Unix subsystem, or by playing tricks inside existing Application bundles. I would also dispute that any UI measures make the Mac inherently safer from malware: if you convince someone they really want to open that attachment, or download that "login application" they need to access the porn site, no amount of warning dialogs will make any difference.

I also dispute the "broken windows" theory, just on the basis that it's easy to assume ever-vigilance against something that has not yet shown any sign of existing. Communities exist in the Windows world to warn of adware-infested applications, but there's still just too many people who just want to get on the file-sharing network, and don't do their homework.

As Gruber says, even if market-share is the dominant reason for the Mac's relative security, this isn't a bad thing: since that share is unlikely to rise significantly, the Mac will stay safe from general threats.

What I'd like to add, though, is that there is still no room for complacency, because none of this keeps you safe from specific threats. Specific threats get no value from the network effect. If I want to get into your computer, I no longer care about the market-share of your operating system: the only target I care about is you.

Three years or so ago, the only thing on TV was Starship Troopers. Dutifully I succumbed, because when there's only one thing on, you have to watch it. I then wrote a quick review:

Boy did this movie suck.

It sucked badly.

On the suck-o-meter, it rates slightly more suckage than "bowling balls through a garden hose".

I've never spent an action movie wishing that all of the lead characters would just die. Not one of them had a single redeeming feature. I wanted to see each of them eviscerated by a CGI monster. This is a movie where the computer-generated aliens were better actors than the humans. If this is the future of the Earth, then for God's sake just let the bugs kill us all.

Tonight, the only thing on TV is Starship Troopers. So guess what I'm doing?

# From the "Random stuff that was taking up room on my phone" department, we first have a sign encountered whilst stopping at a boathouse for some refreshments:

No Parking. Vehicles will be shredded and turned into beer cans.

# This was the headline in the Daily Telegraph on the day I was flying out to Perth. I wonder if taking random photos of stuff in an airport with a phone-cam gets you put on some kind of list of potential terrorists? Seeing as I don't follow League at all, this one gave me several double-takes.

Wiki Banned for Five Weeks

# Finally, we're tracking how many 1.1-level bugs are left in Confluence on the whiteboard. Given that this time last week it was over 70, we're not doing too bad. I did, however, want to save my rather cute number 9 for posterity after my fixing a table-rendering bug early this evening doomed him to erasure.

A terrible number 9, but a rather cute whale.

# And no, I'm not getting into the paragraph anchor thing as a general rule. It's just a habit I'm trying to cultivate when I make a post that's really about a bunch of different stuff. For regular, single-subject postings, paragraph-level anchors are just a distraction.