19
Jun

Deletionism

  • 1:58 PM

Ask me ten years ago, and I'd say a blog entry, once published, should remain that way. Oh wait, I actually did say that:

I try never to delete anything substantive. Attempting to un-say something by deleting it is really just a case of hiding the evidence. I'd much rather correct myself out in the open than pretend I was never wrong in the first place.

The reasons not to delete come down to:

  • Not wanting to break the web by 404-ing a page
  • Wanting to be honest about what you’ve said in public
  • Keeping a record of who you were at some moment in time.

The counter-arguments are:

  • The web was designed to break. And anyway, the stuff worth deleting is usually the stuff nobody’s linking to.
  • Just how long does a mea culpa have to stand before it becomes self-indulgent?
  • Unless you’re noteworthy and dead, or celebrity and alive, the audience for your years-old personal diaries is particularly limited.
  • Publishing on the web isn’t just something you do, and then have done. It’s an ongoing process. A website isn’t just a collection of pages, it’s a work that is both always complete, and always evolving. And every work can do with the occasional read-through with red pen in hand.

That last point is the most compelling one. I was publishing a website full of things that, however apt they were at the time to the audience they were published for, just aren’t worth reading today.

So to cut a long story short, last weekend I un-published about 700 of the previously 1800 posts on this blog; things that were no longer correct, things that were no longer relevant, things that were no longer interesting even as moments in time, and things that I no longer feel comfortable being associated with. I don't think anything that was removed will be particularly missed, and as a whole the blog is a better experience for readers without them.

The weirdest thing about deleting 700 blog posts is realising you had 1800 to start with. Although to be fair, 1750 of them were Cure lyrics drunk-posted to Livejournal.

Under the hood

It's a testament to the resilience of Moveable Type that in the eleven years since I first installed it to run this blog, I've upgraded it exactly twice. If I’d tried that with the competition, I doubt I’d have had nearly as smooth a ride.

Moveable Type got me through multiple front-page appearances on Digg, reddit, Hacker News and Daring Fireball without a hitch, or at least would have if I hadn't turned out to be woefully incompetent at configuring Apache for the simple task of serving static files.

But as they say, all good things must come to end. Preferably with Q showing up in a time travel episode.

I replaced Moveable Type with a couple of scripts that publish a static site from a git repo, fully aware that I’m doing this at least five years after it became trendy. The site should look mostly identical, except comments and trackbacks haven't been migrated. They’re in the repo, but I'm inclined to let them stay there.

Look, bad things happen to people in fiction just like bad things happen in real life. And at least the people in fiction aren't real so it didn't really happen to them.

I get that.

And you can have great entertainment where bad things happen to bad people, or bad things happen to good people, or bad things happen to indifferent people who just happened to be in the wrong place at the wrong time.

I get that too.

But at some point you find yourself sitting on a couch watching a drawn-out scene where a child is burned alive screaming over and over for her parents to save her, and you think “Why the fuck am I still watching this show?”

Bad things happen in real life. Bad things have happened throughout history. So what, I'm watching television. If I wanted to experience the reality of a brutal, lawless campaign for supremacy between tribal warlords, there are plenty of places in the world I could go to see that today. I wouldn't survive very long, but at least I'd get what I deserved for my attempt at misery tourism.

Bad things happen in good drama, too. But drama comes with a contract. The bad things are there because they are contributing to something greater. Something that can let you learn, or understand, or experience something you otherwise wouldn't have; leading you out the other side glad that you put yourself through the ordeal, albeit sometimes begrudgingly.

To refresh our memories, here's how George R. R. Martin explained the Red Wedding:

I killed Ned in the first book and it shocked a lot of people. I killed Ned because everybody thinks he's the hero and that, sure, he's going to get into trouble, but then he'll somehow get out of it. The next predictable thing is to think his eldest son is going to rise up and avenge his father. And everybody is going to expect that. So immediately [killing Robb] became the next thing I had to do.

There are increasingly flimsy justifications for the horrors of Game of Thrones. They motivate character A. Or they open up space for character B. But in the end it's obvious that it's really about providing the now-mandated quota of shock, and giving the writers some hipster cred for subverting fantasy tropes.

I did not enjoy watching Sansa Stark’s rape. I did not enjoy watching Shireen Baratheon burned at the stake.

If that's what you want to watch TV for, go for it. But I'm out.

Seen on Twitter:

Either and Promises/Futures are useful and I’ll use them next time they’re appropriate. But outside Haskell does their monad-ness matter?

All code below is written in some made-up Java-like syntax, and inevitably contains bugs/typos. I'm also saying "point/flatMap" instead of "pure/return/bind" because that's my audience. I also use "is a" with reckless abandon. Any correspondance with anything that either be programatically or mathematically useful is coincidental

What is a monad? A refresher.

A monad is something that implements "point" and "flatMap" correctly.

I just made a mathematician scream in pain, but bear with me on this one. Most definitions of monads in programming start with the stuff they can do—sequence computations, thread state through a purely functional program, allow functional IO. This is like explaining the Rubiks Cube by working backwards from how to solve one.

A monad is something that implements "point" and "flatMap" correctly.

So if this thing implements point and flatMap correctly, why do I care it's a monad?

Because "correctly" is defined by the monad laws.

  1. If you put something in a monad with point, that's what comes out in flatMap. point(a).flatMap(f) === f(a)
  2. If you pass flatMap a function that just points the same value into another monad instance, nothing happens. m.flatMap(a -> point(a)) === m
  3. You can compose multiple flatMaps into a single function without changing their behaviour. m.flatMap(f).flatMap(g) === m.flatMap(a -> f(a).flatMap(g))

If you don't understand these laws, you don't understand what flatMap does. If you understand these laws, you already understand what a monad is. Saying "Foo implements flatMap correctly" is the same as saying "Foo is a monad", except you're using eighteen extra characters to avoid the five that scare you.

Because being a monad gives you stuff for free.

If you have something with a working point and flatMap (i.e. a monad), then you know that at least one correct implementation of map() is map(f) = flatMap(a -> point(f(a)), because the monad laws don't allow that function to do anything else.

You also get join(), which flattens out nested monads: join(m) = m.flatMap(a -> a) will turn Some(Some(3)) into Some(3).

You get sequence(), which takes a list of monads of A, and returns you a monad of a list of A's: sequence(l) = l.foldRight(point(List()))((m, ml) -> m.flatMap(x -> ml.flatMap(y -> point(x :: y)))) will turn [Future(x), Future(y)] into Future([x, y]).

And so on.

Knowing that Either is a monad means knowing that all the tools that work on a monad will work on Either. And when you learn that Future is a monad too, all the things you learned that worked on Either because it's a monad, you'll know will work on Future too.

Because how do you know it implements flatMap correctly?

If something has a flatMap() but doesn't obey the monad laws, developers no longer get the assurance that any of the things you'd normally do with flatMap() (like the functions above) will work.

There are plenty of law-breaking implementations of flatMap out there, possibly because people shy away from the M-word. Calling things what they are (is a monad, isn't a monad) gives us a vocabulary to explain why one of these things is not like the other. If you're implementing a flatMap() or its equivalent, you'd better understand what it means to be a monad or you'll be lying to the consumers of your API.

But Monad is an opaque term of art!

So, kind of like "Scrum", "ORM" or "Thread"?

Or, for that matter, "Object"?

In summary:

As developers, we do a better job when we understand the abstractions we're working with, how they function, and how they can be reused in different contexts.

Think of the most obvious monads that have started showing up in every language1 over the last few years: List, Future, Option, Either. They feel similar, but what do they all have in common? Option and Either kind of do similar things, but not really. An Option is kind of like a zero-or-one element list, but not really. And even though Option and Either are kind of similar, and Option and List are kind of similar, that doesn't make Either and List similar in the same way at all! And a Future, well, er…

The thing they have in common is they're monads.


1 Well, most languages. After finding great branding success with Goroutines, Go's developers realised they had to do everything possible to block any proposed enhancement of the type system that would allow the introduction of "Gonads".

The weekend after I posted this article about the pitfalls hiding in a simple shell command-line, I had some free time and decided it might be fun to see what the same functionality would look like in Haskell.

My Haskell experience is limited to tutorials, book exercises, and reading other people’s code, so I wanted to go a little bit off the paved road for once.

So I grabbed System.FilePath.Find and hsexif off Hackage because they seemed to be what I needed, and set about making the type checker happy enough to run my program.

I didn't expect to just knock out a working program in an unfamiliar language in minutes, and I didn't. Nonetheless I was pretty impressed by how, once I had muddled through understanding the pieces I was working with, they joined together in a pleasingly logical way. (Although looking at the code a month later I can get what the various bits do—the types are informative enough for that—but it would take some unravelling for me to remember how.)

Then I ran my resulting program on a real directory full of photos and it instantly died, because the exif parser was built on lazy I/O, and my script was running out of available file-handles before it was getting around to closing them.

Leaky abstractions 1, Charles 0.

This is the kind of post that has a small but annoyingly non-zero chance of setting off somebody's "wrong on the Internet" buzzer. Please don't. It's an anecdote of how I encountered an unshaven yak on a lazy Sunday afternoon, as developers are wont to do, nothing more.

Twitter is a crowded bar. Most of the people who are there, are there to relax, hang out and chat with the friends or colleagues they showed up with.

Twitter is a crowded bar. You're having conversations in a public place. You are surrounded by people having conversations within your earshot. But none of that makes what you are doing or what they are doing "public debate" by any reasonable definition.

Twitter is a crowded bar, not an alternate universe, so try not to say anything that would get you in trouble if it was repeated outside the bar.

Twitter is a crowded bar. Sometimes you run into people you know. And if they're cool with the friends you went to the bar with, everything will be cool.

Twitter is a crowded bar. Some of the people there are being paid to promote a product. Some are there to network. Some are there because they're famous and it's a "place to be seen". You can hang around them if that's your thing, but they're at the bar because everybody else makes it worth them being there, not the other way around.

Twitter is a crowded bar. People can go to bars to meet strangers, to meet new people in a new town, or because they know it's where other people "like them" hang out. But it's never anyone else's responsibility to make sure a stranger feels welcome with them personally, or with their social circle.

Twitter is a crowded bar. If you hear something interesting from a nearby conversation there's nothing stopping you introducing yourself and joining in. But again, it's your job to make sure you're not an unwelcome interruption.

Twitter is a crowded bar. If you're that person who always ends up cornering a stranger who doesn't particularly want to talk to you in a one-on-one conversation, you're not fooling anyone.

Twitter is a crowded bar. If someone says "I didn't come here to argue with you about that", even if it's something they brought up in the first place, you should probably find another conversation. There's plenty of them around.

Twitter is a crowded bar. If you go there to pick fights with strangers, you're an asshole.

Twitter is a crowded bar, but they've kind of skimped on hiring bouncers.

The most educational part of this recent reiteration of the “your software should be like Unix pipes” trope isn’t that it shows how Unix command line tools are actually rather complicated, and can easily turn into baroque magical invocations. Although that's certainly true. The man-page for ‘find’ is 3,700 words. The manual for grep is a comparatively light 1,600 words, but that's because the 3,000 word explanation of regular expressions is in a different file.

The most educational part is the addendum:

Update: added -print0 to find and -0 to xargs to properly handle spaces in file names.

Firstly, this is a really dangerous class of bug. Unsafe handling of spaces in filenames is the kind of shell scripting mistake that will eventually end up deleting half the files on your computer when you just wanted to prune a directory.

It’s no accident that “The day I accidentally rm -rf /’d as root, but recovered because I still had an emacs process running in another terminal.” is the archetypal Unix admin war-story.

Secondly, this is the kind of bug that appears as an emergent behaviour of component-based systems. Every component in the pipeline is working entirely correctly, in the sense that they're all performing exactly the operation they were instructed to perform. The bug comes from the way the pieces have been joined together.

Joining simple components together doesn't guarantee you simplicity. Hook a machine that does three things to a machine that does three things, and you've got a bigger machine that does nine things. Any one of those nine paths could conceal a bug that doesn't live in either component, but in the assumptions made when those components are joined together.

The Unix pipe model, where complex operations are composed out of single-function pieces that consume one stream of bytes and emit another, is magically simple. Every component speaks the same language—bytes in, bytes out—and thus every component is compatible with each other. The components can be developed to a uniform simple flow of common input and output APIs. Complex things like flow control are handled for you: shells can buffer those bytes so if you send too fast your writes will eventually block until the next component is free to receive.

At this point I must defer to Jamie Zawinski:

…the decades-old Unix “pipe” model is just plain dumb, because it forces you to think of everything as serializable text, even things that are fundamentally not text, or that are not sensibly serializable.

For a program that produces or consumes a list of items, the problem of how that list is communicated doesn't go away by saying “everything is a stream of bytes”. All that happens is that each program producing or consuming lists has to pick a delimiter, and hope that the other program in the chain doesn't pick a different delimiter and delete all your files.

And then there are the assumptions about how a stream of bytes might map to text that are rooted in the 1970s. Or the way programs that want to support pretty-printing to the terminal must do so by silently varying their output based on the identity of the stream they are writing to.

Simplicity is prerequisite for reliability. — Edsger Dijkstra, How do we tell truths that might hurt?

The Unix pipe model is actually a great example of how a complex system can be made to look simple by pushing complexity downstream, and how doing so can give you a very narrowly defined kind of simplicity at the expense of reliability—the simplicity of a system that mostly does the right thing most of the time.

The New Jersey guy said that the Unix solution was right because the design philosophy of Unix was simplicity and that the right thing was too complex. Besides, programmers could easily insert this extra test and loop. The MIT guy pointed out that the implementation was simple but the interface to the functionality was complex. — Richard Gabriel, The Rise of Worse Is Better

If you only have to worry about mostly doing the right thing most of the time, your components can be simpler because they can pretend edge-cases don't exist or don't matter. For users, the default “happy path” can be simpler because they don’t have to cater to those edge-cases except when they happen and they either remember to insert that extra test for the unhappy path, or are left cleaning up the mess afterwards. And if things do screw up, it’s easy to blame yourself because you forgot you needed a -print0 in there.

There is an obvious analogy to programming language type systems, or pure functions vs side-effects here. Feel free to print out this blog post and scribble one in the margins.

Also, Simplicity isn't simple, a coda.

Going Clear

  • 5:25 PM

One of the great unsolved mysteries of my life is: “Who was the utter tosspot who gave my name to the Church of Scientology?”

I was sixteen-turning-seventeen at the end of my last year of high school in Western Australia, and I got a phone call at home. According to their records, I had purchased a copy of Dianetics, and did I want to take some of my time to maybe get together and talk to them about it?

Even in 1992 this seemed like a particularly silly idea. I didn't really know much about the organisation, but I had walked past the Scientology centre in Perth any number of times. There were rumours of cultish brainwashing. Also, my brother owned a few of the Battlefield Earth books and he told me they were kind of shit.

So I told them a polite “thanks but no thanks”, then went to school and interrogated all of the members of my nerdy, Dungeons-and-Dragons-playing social circle to find out who had let curiosity get the better of them enough to buy a book in my name. I had my suspicions, but none of them fessed up.

From then on, contact from Scientology became a regular, but not overwhelmingly regular thing. Every few months I would get a letter here, a phone call there. On one memorable occasion they invited me to their Summer barbecue. I would politely ask them to stop contacting me. They would assure me I was missing out on something really great, then let me go until next time.

In my first year of University I found a copy of Dianetics in the UWA library. Somebody had neatly printed “This is bullshit” on the first page of the first chapter. I didn't make it much further into the book than that myself.

Not long after, I discovered the Internet.

Reports these days will attribute the Internet’s awareness of, and activism against Scientology to Anonymous, but that’s just good marketing on anon’s part. Operation Clambake published the now-infamous Xenu documents all the way back in 1996, a direct consequence of the attention drawn to the organisation in 1995 by the death of Lisa McPherson, and the Church’s clumsy attempt to remove the alt.religion.scientology Usenet newsgroup.

As a Law student at the time, the leaked OT-III documents were one of my favourite legal Catch-22s. In order to make a copyright claim to suppress the documents’ publication, the Church of Scientology had to attest legally they were legitimate, and thus verify the Xenu thing was real.

So after a few years of occasional but annoying contact, I wrote a two page letter explaining in detail what I had discovered while investigating their religion, and why there wasn’t even the slightest chance I would show up to their barbecue even if they did have really tasty sausages. This being the early 90s I printed the letter out, put it in an envelope, put a stamp on it and walked down to the shops to put it in a mailbox.

And that was the last contact I had with Scientology.

Even in the late 90s owning your own domain was something of a novelty. I used it everywhere, even when I posted to Usenet. Which was probably a bad idea since I can also now count eighteen years during which my primary email address has been passed from spammer to spammer like an increasingly quaint heirloom.

At the time I had also discovered that Telstra offered wholesale dialup Internet. For the low entry fee of pretending to be a real company, you got a 24/7 dialup connection with as large a statically-routed IP block as you could justify by emailing them to tell them what you were planning on using the addresses for (a /29, was more than enough for me). The bargain rate was the same they charged ISPs for backbone traffic: 19c per megabyte downstream.

Yes, in modern terms that's a lot. For an Australian student in the 90s who was mostly using the connection for IRC and Usenet, and whose primary browser was lynx, it was actually pretty cheap. And it meant on months I was broke I could get away with barely paying anything. Also, because it was a dedicated connection there was zero modem contention and you never got kicked off.

Around the same time I was also volunteering on the K:Line desk of an IRC network, and even as late in the history of the Internet as 1997 there was still the assumption that when you sent an email to the abuse@ address of a domain, a human being would read it and at least tell you why you weren't important enough to care about.

Anyway, during that time I annoyed someone on Usenet enough that they decided to email my Internet Provider to complain about my terrible behaviour. By the above-explained strange twist of fate, that Internet Provider was me.

Sadly I don't have a copy of my response, but from memory it was something like this.

To: complainer@some.isp.example.com
From: abuse@pastiche.org
Subject: Your complaint.

This is in regards to your contact about the behaviour of one of our subscribers on Usenet newsgroup alt.irc.

I would like to reassure you that we consider it important to be good Internet citizens, and take complaints about our subscribers seriously. I would like to thank you for reporting this incident to us.

Having reviewed the material of your complaint, we agree that this behaviour should not be tolerated.

We have identified the user responsible, and had him taken out and shot.

Sincerely,

The Pastiche Abuse Team.

The complainer, sadly, didn't reply.

There's been an awful lot of discussion the last few months around how much the Apple Watch Edition would cost, who would possibly buy an expensive piece of jewelry that would likely be obsolete in a year, and what clever trade-in or upgrade schemes Apple might put in place to make it worthwhile.

Today we know for sure that the answer to the first question is “US$10,000-$17,000, or possibly higher for models not yet on the Apple Store.”

I'd like to suggest the answer to the third question doesn't matter, because the answer to the second question is “people who don’t care about the third question.”

I think it was Gruber who pointed out that one of the interesting things about Apple products for us upper middle-class consumers is that the iPhone we buy is exactly the same one Beyoncé has. No matter how successful she is, she can't get a better iPhone than us.

Empowering for us, kind of annoying for Beyoncé.

At the time I'm writing this, Apple is the most profitable company in the history of everything. Selling luxury watches to that small population of people who can pay for luxury watches isn't going to make that needle move noticeably. Rolex's annual revenue is around $4.5bn. That's less than 10% of what Apple is expecting to pull in this quarter.

Apple isn't modeling their business on Rolex selling watches that will last a lifetime. They are looking at Armani selling couture that will be deliberately outmoded by next season’s fashions. And they are looking at it for exactly the same reason Armani does.

(Armani's annual revenue is a lot less less than Rolex, which is where the analogy falls down a little, but I think the general point is still valid. :) )

Armani Exchange prices its dresses in the low hundreds of dollars. Emporio Armani or Armani Collezioni dresses start to reach the thousands. But if you want something like the Armani gown Cate Blanchett wore to the Oscars, you might be talking six figures.

The number of people who are going to buy a $100,000 dress is vanishingly small (and I'm assuming Blanchett was, like most Oscar attendees, lent hers as a promotion). If you built a business making $100,000 dresses, you are likely be a cottage industry dependent on the fickle fancies of the very rich. But if you're in the business of selling $100 and $1000 dresses, having a $100,000 dress draped over a famous actress lends an enormous amount of prestige to your brand, and makes people start to think that maybe your $1000 dresses are kind of a bargain?

Apple already knows that with the right tech, it can make hundreds of billions of dollars selling $500-$1000 gadgets with a 24 month life-span. The difference is the iPhone models were differentiated by technical specs, not by appearance. If Apple had just released one watch that was priced around $400, and another that was closer to $1000, with identical specs and different materials, customers would have really wondered whether twice as much money was worth it just for a slightly prettier watch.

Through the simple existence of the Apple Watch Edition, the regular Apple Watch stops being the few-hundred-dollars more expensive model of the entry-level Apple Watch Sport. It becomes the one-tenth-the-price model of the Edition. When the Jay-Z version of a product is $10,000, and you know that you're getting exactly the same functionality as Jay-Z just in steel or aluminium instead of gold, the psychological effect is you're now deciding in terms of the—much less immediately grokkable—difference between 1/10th of the price and 1/20th.

If the Apple Watch is a product people want, and we don't yet know it will be, Apple are going to sell a lot of Editions. The world contains a lot of people who want to express their wealth by buying a better version of the thing everyone else is buying, and the Edition speaks to those consumers. But when it comes to their bottom line, Apple is going to make a whole lot more money from consumers who, just because the Edition exists, unconsciously see the stainless steel Apple Watch as the middle choice.

Vagrant-HOWTO

  • 7:25 PM
  1. Use Google (or just search Github directly) to find the Vagrant recipe for the thing you want to install.
  2. Clone the recipe from the only available tag: master.
  3. Run vagrant up
  4. Upgrade Vagrant, because it's been a few months since you did that last and nothing works any more.
  5. Upgrade VirtualBox, because that's a few months old too, and you know there's no chance the newest version of Vagrant will work with something so hideously ancient.
  6. Run vagrant up
  7. Return to Google to find plausible substitutes for the dependencies that have vanished off the Internet since this recipe was written.
  8. Cross fingers.
  9. Run vagrant up
  10. Return to Google to find the extra plugins you need to run this recipe.
  11. Follow the series of redirects to find where said plugins are hosted these days.
  12. Install plugins.
  13. Run vagrant up
  14. Run VAGRANT_LOG=debug vagrant up
  15. Hunt your way up through the pile of Ruby stack traces until you find the "undefined method" error.
  16. Sob quietly into your cup of tea.