If you read my blog with any regularity, you'll know that I really like the Ruby programming language. It's the language I feel most fits the way my brain works, and when I'm writing Ruby code, I feel happier than when I'm writing code in other languages.
That said, I have some pretty serious doubts about the ability of the Ruby interpreter to do real work in its current form.
Take, for example, a 25Mb XML file that I wanted to investigate the contents of. I thought it would be cool to load it up inside Ruby, because then I could use the interpreter to give me an interactive shell to play around with the contents of the document.
Anyway, first here's my baseline: loading the file into a DOM Document object using dom4j on my 1.5Ghz Powerbook:
Epiphany:/tmp cmiller$ time java -Xmx256M DomTest real 0m19.413s user 0m17.270s sys 0m1.070s
Now, it's Ruby's turn, using REXML to load a DOM tree:
Epiphany:/tmp cmiller$ ruby -v ruby 1.8.2 (2004-07-16) [powerpc-darwin] Epiphany:/tmp cmiller$ time ruby read.rb real 33m22.680s user 14m44.710s sys 1m3.670s
OK, that's a pretty huge difference. Ruby is a factor of fifty slower at parsing XML. At first I thought this might just be REXML's fault. It admits only to being "reasonably fast" on its homepage. Maybe there's just some pathological algorithm going on inside that particular library.
Then, on a hunch, I turned off Ruby's garbage collection with
GC.disable and ran the test again.
Epiphany:/tmp cmiller$ time ruby read-nogc.rb real 45m20.425s user 5m52.830s sys 2m37.110s
The "real" time went up, partly because I was busy doing other stuff at the time, and partly because the amount of memory that was eaten up pushed my Powerbook into swap, but the microbenchmark seems to show that in the first run, Ruby was spending more than half its CPU time doing memory management.
Hopefully, this is something that will be fixed in Ruby 2.0. The plans are to move to bytecode compilation and generational GC, both steps in the right direction. For now, though, 2.0 is quite decidedly vaporware.
So yes. As much as I love the Ruby language, I'm not sure that I'd trust it with too much heavy lifting just yet.