Defining Metadata

by Charles Miller on May 26, 2003

I recently posted this to the XP mailing-list. It's pretty basic stuff, but I figured I'd put it here in case I needed to find it again later.

banshee858 propagated the following meme:

Suppose I am driving my car to work and I am stopped at an itersection[sic]. My location, that is the names of the two streets is data. However, the metadata to my location could be the year of my car, name of the car, how many people are in the car and/or their names.

No.

Metadata is “data about data”.

For example: "2" is data. "The number of people in the car" is meta- data. It provides meaning to the number "2" and places it in some kind of context.

The confusion comes from the fact that “metadata + data == data”.

“There are 2 people in the car” is a single piece of data that is self-describing because it combines a piece of data (2) with a piece of metadata (this is the number of people in the car). Together, though, it becomes data again.

Thus, you can layer metadata miles high. A Java method is a whole heap of data and metadata bundled together in such a way as it instructs the computer to do something. Then with a tool like XDoclet, I can add some further metadata to this method to say “this method is part of the remote interface of the FooEnterpriseBean”. That's metadata again.

Lisp exploits this layering of data and metadata by treating code as data (and vice versa). This way, at the most basic level, you can take advantage of the abstraction-building techniques of combining data with metadata, and layering the whole thing up into a program.

It's a very pure form of program-building, and helps explain why Lisp programmers are even worse than Smalltalk programmers in the “Why the Hell did my perfect language fail?” department.

XML was also designed to be metadata-rich and self-describing, although most people actually producing XML ignore this (as anyone who has had the pleasure to do a CVS merge on the XML generated by Websphere Studio will attest to). This also leads to the fact that most XML-based programming languages end up looking like a poor imitation of Lisp.

Previously: What Bridge?

Next: This one is Alan's fault