Last night I started putting some effort into figuring out how to store data for the birds database that will be on-line eventually. I got perhaps 10% of the job done, but there were a couple of important things I realised before I got too far.

The first is that there is a heck of a lot of data that I want to keep. The usual bird data like common name and species, the specialised data like a unique ID for each bird (often used in bird counts, for example), the official info for each bird (plus references and sources), my own observations on the bird, good web links about the bird… This smells like a job for XML.

In the “bad old days” I knew a bit of XSL and XSLT, but I have no intention of brushing the cobwebs off those skills. Even though I try to keep everything XHTML 1.0 compliant, I still think I can do more in less time with Perl scripts and the regular XML APIs than I could with XSL.

The other arguments in favour of XML in this case are that I can define a DTD or schema and use a validator to ensure that my data files are valid, and that I can easily transform one format to another as my needs change. This doesn’t save me all of the up-front work of figuring out an appropriate format for my data, but it does mean that it will be easier if I have to change down the road.

My next job is to figure out whether to use a DTD or schema to describe the XML application for the data. I know DTDs much better, but namespaces could be useful, and I think those are a schema-only feature. We’ll see.


