Taco Hoekwater
Hi,
Andreas Wagner wrote:
Just out of curiosity: What are your reasons for preferring this over TEI:
MODS was a logical choice mostly my background (scientific publishers => MARC databases => MODS), and that BruceD'Arcus liked it. Btw, his blog is full of bibliographic articles, if you are interested:
http://community.muohio.edu/blogs/darcusb/
(but it looks like he has switched over to RDF now)
Yes, but ...
I am not really set to any particular xml format, and there are more mainstream choices (risx comes to mind).
... I'd say for the design of something like mbib v2 I'd advocate an internal model that abstracts away from any particular more concrete representation. So think in terms of maybe a standard input driver, but leave room for easy development of others. There's some work going on a Python version of my citeproc effort, for example, and he's planning input drivers for MODS, RDF, BibTeX, etc. http://xbiblio.svn.sourceforge.net/viewvc/xbiblio/citeproc-py/citeproc/ This makes is easy for someone to write another input driver for some SQL model.
But the few times I've had to work with TEI stuff I found that you can easily get much more than you bargained for. Bibliographic data is not easy on its own, and a format that allows (almost promotes) extra tags to be embedded also is not helping at all.
Look at this:
http://www.tei-c.org/release/doc/tei-p5-doc/html/ref-author.html
Just the 'core' module is already pretty complex, but 'namesdates' and 'linking' are definately also required for a useful bibliographic database.
The nice, consise examples in the TEI docs are misleading because
<author>Lucy Allen Paton</author>
is useless, more specifics are needed. We need at least this:
<author> <persName> <forename>Lucy</forename> <forename>Allen</forename> <surname>Paton</surname> </persName> </author>
But with the use of <persName>, there are suddenly a gazillion ways an author can encode the same name (and it does not preclude any of the other ways to encode a name).
http://www.tei-c.org/release/doc/tei-p5-doc/html/ND.html#NDPER
Etc. etc. Imagine having to support that in a simple context module.
In the XML citation style language I designed [1] (which *could* serve as the basis for that "internal model" I mention above), there's an implicit notion that any name can have both a sort form and a display form, and that they may (but in contexts like Eastern Europe or Asia often don't) differ. This makes things in many ways both simpler, and more general (works for organizations, as well as is more international-friendly than traditional first/last). You just handle the details you note above in the input drive code. Bruce [1] http://xbiblio.svn.sourceforge.net/viewvc/xbiblio/csl/schema/trunk/csl.rnc?v...