Re: [NTG-context] Bibliographic Databases

20 Apr 2008

      Taco Hoekwater  writes:
...
Hi,
Andreas Wagner wrote:
...
Just out of curiosity: What are your reasons for preferring this over TEI:
MODS was a logical choice mostly my background (scientific publishers
=> MARC databases => MODS), and that BruceD'Arcus liked it.  Btw,
his blog is full of bibliographic articles, if you are interested:
http://community.muohio.edu/blogs/darcusb/
(but it looks like he has switched over to RDF now)
Yes, but ...
...
I am not really set to any particular xml format, and there are
more mainstream choices (risx comes to mind).
... I'd say for the design of something like mbib v2 I'd advocate an internal
model that abstracts away from any particular more concrete representation. So
think in terms of maybe a standard input driver, but leave room for easy
development of others. 

There's some work going on a Python version of my citeproc effort, for example,
and he's planning input drivers for MODS, RDF, BibTeX, etc.

http://xbiblio.svn.sourceforge.net/viewvc/xbiblio/citeproc-py/citeproc/

This makes is easy for someone to write another input driver for some SQL model.
...
But the few times I've had to work with TEI stuff I found that you
can easily get much more than you bargained for. Bibliographic data
is not easy on its own, and a format that allows (almost promotes)
extra tags to be embedded also is not helping at all.
Look at this:
http://www.tei-c.org/release/doc/tei-p5-doc/html/ref-author.html
Just the 'core' module is already pretty complex, but 'namesdates'
and 'linking' are definately also required for a useful bibliographic
database.
The nice, consise examples in the TEI docs are misleading because
<author>Lucy Allen Paton</author>
is useless, more specifics are needed. We need at least this:
<author>
     <persName>
       <forename>Lucy</forename>
       <forename>Allen</forename>
       <surname>Paton</surname>
     </persName>
   </author>
But with the use of <persName>, there are suddenly a gazillion
ways an author can encode the same name  (and it does not
preclude any of the other ways to encode a name).
http://www.tei-c.org/release/doc/tei-p5-doc/html/ND.html#NDPER
Etc. etc. Imagine having to support that in a simple context module.
In the XML citation style language I designed [1] (which *could* serve as the
basis for that "internal model" I mention above), there's an implicit notion
that any name can have both a sort form and a display form, and that they may
(but in contexts like Eastern Europe or Asia often don't) differ. 

This makes things in many ways both simpler, and more general (works for
organizations, as well as is more international-friendly than traditional
first/last). You just handle the details you note above in the input drive code.

Bruce

[1]
http://xbiblio.svn.sourceforge.net/viewvc/xbiblio/csl/schema/trunk/csl.rnc?v...

Re: [NTG-context] Bibliographic Databases

Bruce D'Arcus