Hi, I wonder, is there any interest in the following: - support for http://www.loc.gov/standards/mods/ as basic bibl format - provide converters from marcs and bibtex to mods - layer the bib module on top of that If so, who'd like to join/volunteer for subtasks Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hi! Hans Hagen schrieb:
- support for http://www.loc.gov/standards/mods/ as basic bibl format - provide converters from marcs and bibtex to mods - layer the bib module on top of that
There was a (short) discussion on about this under the thread "croffref in bibtex" 2006-03-23 seq, see esp. Bruce d'Arcus' contribution http://www.ntg.nl/pipermail/ntg-context/2006/017019.html Pro + having a standard bib format for ConTeXt + MODS is a standard format for bibliographers + it is XML-based + BibTeX conversion already exists (at least on Unix) via bibutils: http://www.scripps.edu/~cdputnam/software/bibutils/bibutils2.html Con - it seems the MODS is quite complex (= a lot of work) See also Bruce's comment. IMHO Bruce's own RDF-based approach looks rather promising, too, see http://xbiblio.sourceforge.net/csl/ Perhaps one should contact him to check the projects progess My thoughts. Cheers Ulf
Hi Hans,
Hans Hagen
I wonder, is there any interest in the following:
- support for http://www.loc.gov/standards/mods/ as basic bibl format
I think Ulf's conclusions are right. MODS is expressive, which is why I was originally attracted to it, but it's also more complex than it needs to be for this sort of use case. The big question becomes, if not MODS, then what? As Ulf pointed out, my solution -- and the one I will be advocating for OpenDoucment (I am on the TC) -- is to use a particular RDF serialization. Indeed, I have a draft RELAX NG schema for it, and my formatting system (citeproc) now works with it quite well. Microsoft, incidentally, is implementing pretty good bib support (that looks suspiciously like what I've been advocating for OpenOffice!), which I've blogged about extensively. Their XML format is not bad, though it is totally flat, which means it won't be as flexible as MODS or RDF. More here: http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2006/06/16/flat-vs-...
- provide converters from marcs and bibtex to mods - layer the bib module on top of that
Curious question: would you be writing it in Lua (closer to the pdftex level), or go more high-level (as now)?
If so, who'd like to join/volunteer for subtasks
I can certainly help with advice and design, particularly if you want to use CSL to configure the output. I've made some changes to it (again) recently, but think I'm zeroing in on freezing it. The more feedback I get, the easier it'll be to do that. Incidentally, I'm considering the possibility of submitting CSL to OASIS for standardization, though only if I can get some industry players involved. Bruce
Bruce D'Arcus wrote:
I think Ulf's conclusions are right. MODS is expressive, which is why I was originally attracted to it, but it's also more complex than it needs to be for this sort of use case.
since we're talking databases here, i think the focus should be on what kind of (intermediate) format suits typesetting best (could be different from the databse structure)
The big question becomes, if not MODS, then what? As Ulf pointed out, my solution -- and the one I will be advocating for OpenDoucment (I am on the TC) -- is to use a particular RDF serialization. Indeed, I have a draft RELAX NG schema for it, and my formatting system (citeproc) now works with it quite well.
but that's still the database part of it, isn't it ( i never really used rdf - only looked at it)
Microsoft, incidentally, is implementing pretty good bib support (that looks suspiciously like what I've been advocating for OpenOffice!), which I've blogged about extensively. Their XML format is not bad, though it is totally flat, which means it won't be as flexible as MODS or RDF. More here:
http://netapps.muohio.edu/blogs/darcusb/darcusb/archives/2006/06/16/flat-vs-...
- provide converters from marcs and bibtex to mods - layer the bib module on top of that
Curious question: would you be writing it in Lua (closer to the pdftex level), or go more high-level (as now)?
i dunno, some manipulations now done in tex can better be done in lua, so i can imagine something - interpret xml (tex) - manipulate data (lua and tex) - typeset results (tex) a practical approach would be to start with a copy of m-bib (maybe a pet project of taco and me) and see where we end up
If so, who'd like to join/volunteer for subtasks
I can certainly help with advice and design, particularly if you want to use CSL to configure the output. I've made some changes to it (again) recently, but think I'm zeroing in on freezing it. The more feedback I get, the easier it'll be to do that.
so csl is the formatting spec thing (the oo site is not that informative; i always like to *see* code -) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Hans Hagen
since we're talking databases here, i think the focus should be on what kind of (intermediate) format suits typesetting best (could be different from the databse structure)
There may be trade-offs, but if you get too typesetting-oriented, you cause other problems. [...]
but that's still the database part of it, isn't it ( i never really used rdf - only looked at it)
An example description: <Paper rdf:about="http://ex.net/1"> <title>Paper</title> <author> <Person> <givenName>John</givenName> <familyName>Smith</familyName> </Person> </author> <presentedAt> <Conference> <title>ABC Conference</title> <endDate>2004-03-12</endDate> <startDate>2004-03-15</startDate> </Conference> </presentedAt> </Paper> This is indeed more "database-oriented", complete with the capability to normalize all of that by breaking out persons and such into separate descriptions and linking them. But it's not hard to process for formatting either (easier than MODS I think).
so csl is the formatting spec thing (the oo site is not that informative; i always like to *see* code -)
Correct. Subversion repo for the entire project (including schema for CSL, and the start of a Ruby port of citeproc) is here: http://sourceforge.net/svn/?group_id=117435 Bruce
Hi again folks, IMHO a good, flexible bibliographic format that plays well with the other strength of ConTeXt (e.g. XML support) could be sort a killer feature... Bruce D'Arcus schrieb:
The big question becomes, if not MODS, then what? As Ulf pointed out, my solution -- and the one I will be advocating for OpenDoucment (I am on the TC) -- is to use a particular RDF serialization. Indeed, I have a draft RELAX NG schema for it, and my formatting system (citeproc) now works with it quite well. ... I can certainly help with advice and design, particularly if you want to use CSL to configure the output. I've made some changes to it (again) recently, but think I'm zeroing in on freezing it. The more feedback I get, the easier it'll be to do that.
I think the crucial point for any TeX community is the ability to use the rather huge amount of BibTeX legacy DBs. How about the state of CSL (or RDF) to BibTeX converters? bibutils uses MODS as its native intermediate format and converts from and to BibTeX (not always 100% correct, though). Summary ------- So, at present we already have: (1) MODS <-(bibutils)-> BibTeX -(bibmod)-> ConTeXt For an XML-based format in a ConTeXt context we would like to have: (2) BibTeX <-(a)-> XML -(b)-> ConTeXt using the rather nice XML processing capabilities of ConTeXt for step (b). Now, there is an XML markup for BibTeX: BibTeXML http://bibtexml.sourceforge.net/ This isn't too bad, in my experience (it is, at least, lossless, contrary to bibutils). Thus (3) BibTeX <-(bibtexml)-> BibTeXML -(b')-> ConTeXt would be an instance of (2). CSL could use XSL transformer: (4) BibTeXML <-(XSLT)-> CSL -(b")-> ConTeXt Bye Ulf
Ulf Martin
I think the crucial point for any TeX community is the ability to use the rather huge amount of BibTeX legacy DBs.
How about the state of CSL (or RDF) to BibTeX converters?
I don't care about BibTeX myself, so such things aren't my focus. However, I think a good XML/RDF data format makes it pretty easy to downconvert to formats like BibTeX. Indeed, it took me 30 minutes or so to write a decent XSLT to convert MODS to the RDF/XML I'm using. That was only targeted at book descriptions, so it would take more time for a comprehensive version, but it shows it's not hard. The hard part, in fact, is the logic for conversion, and most of that is clearly documented in the bibutils source code.
bibutils uses MODS as its native intermediate format and converts from and to BibTeX (not always 100% correct, though).
Correct, though it's actually more complicated than that. It uses a C-based internal format that is based on lessons from MODS and from converting the other legacy formats.
Summary -------
So, at present we already have:
(1) MODS <-(bibutils)-> BibTeX -(bibmod)-> ConTeXt
For an XML-based format in a ConTeXt context we would like to have:
(2) BibTeX <-(a)-> XML -(b)-> ConTeXt
*We* wouldn't include me. I deal much more with RIS or Endnote formats than I do with BibTeX. But I don't use ConTeXt for authoring either ;-)
using the rather nice XML processing capabilities of ConTeXt for step (b).
Now, there is an XML markup for BibTeX: BibTeXML http://bibtexml.sourceforge.net/ This isn't too bad, in my experience (it is, at least, lossless, contrary to bibutils). Thus
(3) BibTeX <-(bibtexml)-> BibTeXML -(b')-> ConTeXt
would be an instance of (2).
Yes, but BibTeXML still has all the problems of the BibTeX model.
CSL could use XSL transformer:
(4) BibTeXML <-(XSLT)-> CSL -(b")-> ConTeXt
All CSL is is a language-angostic XML config language. You could write a CSL engine in whatever language you want: TeX, Lua, Perl, Ruby, C. *I* wrote mine in XSLT 2.0, but that's mostly because of limited skills with other langauges. I also designed citeproc, BTW, to have both an input and output driver system. So while I use an RDF/XML representation internally, it wouldn't be too hard to write other inout drivers. A next-generation mbib module probably ought to do the same, so that while it might have a richer core format, it could still be fed BibTeX, or even MODS. Bruce
participants (4)
-
Bruce D'Arcus
-
Bruce D\'Arcus
-
Hans Hagen
-
Ulf Martin