Hi all, xml processing is about the last part of my ConTeXt stuff where I haven't been able to switch to mkiv; I just can't get my head around it... 2 questions: 1. When I try to process a xml-file with my old (mkii) environments, the output looks OK, but I always get a first page with the xml version declaration <?xml version="1.0" encoding="utf-8"?> Is this a bug or a feature? Is there anything I can do to prevent it? 2. I'm really lost with the new xml mechanism. My first problem: In a message from September last year, Hans explained that the command to process xml files is:
The regular definitions still work but processing a file is done differently:
\xmlprocess{id}{filename}{optional initialization setup}
I used to have environments with which to typeset a bunch of files. How can this be ported to the new mechanism, which appears to expect a filename? Sorry if these are very basic problems, but I'm probably a bit obtuse here. All best Thomas
On Sun, 16 Mar 2008 11:29:49 +0100
"Thomas A. Schmitz"
Hi all,
xml processing is about the last part of my ConTeXt stuff where I haven't been able to switch to mkiv; I just can't get my head around it... 2 questions:
1. When I try to process a xml-file with my old (mkii) environments, the output looks OK, but I always get a first page with the xml version declaration
<?xml version="1.0" encoding="utf-8"?>
Is this a bug or a feature? Is there anything I can do to prevent it?
Can you post a example, I never saw such a effect in my test files.
2. I'm really lost with the new xml mechanism. My first problem: In a message from September last year, Hans explained that the command to process xml files is:
The regular definitions still work but processing a file is done differently:
\xmlprocess{id}{filename}{optional initialization setup}
I used to have environments with which to typeset a bunch of files. How can this be ported to the new mechanism, which appears to expect a filename?
\xmlprocess{main}{filename.xml}{} works for me.
Sorry if these are very basic problems, but I'm probably a bit obtuse here.
You don't have to use new xml mechanism form MkIV, the old code could be used without problems. The advantage of the new code is direct access to elements in the tree (you could use xml files as database) and the option to read from zip files. Wolfgang
On Mar 16, 2008, at 12:04 PM, Wolfgang Schuster wrote:
On Sun, 16 Mar 2008 11:29:49 +0100 "Thomas A. Schmitz"
wrote: Hi all,
xml processing is about the last part of my ConTeXt stuff where I haven't been able to switch to mkiv; I just can't get my head around it... 2 questions:
1. When I try to process a xml-file with my old (mkii) environments, the output looks OK, but I always get a first page with the xml version declaration
<?xml version="1.0" encoding="utf-8"?>
Is this a bug or a feature? Is there anything I can do to prevent it?
Can you post a example, I never saw such a effect in my test files.
Hi Wolfgang, OK, here is a minimal example: file test.xml: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE test [ <!ELEMENT document (section)> <!ELEMENT section (#PCDATA)> ]> <document> <section title="First"> <p>This is <quotation>an</quotation> xml file.</p> </section> </document> file testenvironment.tex \usemodule[xtag-ent] \defineXMLenvironment[document] {\starttext} {\stoptext} \defineXMLenvironment[section] {\section{\XMLpar{section}{title}{}}} {} \defineXMLenvironment[quotation] {\quotation\bgroup} {\egroup} When I process with mkii, I get the expected output. Processing with texexec --lua --env=testenvironment test.xml gives me the first line (here it's not an entire page) I described!
2. I'm really lost with the new xml mechanism. My first problem: In a message from September last year, Hans explained that the command to process xml files is:
The regular definitions still work but processing a file is done differently:
\xmlprocess{id}{filename}{optional initialization setup}
I used to have environments with which to typeset a bunch of files. How can this be ported to the new mechanism, which appears to expect a filename?
\xmlprocess{main}{filename.xml}{} works for me.
I tried to translate this into the "new" mechanism and thought it should read like so: \startxmlsetups xml:mysetups \xmlsetsetup{\xmldocument}{text:p|section|quotation}{xml:*} \stopxmlsetups \xmlregistersetup{xml:mysetups} \startxmlsetups xml:p \xmlflush{#1}\endgraf \stopxmlsetups \startxmlsetups xml:quotation \quotation{\xmlflush{#1}} \stopxmlsetups \startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \stopxmlsetups \starttext \xmlprocess{main}{test.xml}{} \stoptext But then, I only get "invalid xml file" in the output.
Sorry if these are very basic problems, but I'm probably a bit obtuse here.
You don't have to use new xml mechanism form MkIV, the old code could be used without problems. The advantage of the new code is direct access to elements in the tree (you could use xml files as database) and the option to read from zip files.
Wolfgang
OK, I'll keep that in mind. Thanks for your help, Wolfgang! Thomas
OK, here is a minimal example:
file test.xml:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE test [ <!ELEMENT document (section)> <!ELEMENT section (#PCDATA)> ]>
<document> <section title="First"> <p>This is <quotation>an</quotation> xml file.</p> </section> </document>
file testenvironment.tex
\usemodule[xtag-ent]
\defineXMLenvironment[document] {\starttext} {\stoptext}
\defineXMLenvironment[section] {\section{\XMLpar{section}{title}{}}} {}
\defineXMLenvironment[quotation] {\quotation\bgroup} {\egroup}
When I process with mkii, I get the expected output. Processing with
texexec --lua --env=testenvironment test.xml
gives me the first line (here it's not an entire page) I described!
This seems like a bug to me. This did only happen with the content in the first line because I inserted a empty first line at the begin of the line the xml header disappeared from the pdf, could be related to a wrong catcode for the "<" at the beginning of the line.
2. I'm really lost with the new xml mechanism. My first problem: In a message from September last year, Hans explained that the command to process xml files is:
The regular definitions still work but processing a file is done differently:
\xmlprocess{id}{filename}{optional initialization setup}
I used to have environments with which to typeset a bunch of files. How can this be ported to the new mechanism, which appears to expect a filename?
\xmlprocess{main}{filename.xml}{} works for me.
I tried to translate this into the "new" mechanism and thought it should read like so:
\startxmlsetups xml:mysetups \xmlsetsetup{\xmldocument}{text:p|section|quotation}{xml:*} \stopxmlsetups
\xmlregistersetup{xml:mysetups}
\startxmlsetups xml:p \xmlflush{#1}\endgraf \stopxmlsetups
\startxmlsetups xml:quotation \quotation{\xmlflush{#1}} \stopxmlsetups
\startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \stopxmlsetups
\startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \xmlflush{#1} \stopxmlsetups
\starttext \xmlprocess{main}{test.xml}{} \stoptext
But then, I only get "invalid xml file" in the output.
Remove the DOCDATA definition from your xml file, seems the parser has problems with "<>" pairs inside of the DOCDATA definition. The following line give me a pdf file <!DOCTYPE document [ <!ELEMENT section (p) ]> but the next one <!DOCTYPE document [ <!ELEMENT section (p)> ]> give only "invalid xml file". Wolfgang
On Mar 16, 2008, at 1:51 PM, Wolfgang Schuster wrote:
This seems like a bug to me. This did only happen with the content in the first line because I inserted a empty first line at the begin of the line the xml header disappeared from the pdf, could be related to a wrong catcode for the "<" at the beginning of the line.
OK, then this is a bug. The declaration has to be on the first line, my editor (emacs in nxml mode) doesn't even let me save the file when I introduce a first blank line before it.
\xmlprocess{main}{filename.xml}{} works for me.
Yes, but that would mean you need an environment for every xml file you want to process. I have now tried \xmlprocess{main}{\inputfilename}{} and this seems to work.
\startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \stopxmlsetups
\startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \xmlflush{#1} \stopxmlsetups
Thanks! I experimented a bit more; I think it has to be \startxmlsetups xml:section \section{\xmlatt{#1}{title}} \xmlflush{#1} \stopxmlsetups (at least, this seems to work for me...)
\starttext \xmlprocess{main}{test.xml}{} \stoptext
But then, I only get "invalid xml file" in the output.
Remove the DOCDATA definition from your xml file, seems the parser has problems with "<>" pairs inside of the DOCDATA definition.
The following line give me a pdf file
<!DOCTYPE document [ <!ELEMENT section (p) ]>
Hmm, but this isn't valid xml?
but the next one
<!DOCTYPE document [ <!ELEMENT section (p)> ]>
Whereas this is valid and processed without problems by mkii? Hmm, either mkiv xml handling is still a bit immature, or I'm not mature enough to use it yet :-) Thanks a lot, Wolfgang! Best Thomas
This seems like a bug to me. This did only happen with the content in the first line because I inserted a empty first line at the begin of the line the xml header disappeared from the pdf, could be related to a wrong catcode for the "<" at the beginning of the line.
OK, then this is a bug. The declaration has to be on the first line, my editor (emacs in nxml mode) doesn't even let me save the file when I introduce a first blank line before it.
No problems with Scite or EmEditor.
\xmlprocess{main}{filename.xml}{} works for me.
Yes, but that would mean you need an environment for every xml file you want to process. I have now tried \xmlprocess{main}{\inputfilename}{}
and this seems to work.
\startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \stopxmlsetups
\startxmlsetups xml:section \section{\xmlatt{#1}{section}{title}} \xmlflush{#1} \stopxmlsetups
Thanks! I experimented a bit more; I think it has to be
\startxmlsetups xml:section \section{\xmlatt{#1}{title}} \xmlflush{#1} \stopxmlsetups
(at least, this seems to work for me...)
I did the same thing in my example but forgot it in my last mail.
\starttext \xmlprocess{main}{test.xml}{} \stoptext
But then, I only get "invalid xml file" in the output.
Remove the DOCDATA definition from your xml file, seems the parser has problems with "<>" pairs inside of the DOCDATA definition.
The following line give me a pdf file
<!DOCTYPE document [ <!ELEMENT section (p) ]>
Hmm, but this isn't valid xml?
I know but it could help to find the wrong definition in the xml parser.
but the next one
<!DOCTYPE document [ <!ELEMENT section (p)> ]>
Whereas this is valid and processed without problems by mkii?
You should know, MkII read the xml code with TeX macros whereas MkIV use lpeg to read the xml code.
Hmm, either mkiv xml handling is still a bit immature, or I'm not mature enough to use it yet :-)
The MkIV is new and still under development, tests like your one help to find errors and to fix them. Wolfgang
Thomas A. Schmitz wrote:
On Mar 16, 2008, at 1:51 PM, Wolfgang Schuster wrote:
This seems like a bug to me. This did only happen with the content in the first line because I inserted a empty first line at the begin of the line the xml header disappeared from the pdf, could be related to a wrong catcode for the "<" at the beginning of the line.
OK, then this is a bug. The declaration has to be on the first line, my editor (emacs in nxml mode) doesn't even let me save the file when I introduce a first blank line before it.
indeed ther eis something weird, but it may as well be something in lautex itself, so taco has to look into it too what happens is: \def\processXMLfilegrouped#1{{\enableXML\processfile{#1}\relax\ifmmode\else\par\fi}} it looks like the new catcode regime lags one line behind here ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Mar 16, 2008, at 6:16 PM, Hans Hagen wrote:
indeed ther eis something weird, but it may as well be something in lautex itself, so taco has to look into it too
what happens is:
\def\processXMLfilegrouped#1{{\enableXML\processfile{#1}\relax \ifmmode\else\par\fi}}
it looks like the new catcode regime lags one line behind here
Glad to know it wasn't just me being stupid... So I will continue to try my hand at the new mkiv mechanism. All best Thomas
Thomas A. Schmitz wrote:
On Mar 16, 2008, at 6:16 PM, Hans Hagen wrote:
indeed ther eis something weird, but it may as well be something in lautex itself, so taco has to look into it too
what happens is:
\def\processXMLfilegrouped#1{{\enableXML\processfile{#1}\relax \ifmmode\else\par\fi}}
it looks like the new catcode regime lags one line behind here
Glad to know it wasn't just me being stupid... So I will continue to try my hand at the new mkiv mechanism.
It is a bug in luatex, but not an easy one to fix. The simplest workaround (for now) is to patch core-job.lua. Best wishes, Taco --- core-job.lua~ 2008-02-13 12:01:06.000000000 +0100 +++ core-job.lua 2008-03-17 14:02:12.000000000 +0100 @@ -64,7 +64,7 @@ function commands.processfile(name,maxreadlevel) name = find_file(name,maxreadlevel) if name ~= "" then - tex.sprint(tex.ctxcatcodes,string.format("\\input %s\\relax",name)) + tex.print(tex.ctxcatcodes,string.format("\\input %s\\relax",name)) end end
On Mar 17, 2008, at 2:04 PM, Taco Hoekwater wrote:
It is a bug in luatex, but not an easy one to fix. The simplest workaround (for now) is to patch core-job.lua.
Best wishes, Taco
--- core-job.lua~ 2008-02-13 12:01:06.000000000 +0100 +++ core-job.lua 2008-03-17 14:02:12.000000000 +0100 @@ -64,7 +64,7 @@ function commands.processfile(name,maxreadlevel) name = find_file(name,maxreadlevel) if name ~= "" then - tex.sprint(tex.ctxcatcodes,string.format("\\input %s\ \relax",name)) + tex.print(tex.ctxcatcodes,string.format("\\input %s\ \relax",name)) end end
Thanks Taco! I was away from my computer yesterday, but will try that today. All best Thomas
participants (4)
-
Hans Hagen
-
Taco Hoekwater
-
Thomas A. Schmitz
-
Wolfgang Schuster