ConvertToConteXt 0.2 - convert special ConTeXt-characters (PHP)
At first: I am ConTeXt-newby and know PHP very well. @Peter Münster/Aditya Mahajan : I think \startasci and \stopasci is not the solution, when you generate ConTeXt-code with php full of ConTeXt-macro-calls: because sometimes the special-characters are ConTeXt-special-characters and sometimes they are purely the wanted text. @Philipp Gesang: I think Luatex could do the same job for me as PHP - however I am familar with PHP. @all: of course not every Character, i am converting, is a ConTeXt-special-character. Though I don't know all important characters I took all I could imagine. Shurly I converted too much however it is no problem: I have tested my function "ConvertToConteXt" with 400 Pages full of text and lots of ConTeXt-special-characters and ConTeXt-macro-calls and compiled a nice book with ConText. Which character must not be converted? I had a mistake in the function, below is the next version: Regards, Janis function ConvertToConteXt ( $xstring ) { /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ConvertToConteXt() * Version 0.2 * 01.11.2011 * * author: Jörg Kopp * www.dr-kopp.com * * Convert special ConTeXt-characters with php * Works with PHP5 * * Call it with the string you want to convert ... * ConvertToConteXt ($xstring); * * ... and you get back the converted string * * e.g.: * Input: * $string = "My root-Directory: /home/hans"; * $string = ConvertToConteXt ( $string ); * * Output/Return: * $string = "My root\\char45Directory\\char58 \\char47home\\char47hans"; * * When you write this into a file ... * file_put_contents ( "example.tex", "My root\\char45Directory\char58 \\char47home\\char47hans", FILE_APPEND ); * * ... You will find the following in example.tex: * My root\char45Directory\char58 \char47home\char47hans * * An when you compile example.tex with ConTeXt * context example.text * * You can read the following in the resulting example.pdf: * My root-Directory: /home/hans * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */ $xstring = html_entity_decode ( $xstring ); // convert HTML-entities into normal characters // Braces and Backslash need to be handled first otherwise trash will be produced $xstring = str_replace ( "{", "*$##char123##$*", $xstring ); // geschweifte Klammer auf/left curly brace $xstring = str_replace ( "}", "*$##char125##$*", $xstring ); // geschweifte Klammer zu/right curly brace $xstring = str_replace ( "\\", "{\\char92}", $xstring ); // Backslash/backslash $xstring = str_replace ( "*$##char123##$*", "{\\char123}", $xstring ); // This trick is nessecary ... $xstring = str_replace ( "*$##char125##$*", "{\\char125}", $xstring ); // ... !!! $xstring = str_replace ( "!", "{\\char33}", $xstring ); // Ausrufungszeichen/ConvertToConteXt $xstring = str_replace ( "\"", "{\\char34}", $xstring ); // Anführungszeichen/quotation mark $xstring = str_replace ( "#", "{\\char35}", $xstring ); // Raute/number sign $xstring = str_replace ( "$", "{\\char36}", $xstring ); // Dollar-Zeichen/dollar sign $xstring = str_replace ( "%", "{\\char37}", $xstring ); // Prozent-Zeichen/percent sign $xstring = str_replace ( "&", "{\\char38}", $xstring ); // Kaufmännisches Und/ampersand $xstring = str_replace ( "'", "{\\char39}", $xstring ); // Apostroph/apostrophe $xstring = str_replace ( "(", "{\\char40}", $xstring ); // Klammer auf/left parenthesis $xstring = str_replace ( ")", "{\\char41}", $xstring ); // Klammer zu/right parenthesis $xstring = str_replace ( "*", "{\\char42}", $xstring ); // Stern/asterisk $xstring = str_replace ( "+", "{\\char43}", $xstring ); // Plus/plus sign $xstring = str_replace ( ",", "{\\char44}", $xstring ); // Komma/comma $xstring = str_replace ( "-", "{\\char45}", $xstring ); // Minus/hyphen $xstring = str_replace ( ".", "{\\char46}", $xstring ); // Punkt/period $xstring = str_replace ( "/", "{\\char47}", $xstring ); // Schrägstrich/period $xstring = str_replace ( ":", "{\\char58}", $xstring ); // Doppelpunkt/colon $xstring = str_replace ( ";", "{\\char59}", $xstring ); // Semikolon/semicolon $xstring = str_replace ( "<", "{\\char60}", $xstring ); // Kleinerzeichen/less-than $xstring = str_replace ( "=", "{\\char61}", $xstring ); // Gleichzeichen/equals-to $xstring = str_replace ( ">", "{\\char62}", $xstring ); // Größerzeichen/greater-than $xstring = str_replace ( "?", "{\\char63}", $xstring ); // Fragezeichen/question mark $xstring = str_replace ( "@", "{\\char64}", $xstring ); // at-Zeichen/at sign $xstring = str_replace ( "[", "{\\char91}", $xstring ); // eckige Klammer auf/left square bracket $xstring = str_replace ( "]", "{\\char93}", $xstring ); // eckige Klammer zu/right square bracket $xstring = str_replace ( "^", "{\\char94}", $xstring ); // Zirkumflex/caret $xstring = str_replace ( "_", "{\\char95}", $xstring ); // Unterstrich/underscore //$xstring = str_replace ( "°", "{\\char??}", $xstring ); // Grad/ < ------ missing $xstring = str_replace ( "`", "{\\char96}", $xstring ); // accent aigu/acute accent $xstring = str_replace ( "|", "{\\char124}", $xstring ); // Pipezeichen/vertical bar $xstring = str_replace ( "~", "{\\char126}", $xstring ); // Tilde/tilde //$xstring = str_replace ( "•", "{\\char??}", $xstring ); // ?/ < ------ missing //$xstring = str_replace ( "º", "{\\char??}", $xstring ); // ?/ < ------ missing return $xstring; }
On 2011-11-01 20:16, Jan Heinen wrote:
@all: of course not every Character, i am converting, is a ConTeXt-special-character. Though I don't know all important characters I took all I could imagine. Shurly I converted too much however it is no problem:
Which character must not be converted?
··· from catc-ctx.mkiv ·········································· \startcatcodetable \ctxcatcodes \catcode\tabasciicode \spacecatcode \catcode\endoflineasciicode \endoflinecatcode \catcode\formfeedasciicode \endoflinecatcode \catcode\spaceasciicode \spacecatcode \catcode\endoffileasciicode \ignorecatcode % \catcode\circumflexasciicode\superscriptcatcode % \catcode\underscoreasciicode\subscriptcatcode % \catcode\ampersandasciicode \alignmentcatcode \catcode\underscoreasciicode\othercatcode \catcode\circumflexasciicode\othercatcode \catcode\ampersandasciicode \othercatcode \catcode\backslashasciicode \escapecatcode \catcode\leftbraceasciicode \begingroupcatcode \catcode\rightbraceasciicode\endgroupcatcode \catcode\dollarasciicode \mathshiftcatcode \catcode\hashasciicode \parametercatcode \catcode\commentasciicode \commentcatcode \catcode\tildeasciicode \activecatcode \catcode\barasciicode \activecatcode \stopcatcodetable ································································· So, afaict, assuming standard catcodes you should be safe with escaping »~|\{}$%#« (of which the bar was missing in the snippet I posted). Good luck, Philipp
I had a mistake in the function, below is the next version:
Regards, Janis
function ConvertToConteXt ( $xstring ) { /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ConvertToConteXt() * Version 0.2 * 01.11.2011 * * author: Jörg Kopp * www.dr-kopp.com * * Convert special ConTeXt-characters with php * Works with PHP5 * * Call it with the string you want to convert ... * ConvertToConteXt ($xstring); * * ... and you get back the converted string * * e.g.: * Input: * $string = "My root-Directory: /home/hans"; * $string = ConvertToConteXt ( $string ); * * Output/Return: * $string = "My root\\char45Directory\\char58 \\char47home\\char47hans"; * * When you write this into a file ... * file_put_contents ( "example.tex", "My root\\char45Directory\char58 \\char47home\\char47hans", FILE_APPEND ); * * ... You will find the following in example.tex: * My root\char45Directory\char58 \char47home\char47hans * * An when you compile example.tex with ConTeXt * context example.text * * You can read the following in the resulting example.pdf: * My root-Directory: /home/hans * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
$xstring = html_entity_decode ( $xstring ); // convert HTML-entities into normal characters
// Braces and Backslash need to be handled first otherwise trash will be produced $xstring = str_replace ( "{", "*$##char123##$*", $xstring ); // geschweifte Klammer auf/left curly brace $xstring = str_replace ( "}", "*$##char125##$*", $xstring ); // geschweifte Klammer zu/right curly brace $xstring = str_replace ( "\\", "{\\char92}", $xstring ); // Backslash/backslash $xstring = str_replace ( "*$##char123##$*", "{\\char123}", $xstring ); // This trick is nessecary ... $xstring = str_replace ( "*$##char125##$*", "{\\char125}", $xstring ); // ... !!!
$xstring = str_replace ( "!", "{\\char33}", $xstring ); // Ausrufungszeichen/ConvertToConteXt $xstring = str_replace ( "\"", "{\\char34}", $xstring ); // Anführungszeichen/quotation mark $xstring = str_replace ( "#", "{\\char35}", $xstring ); // Raute/number sign $xstring = str_replace ( "$", "{\\char36}", $xstring ); // Dollar-Zeichen/dollar sign $xstring = str_replace ( "%", "{\\char37}", $xstring ); // Prozent-Zeichen/percent sign $xstring = str_replace ( "&", "{\\char38}", $xstring ); // Kaufmännisches Und/ampersand $xstring = str_replace ( "'", "{\\char39}", $xstring ); // Apostroph/apostrophe $xstring = str_replace ( "(", "{\\char40}", $xstring ); // Klammer auf/left parenthesis $xstring = str_replace ( ")", "{\\char41}", $xstring ); // Klammer zu/right parenthesis $xstring = str_replace ( "*", "{\\char42}", $xstring ); // Stern/asterisk $xstring = str_replace ( "+", "{\\char43}", $xstring ); // Plus/plus sign $xstring = str_replace ( ",", "{\\char44}", $xstring ); // Komma/comma $xstring = str_replace ( "-", "{\\char45}", $xstring ); // Minus/hyphen $xstring = str_replace ( ".", "{\\char46}", $xstring ); // Punkt/period $xstring = str_replace ( "/", "{\\char47}", $xstring ); // Schrägstrich/period $xstring = str_replace ( ":", "{\\char58}", $xstring ); // Doppelpunkt/colon $xstring = str_replace ( ";", "{\\char59}", $xstring ); // Semikolon/semicolon $xstring = str_replace ( "<", "{\\char60}", $xstring ); // Kleinerzeichen/less-than $xstring = str_replace ( "=", "{\\char61}", $xstring ); // Gleichzeichen/equals-to $xstring = str_replace ( ">", "{\\char62}", $xstring ); // Größerzeichen/greater-than $xstring = str_replace ( "?", "{\\char63}", $xstring ); // Fragezeichen/question mark $xstring = str_replace ( "@", "{\\char64}", $xstring ); // at-Zeichen/at sign $xstring = str_replace ( "[", "{\\char91}", $xstring ); // eckige Klammer auf/left square bracket $xstring = str_replace ( "]", "{\\char93}", $xstring ); // eckige Klammer zu/right square bracket $xstring = str_replace ( "^", "{\\char94}", $xstring ); // Zirkumflex/caret $xstring = str_replace ( "_", "{\\char95}", $xstring ); // Unterstrich/underscore //$xstring = str_replace ( "°", "{\\char??}", $xstring ); // Grad/ < ------ missing $xstring = str_replace ( "`", "{\\char96}", $xstring ); // accent aigu/acute accent $xstring = str_replace ( "|", "{\\char124}", $xstring ); // Pipezeichen/vertical bar $xstring = str_replace ( "~", "{\\char126}", $xstring ); // Tilde/tilde //$xstring = str_replace ( "•", "{\\char??}", $xstring ); // ?/ < ------ missing //$xstring = str_replace ( "º", "{\\char??}", $xstring ); // ?/ < ------ missing
return $xstring; }
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________
-- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments
participants (2)
-
Jan Heinen
-
Philipp Gesang