I am using MacOSX and up till now my files have a Western(Mac OS Roman) encoding, not UTF8. That works fime in ConTeXt-mkii by using \enableregime[mac]. But in ConTeXt-mkiv through luatex I get an error despite that same \enableregime[mac]: For an accented character like é the error is "! Text line contains an invalid utf-8 sequence." How can I solve that other than changing all my files to UTF8 encoding? Hans van der Meer
Hans van der Meer wrote:
I am using MacOSX and up till now my files have a Western(Mac OS Roman) encoding, not UTF8. That works fime in ConTeXt-mkii by using \enableregime[mac]. But in ConTeXt-mkiv through luatex I get an error despite that same \enableregime[mac]: For an accented character like é the error is "! Text line contains an invalid utf-8 sequence."
How can I solve that other than changing all my files to UTF8 encoding?
Perhaps with a preprocessing (ctx) file that runs iconv, but luatex itself wants UTF-8, full stop. Best wishes, Taco
Taco Hoekwater wrote:
Hans van der Meer wrote:
I am using MacOSX and up till now my files have a Western(Mac OS Roman) encoding, not UTF8. That works fime in ConTeXt-mkii by using \enableregime[mac]. But in ConTeXt-mkiv through luatex I get an error despite that same \enableregime[mac]: For an accented character like é the error is "! Text line contains an invalid utf-8 sequence."
How can I solve that other than changing all my files to UTF8 encoding?
Perhaps with a preprocessing (ctx) file that runs iconv, but luatex itself wants UTF-8, full stop.
regimes should work but maybe there is no mac regime; anyway ... moving to utf is the best option ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On 12 mei 2008, at 10:16, Hans Hagen wrote:
Taco Hoekwater wrote:
Hans van der Meer wrote:
I am using MacOSX and up till now my files have a Western(Mac OS Roman) encoding, not UTF8. That works fime in ConTeXt-mkii by using \enableregime[mac]. But in ConTeXt-mkiv through luatex I get an error despite that same \enableregime[mac]: For an accented character like é the error is "! Text line contains an invalid utf-8 sequence."
How can I solve that other than changing all my files to UTF8 encoding?
Perhaps with a preprocessing (ctx) file that runs iconv, but luatex itself wants UTF-8, full stop.
regimes should work but maybe there is no mac regime; anyway ... moving to utf is the best option
What is to be preferred: with or without BOM? Hans van der Meer
Hans van der Meer wrote:
On 12 mei 2008, at 10:16, Hans Hagen wrote:
Taco Hoekwater wrote:
Hans van der Meer wrote:
I am using MacOSX and up till now my files have a Western(Mac OS Roman) encoding, not UTF8. That works fime in ConTeXt-mkii by using \enableregime[mac]. But in ConTeXt-mkiv through luatex I get an error despite that same \enableregime[mac]: For an accented character like é the error is "! Text line contains an invalid utf-8 sequence." How can I solve that other than changing all my files to UTF8 encoding? Perhaps with a preprocessing (ctx) file that runs iconv, but luatex itself wants UTF-8, full stop. regimes should work but maybe there is no mac regime; anyway ... moving to utf is the best option
What is to be preferred: with or without BOM?
in mkiv both should work ok, but some programs don't like it so i try to avoid boms Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Text line contains an invalid utf-8 sequence." Ah yes, I rember: there are 2~3 tex files in base/ with this problem. I have a list, but it's not update to last context distro (I will do soon) . I believe that with file command (under linux) one can discover them.
Perhaps with a preprocessing (ctx) file that runs iconv, but luatex itself wants UTF-8, full stop. Just for sure : no BOM problem ? I remember something about this in mkii http://unicode.org/faq/utf_bom.html#BOM
-- luigi it's new . it's powerful . it's luatex . http://www.luatex.org
On Mon, May 12, 2008 at 10:03 AM, Taco Hoekwater wrote:
Hans van der Meer wrote:
I am using MacOSX and up till now my files have a Western(Mac OS Roman) encoding, not UTF8. That works fime in ConTeXt-mkii by using \enableregime[mac]. But in ConTeXt-mkiv through luatex I get an error despite that same \enableregime[mac]: For an accented character like é the error is "! Text line contains an invalid utf-8 sequence."
How can I solve that other than changing all my files to UTF8 encoding?
Perhaps with a preprocessing (ctx) file that runs iconv, but luatex itself wants UTF-8, full stop.
Well, even though - please do yourself a favor and do not use them - regimes are still supported in mkiv (at least cp12xx and iso-8859-x), but "mac" regime has not been added to the lua tables (just as many other older regimes). Two simple options that I always use for recoding files: a) recode macintosh..UTF-8 your-file.tex b) (if a fails for some reason) vim your-file.tex :e ++enc=macroman :set fileencoding=utf-8 :wq Recode needs to be installed from somewhere (with fink for example). If you really really cannot leave without mac encoding, support can still be added to mkiv. But in either case: do yourself a favor and start using UTF-8. On Mon, May 12, 2008 at 10:28 AM, luigi scarso wrote:
Text line contains an invalid utf-8 sequence." Ah yes, I rember: there are 2~3 tex files in base/ with this problem. I have a list, but it's not update to last context distro (I will do soon) . I believe that with file command (under linux) one can discover them.
supp-tpi.tex: Non-ISO extended-ASCII English text supp-mps.tex: ISO-8859 English text sort-lan.mkii: Non-ISO extended-ASCII English text regi-ibm.tex: Non-ISO extended-ASCII English text, with LF, NEL line terminators ppchtex.tex: Non-ISO extended-ASCII English text m-pdfsnc.tex: ISO-8859 English text font-chi.tex: ISO-8859 English text enco-ini.tex: Non-ISO extended-ASCII English text enco-cyr.tex: ISO-8859 English text core-reg.tex: ISO-8859 English text core-lst.tex: Non-ISO extended-ASCII English text cont-new.tex: ISO-8859 English text
Just for sure : no BOM problem ? I remember something about this in mkii http://unicode.org/faq/utf_bom.html#BOM
BOM is usually removed automatically. The only problem that I have discovered last time was when BOM was present in environment file in a project. But: - you can write files without BOM - you can pass --utf option to texexec Mojca
participants (5)
-
Hans Hagen
-
Hans van der Meer
-
luigi scarso
-
Mojca Miklavec
-
Taco Hoekwater