Hi,
I just tried doing
luatex -ini latex.ltx
with a freshly checked out LuaTeX. The result is
This is luaTeX, Version 3.141592-snapshot-2007032611 (Web2C 7.5.6) (INITEX)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/latex.ltx
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/texsys.cfg)
./texsys.aux found
\@currdir set to: ./.
Assuming \openin and \input
have the same search path.
Defining UNIX/DOS style filename parser.
catcodes, registers, compatibility for TeX 2, parameters,
LaTeX2e <2005/12/01>
hacks, control, par, spacing, files, font encodings, lengths,
====================================
Local config file fonttext.cfg used
====================================
(/usr/local/texlive/2007/texmf-dist/tex/cslatex/base/fonttext.cfg
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/omlenc.def)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/t1enc.def)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/ot1enc.def)
(/usr/local/texlive/2007/texmf-dist/tex/latex/cslatex/il2enc.def)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/omsenc.def)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/t1cmr.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/ot1cmr.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/cslatex/il2cmr.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/ot1cmss.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/ot1cmtt.fd))
====================================
Local config file fontmath.cfg used
====================================
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/fontmath.cfg
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/fontmath.ltx
=== Don't modify this file, use a .cfg file instead ===
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/omlcmm.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/omscmsy.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/omxcmex.fd)
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/ucmr.fd)))
====================================
Local config file preload.cfg used
=====================================
(/usr/local/texlive/2007/texmf/tex/generic/config/preload.cfg
(/usr/local/texlive/2007/texmf-dist/tex/latex/base/preload.ltx)) page nos.,
x-ref, environments, center, verbatim, math definitions, boxes, title,
sectioning, contents, floats, footnotes, index, bibliography, output,
! Buffer contains an invalid utf-8 sequence.
l.7804 \lccode`\
�=`\i % dotted I
?
! Pool contains an invalid utf-8 sequence
.
l.7804 \lccode`\�
=`\i % dotted I
?
! Buffer contains an invalid utf-8 sequence.
l.7805 \uccode`\
�=`\^^9d % dotted I
?
! Pool contains an invalid utf-8 sequence
.
l.7805 \uccode`\�
=`\^^9d % dotted I
?
! Buffer contains an invalid utf-8 sequence.
l.7805 \uccode`\�=`\
� % dotted I
?
! Pool contains an invalid utf-8 sequence
[...]
Now the sequences in question are:
\ifnum\inputlineno=\m@ne\else
\lccode`\^^9d=`\i % dotted I
\uccode`\^^9d=`\^^9d % dotted I
\lccode`\^^9e=`\^^9e % d-bar
\uccode`\^^9e=`\^^d0 % d-bar
\fi
In short: the buffer does not contain any illegal utf-8 sequence at
all! latex.ltx consists _solely_ of ASCII characters in the range
0-127. Instead, LuaTeX barfs on "\^^9d" and similar ASCII
_transliterations_ of characters which happen to be legal _characters_
in Unicode (though not legal _bytes_ in utf-8).
(/usr/local/texlive/2007/texmf-dist/tex/generic/xu-hyphen/xu-bahyph.tex
! Text line contains an invalid utf-8 sequence.
l.17 \lccode`\
�=0
?
! Text line contains an invalid utf-8 sequence.
l.20 \ifnum\lccode`\
�=0 % if bahyph.tex didn't change this,
?
Again, the input file is purely ASCII, in this case
\begingroup
\expandafter\ifx\csname XeTeXrevision\endcsname\relax
\else
% The standard bahyph.tex is plain ASCII, so directly readable;
% but we want to add patterns for n-tilde (^^f1), as generated by
% bahyph.sh if the "latin1" option is given.
% However, if a "latin1" version of bahyph was already present,
% these would be duplicate patterns.
% We'll watch the \lccode of ^^f1 so as to detect this.
\lccode`\^^f1=0
\let\PATTERNS=\patterns
\def\patterns{%
\ifnum\lccode`\^^f1=0 % if bahyph.tex didn't change this,
\lccode`\^^f1=`\^^f1 % then we can load the extra patterns here
\PATTERNS{1^^f1a 1^^f1e 1^^f1o 1^^f1i 1^^f1u}%
\fi
\PATTERNS
}
\fi
So we have error messages about "pool", "buffer" and "text line"
containing invalid utf-8 sequences, when the input actually is just
ASCII.
--
David Kastrup