Fix for pdf_literal Lua string / token list conversions
Hello,
take this plain LuaTeX example:
\setbox0=\hbox{\pdfextension literal{0 g}}
% 1)
\directlua{
local literal = tex.getbox(0).head
texio.write_nl("log", "literal.data="..literal.data)
}
\showbox0
% 2)
\directlua{
local literal = tex.getbox(0).head
literal.data = "test"
literal.token = "toks"
texio.write_nl("log", "literal.token="..(literal.token or "nil"))
}
\showbox0
% 3) patch test
\directlua{
tex.set("everyjob", "asd")
texio.write_nl("log", tex.get("everyjob"))
}
\bye
Expected log output (abridged):
1)
literal.data=0 g
.\pdfliteral origin{0 g}
2)
literal.token=toks
.\pdfliteral origin{toks}
Actual output:
1)
literal.data=data
.\pdfliteral origin{0 g}
2)
literal.token=characters
.\pdfliteral origin
On Sat Aug 28, 2021 at 9:14 PM CEST, Michal Vlasák wrote:
Hello,
take this plain LuaTeX example:
\setbox0=\hbox{\pdfextension literal{0 g}}
% 1) \directlua{ local literal = tex.getbox(0).head texio.write_nl("log", "literal.data="..literal.data) } \showbox0
% 2) \directlua{ local literal = tex.getbox(0).head literal.data = "test" literal.token = "toks" texio.write_nl("log", "literal.token="..(literal.token or "nil")) } \showbox0
% 3) patch test \directlua{ tex.set("everyjob", "asd") texio.write_nl("log", tex.get("everyjob")) }
\bye
Expected log output (abridged): 1) literal.data=0 g .\pdfliteral origin{0 g}
2) literal.token=toks .\pdfliteral origin{toks}
Actual output: 1) literal.data=data .\pdfliteral origin{0 g}
2) literal.token=characters .\pdfliteral origin
In the first case the Lua accessor returns a value which happens to be on top of the stack (the key "data" itself). In the second case an index into TeX memory is misinterpreted as a Lua registry index, so the returned data is essentially garbage.
Patch for both issues is attached. I also extended `tokenlist_from_lua` (which is what `nodelib_gettoks` is defined as) to allow an index argument. The previous version used the value on top of the stack, which probably worked for every use currently in LuaTeX, but seemed rather dangerous and subtle.
The patch is from git, though it can be applied normally with
patch -Np1 < pdf_literal.patch
Kind regards, Michal Vlasák
Are there any news on this? I must admit I don't remember the details very well, but I used the patch without issues. Though I would appriciate review from someone, because I could have botched the patch in some subtle way. Thanks in advance! Michal
On Sat Aug 28, 2021 at 9:14 PM CEST, Michal Vlasák wrote:
Hello,
take this plain LuaTeX example:
\setbox0=\hbox{\pdfextension literal{0 g}}
% 1) \directlua{ local literal = tex.getbox(0).head texio.write_nl("log", "literal.data="..literal.data) } \showbox0
% 2) \directlua{ local literal = tex.getbox(0).head literal.data = "test" literal.token = "toks" texio.write_nl("log", "literal.token="..(literal.token or "nil")) } \showbox0
% 3) patch test \directlua{ tex.set("everyjob", "asd") texio.write_nl("log", tex.get("everyjob")) }
\bye
Expected log output (abridged): 1) literal.data=0 g .\pdfliteral origin{0 g}
2) literal.token=toks .\pdfliteral origin{toks}
Actual output: 1) literal.data=data .\pdfliteral origin{0 g}
2) literal.token=characters .\pdfliteral origin
In the first case the Lua accessor returns a value which happens to be on top of the stack (the key "data" itself). In the second case an index into TeX memory is misinterpreted as a Lua registry index, so the returned data is essentially garbage.
Patch for both issues is attached. I also extended `tokenlist_from_lua` (which is what `nodelib_gettoks` is defined as) to allow an index argument. The previous version used the value on top of the stack, which probably worked for every use currently in LuaTeX, but seemed rather dangerous and subtle.
The patch is from git, though it can be applied normally with
patch -Np1 < pdf_literal.patch
Kind regards, Michal Vlasák
Hello, seems like the issue was rediscovered and partly fixed by Phelype and Hans: https://gitlab.lisn.upsaclay.fr/texlive/luatex/-/commit/c1909e4a4c5311197a25... Though the fix omits one important line that I had in my patch (first change below). Also IMO the second change below is also worthwhile, as it makes Lua stack use more obvious. --- a/source/texk/web2c/luatexdir/lua/lnodelib.c +++ b/source/texk/web2c/luatexdir/lua/lnodelib.c @@ -1566,7 +1566,7 @@ static int lua_nodelib_direct_setleader(lua_State * L) #define get_pdf_literal_direct_value(L,n) do { \ if (pdf_literal_type(n) == lua_refid_literal) { \ lua_rawgeti(L, LUA_REGISTRYINDEX, pdf_literal_data(n)); \ - } else if (pdf_literal_type(n) == lua_refid_literal) { \ + } else if (pdf_literal_type(n) == normal) { \ tokenlist_to_luastring(L, pdf_literal_data(n)); \ } \ } while (0) --- a/source/texk/web2c/luatexdir/lua/ltexlib.c +++ b/source/texk/web2c/luatexdir/lua/ltexlib.c @@ -1790,7 +1790,7 @@ static int settex(lua_State * L) } } else if (is_toks_assign(cur_cmd1)) { if (lua_type(L,i) == LUA_TSTRING) { - j = tokenlist_from_lua(L, -1); /* uses stack -1 */ + j = tokenlist_from_lua(L, i); assign_internal_value((isglobal ? 4 : 0), equiv(cur_cs1), j); } else { New patch attached. Same test case & patch instructions apply. (For reference, the message I reply to: https://mailman.ntg.nl/pipermail/dev-luatex/2021-August/006542.html) Best, Michal
On Wed, 3 May 2023 at 19:29, Michal Vlasák
participants (2)
-
luigi scarso
-
Michal Vlasák