[Dev-luatex] Unicode in \pdfoutlines

geradts at tatzetwerk.nl geradts at tatzetwerk.nl
Wed May 29 12:30:15 CEST 2013

Dear LuaTeXies,

I have been experimenting with Unicode characters (> 0x00FF) in \pdfoutlines (bookmarks) in (plain)LuaTeX. This post


helped me to find out that the contents of the pdfoutline will be interpreted as UTF-16 if they are preceded by the BOM 0xFE 0xFF (which in LuaTeX can be output using the 0x1100FE 0x1100FF notation). After this BOM the Greek letter alpha (Unicode 0x03B1) can be presented as 0x110003 0x1100B1. The (very minimal) convertPDFstring function in the sample below takes care of this. However, when trying to present positions < 0x0100 I ran into a problem. It turned out that LuaTeX does not output 0x110000 to the pdf and this is needed for these positions; for example the letter a (0x0061) would be output as 0x110000 0x110061. Is this intended behaviour? The LuaTeX manual seems to present 0x110000 as a valid way to output a byte-sized chunk.

> Output in byte-sized chunks can be achieved by using characters just outside of the valid Unicode range, starting at the value 1,114,112 (0x110000). When the time comes to print a character c> = 1,114,112, LuaTEX will actually print the single byte corresponding to c minus 1,114,112.

Any suggestions?

Best, Ivo.


  function convertPDFstring(s)
    tex.write(unicode.utf8.char(0x1100FE, 0x1100FF))
    for c in string.utfvalues(s) do
      tex.write(unicode.utf8.char(0x110000 + c / 256, 0x110000 + c % 256))

line 1\pdfdest num 1 xyz\pdfoutline goto num 1 count 0{\directlua{convertPDFstring('αβγāĉđ')}} % works fine

line 2\pdfdest num 2 xyz\pdfoutline goto num 2 count 0{\directlua{convertPDFstring('abcdef')}} % not working


TAT Zetwerk
Ondiep Zuidzijde 6
3551 BW Utrecht
t 030 2 456 056
f 030 2 456 336
i www.tatzetwerk.nl
e geradts at tatzetwerk.nl

More information about the dev-luatex mailing list