[Dev-luatex] UTF-16 in \pdfoutline

Jonathan Sauer Jonathan.Sauer at silverstroke.com
Tue Dec 11 09:07:31 CET 2007


% The following code converts a string to UTF-16 big endian with BOM
% and outputs it using \message:

% We change the catcode of '%' so we can use it for modulo calculations:
\begingroup
\catcode`\%=12
\directlua0{\unexpanded{
	function convertToUTF16(str)
		local result = string.char(0xFE) .. string.char(0xFF)
		for c in string.utfvalues(str) do
			if c < 0x10000 then
				result = result ..
						 string.char(c / 256) ..
						 string.char(c % 256)
			else
				c = c - 0x10000
				local c1 = c / 1024 + 0xD800
				local c2 = c % 1024 + 0xDC00
				result = result ..
						 string.char(c1 / 256) ..
						 string.char(c1 % 256) ..
						 string.char(c2 / 256) ..
						 string.char(c2 % 256)
			end
		end
		tex.print('\\message{' .. result .. '}')
	end
	
	
	convertToUTF16('AäöüB!')
}}
\endgroup



\bye

This fails with 'Text line contains an invalid utf-8 sequence.' (not
surprising, since the text is UTF-16 big endian). If I want to pass the
UTF-16-encoded string i.e. to \pdfoutline (since PDF bookmarks can be
encoded in UTF-16), how do I do this?

(Maybe a callback would be useful, i.e. `convert_pdf_text')


Thanks in advance,
Jonathan



More information about the dev-luatex mailing list