Am Fri, 3 Apr 2020 02:08:25 +0200 schrieb Marcel Fabian Krüger:
Hi,
I recently tried to do something with the embedded pdfe library and noticed that accessing strings comes with certain problems. PDF strings are always returned in raw form without the surrounding <> or (), so any script using them will need to know if it is a hex string or a "normal" () delimited string in order to treat it correctly. So pdfe.getstring is a bit weird: It gives a Lua string but no indication which type of string is returned.
I just run into the same problem and used the detail field from getfromdictionary/getfromarray to access the string type. But I agree that it would be nice, if getstring would return this directly
\documentclass{article} \begin{document} \directlua{ doc= pdfe.open(kpse.find_file("example-image.pdf")) trailerid = pdfe.getarray(pdfe.gettrailer (doc),"ID") type,value,detail = pdfe.getfromarray(trailerid,1) if detail then print("HEXSTRING", value) else print("LITERALSTRING", value) end type,value,detail = pdfe.getfromdictionary(pdfe.getinfo(doc),"Creator") if detail then print("HEXSTRING", value) else print("LITERALSTRING", value) end } blub \end{document}
On 4/3/2020 8:33 AM, Ulrike Fischer wrote: the problem with return values for these basic types (string, number, boolean) is that when they are used in arguments (and such) one then need to encapsulate them in () to make sure that the first argument is used (the string value) ... in the end these are strings (no matter if they are hex encoded or not) print("STRING", pdfe.getstring(trailerid,1)) adding an extra return value is no big deal but we can't predict incompatibilities (and we're not assumed to introduce these) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------