On 1/26/2019 12:28 AM, Ross Moore wrote:
Hi Karl,
On 26 Jan 2019, at 10:01 am, Karl Berry
mailto:karl@freefriends.org> wrote: If the FontDescriptor dictionary of an embedded Type 1 font contains a CharSet string, then
I see nothing in that wording that implies CharSet is anything but entirely optional.
That wording is for PDF/A-2, not for PDF/A-1.
The PDF doc from MikTeX, which alerted me to this, does *not* show the CharSet error when validated for PDF/A-2 or PDF/A-3. It *does* show the error for PDF/A-1 validation. (I’ll copy you my response to the author, in a separate email.)
There are many ways in which PDF/A-1 is stricter than later versions. See here: (page 3) https://www.pdfa.org/wp-content/until2016_uploads/2011/06/19005-1_FAQ.pdf
PDF/A-1 files must include: • Embedded fonts • Device-independent color • XMP metadata
PDF/A-1 files may not include: • Encryption • LZW Compression • Embedded files • External content references • PDF Transparency • Multi-media • JavaScript
PDF/A-2 and PDF/A-3 relax many of those 'may not include’s, which are mostly things that TeX does support. The optionality of /CharSet is just another such relaxation.
just wondering: do you see any technical advantage in this CharSet bit array, other than it being an option to predict maybe font memory allocation demands or so (which then in turn is useless as the pdf format has many aspects that will bloat memory usage anyway)
Anyway, right now the choices are a) omit /CharSet or b) output a possibly-incorrect CharSet.
If there was a primitive that can control this, then that would potentially be enough, at least for the present. It would allow the CharSet to be omitted with PDF/A-2,3 but included with PDF/A-1.
in luatex it's an option
This distinction would need to be documented (in pdfx.pdf say ) so that authors can understand the issue and choose the appropriate package-loading option for their own circumstances. I’m happy to do this.
If you want to have a third option c) <something else>, you (or someone) will need to send me a patch.
I’ve looked at the coding in writefont.c for how gl_tree is set and used. But I’ve not yet looked at how the subsetted font is constructed. My thought is that the latter needs to adjust the gl_tree before it is used. As I said previously, this will be a timing issue; so I’m not confident that I could correctly write the necessary coding, using programming structures that I don’t fully understand.
i don't know about pdftex but it is something delayed to the last when the 'combined' font resource is added as different tex fonts using the same resource can get different entries (and width arrays) but share the blobs
(I highly doubt that Thanh has time to look into this.) Sorry, but that's the reality. -k it's probably not that complex; i also doubt if the quality of that vector should be perfect as probably only its prensence is checked, not its internal validity (which then would also demand checking fonts which afaik doesn't happen in detail); and i bet that viewers ignore its content anyway
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------