Optimizing the generated pdf (was: [NTG-pdftex] Re: [dev-context] Re: \showskips bug)

Martin Schröder martin at oneiros.de
Thu Nov 17 10:46:51 CET 2005


On 2005-11-17 09:31:19 +0100, Taco Hoekwater wrote:
> This is mostly the required page objects. 3 objects are used per actual
> (totally empty) page:
> 
>   13 0 obj << /Length 0 >> stream endstream endobj

This one is even longer with compresslevel > 0. It would be nice
to not compress empty streams, but I think that would be too
difficult to implement and isn't needed very often.

>   12 0 obj << /Type /Page /Contents 13 0 R /Resources 11 0 R
>               /MediaBox [0 0 595.2756 841.8898] /Parent 7 0 R >> endobj
>   11 0 obj << /ProcSet [ /PDF ] >> endobj

And 11 and 13 are created for every page. :-(

11 could simply be empty (or null) for empty pages. Looking at
<Write out page object@>, it doesn't seem too difficult to
optimize for empty Resources. Of course, the question is: How
often do we have empty /Resources? Normally they at least have a
/Font entry.

If we start optimizations like these, it would be nice to move
the /MediaBox to the root object (or pages) and write it only for
different-sized pages (Hans: ConTeXt writes /TrimBox and /CropBox
on every page (even if they are allways the same); adding them to
pdfpagesattr instead would save quite some space -- look at
pdftex-a.pdf).

> It is a bit wasteful to keep those in the indirects objects table
> for ever and onwards, but I am not sure if it is doable to flush
> them right away. (CC ntg-pdftex)

I don't think that optimizations like these are generally usefull
as they are seldom needed and make the code more complex. When I
look at a typical result of ConTeXt or hyperref, they seem
unneeded.

Btw: Is there a tool that compresses a pdf by replacing identical
objects with references?

Best
    Martin
-- 
                    http://www.tm.oneiros.de


More information about the ntg-pdftex mailing list