Bugs item #1751, was opened at 2008-06-23 09:28
Status: Closed
Priority: 3
Submitted By: Stefan Becuwe (sbecuwe)
Assigned to: The Thanh Han (hanthethanh)
Summary: bounding box ignored?
Resolution: Accepted
Group: v1.40.3
Category: PDF inclusion
Initial Comment:
Hello,
I've got a problem with a pdf inclusion using includegraphics.
I've made a screendump with Firefox using the "Print to File" option and selected pdf as format. Using Acrobat, I've cropped the figure. In kpdf or acroread the result (attached) looks ok. However, when I create a LaTeX document containing this figure, the output depends on the viewer...: correct output shown in kpdf, incorrect output shown in acroread since cropping is ignored. I also get the following message:
PDF inclusion: Page Group detected which pdfTeX can't handle. Ignoring it.
I don't know whether it's related to the above problem.
Best regards
Stefan
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2008-06-28 13:58
Message:
fixed by Heiko, patch is in 1.40.8-rc2
----------------------------------------------------------------------
Comment By: Adrian Johnson (ajohnson)
Date: 2008-06-25 11:57
Message:
The non standard page content stream dictionary produced by cairo (and hence Firefox 3) is my fault. The reason I did that is due to the way cairo handles fallback images for Porter-Duff operations not supported by PDF (ie all of them except Source OVER Destination). When one or more fallback images are required, cairo puts a knockout group in the page content. The first thing in the knockout group is another group containing the natively supported content. The rest of the knockout group contains the fallback images. The knockout group is required to ensure the fallback images do not composite with any native content underneath the image.
The problem is at the time the page content stream is streamed out to the file it is not known whether any fallback images are required. I did not want to include a knockout group unless it is actually required. The content stream has an XObject dictionary because at the time the content is written it is not known whether the content stream will be belong to the page or be an XObject in a knockout group.
As far as I can tell from reading the PDF Reference this is not incorrect but I agree that this is not standard practice in PDF files. As it is causing problems I will work on a fix for cairo to ensure that the page content stream does not include an XObject dictionary.
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2008-06-25 09:52
Message:
the pdf looks a bit weird, but after a few tests using all applications I have access to, I could not find any application that complains that this pdf is broken or invalid. This includes acrobat 8 preflight/pdf analysis/Report PDF syntax issues and a few more preflight tests.
The only thing that failed is that when I converted the pdf to ps using acrobat reader (4.05 on linux), then gs fails to display that ps. But this happens very often (ie printing from pdf to ps using acrobat reader, then checking the ps by gs). On the other hand, printing to pdf using pdftops and checking the ps with gs went ok, too.
----------------------------------------------------------------------
Comment By: Heiko Oberdiek (oberdiek)
Date: 2008-06-25 09:05
Message:
Martin wrote:
This file is IMHO broken: Object 3 is an XObject dict,
but is used for the Contents of page 1 (object 23).
In this case the application is a little too sloppy
and can be fixed.
But there is an counter example:
the same "contents" stream could be shared between
a Page and an XObject. This decreases the file size,
because it avoids duplication of streams.
Yours sincerely
Heiko
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2008-06-25 08:33
Message:
I am of the same opinion as Heiko. The reason why content streams are copied entirely is that I didn't expect those streams to contain anything else apart from entries commond for all streams (eg those listed in table 3.4 in pdf spec). However, I also could not find anything explicitly saying that additional entries are not allowed.
Heiko's patch looks good to me. So, here is my vote for that patch. Better to copy only what is required, than to copy everything including what is undefined.
Variant A looks simple and clean, but it will slow down pdf inclusion, since the streams must be uncompressed and re-compressed.
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-25 08:28
Message:
Thanks. This file is IMHO broken: Object 3 is an XObject dict, but is used for the Contents of page 1 (object 23). Can you please file a bug report against FireFox at https://bugzilla.mozilla.org/ and report the bug url here, so that I can chime in there?
----------------------------------------------------------------------
Comment By: Stefan Becuwe (sbecuwe)
Date: 2008-06-25 08:05
Message:
I've attached the original output from Firefox 3.
Regards
Stefan
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-25 07:02
Message:
Of course the stream dict may contain arbitrary keys; but the input PDF here is IMHO broken; at least it's a borderline case (who needs a /BBox or /Group in a stream dict?).
----------------------------------------------------------------------
Comment By: Heiko Oberdiek (oberdiek)
Date: 2008-06-24 22:34
Message:
The /Contents stream object may contain additional
entries, I haven't found something in the PDF specification
that would forbid this.
Therefore the PDF inclusion code needs fixing. Instead
of copying the whole dictionary, this should be limited
to /Length, /Filter, and /DecodeParms. (I assume
external streams aren't supported.)
Patch 1812 contains a patch for pdftoepdf.cc.
The commented variant A recompresses the stream,
controlled by \pdfcompresslevel. Variant B is more
complicate, it copies the needed entries manually
and copies the stream without recompressing.
Yours sincerely
Heiko
----------------------------------------------------------------------
Comment By: Hartmut Henkel (hhenkel)
Date: 2008-06-24 20:21
Message:
right, Martin. Now i see: the page /Contents stream object
no. 3 in crop_problem.pdf looks like an /XObject, but only
/Filter and /Length should be there (or the ones in Table
3.4 of the PDFRef.)
Regards, Hartmut
----------------------------------------------------------------------
Comment By: Hartmut Henkel (hhenkel)
Date: 2008-06-24 18:32
Message:
hmm, there are /BBox entries and /Resources in the many
/Pattern objects, both of which are ok and even required by
the PDF Spec. Also, having the same /Resources in the /Page
and the /XObject dict. should be ok. The crop_problem.pdf
just looks a little weird, but not wrong.
Regards, Hartmut
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-24 14:20
Message:
Can you attach the original PDF from FF?
The content stream dict of crop_problem.pdf has some keys that don't belog there (BBox, Resources)...
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-23 22:54
Message:
Confirmed. We get two BBoxes... :-(
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-23 13:37
Message:
Confirmed. I'm tempted to call it a bug in AR; I'll talk to Adobe.
----------------------------------------------------------------------
Comment By: Stefan Becuwe (sbecuwe)
Date: 2008-06-23 10:58
Message:
I have attached test2-sb.pdf and test2-sb.log.
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-23 10:20
Message:
The page group bug is fixed in 1.40.6, btw.
----------------------------------------------------------------------
Comment By: Martin Schröder (oneiros)
Date: 2008-06-23 10:19
Message:
It works for me. :-)
Can you try the attached test2.tex and attach your pdf and log, please?
----------------------------------------------------------------------
You can respond by visiting:
http://sarovar.org/tracker/?func=detail&atid=493&aid=1751&group_id=106