[NTG-pdftex] [pdftex-Bugs][4321] Illegal entry in bfrange block in ToUnicode CMap

pdftex-bugs at sarovar.org pdftex-bugs at sarovar.org
Fri Nov 26 08:19:45 CET 2010

Bugs item #4321, was opened at 2010-11-25 15:21
Status: Open
Priority: 3
Submitted By: Heiko Oberdiek (oberdiek)
Assigned to: Nobody (None)
Summary: Illegal entry in bfrange block in ToUnicode CMap 
Category: None
Group: None
Resolution: Accepted

Initial Comment:

pdfTeX complains
  Error: Illegal entry in bfrange block in ToUnicode CMap
for valid cmap entries, when a PDF file is included.
The CMap entries are, for example:

1 beginbfrange

The error disappears in case of

1 beginbfrange

The error is in function CharCodeToUnicode::parseCMap1 in file

In case of poppler the problem is already reported with patch:


The appended test file can be processed by "pdftex --ini", "pdftex" or

Yours sincerely


>Comment By: Taco Hoekwater (taco)
Date: 2010-11-26 08:19

ToUnicode is a little odd because it uses CMap syntax with a
few extra limitations that are only in the pdf reference,
and these seem to come from a really weird bit of Acroread
implementation code.

I have not looked at the input closely, so I could be
missing the point a little, but this could be the problem:

The hex number scanning in AR is closely related to the
begincodespacerange ... endcodespacerange block. If the code
space range is one byte, then all hex numbers have to be
specified in two digits, and if the code space range is two
bytes, then all further hex numbers have to be given in four


Comment By: The Thanh Han (hanthethanh)
Date: 2010-11-26 03:32

we would apply the mentioned patch from poppler.

Regards the case


it works fine with Preview (osx) and acrobat 9, so I think it's a browser



Comment By: Heiko Oberdiek (oberdiek)
Date: 2010-11-25 15:48


I have made further experiments by
replacing the last <0041> by <0042>.
The "A" of the input file should then get
converted to "B" by copy&paste.
This works for the line
with AR7/Linux,
however it fails ("A" instead of "B") in case
The PDF specification shows in section
"5.9 Extraction of Text Content" entries
with four hexadecimal digits.

Can someone bring some light to this obscurity?

Yours sincerely


You can respond by visiting: 

More information about the ntg-pdftex mailing list