[NTG-pdftex] [ pdftex-Feature Requests-429 ] Generate Tagged PDF

noreply at sarovar.org noreply at sarovar.org
Sat May 6 17:20:42 CEST 2006


Feature Requests item #429, was opened at 2005-09-11 22:37
You can respond by visiting: 
http://sarovar.org/tracker/?func=detail&atid=496&aid=429&group_id=106

Category: None
Group: None
Status: Open
Resolution: None
Priority: 4
Submitted By: Timothy O'Brien (oberon101)
Assigned to: Martin Schröder (oneiros)
Summary: Generate Tagged PDF

Initial Comment:
Adobe Reader has a'reflow' feature that allows 
visually impaired users to zoom into properly 
formatted documents and read them without having to 
move the viewable area back and forth across the 
page.  PDFs made in MikteX eith pdfte are not properly 
formatted and reflow without interword spacing, 
redering them unreadable. I contacted the MikteX 
people and they referred me here. I beleive Scientific 
Word also uses pdftex with the same result.

Any chance this could be fixed?

----------------------------------------------------------------------

Comment By: Robert (schlcht)
Date: 2006-05-06 15:20

Message:
Logged In: YES 
user_id=2217

The main problem with reflowing is not the missing tags but
that pdftex writes interword spaces as a kern (since there
is no "space" in TeX, of course). A rather simple but
effective way would be to write the interword spaces in a
different font (e.g. non-embedded Times-Roman), and then
compensate for the difference between Times's width of space
and the width of the glue calculated by TeX. (At least, this
is what Distiller does, if you select "Advanced ->
Accessibility -> Add Tags to Document".)

 So that

 (This)-419(is)-420(an)-419(example)

 will be turned into:

 /T1_0 1 Tf
 (This)Tj
 /T1_1 1 Tf
 ( )Tj
 /T1_0 1 Tf
 2.369 0 Td
 (is)Tj
 /T1_1 1 Tf
 ( )Tj
 /T1_0 1 Tf
 1.092 0 Td
 (an)Tj
 /T1_1 1 Tf
 ( )Tj
 /T1_0 1 Tf
 1.475 0 Td
 (example)Tj

where T1_0 is cmr10 and T1_1 is Times-Roman.

This would be already a major enhancement with respect to
accessibility without any packages being required.


----------------------------------------------------------------------

Comment By: Nobody (None)
Date: 2006-04-14 11:33

Message:
Logged In: NO 

Maybe a first version can use a very low-level solution,
just with a single tagging primitive; there is such a command
already (I think it is called pdfliteral) . And the tree can come
later. So one could start with little work, assuming one knows
tagging.

CS

----------------------------------------------------------------------

Comment By: Martin Schröder (oneiros)
Date: 2006-04-14 11:12

Message:
Logged In: YES 
user_id=421

I'm changing the summary. Yes, we are aware that tagged pdf
is an often requested feature, but implementing it would be
non-trivial:
- first pdfTeX would have to be extended with primites for a
structure tree (and classes and packages would have to use
these primitives)
- then primitives for tagging the content are needed and
must be used

----------------------------------------------------------------------

Comment By: Nobody (None)
Date: 2005-11-04 19:58

Message:
Logged In: NO 


I would volonteer to test the feature. I am writing
a rather long pdf produced with pdftex that is 
downloadable for free ( http://www.motionmountain.net )
and readers regularly ask why it cannot be read aloud.
Pdftex probably would only need to be extended
with a single command - something like 
\writetaghere{tagtype} - and all the rest could 
be done by extensions to the latex cls and sty files.

CS

----------------------------------------------------------------------

You can respond by visiting: 
http://sarovar.org/tracker/?func=detail&atid=496&aid=429&group_id=106


More information about the ntg-pdftex mailing list