[NTG-context] Accessibility and Tagged PDFs: Bugs and Feature Requests

luigi scarso luigi.scarso at gmail.com
Tue Jun 30 10:32:29 CEST 2015

On Sun, Jun 28, 2015 at 12:59 PM, Dr. Dominik Klein <
Dominik.Klein at outlook.com> wrote:

> Context is the only Tex-based system that allows to properly tag a pdf.
> Tagged PDFs are one major requirement for accessibility.
> Indeed, in several large organizations/universities, accessibility is
> mandated by law, and this is a major obstacle for using Tex. In practice
> compliance is often assessed with Acrobat Pro's
> accessibility checker.
> Context produces a nice tag-structure, but there are some minor issues
> that prevent compliance to [1], and hence Acrobat Pro complains during the
> check. The main issues are:
> 1.) Elements that are not contained in the structure tree are not marked
> as an artifact. Consider this example:
> -------------------------------
> \setuptagging[state=start]
> \setuppagenumbering
> [location=,
>  alternative=doublesided]
> \setupheadertexts
>   [{Chapter~\getmarking[chapternumber]\hskip1em\getmarking[chapter]}]
>   [{Header Right}]
>   [{Header Left}]
>   [{Chapter~\getmarking[chapternumber]\hskip1em\getmarking[chapter]}]
> \setupfootertexts
>   [Organization Name]
>   [pagenumber]
>   [pagenumber]
>   [Organization Name]
> \starttext
> \startfrontmatter
> something
> \stopfrontmatter
> \startbodymatter
> some more text here
> \stopbodymatter
> \stoptext
> -------------------------------
> Header, footer, pagenumber etc. will not be included in the tag structure.
> Of course this makes absolutely sense and is correct, however according to
> Section of [1], then this content that is not in the structure
> tree should be marked as an artifact, i.e.
> /Artifact
>   BMC
>   ..
>   EMC
> or in an advanced way with /Artifact PropertyList where the type of
> Artifact can be defined. It would be nice if those elements that are not
> included in the tag tree would be marked as artifacts by default. The same
> holds for \startelement[ignore] when one wants to explicitly remove
> something from the structure tree.
> 2.) Images without alternate text:
> According to Section 14.9.3 of [1], alternate descriptions in human
> readable text should be provided for images. It would be really helpful,
> if these could be defined in the source tex file, and then automatically
> added when creating the object in the structure tree. I.e. it would be
> nice to have something like:
> \placefigure[top][Image Reference]{Caption}{
> \externalfigure[cow.pdf][width=10cm][alternate text = "This images shows a
> beautiful cow."]
> }
> The same holds for formulas: Whereas the mathml-like tagging of Context is
> very advanced, sometimes it might be still helpful to supply a textual
> description (alt-text ="The definition of the Pythagorean theorem: a^2 +
> b^2 = c^2")
> 3.) Tag names of the resulting tag structure:
> Section 14.8.4 of [1] defines standard structure types, such as <H>, <P>,
> <Sect> etc. Context creates a tag-tree that uses names directly
> representing the structure names of the context laguage, such as
> <sectiontitle>. This should however be mapped to something standard, such
> as <H>. Interestingly these mappings seem to have been considered in
> strc-tag.mkiv but I was unable to generate such a tagged pdf.
> Editing/Outcommenting things in strc-tag.mkiv didn't work for me. It would
> be nice if there was a switch somewhere, i.e.
> \setuptagging[state=start,tagnames=pdf17] - or maybe I overlooked something?
> 4.) Acrobat Pro always complains that the language for the whole document
> is not set.
> 5.) Tables
> The generated structure looks something like this:
> <table>
>  <tablerow>
>    <tablecell>
>    ...
>  <tablerow>
>    <tablecell>
>  ...
> Here, not only are the tag names non-compliant, also the tag structure
> should distinguish between the table header (THead), and table rows
> (TBody), c.f. Section of [1]. A simple heuristic would be
> to always put the first line into THead tags, and the rest of the able
> into TBody.
> 6.) It would be nice if a flat tag structure could be created optionally.
> This is not a required feature according to [1], and in fact a properly
> nested structure is surely preferable for the final output; for debugging
> or checking during document creation however, a flat structure tree
> sometimes is easier to browse through.
> All in all, these seem to be the only issues that prevent accessible PDF
> documents with context. For those within an organization where
> accessibility is required legally for all publications, compliance to at
> least Acrobat Pro's checks is a huge issue. I do not know how difficult
> these things are to implement in Context (personally I am just lost in the
> code), but looking at e.g. tex.stackexchange
> for question related to accessibility, this is indeed a major obstacle for
> several people.
> cheers
> - Dominik
> [1] ISO 32000-1:2008, available at
> http://www.adobe.com/devnet/pdf/pdf_reference.html
> ___________________________________________________________________________________

Thank you for the report .
It would be nice to have a pdf made by context using \nopdfcompression
that have all these issues together with the report emitted by acrobat.
Last time I have checked a pfd/a-1a made by context it was all ok, but it
was time ago and maybe not
all the features were tested deeply.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ntg.nl/pipermail/ntg-context/attachments/20150630/b238c61d/attachment-0001.html>

More information about the ntg-context mailing list