Re: [NTG-context] Accessibility and Tagged PDFs: Bugs and Feature Requests
On Wed Jul 1, 18:47:57 CEST 2015, Hans Hagen wrote:
/Artifact BMC .. EMC
i'll add the simple variant (i see no need to add properties to something that is supposed to be ignored anyway)
thanks!
2.) Images without alternate text:
i'll pass the label to the tag as alt text
\externalfigure[t:/sources/cow.pdf][label=whatever]
Again, thanks!
3.) Tag names of the resulting tag structure: Section 14.8.4 of [1] defines standard structure types,
The set of those standard tags is rather limited and imo one of the craziest things in pdf as we then end up with abuse of those html tags (and probably endless discussions on what to map onto what). I don't even have a clue what it would add to the concept either. Reflow is a braindead thing anyway.
Indeed, the set of those tags is very limited. Unfortunately, as far as I know, some screen readers (for the visually impaired) use these as navigation aids, i.e. press button "jump to next section", and the reader will look for the next section marked as <Sect> or something. Is it difficult to make the mapping user-defineable in the source tex-file? Say, like such a command: \definemapping[ section=Sect, sectiontitle=H sectionnumber=H, ... tablerow=TR ... ] It would then give users the control on what to map onto what, depending on what kind of documents they create.
All in all, these seem to be the only issues that prevent accessible PDF documents with context. For those within an organization where accessibility is required legally for all publications, compliance to at least Acrobat Pro's checks is a huge issue. I do not know how difficult these things are to implement in Context (personally I am just lost in the code), but looking at e.g. tex.stackexchange for question related to accessibility, this is indeed a major obstacle for several people.
In fact adding pdf tagging to context was rather easy. Some time was So, it's not that difficult to add features, more a matter of priorities and motivation (apart from the fact that my acrobat is a bit old by now so I cannot really test).
I can fully understand that such things are not of the highest priority. Nevertheless accessibility plays more and more a role, e.g. lately, even conferences like http://chi2015.acm.org/authors/guide-to-an-accessible-submission/ require accessible pdfs (the workflow they suggest, i.e. tagging a pdf by acrobat pro after compiling of course doesn't work at all - the generated structure is useless). Hence, for some users, it makes all the difference. For example for me and some other friends, it would allow to change from using Microsoft Word to a ConTeXt based workflow. cheers - Dominik
On 7/3/2015 10:12 AM, Dominik Klein wrote:
On Wed Jul 1, 18:47:57 CEST 2015, Hans Hagen wrote:
/ /Artifact />>/ BMC />>/ .. />>/ EMC /> i'll add the simple variant (i see no need to add properties to something that is supposed to be ignored anyway)
thanks!
/ 2.) Images without alternate text: /> i'll pass the label to the tag as alt text
\externalfigure[t:/sources/cow.pdf][label=whatever]
Again, thanks!
/ 3.) Tag names of the resulting tag structure: />>/ Section 14.8.4 of [1] defines standard structure types, //> />The set of those standard tags is rather limited and imo one of the craziest things in pdf as we then end up with abuse of those html tags (and probably endless discussions on what to map onto what). I don't even have a clue what it would add to the concept either. Reflow is a braindead thing anyway.
Indeed, the set of those tags is very limited. Unfortunately, as far as I know, some screen readers (for the visually impaired) use these as navigation aids, i.e. press button "jump to next section", and the reader will look for the next section marked as <Sect> or something.
Is it difficult to make the mapping user-defineable in the source tex-file? Say, like such a command: \definemapping[ section=Sect, sectiontitle=H sectionnumber=H, ... tablerow=TR ... ]
It would then give users the control on what to map onto what, depending on what kind of documents they create.
/ All in all, these seem to be the only issues that prevent accessible PDF />/ documents with context. For those within an organization where />/ accessibility is required legally for all publications, compliance to at />/ least Acrobat Pro's checks is a huge issue. I do not know how difficult />/ these things are to implement in Context (personally I am just lost in />/ the code), but looking at e.g. tex.stackexchange />/ for question related to accessibility, this is indeed a major obstacle />/ for several people. /> In fact adding pdf tagging to context was rather easy. Some time was So, it's not that difficult to add features, more a matter of priorities and motivation (apart from the fact that my acrobat is a bit old by now so I cannot really test).
I can fully understand that such things are not of the highest priority. Nevertheless accessibility plays more and more a role, e.g. lately, even conferences likehttp://chi2015.acm.org/authors/guide-to-an-accessible-submission/ require accessible pdfs (the workflow they suggest, i.e. tagging a pdf by acrobat pro after compiling of course doesn't work at all - the generated structure is useless).
Hence, for some users, it makes all the difference. For example for me and some other friends, it would allow to change from using Microsoft Word to a ConTeXt based workflow.
\nopdfcompression \setuptagging[state=start] \starttext \chapter{whatever} \stoptext gives a pdf with a rolemap like: 11 0 obj << /ParentTree 12 0 R /K 29 0 R /RoleMap << /sectiontitle /H /section /Sect /sectionnumber /H /document /Div >> /Type /StructTreeRoot >> endobj but as usual it's hard to check what gets done with such things (i'm pretty sure that context was one of the first to support for instance field (widget) trees but support for that in viewers changes each version so one never knows what is the right way as specs predate support in viewers) (in the same fashion tagging and layers is/are useless till it gets supported in other viewers than acrobat) ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
On Fri, 3 Jul 2015 19:19:58 +0200
Hans Hagen
(i'm pretty sure that context was one of the first to support for instance field (widget) trees but support for that in viewers changes each version so one never knows what is the right way as specs predate support in viewers)
(in the same fashion tagging and layers is/are useless till it gets supported in other viewers than acrobat)
Maybe a subject to discuss at the next ConTeXt meeting. Alan
On 7/4/2015 6:45 PM, Alan BRASLAU wrote:
On Fri, 3 Jul 2015 19:19:58 +0200 Hans Hagen
wrote: (i'm pretty sure that context was one of the first to support for instance field (widget) trees but support for that in viewers changes each version so one never knows what is the right way as specs predate support in viewers)
(in the same fashion tagging and layers is/are useless till it gets supported in other viewers than acrobat)
Maybe a subject to discuss at the next ConTeXt meeting.
sure but in the meantime we need to find a way to determine what works and what not, for instance, as i mentioned that context already adds a rolemap 11 0 obj << /ParentTree 12 0 R /K 29 0 R /RoleMap << /sectiontitle /H /section /Sect /sectionnumber /H /document /Div >> /Type /StructTreeRoot >> endobj we have no way to check if that works (so maybe we need to have a page on the wiki with a viewer/functionality matrix) (ok, we could peek into files produced by word and see what gets added there but even then) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
Am 05.07.15 um 13:11 schrieb Hans Hagen:
On 7/4/2015 6:45 PM, Alan BRASLAU wrote:
On Fri, 3 Jul 2015 19:19:58 +0200 Hans Hagen
wrote: sure but in the meantime we need to find a way to determine what works and what not, for instance, as i mentioned that context already adds a rolemap 11 0 obj << /ParentTree 12 0 R /K 29 0 R /RoleMap << /sectiontitle /H /section /Sect /sectionnumber /H /document /Div >> /Type /StructTreeRoot >> endobj
we have no way to check if that works (so maybe we need to have a page on the wiki with a viewer/functionality matrix)
The whole rolemap thing and how Acrobat Pro handles it leaves me somewhat puzzled. Taking https://github.com/asdfjkl/tex-access/blob/master/rolemap.tex and compiling will give the rolemap as Hans described above. Looking at the Tag structure, this seems to be ignored by acrobat (but why?), see https://github.com/asdfjkl/tex-access/blob/master/rolemap.PNG What would be expected is this, right? After all, the rolemap should be interpreted, shouldn't it (mapping /H to /H1 was a mistake of mine, but it doesn't change the fact). https://github.com/asdfjkl/tex-access/blob/master/rolemap2.PNG After changing things manually in the tag editor in acrobat, and saving the pdf again, this is obtained: https://github.com/asdfjkl/tex-access/blob/master/rolemap_edited.pdf Note this: << /RoleMap << /document /Div /sectionnumber /H /sectiontitle /H /section /Sect >> /Type /StructTreeRoot /ParentTree 12 0 R /K 29 0 R >> and also the different structure elements at the start of the pdf... I am lost here... cheers - Dominik
On 7/5/2015 10:04 PM, Dr. Dominik Klein wrote:
Am 05.07.15 um 13:11 schrieb Hans Hagen:
On 7/4/2015 6:45 PM, Alan BRASLAU wrote:
On Fri, 3 Jul 2015 19:19:58 +0200 Hans Hagen
wrote: sure but in the meantime we need to find a way to determine what works and what not, for instance, as i mentioned that context already adds a rolemap 11 0 obj << /ParentTree 12 0 R /K 29 0 R /RoleMap << /sectiontitle /H /section /Sect /sectionnumber /H /document /Div >> /Type /StructTreeRoot >> endobj
we have no way to check if that works (so maybe we need to have a page on the wiki with a viewer/functionality matrix)
The whole rolemap thing and how Acrobat Pro handles it leaves me somewhat puzzled.
Taking https://github.com/asdfjkl/tex-access/blob/master/rolemap.tex and compiling will give the rolemap as Hans described above. Looking at the Tag structure, this seems to be ignored by acrobat (but why?), see https://github.com/asdfjkl/tex-access/blob/master/rolemap.PNG
i always suspect a chicken-egg issue there: someone wants a feature, it gets added to pdf, then there is waiting for some typesetting engine to support it, and then acrobat might do something with it and afterwards the spec gets adapted (or interpretation is adapted) .. it happened with widgets and such the interesting thing about tex is that we can easily adapt to such new features but have no way of testing it (some relates to the fact that pdf is both a document format and a storage format for e.g. illustrator so it's some hybrid) maybe it's just: if we have tags it is accessible by definition, no matter if it can be used or not
What would be expected is this, right? After all, the rolemap should be interpreted, shouldn't it (mapping /H to /H1 was a mistake of mine, but it doesn't change the fact). https://github.com/asdfjkl/tex-access/blob/master/rolemap2.PNG After changing things manually in the tag editor in acrobat, and saving the pdf again, this is obtained: https://github.com/asdfjkl/tex-access/blob/master/rolemap_edited.pdf
maybe the H etc is only used with reflow ... and reflow is weird in itself as one can then better provide an html file alongside the pdf it makes me wonder how a complex doc with mostly H's would look / be interpreted as that is then the dominant structure thing
Note this: << /RoleMap << /document /Div /sectionnumber /H /sectiontitle /H /section /Sect >> /Type /StructTreeRoot /ParentTree 12 0 R /K 29 0 R >>
and also the different structure elements at the start of the pdf...
the order of /Key values in the dicts is not important and hashes are often unordered (different per application or even per run for some applications for security reasons); when you play with widgets you will also observe that acrobat adds rendered content to the file as addition to the key/values Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (4)
-
Alan BRASLAU
-
Dominik Klein
-
Dr. Dominik Klein
-
Hans Hagen