[NTG-context] Bad PDF to text crawlers
Kip Warner
kip at thevertigo.com
Wed Aug 19 23:05:51 CEST 2015
Hey list,
I have an important document online that I would prefer to keep as a PDF
and not in another format. Unfortunately bots frequently try to provide
those looking for it with a text version they try to extract (beyond my
control). The extraction looks just absolutely awful and has been a
major pain in leaving readers with a really bad understanding of the
contents of the document.
I was thinking that there must be some way of tricking these bots,
depending on how they are implemented, and let's assume they will always
find the PDF, to get them to extract only a small invisible layer that
just contains some hidden text directing a user to the location to
download the original high quality ConTeXt PDF.
Any suggestions?
--
Kip Warner -- Senior Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://www.ntg.nl/pipermail/ntg-context/attachments/20150819/dd4d93e4/attachment.sig>
More information about the ntg-context
mailing list