[NTG-context] Bad PDF to text crawlers

Peter M√ľnster pmlists at free.fr
Wed Aug 19 23:35:05 CEST 2015


On Wed, Aug 19 2015, Kip Warner wrote:

> I was thinking that there must be some way of tricking these bots, 
> depending on how they are implemented, and let's assume they will always 
> find the PDF, to get them to extract only a small invisible layer that 
> just contains some hidden text directing a user to the location to 
> download the original high quality ConTeXt PDF.

Even if you would find a way today, tomorrow there would be other bots,
that see the same text, as the humans.


> Any suggestions?

Get the value of HTTP_USER_AGENT and send the replacement text, if the
agent is a bot. Or use robots.txt.

-- 
           Peter


More information about the ntg-context mailing list