Hi Luigi,
pdf spec. www.adobe.com/devnet/acrobat/pdfs/PDF32000_2008.pdf xpdf sources
That sounds like a definite reference ... probably more than I can digest for a start :-( Perhaps there's a simple, example-driven guide for dummies somewhere? Like writing a simple PDF document with more than one page, non- contiguous text blocks and perhaps a hyperlink by hand ...
Under linux pdfedit is experimental http://pdfedit.petricek.net/en/index.html
Installed :-) OK, I've just managed to remove a page from a given PDF file. Beyond that one will probably have to know more about PDF ... for example, do you know how pdfedit can help me identify which object in the raw PDF corresponds to a given blob of text on the page? After selecting the blob I get some info on my selection that eludes me :-(
pypdf is a python module at lowlevel. http://pybrary.net/pyPdf/
This looks very interesting!
As exercise, you can try to minimic pdffonts in python with pypdf (pdfs with ttf,otf,type1 etc )
I'm afraid, I don't understand :-( Best, Oliver