However, it turns out that pdftotext converts to
fi ff ffi ffl 1234567890,
splitting fi ligature while leaving ff, ffi and ffl intact, which is strange.
I did not try with Adobe Reader but the pdf is searchable with Apple Preview and the pasted copy is still intact:
fi ff ffi ffl 1234567890
For me, it still doesn't work. I get oldstyle numbers in the text, and neither in Adobe Reader nor in okular, evince or xpdf the numbers are searchable. However, I figured out that it is my version of the font causing the wrong result.
You are right! I have not considered that. Depending on the used font, pdftotext expands (some) the ligatures or not. With TeXGyre Pagella for instance there is no ligature expansion at all: fi ff ffi ffl 1234567890 and with Cambria I get a pdf which is not searchable with Preview: i ff fi fl 1234567890 Florian