Hi,
it is a normal apache file listing that I use as url to index (1 level)
file listing. So all pdfs are linked.
You're totally right about the bug in pdftotext, when I tried
the same pdf file in the command line I get the same junk text.
Still why does it stop indexing after this file?
/Aryan