I too would be interested in ways to do this. I currently have a need to take a PDF file as the input (which can already be done using pdftotext or pdftohtml, and even pdf2xml), and actually parse an XML data conversion of it so that a search on specific text will return the document title, article title, page number, and bibiliographic extract from the section.
The way the current keywords table is setup, there's no way to correlate the keywords to that level of granularity.
Leland
|