|
|
How do I get pdftotext for use with PhpDig? |
At FooLabs is a mirror to PlanetMirror where you can find compiled versions of pdftotext for various operating systems.
Go to PlanetMirror and download xpdf-3.00-linux.tar.gz or a later version (assumes linux is your operating system). Unzip xpdf-3.00-linux.tar.gz and extract only the pdftotext file (it has already been compiled and is a binary file). FTP just the pdftotext file in binary mode to your account (your cgi-bin directory should allow this file to run). Once the file is over, change its permission to rwxr-xr-x (755 permission) if applicable for your operating system. Now in the PhpDig config file, set the following: define('PHPDIG_INDEX_PDF',true); // set to true define('PHPDIG_PARSE_PDF','/full/path/to/cgi-bin/pdftotext'); // assuming linux define('PHPDIG_OPTION_PDF',''); // two single quotes, no space inbetween Also be sure to set the following in the PhpDig config file too: define('PHPDIG_PDF_EXTENSION','.txt'); // don't forget the period in .txt Give PhpDig a whirl and see if it indexes PDF files. Run into a problem? Check this thread. |