View Single Post
Old 01-06-2004, 05:39 AM   #1
zevince
Green Mole
 
Join Date: Dec 2003
Posts: 26
pdf indexing with pstotext

Hi,

I'm running an apache 1.3.28 with php 4.3.4rc1. and phpdig 1.6.4 (hmm, i should updgrade...)
But here is my problem..
I've got a lot of pdf, and i want them to be indexed..

I've installed pstotext, which is working right (pstotext "nameofthefile.pdf" shows the contents of the pdf file in STDOUT)

i've changed the config file for phpdig to use this..

Quote:
define('USE_IS_EXECUTABLE_COMMAND','1'); //use is_executable for external binaries

// if set to true, full path to external binary required
define('PHPDIG_INDEX_MSWORD',false);
define('PHPDIG_PARSE_MSWORD','/usr/local/bin/catdoc');
define('PHPDIG_OPTION_MSWORD','-s 8859-1');

define('PHPDIG_INDEX_PDF',true);
define('PHPDIG_PARSE_PDF','/usr/local/bin/pstotext');
define('PHPDIG_OPTION_PDF','');

define('PHPDIG_INDEX_MSEXCEL',false);
define('PHPDIG_PARSE_MSEXCEL','/usr/local/bin/xls2csv');
define('PHPDIG_OPTION_MSEXCEL','');

//---------EXTERNAL TOOLS EXTENSIONS
// if external binary is not STDOUT or different extension needed
// for example, use .txt if external binary writes to filename.txt
define('PHPDIG_MSWORD_EXTENSION','');
define('PHPDIG_PDF_EXTENSION','');
define('PHPDIG_MSEXCEL_EXTENSION','');

ok... ?

When i try to refresh my site, in phpdig admin, pdf files are found, and seems to be indexed.. but when i try to search a name in the pdf text.. no responses..

So where could be the problem ?
zevince is offline   Reply With Quote