|
12-01-2004, 02:29 AM | #1 |
Green Mole
Join Date: Dec 2004
Posts: 5
|
Win 98 + Easyphp & binary problem
Hello,
has some one managed to set up pdf indexation with Win 98 + EasyPHP (PHP Version 4.3.3) + latest phpdig + "pdftotext" binary ?? Difficult to check up with the advised "checklist", it seems that the php function "is_executable" doesnt work on php 4.3.3 (?) I get this : Is result test http an array: 1 What is result test http status: HTML Is result test an array: 1 What is result test status: HTML Use is executable is set to: 1 Index the pdf is set to: 1 Parse the pdf is set to: C:\Web\cgi-bin\pdftotext.exe Does parse pdf exist: 1 Fatal error: Call to undefined function: is_executable() in c:\web\www\phpdig\admin\robot_functions.php on line 963 And if I cut this line (to see whats going on next), it goes to : Is result test http an array: 1 What is result test http status: HTML Is result test an array: 1 What is result test status: HTML Use is executable is set to: 1 Index the pdf is set to: 1 Parse the pdf is set to: C:\Web\cgi-bin\pdftotext.exe Does parse pdf exist: 1 Doublon avec un document existant 1:http://localhost/Documentation/Revues/ (temps : 00:00:06) + niveau 1... Is result test http an array: 1 What is result test http status: PDF Is result test an array: 1 What is result test status: PDF Use is executable is set to: 1 Index the pdf is set to: 1 Parse the pdf is set to: C:\Web\cgi-bin\pdftotext.exe Does parse pdf exist: 1 Command is: C:\Web\cgi-bin\pdftotext.exe -cork ../admin/temp/62981782.tmp2>&1 Result contains: Array ( ) Return value is: 0 2:http://localhost/Documentation/Revues/Lisezmoi.pdf (temps : 00:00:18) Pas de liens dans la table temporaire Thank's for your help .... |
12-03-2004, 12:12 AM | #2 |
Green Mole
Join Date: Dec 2004
Posts: 5
|
OK, after several readings, I have solved the problem for pdf (I wasnt the only one apparenly and it was just the problem of the option line given to the binary)
I am now stucked on a similar problem, with "word" documents (I use Doc2txt for the text conversion) and whenever I try to index those documents, indexation is not done, but it remains : a 98183891.tmp file in the admin/temp directory and a 98183892.txt file in the admin directory. By the way, this last one is the right text translation of the original Word document. The one in the temp directory contains only one line saying something like : C:Web\www\phpdig\admin\temp\98183891.tmp --> "" "" \admin\98183892.txt Has s.o experienced this ? Thanks Last edited by sofos; 12-03-2004 at 12:19 AM. |
12-03-2004, 04:43 AM | #3 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Try setting define('PHPDIG_MSWORD_EXTENSION',''); to define('PHPDIG_MSWORD_EXTENSION','.txt'); in the config file, making sure to have the period on the .txt extension.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-03-2004, 06:34 AM | #4 |
Green Mole
Join Date: Dec 2004
Posts: 5
|
Hi Charter, That was already set like this (actually, I have duplicated the 'pdf' settings, except the name of the binary, of course).
|
12-03-2004, 06:46 AM | #5 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Try running Doc2Txt from shell and verify that filename.doc is output to filename.txt - maybe Doc2Txt outputs to filename.doc.txt or something else?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-03-2004, 07:10 AM | #6 |
Green Mole
Join Date: Dec 2004
Posts: 5
|
I try this.
Just to make sure I have been clear enough, the text conversion seems to work just fine, since the 98183892.txt file in admin (which, I guess, was previoulsly in admin/temp) is the right text extraction on the right word doc. Anyway, I try your advice and i 'll go back to you on monday. Good week end |
12-03-2004, 10:18 PM | #7 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
OIC, so try changing:
PHP Code:
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-06-2004, 01:33 AM | #8 |
Green Mole
Join Date: Dec 2004
Posts: 5
|
Hi, It works now. And I explain it in case some one uses Doc2txt also on a Windows + Easyphp configuration :
Actually, when using Doc2txt, it is necessary to precise in the options the three flags ( " Doc2txt -q -o /admin/temp -E txt filename.doc" ) "-q" to make it quiet, "-o /admin/temp" to force the generated text file to be in the right directory, "-E txt" to force the right extension : The problem was mainly here because Phpdig is expecting that a "filename.tmp" (the local copy of the given "filename.doc") will be translated by Doc2txt into a "filename.tmp.txt". But, if the flag -E is omitted, Doc2txt will generate "filename.txt" instead of "filename.tmp.txt". So it's working and I really appreciate using phpdig ! I try now to set up crons for the scheduling of the indexation process : I hope Windows wil let me do that.... Thanks for your help, |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
DISPLAY_SNIPPETS & DISPLAY_SUMMARY problem | philcheese | Troubleshooting | 0 | 10-07-2007 06:43 AM |
catdoc MSWORD binary won't execute | frodo | External Binaries | 0 | 06-22-2006 02:31 PM |
pstotext binary | tomas | External Binaries | 2 | 02-12-2004 08:09 PM |
PhpDig and EasyPHP | frostbyte | Troubleshooting | 4 | 01-04-2004 01:40 PM |
1.6.2 fix to crawl binary files | Charter | Mod Submissions | 1 | 09-16-2003 07:52 PM |