PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > External Binaries

Reply
 
Thread Tools
Old 11-28-2006, 02:24 AM   #1
acti_dev
Awaiting Email
 
Join Date: Nov 2006
Posts: 6
spider.php blocked when indexing

Hello,
I've installed phpdig v.1.8.8 with EasyPhp on Windows.

I would like to index pdf file.
I've added the 3 part of code in "read me before..."

When i try to index pdf files, it blocks

SITE : http://192.168.1.28/
Chemins exclus :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: //.../phpdig/xpdf/pdftotext.exe
Does parse pdf exist: 1

Thanks for your help
acti_dev is offline   Reply With Quote
Old 11-30-2006, 02:12 AM   #2
acti_dev
Awaiting Email
 
Join Date: Nov 2006
Posts: 6
When I comment this line //echo "Is parse pdf executable: " . is_executable(PHPDIG_PARSE_PDF) . "<br>";

I obtain this result :
SITE : http://192.168.1.28/
Chemins exclus :
- @NONE@


Is result test http an array: 1
What is result test http status: PDF

Is result test an array: 1
What is result test status: PDF
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: //.../phpdig/xpdf/pdftotext.exe
Does parse pdf exist: 1

Command is: //.../phpdig/xpdf/pdftotext.exe ../admin/temp/69288482.tmp 2>&1
Result contains: Array ( [0] => Error: Couldn't open file '../admin/temp/69288482.tmp' )
Return value is: 1

1:http://192.168.1.28/espace-dpi/directives/dir117.pdf
(temps : 00:00:01)
Pas de liens dans la table temporaire

And i have a tmp file which its name is 69288481.tmp (1ko) and not 69288482.tmp
acti_dev is offline   Reply With Quote
Old 12-02-2006, 06:20 AM   #3
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
What did you set in the config file for the following?
  • PHPDIG_INDEX_PDF
  • PHPDIG_PARSE_PDF
  • PHPDIG_OPTION_PDF
  • PHPDIG_PDF_EXTENSION
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 12-04-2006, 05:17 AM   #4
acti_dev
Awaiting Email
 
Join Date: Nov 2006
Posts: 6
define('PHPDIG_INDEX_PDF',true);
define('PHPDIG_PARSE_PDF','\\\\..\\..\\phpdig\\xpdf\\pdftotext.exe');
define('PHPDIG_OPTION_PDF','');
define('PHPDIG_PDF_EXTENSION','.txt');
acti_dev is offline   Reply With Quote
Old 12-04-2006, 06:05 AM   #5
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Did "Is parse pdf executable" come out as zero or blank or one? If it was zero or blank, try setting the PHPDIG_PARSE_PDF constant in the config file to the full server path instead of using a relative path. Also if you are not using PHP5, set the USE_IS_EXECUTABLE_COMMAND constant in the config file to the number zero.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 12-04-2006, 07:59 AM   #6
acti_dev
Awaiting Email
 
Join Date: Nov 2006
Posts: 6
I'm not using PHP5 so define('USE_IS_EXECUTABLE_COMMAND','0'); and it comes out blank.
I obtain the display of my 2nd post when I comment this line //echo "Is parse pdf executable: " . is_executable(PHPDIG_PARSE_PDF) . "<br>";
PHPDIG_PARSE_PDF is already a full server path (i'm not working on the server machine)
acti_dev is offline   Reply With Quote
Old 12-04-2006, 04:39 PM   #7
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Try running pdftotext.exe dir117.pdf from command prompt. Does it work? Are you able to index non-PDF files/HTML pages?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 12-06-2006, 12:05 AM   #8
acti_dev
Awaiting Email
 
Join Date: Nov 2006
Posts: 6
pdftotext.exe runs well from dos command, it's from php it doesn't work and when i run a .bat file from php, a dos windows open and close but no txt file is created...
acti_dev is offline   Reply With Quote
Old 12-07-2006, 12:04 AM   #9
acti_dev
Awaiting Email
 
Join Date: Nov 2006
Posts: 6
And it doesn't work also with doc or xls files with catdoc or antiword. Only indexing of HTML pages works fine...
acti_dev is offline   Reply With Quote
Old 12-09-2006, 06:29 AM   #10
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
If HTML pages are indexed, but not DOC, PDF, PPT, or XLS files, then it seems that EasyPHP might not be allowing the PHP exec function:
Code:
exec($command,$result,$retval);
I'm not familiar with EasyPHP, but perhaps the user comments on this page might help. Also, try to get the following script to run in EasyPHP:
Code:
<?php
echo exec('whoami');
?>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
spider.php problem digdug Script Installation 8 10-18-2006 08:25 AM
PDF indexing blocked pascalp External Binaries 16 08-11-2005 05:20 AM
I have one bug with spider.php Booboo Troubleshooting 1 03-01-2005 02:46 AM
phpdig blocked when spidering any site heli Troubleshooting 3 09-30-2004 11:42 AM
spider.php via bash tomas Troubleshooting 16 02-07-2004 05:23 PM


All times are GMT -8. The time now is 08:35 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.