|
10-01-2004, 03:30 AM | #1 |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
problem with .pdf and .doc files
Hi,
As I'm not very good in english, I'm a little losted in this Forum. I've seen many topics speaking about issues with indexing pdf but can't find a solution. I'm sure it is on the forum... So, my problem is that my pdf files seem to be indexed. But when I search a keyword or the filename of one of them, I can't find it. I've searched in the database and never seen any pdf file (never .doc file..., but .xls seem to be ok) I use PHP 4.3.3, MySQL 4.0.15 on Windows XP The PHPDig version is 1.8.3 The site I'm trying to index is the Intranet site, so I can't make a link for you to see.. PHP Code:
niveau 2... 4:http://10.37.1.240/dossier_presse/dp_2004_a.pdf (not checked) (temps : 00:01:22) 5:http://10.37.1.240/arrete_100903.pdf (not checked) (temps : 00:01:30) 6:http://10.37.1.240/Ressources-Humain...lephonique.htm (checked) (temps : 00:01:51) + + + + + + And in the summary : http://10.37.1.240/dossier_presse/dp_2004_a.pdf |
10-01-2004, 06:05 AM | #2 |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
I try what is writing in the readme topic and this is what I obtain :
Is result test http an array: 1 What is result test http status: HTML Is result test an array: 1 What is result test status: HTML Use is executable is set to: 0 Index the pdf is set to: 1 Parse the pdf is set to: C:/Stage_Manuella/moteur/PHPDIG_DIR/Ghostgum/pstotext Does parse pdf exist: 1 Fatal error: Call to undefined function: is_executable() in c:\stage_manuella\moteur\phpdig_dir\phpdig-1.8.3\admin\robot_functions.php on line 963 |
10-01-2004, 06:55 AM | #3 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Set USE_IS_EXECUTABLE_COMMAND to zero in the config file.
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-01-2004, 07:23 AM | #4 | |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
I've done it.
Quote:
Should I put the path to the executable with the name of the file (pstotxt3.exe) or not ? like this : PHP Code:
PHP Code:
or something else ? should I put relative path or absolute ? Last edited by mleray; 10-01-2004 at 07:35 AM. |
|
10-01-2004, 08:19 AM | #5 |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
I try with pdftotext, seems to be better but not perfect ...
Is result test http an array: 1 What is result test http status: PDF Is result test an array: 1 What is result test status: PDF Use is executable is set to: 0 Index the pdf is set to: 1 Parse the pdf is set to: C:\Stage_Manuella\moteur\PHPDIG_DIR\xpdf-3.00-win32\pdftotext.exe Does parse pdf exist: 1 Command is: C:\Stage_Manuella\moteur\PHPDIG_DIR\xpdf-3.00-win32\pdftotext.exe ../admin/temp/95662532.tmp 2>&1 Result contains: Array ( [0] => Error: Copying of text from this document is not allowed. ) Return value is: 3 What does this error mean ? |
10-05-2004, 12:22 AM | #6 |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
No more help ?
Is there any frenchies here ? |
10-06-2004, 04:29 AM | #7 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
>> Result contains: Array ( [0] => Error: Copying of text from this document is not allowed. )
The issue is with the PDF, not PhpDig. The PDF permissions are set such that "copying of text from this document is not allowed."
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-08-2004, 12:59 AM | #8 | ||
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
Seems to be ok now. Thanks.
But now I've got new problem with catdoc and xls2csv Quote:
It's the same with catdoc.exe If I try to launch the program in MS-DOS like this : Quote:
|
||
10-12-2004, 07:57 AM | #9 |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
Very Important for catdoc & xls2csv ! + traduction française
I've found a solution to my problem with these external binaries.
I'd got PHP install with EasyPHP but it should be instal in CGI mode ! So now I've change robot_function.php to robot_function.cgi and spider.php to spider.cgi and the links to these files should be change as you had guess... And it works ! No I just have problem with accent as I'm french but that's all. Hope that will help. ----------------------------------------------------------------- Traduction française... J'ai trouvé la solution Ã* mon problème avec les external binaries. J'avais installé PHP en module avec EasyPHP mais il fallait l'installer en CGI parce que sinon la fonction exec() ne marchait pas (erreur : Le système ne peut exécuter le programme demandé). J'ai donc ensuite renommé mais fichier robot_functions.php et spider.php en .cgi et modifié les liens correspondants dans les fichiers où c'était nécessaire. Et ça marche ! Il me reste juste un petit souci de conversion des accents mais c'est un moindre mal. En espérant que cela puisse vous aider. (vous pouvez laisser un post sur developpez.com au cas z'où, j'y suis souvent) Manuella |
10-13-2004, 02:14 AM | #10 |
Orange Mole
Join Date: Sep 2004
Location: Nantes (44) FRANCE
Posts: 31
|
Precision :
I use PHP 4.3.3 MySQL 4.0.15 Apache 1.3.27 on Windows XP installed with EasyPHP 1.7 My PHPDig version is 1.8.3 |
12-09-2004, 03:27 AM | #11 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
bump for xperienss...
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-09-2004, 11:26 PM | #12 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
ohhhhhhhhhh thanx a lot @ Charter for bumping this post.
---- Ce message va Ã* Mleray Apparement nous avons les mêmes configuration (WinXP, easyPHP 1.7,...) Pour le moment j'ai réussi a faire marcher l'indexation de pdf avec Xpdf/pdftotext.exe v3. Mais pour ce qui est de catdoc et xls2csv, je n'arrive toujours pas Ã* indexer les fichiers. Tu disais que tu avais trouvé la solution... alors si tu peux m'aider car cela fait 1 semaine qur je galère en essayant toutes les configs possibles. Merci d'avance (si tu reçois ce message) ---- Well, as soon as i ll got everything working, i ll post a topic with all explanations to install phpdig/catdoc/xpdf-pdftotext on WinXP/EasyPHP 1.7... I am sure this would help lots of people. Xperienss |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
help where I find External Binaries the pdf xls doc | gioducati | External Binaries | 0 | 08-12-2006 12:28 AM |
index only *.doc files ? | ipguy | Troubleshooting | 1 | 01-16-2006 04:45 PM |
xls doc pdf with windows | sktest | External Binaries | 1 | 02-09-2004 10:47 AM |
indexation pdf doc et xls | yoann | Mod Submissions | 0 | 09-26-2003 08:49 AM |