|
12-08-2004, 04:12 AM | #1 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
catdoc problem with WinXP
Hi all
I am using phpdig 1.8.4 on winXP (Windows NT SERVER 5.1 build 2600 ) with easyPHP 1.7 (PHP Version 4.3.3) I am trying to index .doc files (to start with) with the spider but so far no luck... When i used catdoc in command line, i get this : --- catdoc ./test.doc Banane Fruit Abricot --- those are the words in my doc file. So i guess catdoc.exe is working But when i try to index the file using phpdig, here is what i get : --- SITE : http://server/ Chemins exclus : - @NONE@ 1:http://server/moteur/catdoc/test.doc (temps : 00:00:07) Pas de liens dans la table temporaire liens trouvés : 1 http://server/moteur/catdoc/test.doc Optimizing tables... Indexation terminée ! --- its look like its not indexing that file Here is my config file PHP Code:
PHP INFO : Safe_mode OFF allow_url_fopen ON --- robot_functions.php : PHP Code:
thanx for your help... |
12-08-2004, 11:58 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Post the info that gets printed from this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-08-2004, 12:17 PM | #3 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
thanx Charter for replying...
I tried already all codes changes and here is what i get now when trying to index a pdf file : --- SITE : http://10.1.0.181/ Chemins exclus : - @NONE@ Is result test http an array: 1 What is result test http status: PDF Is result test an array: 1 What is result test status: PDF Use is executable is set to: 0 Index the pdf is set to: 1 Parse the pdf is set to: d:\serveur\www\moteur\xpdf\pdftotext.exe Does parse pdf exist: 1 --- and it stop there... nothing happened after that line... but when i try in command line, its ok, i get the txt file right Last edited by xperienss; 12-08-2004 at 12:25 PM. |
12-08-2004, 12:27 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Maybe one of the following links might help?
http://www.phpdig.net/forum/showthread.php?t=1407 http://www.phpdig.net/forum/showthread.php?t=534
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-08-2004, 11:15 PM | #5 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
Hi again
i ve been to : http://www.phpdig.net/forum/showthread.php?t=1407 and i ve done the same change and still no luck... @ Charter is there any way for me to contact mleray via the forum as she has exactly the same config than mine (easyphp1.7 WinXP) and its look like she found the solution ? I can try to write a reply to her post but last time she came around was in october 2004 (2months ago)... |
12-09-2004, 03:35 AM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
When you use the following, what does it print out?
PHP Code:
* Just a general comment, not directed to anyone in particular: This bump is the exception, not the rule, so don't expect me to bump old threads even if asked. Thanks.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-10-2004, 03:58 AM | #7 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
here is where i stand for now:
pdf files are indexing but no way for word or xls. For those waiting for an answer : My config is WinXP SP2, EasyPHP 1.7 (PHP 4.3.3) EasyPHP is installed in 'd:\serveur' Phpdig is installed in 'd:\serveur\www\moteur' My config file for phpdig PHP Code:
i am using Xpdf/pdftotext availaible here : ftp://ftp.foolabs.com/pub/xpdf/ -- (http://www.foolabs.com/xpdf/download.html) get 'xpdf-3.00-win32.zip' 1,08Mb Warning : It cannot index pdf file which are password protected ! AND : shut down ALL firewall on your machine before indexing. as soon as i ve got the answer for doc and xls file, i ll post the answer. hope that this will help Xperienss Last edited by xperienss; 12-10-2004 at 04:32 AM. |
12-10-2004, 01:29 PM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Change:
PHP Code:
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-11-2004, 05:35 AM | #9 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
i tried that already and no change
(checked)431:http://xxx/budgetTresorerie.pdf (temps : 00:58:01) (not checked)432:http://xxx/budgetTreso.doc (temps : 00:58:07) still not indexing .doc and .xls file but i don't give up and i ll find the solution soon or later |
12-12-2004, 02:08 AM | #10 |
Green Mole
Join Date: Dec 2004
Location: Geneva Switzerland
Posts: 8
|
okay i see what s wrong now
when i try to index .doc file, catdoc.exe seems to see the file but don't create the outpout file and store that file to the right directory. the same when i run catdoc in MS-DOSS catdoc read the info from the doc file but it doesn't print out any file i can see the infos inside my MS-DOSS window but no file is created anyone s got any idea what command we need to use ? catdoc manual : http://www.45.free.net/~vitus/ice/ca...atdoc.man.html i tried : ------- catdoc -s 8859-1 -f ascii ../../test/test.doc Test Fichier Word -------- it read the texte from the doc file but doesn't create any file Last edited by xperienss; 12-12-2004 at 02:37 AM. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
catdoc with WinXP | sandychan | External Binaries | 0 | 07-12-2006 07:50 PM |
command line using winxp | mrgee | Troubleshooting | 1 | 11-03-2004 03:20 AM |
Catdoc garbage | Hoek | External Binaries | 3 | 02-23-2004 02:57 PM |
catdoc | Tanasja | External Binaries | 7 | 11-07-2003 02:55 PM |
catdoc | mario | External Binaries | 1 | 10-28-2003 08:13 PM |