PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > External Binaries

Reply
 
Thread Tools
Old 07-09-2004, 11:22 AM   #1
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
no msword to txt parsing

hello

(i've 1.8.1 and 1.8.0 version on my site)

i made a simple test page as

<a href="http://quito.citipo.fr/modules/documents/rep2/DocUtil.doc">Docutilisateur</a><br>

--
i indexe it ... a temporary file is created in admin/temp/xxxx.tmp for this .doc

but it seems that this file is not parse as txt file with phpdig

---

i don't know why ???

thanks
lolodev is offline   Reply With Quote
Old 07-09-2004, 12:15 PM   #2
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
no msword indexing

hello

i continue my test.

i put an echo at line 461 from spider.php script.

my script to index is : test.php
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Sans titre</title>
</head>
<body>
<a href="http://quito.citipro.fr/modules/documents/rep2/DocUtil.doc">Docutilisateur</a><br>
</body>
</html>


the result is:

SITE : http://quito.citipro.fr/
Exclude paths :
- @NONE@
Resource id #5**../admin/temp/81475511.tmp**245**15********
test.php**HTML**20040709211142**20040709211125**Array**
1:http://quito.citipro.fr/test.php
(time : 00:00:22)
+
level 1...
Resource id #5**0**0**15******modules/documents/rep2/**
DocUtil.doc**MSWORD**20040709211152**20040708082318****
2:http://quito.citipro.fr/modules/docu...p2/DocUtil.doc
(time : 00:00:32)

No link in temporary table

there is no temporary file for msword ...

thanks
lolodev is offline   Reply With Quote
Old 07-09-2004, 12:20 PM   #3
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. There is a checklist here to help with troubleshooting.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-10-2004, 02:23 PM   #4
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
always catdoc

hello

thanks you for posting thread- i check your list and all your request are good - but ...

when i indexe my .doc, response is:

Command is: /home/mutualiseweb/catdoc-0.93.3/catdoc -s 8859-1 ../admin/temp/44148632.tmp
Result contains: Array ( )
Return value is: 127

but nothing is record in the database

i try a command line with catdoc on my linux OS, catdoc runs well my MSWORD

what happend ??

Are there frenchies users in this forum ??
lolodev is offline   Reply With Quote
Old 07-10-2004, 02:33 PM   #5
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. In robot_functions.php find:
PHP Code:
$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2
and replace with:
PHP Code:
$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'
to see what issue occurs.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-10-2004, 02:44 PM   #6
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
hi (23:44 in france)

here response with the code modification:

Command is: /home/mutualiseweb/catdoc-0.93.3 -s 8859-1 ../admin/temp/38346732.tmp 2>&1
Result contains: Array ( [0] => sh: line 1: /home/mutualiseweb/catdoc-0.93.3: is a directory )
Return value is: 126

strange: when i use a command line /home/mutualiseweb/catdoc -s 8859-1 mymsword.doc, catdoc runs - but when i change define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3');

with define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc);, phpdig not recognize my msword file
lolodev is offline   Reply With Quote
Old 07-10-2004, 02:47 PM   #7
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Does this work?
PHP Code:
define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3/catdoc'); 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-10-2004, 02:49 PM   #8
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
lol, i try this before your post

No! doesn't work
lolodev is offline   Reply With Quote
Old 07-10-2004, 02:51 PM   #9
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. What does
PHP Code:
define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3/catdoc'); 
give you when you use
PHP Code:
$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-10-2004, 02:53 PM   #10
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
Command is: /home/mutualiseweb/catdoc-0.93.3/catdoc -s 8859-1 ../admin/temp/39511712.tmp 2>&1
Result contains: Array ( [0] => sh: line 1: /home/mutualiseweb/catdoc-0.93.3/catdoc: No such file or directory )
Return value is: 127
lolodev is offline   Reply With Quote
Old 07-10-2004, 02:56 PM   #11
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. What does
PHP Code:
define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc'); 
give you when you use
PHP Code:
$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'
Also, is catdoc 755 permission?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-10-2004, 03:02 PM   #12
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
OK !!
all is my fault

my catdoc is under /home/mutualiseweb/catdoc-0.93.3/src/ MY GOD

a little question with .pdf files: is it necessary to install GHOST ??

)) sorry
lolodev is offline   Reply With Quote
Old 07-10-2004, 03:03 PM   #13
lolodev
Orange Mole
 
Join Date: Apr 2004
Location: Nancy (54)
Posts: 38
THANKS LOT
lolodev is offline   Reply With Quote
Old 07-10-2004, 03:11 PM   #14
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
LOL, paths and permissions.

For PDFs perhaps try getting pdftotext already compiled. Directions are in this post.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
catdoc MSWORD binary won't execute frodo External Binaries 0 06-22-2006 02:31 PM
Student who try to works with Msword! davids211082 External Binaries 1 03-15-2005 10:09 AM
Index MSWORD But No search result wessam External Binaries 29 08-22-2004 04:29 PM
robots.txt versus robotsxx.txt Charter IPs, SEs, & UAs 0 03-11-2004 07:00 PM
Problems with URL parsing apdejong Troubleshooting 6 11-20-2003 03:35 AM


All times are GMT -8. The time now is 02:51 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.