PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > External Binaries

Reply
 
Thread Tools
Old 08-20-2004, 10:02 AM   #1
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
Index MSWORD But No search result

Hi All
I'm try indexing MSWORD Files but when im try search the content of this file i got nothing
my config file look like :
define('PHPDIG_INDEX_MSWORD',true);
define('PHPDIG_PARSE_MSWORD','c:\appserv\www\catdoc\catdoc');
define('PHPDIG_OPTION_MSWORD','');

define('PHPDIG_INDEX_PDF',true);
define('PHPDIG_PARSE_PDF','/usr/local/bin/pstotext');
define('PHPDIG_OPTION_PDF','-cork');

define('PHPDIG_INDEX_MSEXCEL',true);
define('PHPDIG_PARSE_MSEXCEL','c:\appserv\www\catdoc\xls2csv');
define('PHPDIG_OPTION_MSEXCEL','-s 8859-1');



//---------EXTERNAL TOOLS EXTENSIONS
// if external binary is not STDOUT or different extension is needed
// for example, use '.txt' if external binary writes to filename.txt
define('PHPDIG_MSWORD_EXTENSION','');
define('PHPDIG_PDF_EXTENSION','');
define('PHPDIG_MSEXCEL_EXTENSION','');
define('PHPDIG_MSPOWERPOINT_EXTENSION','');

and i add this line of code to robot_functions.php:
$command = PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1';


when im try catdoc in command line its work and got my MSWORD
c:\Appserv\www\catdoc\catdoc w.doc

im try check this Information
but still can't search my word

document files
please any help
wessam is offline   Reply With Quote
Old 08-20-2004, 10:26 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Did you try it with .exe added on to catdoc?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 08-20-2004, 02:02 PM   #3
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
yes and i got the same things
wessam is offline   Reply With Quote
Old 08-20-2004, 02:09 PM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Like this?
PHP Code:
define('PHPDIG_PARSE_MSWORD','C:\\\\appserv\\\\www\\\\catdoc\\\\catdoc.exe'); 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 08-20-2004, 02:12 PM   #5
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
thanks for you fast answers

and yes i try this one and also 'c:\appserv\........'
wessam is offline   Reply With Quote
Old 08-20-2004, 02:22 PM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Go back to this thread and add the code, and then reindex, and let me know what it says when it encounters the Word document.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 08-20-2004, 02:32 PM   #7
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
hi..
this the output
--------------------------------------------------------------------------------
SITE : http://localhost/
Exclude paths :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 0
Index the pdf is set to: 1
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
1:http://localhost/test/
(time : 00:00:05)
+
level 1...


Is result test http an array: 1
What is result test http status: MSWORD

Is result test an array: 1
What is result test status: MSWORD
Use is executable is set to: 0
Index the pdf is set to: 1
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
2:http://localhost/test/w.doc
(time : 00:00:15)

No link in temporary table

--------------------------------------------------------------------------------

links found : 2
http://localhost:10/test/
http://localhost:10/test/w.doc
Optimizing tables...
Indexing complete !
--------------------------------------------------------------------------------
[Back] to admin interface.
wessam is offline   Reply With Quote
Old 08-20-2004, 02:42 PM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Set the following and do another reindex:
PHP Code:
define('PHPDIG_INDEX_PDF',false); 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 08-20-2004, 02:52 PM   #9
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
Hi i did but still can't search my word document
SITE : http://localhost/
Exclude paths :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
1:http://localhost/test/
(time : 00:00:05)
+
level 1...


Is result test http an array: 1
What is result test http status: MSWORD

Is result test an array: 1
What is result test status: MSWORD
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
2:http://localhost/test/w.doc
(time : 00:00:15)

No link in temporary table

--------------------------------------------------------------------------------

links found : 2
http://localhost:10/test/
http://localhost:10/test/w.doc
Optimizing tables...
Indexing complete !
wessam is offline   Reply With Quote
Old 08-20-2004, 02:57 PM   #10
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Oh, you need to edit the code you added so that it is for Word documents, not for PDFs. For example...
PHP Code:
// it can have _PDF or _MSWORD or _MSEXCEL depending on binary
$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 08-20-2004, 03:03 PM   #11
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
im sorry coz im bother you
I did but nothing new (
SITE : http://localhost/
Exclude paths :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
1:http://localhost/test/
(time : 00:00:05)
+
level 1...


Is result test http an array: 1
What is result test http status: MSWORD

Is result test an array: 1
What is result test status: MSWORD
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
2:http://localhost/test/w.doc
(time : 00:00:15)

No link in temporary table
wessam is offline   Reply With Quote
Old 08-20-2004, 03:18 PM   #12
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
I mean throughout, including for these things...
PHP Code:
// in the next four lines change _PDF to either _MSWORD or _MSEXCEL for those binaries
echo "Index the pdf is set to: " PHPDIG_INDEX_PDF "<br>";
echo 
"Parse the pdf is set to: " PHPDIG_PARSE_PDF "<br>";
echo 
"Does parse pdf exist: " file_exists(PHPDIG_PARSE_PDF) . "<br>";
echo 
"Is parse pdf executable: " is_executable(PHPDIG_PARSE_PDF) . "<br>"
It's still using _PDF because "/usr/local/bin/pstotext" is getting printed.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 08-20-2004, 03:19 PM   #13
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
Hi this is what i got now

SITE : http://localhost/
Exclude paths :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
1:http://localhost/test/
(time : 00:00:05)
+
level 1...


Is result test http an array: 1
What is result test http status: MSWORD

Is result test an array: 1
What is result test status: MSWORD
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:

Command is: c:\appserv\www\catdoc\catdoc.exe -s 8859-1 ../admin/temp/75689462.tmp 2>&1
Result contains: Array ( [0] => The system cannot execute the specified program. )
Return value is: 1

2:http://localhost/test/w.doc
(time : 00:00:16)

No link in temporary table
wessam is offline   Reply With Quote
Old 08-20-2004, 03:25 PM   #14
wessam
Orange Mole
 
Join Date: Jul 2004
Posts: 30
after that i remove the .exe from the path and got
SITE : http://localhost/
Exclude paths :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
1:http://localhost/test/
(time : 00:00:05)
+
level 1...


Is result test http an array: 1
What is result test http status: MSWORD

Is result test an array: 1
What is result test status: MSWORD
Use is executable is set to: 0
Index the pdf is set to:
Parse the pdf is set to: /usr/local/bin/pstotext
Does parse pdf exist:
Is parse pdf executable:
2:http://localhost/test/w.doc
(time : 00:00:15)

No link in temporary table

--------------------------------------------------------------------------------

links found : 2
http://localhost:10/test/
http://localhost:10/test/w.doc
Optimizing tables...
Indexing complete !
wessam is offline   Reply With Quote
Old 08-20-2004, 03:31 PM   #15
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Why is "Parse the pdf is set to: /usr/local/bin/pstotext" still printing?

It should be the following code...
PHP Code:
// in the next four lines change _PDF to either _MSWORD or _MSEXCEL for those binaries
echo "Index the doc is set to: " PHPDIG_INDEX_MSWORD "<br>";
echo 
"Parse the doc is set to: " PHPDIG_PARSE_MSWORD "<br>";
echo 
"Does parse doc exist: " file_exists(PHPDIG_PARSE_MSWORD) . "<br>";
echo 
"Is parse doc executable: " is_executable(PHPDIG_PARSE_MSWORD) . "<br>"
Try that and also keep the following:
PHP Code:
define('PHPDIG_OPTION_MSWORD',''); // two single quotes, no space between 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search = Alwais No Result benj-- Troubleshooting 0 07-26-2006 09:17 AM
Ranking for Pages in Search Result alokjain9 How-to Forum 1 02-10-2006 06:25 AM
Problems with the search result Paka76 How-to Forum 1 12-05-2005 06:53 AM
Search Result Page Question b-online How-to Forum 2 03-28-2005 01:03 PM
v1.8.0 strange search result on numbers Andrew Troubleshooting 5 05-03-2004 06:14 PM


All times are GMT -8. The time now is 09:43 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.