PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)
-   -   Indexing PDFs doesen't really work (http://www.phpdig.net/forum/showthread.php?t=1015)

N100101 06-18-2004 09:40 AM

Indexing PDFs doesen't really work
 
OS: Linux
PHP Version 4.3.2


**********************************
Spidering in progress...

SITE : http://localhost/
Exclude paths :
- @NONE@

Is result test an array: 1
What is result test status: PDF
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: /usr/local/bin/pdftotext
Does parse pdf exist: 1
Is parse pdf executable: 1

Command is: /usr/local/bin/pdftotext ../admin/temp/7672.tmp
Result contains: Array ( )
Return value is: 3

1:http://localhost/pub/info/info_st.pdf
(time : 00:00:05)
No link in temporary table

links found : 1
http://localhost/pub/info/info_st.pdf
Optimizing tables...
Indexing complete
**********************************

Indexing via terminal works without any problems.

Any hints?

Thanks in advance.

Charter 06-18-2004 12:46 PM

Hi. In robot_functions.php try changing:
PHP Code:

$command PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2

to the following:
PHP Code:

$command PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2.' 2>&1'

And see if it will echo the problem.

N100101 06-18-2004 02:32 PM

Here is the result:

Command is: /usr/local/bin/pdftotext ../admin/temp/5952.tmp 2>&1
Result contains: Array ( [0] => Error: Bad annotation action [1] => Error: Copying of text from this document is not allowed. )
Return value is: 3

Hm, what does this mean? :confused:

N100101 06-18-2004 03:59 PM

Arrgh, sure that PDF cannot be copied... :bang:

I have tested it with another PDF and it works!

Thanks a lot.


All times are GMT -8. The time now is 03:46 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.