PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > External Binaries

Closed Thread
 
Thread Tools
Old 04-09-2004, 10:24 AM   #1
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
README before posting

External Binaries Problem Checklist

This checklist includes most external binaries related issues pertaining to PhpDig version 1.6.4+ but is not meant to be absolutely exhaustive. If you are experiencing an external binaries related problem, then read through this checklist.
  • If receiving a "call to undefined function: is_executable" error or using PHP < 5.0.0 on a Win system, set define('USE_IS_EXECUTABLE_COMMAND','0'); in the config file.
  • Check that the directories to the external binary and the external binary itself are set to 755 permissions if applicable.
  • Check that the following directories are set to 777 permissions if applicable:
    - [PHPDIG_DIR]/text_content
    - [PHPDIG_DIR]/includes (can be set to 755 after connect.php is configured)
    - [PHPDIG_DIR]/admin/temp
  • If using PHP version 4.2.2/3, check this thread or upgrade your PHP.
  • If using for example pdftotext, make sure define('PHPDIG_PDF_EXTENSION','.txt'); includes the period in the .txt extension.
  • If using for example pstotext, make sure Ghostscript is installed correctly, version 3.33+ for PS files or version 3.51+ for PDF files.
  • Set the correct path, for example define('PHPDIG_PARSE_PDF','/path/to/pdftotext'); on *nix or define('PHPDIG_PARSE_PDF','C:\\path\\to\\pdftotext'); on Win (may need .exe extension on Win).
  • If not sure of the path, run the external binary from command line first and try that path.
  • Use a path that does not include spaces, periods, or other 'special' characters.
  • Check to make sure that safe_mode is set to off and allow_url_fopen is set to on.
  • If an open_basedir restriction is in place, make sure to stick the files in the correct directory.
  • If indexing from command line, make sure register_argc_argv is on or check this thread.
  • If not sure about safe_mode, allow_url_fopen, open_basedir, or register_argc_argv, check your phpinfo page.
  • Set define('LIMIT_DAYS',0); to allow for immediate reindex or check this thread.
  • Contact the authors of the external binaries if you have trouble compiling and/or installing those programs.
  • Still having problems...

    Try the below code, modifying the code for other binaries if necessary, do another index, and post the results in your own thread:

    First try the following and then reindex.

    In robot_functions.php, find the appropriate $command variable:
    PHP Code:
    // it can have _PDF or _MSWORD or _MSEXCEL depending on binary
    $command PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2
    And change to the following to see if the issue is displayed upon reindex:
    PHP Code:
    // it can have _PDF or _MSWORD or _MSEXCEL depending on binary
    $command PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2.' 2>&1'
    If that didn't help, then try the following and reindex.

    In spider.php, add the following echo statements:
    PHP Code:
    // sets $tempfile and $tempfilesize

    /*****/
    echo "<br><br>Is result test http an array: " is_array($result_test_http) . "<br>";
    echo 
    "What is result test http status: " $result_test_http['status'] . "<br>";
    /*****/

    extract(phpdigTempFile($url_indexing,$result_test_http,$relative_script_path.'/admin/temp/')); 
    In robot_functions.php, add the following echo statements:
    PHP Code:
    function phpdigTempFile($uri,$result_test,$prefix='temp/',$suffix1='1.tmp',$suffix2='2.tmp') {

    /*****/
    echo "<br>Is result test an array: " is_array($result_test) . "<br>";
    echo 
    "What is result test status: " $result_test['status'] . "<br>";
    echo 
    "Use is executable is set to: " USE_IS_EXECUTABLE_COMMAND "<br>";
    // in the next four lines change _PDF to either _MSWORD or _MSEXCEL for those binaries
    echo "Index the pdf is set to: " PHPDIG_INDEX_PDF "<br>";
    echo 
    "Parse the pdf is set to: " PHPDIG_PARSE_PDF "<br>";
    echo 
    "Does parse pdf exist: " file_exists(PHPDIG_PARSE_PDF) . "<br>";
    echo 
    "Is parse pdf executable: " is_executable(PHPDIG_PARSE_PDF) . "<br>";
    /*****/

    // $temp_filename = md5(time()+getmypid()).$suffix; 
    Also in robot_functions.php, add the following echo/print statements:
    PHP Code:
    exec($command,$result,$retval);

    /*****/
    echo "<br>Command is: " $command "<br>";
    echo 
    "Result contains: ";
    print_r($result);
    echo 
    "<br>Return value is: " $retval "<br><br>";
    /*****/

    unlink($tempfile2); 
    Remember to remove any "word" wrapping in the above code.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline  
Closed Thread


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
only posting bluntman The Mole Hole 4 10-04-2004 08:38 PM
README Before Posting Charter Troubleshooting 0 01-13-2004 06:16 PM


All times are GMT -8. The time now is 11:49 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.