PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 09-18-2003, 12:59 AM   #1
Skop
Green Mole
 
Join Date: Sep 2003
Posts: 5
Indexing by command line interface

Hi,

i installed phpdig 1.6.2 in a linux machine and now i'm trying to index by command line.

PHP Code:
/usr/bin/php4 -[path]/search/admin/spider.php forceall >> /tmp/phpdig.log 
nothing happend! the phpdig.log includes something like

PHP Code:
848old priority 0, new priority 18 
and the indexing (reindexing of existing hosts) doesn't work.

Some ideas?

Thanks a lot.
JS
Skop is offline   Reply With Quote
Old 09-18-2003, 08:19 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Here are some suggestions.


If CGI mode, perhaps try the following:
Code:
#!/usr/bin/php4 -f [path]/search/admin/spider.php forceall >> /tmp/phpdig.log
If not CGI mode, and PHP can run anywhere, cd to the search dir and try the following:
Code:
php -f admin/spider.php forceall > phpdig.log
If this is the first time indexing, change forceall to http://www.domain.com


In the config file, change the following to one if updating before seven days have past:
PHP Code:
define('LIMIT_DAYS',7); //default days before reindex a page 
To start over and index from scratch, do the following:
  1. empty all the PhpDig database tables
  2. delete all files that may be in the temp dir
  3. delete all files in the text_content dir except keepalive.txt
  4. run spider.php from a browser or command prompt
Before running spider.php from the command prompt, in the config file, change the following to one if only one level is wanted:
PHP Code:
define('SPIDER_MAX_LIMIT',20); //max recurse levels in sipder
define('SPIDER_DEFAULT_LIMIT',3); //default value
define('RESPIDER_LIMIT',4); //recurse limit for update 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 09-19-2003, 02:17 AM   #3
Skop
Green Mole
 
Join Date: Sep 2003
Posts: 5
Quote:
Originally posted by Charter

Code:
php -f admin/spider.php forceall > phpdig.log
If this is the first time indexing, change forceall to http://www.domain.com
Nothing, nothing happend. I take a look on spider.php source, and i think that the program hang on line 80:

PHP Code:
    print @exec('renice 18 '.getmypid()).$br
I try also to clean the tables etc like you write; but the db stay empty, and the spider.php don't works.

Thanks a lot.
Skop is offline   Reply With Quote
Old 09-19-2003, 04:29 AM   #4
Rolandks
Purple Mole
 
Rolandks's Avatar
 
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
Hmm,
command Line is something with difficulty. I also have many attempts until it works.
I think it shoult be change in the one of the next versions to work better with all Operating Systems, because it is important that it works fine, when you will indexing frequently Content Sites daily with Cron jobs or Windows Tasks.

Read this:
http://www.phpdig.net/showthread.php?s=&threadid=56

-Roland-
Rolandks is offline   Reply With Quote
Old 09-19-2003, 06:20 AM   #5
Skop
Green Mole
 
Join Date: Sep 2003
Posts: 5
Quote:
[...]
Read this:
http://www.phpdig.net/showthread.php?s=&threadid=56

-Roland- [/b]
I red this, but unfortunally don't help me Now i'll try to hack a little the code... If you have other ideas, i'm here!

Thanks a lot
JS
Skop is offline   Reply With Quote
Old 09-19-2003, 08:06 AM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. It looks like the renice command is working as 848: old priority 0, new priority 18 appears in the log file, but you could try commenting that line out. The renice command is for setting the priority of the spidering process.

Are there any files besides keepalive.txt in the text_content dir?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 09-19-2003, 08:37 AM   #7
Skop
Green Mole
 
Join Date: Sep 2003
Posts: 5
Quote:
Originally posted by Charter
[...]
The renice command is for setting the priority of the spidering process.

Are there any files besides keepalive.txt in the text_content dir? [/b]
I commented out this line, but as how i write, nothing happend.

The text_content dir is empty (except keepalive.txt [2 b])

For now i've this solution: I use the lynx for call the function:


PHP Code:
lynx -dump -auth=yourlogin:yourpwd '[url]/pathtosearch/admin/update.php?site_id=XXX&exp=1' >/tmp/uotput 2>/tmp/erroroutput 
this works

JS
Skop is offline   Reply With Quote
Old 09-19-2003, 08:50 AM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Quote:
Originally posted by Skop
PHP Code:
lynx -dump -auth=yourlogin:yourpwd '[url]/pathtosearch/admin/update.php?site_id=XXX&exp=1' >/tmp/uotput 2>/tmp/erroroutput 
Great! Glad it's working. Interesting that lynx will work but php won't. Are you able to do the following from the command prompt?
Code:
php -f test.php
where test.php is the below:
PHP Code:
<?php
echo "test";
?>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 10-14-2003, 03:23 AM   #9
Skop
Green Mole
 
Join Date: Sep 2003
Posts: 5
Quote:
Originally posted by Charter
Code:
php -f test.php
Hi, sorry for late answer. I try what you suggest to me, and works. I think the problem is the spider.php file, and how get the inputs from STDIN.
Skop is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
command line indexing that actually works carlaron Troubleshooting 0 11-06-2006 09:48 PM
Command line vs. admin indexing wx3 Troubleshooting 8 09-08-2004 01:31 AM
Can I close Putty during command line indexing? Nosmada How-to Forum 7 01-23-2004 04:46 PM
indexing from command line with text file Wayne McBryde Troubleshooting 8 01-12-2004 06:56 PM
Indexing by command line... Canadian How-to Forum 4 01-04-2004 07:44 PM


All times are GMT -8. The time now is 12:15 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.