|
01-22-2004, 09:54 PM | #1 |
Orange Mole
Join Date: Dec 2003
Posts: 32
|
Can I close Putty during command line indexing?
Can I close Putty during command line indexing or will it stop indexing? This is what I do with perl scripts:
nohup perl nph-build.cgi --all > log.txt & [the nohup command means it keeps running even if you hang up and come back later - log.txt & - that is where the output goes] tail -n50 -f log.txt [this line gets you back into the log to see what is going on in realtime] Can I use the same commands in php? Do I need to? How does it work?
__________________
Nosmada |
01-22-2004, 10:31 PM | #2 |
Orange Mole
Join Date: Dec 2003
Posts: 32
|
I just closed Putty and am not sure if it is still running. I have 45,000 pages. If I run it continuously it will take almost 4 days.
What should I do? I guess I should adjust some of the settings below but don't quite understand what they do and what the impact will be on the search results? define('SPIDER_MAX_LIMIT',20); //max recurse levels in spider define('SPIDER_DEFAULT_LIMIT',3); //default value define('RESPIDER_LIMIT',4); //recurse limit for update define('LIMIT_DAYS',7); //default days before reindex a page
__________________
Nosmada |
01-23-2004, 12:05 PM | #3 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. The $limit variable is set to the number selected in the drop down box via the admin panel for search depth. The numbers in this drop down box go from zero to SPIDER_MAX_LIMIT. If running from shell or updating a site via the admin panel, $limit is set according to the following code:
PHP Code:
You can use shell commands to run PhpDig, but if you wish to use shell commands via PHP, check out this page for various methods.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-23-2004, 12:18 PM | #4 |
Orange Mole
Join Date: Dec 2003
Posts: 32
|
Okay looked through that whole page and found the following for shell command to make it run in the background.
php -q foobar.php >/dev/null 2>&1 When I log back in where do I go to get the realtime output log or do I have to add more code onto the end? What are those parameters on the end? Like /dev/null/ 2>&1.
__________________
Nosmada |
01-23-2004, 12:46 PM | #5 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
The '> /dev/null 2>&1' redirects STDOUT and STRERR to /dev/null so if you want to background something use '> /dev/null 2>&1 &' and then look in /dev/null for the results. Alternatively, you can try using 'php -f spider.php [option] > phpdiglog.txt &' from the admin directory and then check the phpdiglog.txt file in the admin directory for output. Available options are listed here.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-23-2004, 12:56 PM | #6 |
Orange Mole
Join Date: Dec 2003
Posts: 32
|
...and then look in /dev/null for the results or phpdig.log.
How do I call this from the command line when I log back in. Cgi would be something like tail -n50 -f phpdig.log? And thanks for explaining the depth parametes but - just one more suggestion from you should help. So with 45,000 pages, how would you deal with this. Probably comment out some or all of the shared borders: header, footer, left-side, right side? And maybe limit the dept. I don't know what to limit to and what I will lose. I know where to limit now from your explanation, just don't know what to set it to and what exactly that means in terms of what is gained and what is lost?
__________________
Nosmada |
01-23-2004, 03:23 PM | #7 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Perhaps try using 'php -f spider.php [option] > phpdiglog.txt &' from the admin directory and then 'tail -f phpdiglog.txt' from the admin directory for output.
As for what to set in the config file, that is up to you, but below is an example of search depth, assuming a simple five-page site with the following link structure: Code:
pageA.html -- pageA1.html -- pageA11.html -- pageA2.html -- pageA21.html
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-23-2004, 04:46 PM | #8 |
Orange Mole
Join Date: Dec 2003
Posts: 32
|
Thanks Charter I will give your command syntax a go and play around with the depth.
__________________
Nosmada |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
command line indexing that actually works | carlaron | Troubleshooting | 0 | 11-06-2006 09:48 PM |
Excellent results using Putty SSH client and nohup command | claudiomet | How-to Forum | 2 | 09-30-2004 03:17 PM |
Command line vs. admin indexing | wx3 | Troubleshooting | 8 | 09-08-2004 01:31 AM |
Indexing by command line... | Canadian | How-to Forum | 4 | 01-04-2004 07:44 PM |
Indexing by command line interface | Skop | Troubleshooting | 8 | 10-14-2003 03:23 AM |