|
09-30-2004, 09:41 PM | #1 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Shell Spidering Quits After Indexing A Few Pages
I'm spidering from a shell command for the first time and having some problems. The process ran for around 5 minutes yesterday, and indexed about 50 pages, then said it was complete. I launched the shell command again, phpdig indexed a few more pages, then quit again. Same thing on a third try.
I have 1,500+ pages on my site, which indexes just fine when I run it from the secure web page, so why won't it do the same when I run it from a shell command? |
09-30-2004, 10:58 PM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
What are these set to in the config file?
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-01-2004, 05:25 AM | #3 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Here's what I have:
Code:
define('SPIDER_MAX_LIMIT',40); //max recurse levels in spider define('RESPIDER_LIMIT',40); //recurse respider limit for update define('LINKS_MAX_LIMIT',30); //max links per each level define('RELINKS_LIMIT',40); //recurse links limit for an update |
10-02-2004, 12:04 AM | #4 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Here's a little more information, for whatever it's worth. I've just moved my website to a new host, and am trying to rebuild my phpdig search engine from scratch. The same performance issues are happening when I run phpdig as a secured web page as when I run via shell. Any idea what the problem could be?
|
10-02-2004, 12:36 AM | #5 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Does it actually complete the index, or does it just stop after five minutes?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-02-2004, 06:25 AM | #6 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
It says indexing is complete, then stops.
I noticed when I ran phpdig one more time last night, the process ran to completion. Very strange. I sense this has been some sort of timeout issue. I wrote my web host to confirm, and will let you know if that was the problem after all. |
10-02-2004, 06:50 PM | #7 | |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Well, here is my web host's reply:
Quote:
Any suggestions as to where I should go from here with this? |
|
10-03-2004, 08:06 AM | #8 |
Orange Mole
Join Date: Oct 2003
Location: NC, USA
Posts: 34
|
I don’t know if this will help but:
I have had the spidering stop when testing from my test server but never from my production server. The 2 servers are almost identical except the test server is in my house on a DSL line and the production server is co-located about 100 feet from where level 3 comes into Charlotte, NC. The test server has at most 2 websites that have VERY little traffic and no e-mail running. The production server has over 100 websites with lots of e-mail and traffic (But the server load is light). It looks to me like my problem is related to the slower internet connection, not the server. I would expect a server that is overloaded (or at least has a heavy load) could have the same problem.
__________________
Wayne Mcbryde http://LakeNormansWeb.com We search all of Lake Norman! |
10-03-2004, 09:18 AM | #9 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
I have no idea what kind of server load there might be, don't know how one measures that. I've just moved my site to a new hosting company, and server response time in terms of page loads seems to be pretty quick. Don't know if that would be an indicator of server load necessarily.
I did discover one thing last night which might possibly be related to this issue. I hadn't been able to get my phpdig search page to work properly since the move. I'd enter a search term and click Go, but then I'd get the same page back. I went through my site to make sure there weren't any missing files of any kind, and after doing that, my search page worked properly. All this, too, when none of the files that I had to upload (or re-upload) was anything to do with phpdig searches. Fixing all the little stuff that was wrong seems to also have fixed spidering from a secured web page, but I still had the spidering process just hang at a time of 9:53 into spidering around midnight last night. Go figure. I guess I'll work with it a little more and see how it goes. |
10-04-2004, 11:01 AM | #10 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
>> I have no idea what kind of server load there might be, don't know how one measures that.
From the shell prompt, type top or uptime and hit return. You should see the load average with three numbers showing the average load over the last 1, 5, and 15 minutes. >> It says indexing is complete, then stops. >> I still had the spidering process just hang... Sometimes it hangs and sometimes it completes? Does anything unusual show in your raw access or error logs?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-04-2004, 08:48 PM | #11 | ||
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Quote:
Quote:
This whole thing really bugs me, because I would eventually like to have this run as a cron job and dispense with the other two methods. However, I don't have a lot of confidence at this point that a cron job would do any different than shell. This is all so strange. Nothing unusual at all in the server log. My provider is at a loss to explain why it would just hang, too. I've been chewing up lots of bandwidth messing with this. Still have plenty to play with for now, but need to watch it. |
||
10-09-2004, 07:08 PM | #12 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Just a follow-up on this thread. I created a test phpdig database so I could mess with this a little more without clobbering production. Tried to populate the test database initially from shell, and phpdig wouldn't index any pages at all. My saved spider log was totally empty, too. I checked my server log, and nothing shows up for phpdig at all there.
I went ahead and populated my test database using the secure web page, and although it took about 3.5 hours to spider, it still indexed everything I would have expected. So why doesn't it work from shell? Just now, I tried to update the index via shell, and the same thing happened that did initially - nothing indexed, empty spider log, nothing in the server log. When I type uptime from the shell, here's what I get: up 18 days, 38 minutes, 2 users, load average: 0.11, 0.31, 0.33 Any suggestions as to where I go from here? I've hit a brick wall... |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
shell indexing problem | Patrick_2a | Troubleshooting | 1 | 11-06-2005 11:30 AM |
Shell command no indexing | noel | Troubleshooting | 3 | 10-27-2005 11:22 AM |
Spidering from shell - returns immediately, with nothing | ciaran@clissman | Troubleshooting | 1 | 06-17-2005 04:14 AM |
404 error via shell... no pages indexed | claudiomet | Troubleshooting | 2 | 09-01-2004 07:07 AM |
Shell Spidering | CrazyCanuck | Troubleshooting | 3 | 04-20-2004 10:56 AM |