|
01-30-2004, 03:55 AM | #1 |
Green Mole
Join Date: Jan 2004
Posts: 10
|
Problem spidering sites at in .txt over 20 address
I have trouble spidering sites from a text file containing websites that has over a certain amount of address.
when I enter the following command with 300 sites in the file sites.txt php -f spider.php sites.txt I get spidering in progress... When I check only 50 websites are added as HOST and in the TEMP tables but not as entires. When I spider the with only 20 sites in the file sites.txt it works fine. how can I place huge amounts of sites in a .txt file and have all spidered
__________________
Best Regards, Joshua |
01-30-2004, 11:05 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Do the URLs start with http and are they listed one per line in the text file? If so, how long do you wait before you check the tables?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-30-2004, 08:13 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Not sure of the exact time frame with 300 sites, but try giving it a couple of hours, checking the tables intermittently, but don't stop PhpDig while checking.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem with robot.txt file | Samar | Troubleshooting | 1 | 02-09-2006 10:29 AM |
Spaces (%20) in URLs | FaberFedor | How-to Forum | 2 | 02-08-2005 10:02 AM |
Spidering vBulletin web sites? | jamison | How-to Forum | 15 | 06-11-2004 04:55 AM |
Client IP address | griffinmt | Coding & Tutorials | 3 | 05-23-2004 10:30 AM |
robots.txt versus robotsxx.txt | Charter | IPs, SEs, & UAs | 0 | 03-11-2004 06:00 PM |