|
03-13-2004, 01:01 PM | #1 |
Green Mole
Join Date: Mar 2004
Posts: 5
|
Searching external domains/links
Hi! I'm brand shiny new to search engines and am not clear on how this engine searches external domains relative to 'my' domain.
Does it follow links from 'my' domain (let's say www.mine.com) to external domains (let's say www.outside.com)? Thus if I have a link to an external domain will it follow that link and index those pages also? If so can I set the depth it searches on those domains? In relation to this can I simply set it to search only www.mine.com and not follow external links? Can I set a list of 20 domains and have it index only those domains? Phew...hope that is clear! Thanks. |
03-13-2004, 02:25 PM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
PhpDig is set to crawl links from one site using the admin panel. By indexing from shell a list of URLs can be specified, one per line in a text file. To crawl links from site to external site, set PHPDIG_IN_DOMAIN to true in the config file and apply the code change in this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
03-13-2004, 02:37 PM | #3 |
Green Mole
Join Date: Mar 2004
Posts: 5
|
Thanks for the quick reply
So if I set PHPDIG_IN_DOMAIN to true, can I then specify the depth it will dig those external links to? (if not it would obviously get out of control!). Is that depth just considered as part of the depth set in the main search? Or does it start from scratch when it hits a new domain? |
03-14-2004, 03:55 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. If you (a) set PHPDIG_IN_DOMAIN to true in the config.php file and (b) set the else part of the phpdigCompareDomains function to true in the robot_functions.php file, then it is possible to wind up in a loop. To avoid this loop, use the files in the attached ZIP file below. The files in the attached ZIP apply point (b) above and are for use with version 1.8.0.
As for search depth, using the files in the attached ZIP file to avoid the possible aforementioned loop, then search depth gets applied to each different (sub)domain found, so in theory, it would be possible to index site to linked site to linked site, etcetera, where the search depth specified gets applied to each different site.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Limit of spidering external domains | Vadim | How-to Forum | 0 | 11-17-2006 10:53 AM |
spidering external links | websearch | How-to Forum | 1 | 01-11-2005 09:39 AM |
Wildcard for banned external links? | Slider | How-to Forum | 5 | 12-19-2004 09:07 AM |
Spider External links to a depth of 1 (1.8.3) | kenazo | How-to Forum | 0 | 10-20-2004 07:28 AM |
redirect to external domains | sf44 | Troubleshooting | 4 | 07-04-2004 12:56 AM |