PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Mod Requests

Reply
 
Thread Tools
Old 02-06-2005, 02:25 AM   #1
WebSpider
Green Mole
 
Join Date: Feb 2005
Posts: 16
How to index other pages but not farther from them?

I'll try to explain it as clear as possible:

I spider a www.domainA.com which has links to www.domainX.com www.domainY.com and www.domainZ.com

How do i set up the digger to spider ALL links in domainA.com (domainX, domainY and domainZ) PLUS entering and spidering each of those links but not outside them?

So:

www.domainA.com
|
|\- www.domainX.com: grab links to domain1, domain2 and domain3.com
|
|\- www.domainY.com: grab links to domain4.com
|
\-- www.domainZ.com: grab links to domain5 and domain6.com

In this figure, my DB would contain

domainA, domainX, domainY, domainZ, domain1 to domain6 but not farther from domain1 to 6.

Is it clear enough?
WebSpider is offline   Reply With Quote
Old 02-07-2005, 01:23 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Set PHPDIG_IN_DOMAIN to true in config.php, find the phpdigCompareDomains function in robot_functions.php and set the else part to true, and set a counter in spider.php so that the phpdigSpiderAddSite function, and related code, is executed a max of X times. Note that phpdigSpiderAddSite appears twice in the spider.php file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-07-2005, 01:57 AM   #3
WebSpider
Green Mole
 
Join Date: Feb 2005
Posts: 16
Thanks for the answer.

Do you have the line numbers for "find the phpdigCompareDomains function in robot_functions.php and set the else part to true," and "and set a counter in spider.php so that the phpdigSpiderAddSite function"?

I'm still new to the script and I'm not a programmer, just used to apply hacks to VB.

Also, executed a max of X? In my case, how much would that be?

Thanks for your help.
WebSpider is offline   Reply With Quote
Old 02-07-2005, 03:39 AM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Setting a counter would require a modification to the code. If you set PHPDIG_IN_DOMAIN to true and set the else part of the phpdigCompareDomains function to true, PhpDig should crawl on/off-site links, assuming your resources can handle it. Maybe you would rather just list the URIs in the textbox in the admin panel for those sites you'd like to index?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-07-2005, 03:43 AM   #5
WebSpider
Green Mole
 
Join Date: Feb 2005
Posts: 16
The thing is:

I own a directory, full of pages full of links to other sites...

Since it's a thematic directory, I just can't manually add 4 thousand links to spider, manually.

Rather than that I'd preffer the spider to spider my own directory PLUS each of the 4 thousand links but without following ANY of the links in those 4 thousand links.

Could u provide (payed, if u want) stepbystep instructions to achieve that?
WebSpider is offline   Reply With Quote
Old 02-07-2005, 07:18 PM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Moved to Mod Requests...
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PhpDig 1.8.5 does not index all pages gaam Troubleshooting 3 12-14-2004 05:57 AM
how to index only not indexed pages? zaartix How-to Forum 2 07-14-2004 05:23 AM
How do you index dynamice pages? orbitalz How-to Forum 2 05-10-2004 05:06 PM
converted from html pages to php pages now no pages will index!!! help!! bigals Troubleshooting 24 04-01-2004 10:34 AM
do not index all pages robilix Troubleshooting 2 11-25-2003 02:50 PM


All times are GMT -8. The time now is 01:40 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.