|
07-08-2004, 06:19 PM | #1 |
Purple Mole
Join Date: Dec 2003
Posts: 106
|
Spidering sub-directories as the root
I'm interested in getting the spider function, not just the search function, to treat subdirectories of URLs as the root.
For example, if someone wanted to spider http://www.geocities.com/website as its own site, without scanning the true root (www.geocities.com). So far I changed this bit of code in robot_functions.php: PHP Code:
PHP Code:
PHP Code:
PHP Code:
__________________
Foundmyself.com artist community, art galleries |
07-10-2004, 01:01 PM | #2 |
Green Mole
Join Date: Jul 2004
Location: Illnau, Switzerland, Europe
Posts: 9
|
hello bloodjelly
I have the same problem, and i solved it with adding this code: PHP Code:
PHP Code:
Last edited by caco3; 07-10-2004 at 01:14 PM. |
07-12-2004, 12:39 PM | #3 |
Purple Mole
Join Date: Dec 2003
Posts: 106
|
Thanks for the help, caco, but what I need is a mod that adds links to the database exactly as entered, either with a subdirectory or not. In other words, if I wanted to spider "http://www.mysite.com/directory" as a root, I could do it, and if I wanted to spider "http://www.mysite.com" as a root I could do that too.
__________________
Foundmyself.com artist community, art galleries |
07-12-2004, 05:27 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Perhaps upgrade to PhpDig version 1.8.2...
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 05:39 PM | #5 |
Purple Mole
Join Date: Dec 2003
Posts: 106
|
You are awesome.
__________________
Foundmyself.com artist community, art galleries |
07-14-2004, 08:04 PM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
FYI: version 1.8.3 released to allow for the 'limit to directory' option to be consistent across other control panel options, among other changes.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-15-2004, 06:54 PM | #7 |
Purple Mole
Join Date: Dec 2003
Posts: 106
|
Hi charter -
I'm not sure if I'm using the limit to directory feature correctly (I have it set to "true") but when I enter a website (www.geocities.com/psychology_x/main.html for example) it spiders correctly, but the listing in the "sites" table is only for geocities. Is there a way to make each separate directory treated as its own site? Or am I missing something? Thanks.
__________________
Foundmyself.com artist community, art galleries |
07-15-2004, 08:05 PM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. The issue is that foo.com/bar/ is not a separate domain from foo.com/ but rather a subdirectory of that domain. Spidering can now be limited to subdirectories, but the domain is still the domain. On the other hand, the bar.foo.com/ subdomain, while it can point to the foo.com/bar/ subdirectory, it is a third level domain and can also be treated as a separate site on a separate server. The database storage scheme is domain based, and that is why subdirectories are not stored separately but subdomains are separately stored.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-15-2004, 08:12 PM | #9 |
Purple Mole
Join Date: Dec 2003
Posts: 106
|
Got it. Thanks for the explaination.
__________________
Foundmyself.com artist community, art galleries |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Using non-root MySQL account with phpDig | muppet | How-to Forum | 0 | 01-22-2006 07:08 AM |
Script not indexing - host doesn't allow remote/root access | tyhand | Troubleshooting | 1 | 07-18-2005 04:48 PM |
my root is not being indexed! | ivmedia | Troubleshooting | 1 | 06-26-2005 02:52 AM |
Indexing outside root domain | T3D | How-to Forum | 5 | 03-14-2004 02:57 PM |
Not Indexing Sub-Directories | jayhawk | Troubleshooting | 3 | 02-11-2004 02:41 PM |