|
03-10-2004, 09:42 AM | #1 | |
Green Mole
Join Date: Mar 2004
Location: San Francisco
Posts: 4
|
Exclude list?
First of all, PhpDig looks like an awesome product. I've been looking for a new search engine for ages, and I think I've found it!!
One question: In the docs for phpdig 1.8.0, it says: Quote:
PHP Code:
/developers/community/forums/... -Antun |
|
03-10-2004, 10:29 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi antun, and welcome to PhpDig.net!
The BANNED constant is meant to prevent the following of certain links in pages that get crawled. To prevent certain directories from being crawled altogether, set a robots.txt file in your web root. If a directory has already been crawled and you want to exclude it, just click the red circle noway symbol from the admin panel.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
03-10-2004, 11:11 AM | #3 |
Green Mole
Join Date: Mar 2004
Location: San Francisco
Posts: 4
|
Thanks, but if I use a robots.txt file to exclude certain directories, won't that prevent those dirs from being indexed by public search engines too (e.g. Google?).
I'm only trying to fine-tune our search - for example, I'd like to exclude our forums from all searches, and I'd like to remove our Developers area from all non-tech releated searches. Should I be excluding different directories and running separate indexes, or should I be running one large index and (if possible?) excluding parts of the site at search-time? -Antun |
03-10-2004, 11:32 AM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. A robots.txt file with the following should exclude the directories from PhpDig prior to index:
Code:
User-agent: PhpDig Disallow: /developers/ Disallow: /developers/community/forums/ Disallow: /lps/ Disallow: /lps-2.0/docs/lzx-developers-guide/
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
03-10-2004, 12:24 PM | #5 |
Green Mole
Join Date: Mar 2004
Location: San Francisco
Posts: 4
|
Got it! That will work for the "/developers/community/forums/", which I never want indexed.
However, in my case, I'd like to have separate configurations: - The entire website (excluding /developers/). - All of /developers/, but nothing in the rest of the site. - Just /lps-2.0/docs/lzx-reference/, but nothing else. I presume the best way would be to have each one as a separate website, right?. You see I want to give people an option as to what to search (using a pull-down) most likely. (You can see what I mean here: http://www.laszlosystems.com/developers/). -Antun |
03-10-2004, 12:38 PM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Perhaps this thread might help.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
List how many Sites have been indexed? | Dan | Mod Requests | 1 | 11-17-2006 07:00 AM |
Feature List? | paulsv | The Mole Hole | 1 | 01-31-2006 10:40 PM |
Clear List of Queries | Kvasir | How-to Forum | 1 | 05-19-2005 07:55 AM |
List all pages from specified host | BulForce | How-to Forum | 3 | 01-19-2005 01:19 PM |
search list... | staura | How-to Forum | 3 | 06-19-2004 06:54 AM |