PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 02-07-2005, 08:01 AM   #1
WebSpider
Green Mole
 
Join Date: Feb 2005
Posts: 16
Spider indexes cgi pages but not its links!?

Hi!

When I run the spider on a site www.domain.com that hosts several pages in the form of www.domain.com/cgi-bin/whatever... or cgi-bin.domain.com/whatever.... I can't find those links on the database nor the search results, but checking the most common keywords gives me as 1st place the cgi-bin.domain.com keyword.

What's the deal?

How do i make it to add the cgi-*.* links to the database for that particular domain?


Also, is there any difference between indexing http://www.domain.com and http://domain.com? Will I get duplicate pages onto the db?

Thanks!
WebSpider is offline   Reply With Quote
Old 02-07-2005, 10:59 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
The links domain.com, www.domain.com, and sub.domain.com are considered different. Try setting PHPDIG_IN_DOMAIN to true in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-08-2005, 03:18 PM   #3
WebSpider
Green Mole
 
Join Date: Feb 2005
Posts: 16
But then, why search results do not provide any cgi.domain.com result BUT will index it as keywords? how do i remove those false keywords and rerun the spider so it will pick cgi. as a url and not as a keyword?
WebSpider is offline   Reply With Quote
Old 02-08-2005, 07:04 PM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Stored keywords and indexed links are two different things. If the text cgi.domain.com appears in a page, it is stored as a keyword, regardless of whether the link cgi.domain.com is actually indexed. If you are using PhpDig v.1.8.7, and don't want the text cgi.domain.com stored as a keyword, then edit BANNED in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Index some, but spider all pages griemer Troubleshooting 0 01-16-2007 06:30 AM
Spider Indexes an Error Page alisan Troubleshooting 2 02-13-2006 12:14 PM
Spider stops before all pages are indexed halide Troubleshooting 3 07-19-2005 01:26 AM
Using a dictionnary to spider pages Edomondo How-to Forum 0 11-23-2004 08:36 AM
not spidering all pages (too many links on page?) mirdin Troubleshooting 2 09-01-2004 07:08 AM


All times are GMT -8. The time now is 02:31 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.