|
01-08-2004, 08:11 AM | #1 |
Green Mole
Join Date: Jan 2004
Posts: 3
|
Does PHPDig ignore <base href...?
I have used the BASE HREF=directive in my dynamic site so that pages that appear to be in subfolders (but actually aren't) can point to external images and css files correctly.
This is correct as far as HTML goes and gives no trouble in any tested browsers. However PHPDig seems to ignore this setting. If a page that appears to be in a folder called 'news' links to index.html in the root the link will read 'href='index.html' instead of '../index.html'. The base href tag tells the browser to calculate any realtive URLs fron the root rather than from the current folder (which in my case doesn't exist) The result of this is that PHPDig finds multiple copies of each page. It thinks that index.html is in a subfolder of news and thus spiders a complete duplicate of the whole site. Up till now I have been using exclusions to get round this but this requires a lot of manual fiddling every time the site is changed. Is there a solution or is it a bug in PHPDig? |
01-08-2004, 11:01 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. PhpDig looks for links that match the following regex and then processes those links via the phpdigRewriteUrl function.
PHP Code:
Code:
<HTML> <HEAD> <BASE HREF="http://www.domain.com/dir2/index1.html"> </HEAD> <BODY> <A HREF="index2.html">test</A> </BODY> </HTML>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-08-2004, 11:06 AM | #3 |
Green Mole
Join Date: Jan 2004
Posts: 3
|
Wouldn't it be fairly simple to check the <head> for the existence of a BASE tag and prefix any relative URLs with that instead of the current path?
Is it worth posting this to the suggestions forum? |
01-08-2004, 04:08 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. In robot_functions.php is a function called phpdigExplore.
In this function, replace the following: PHP Code:
PHP Code:
Code:
<HTML> <HEAD> <BASE HREF="http://www.domain.com/dir2/file.html"> </HEAD> <BODY> <A HREF="index2.html">test</A> </BODY> </HTML> Code:
<BASE HREF="http://www.domain.com/file.html"> <BASE HREF="http://www.domain.com/dir2/dir3/file.html"> <A HREF="index2.html">test</A> <!--- or the following tags ---> <BASE HREF="http://www.domain.com/dir2/file.html"> <A HREF="/index2.html">test</A>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-08-2004, 08:01 PM | #5 |
Green Mole
Join Date: Jan 2004
Posts: 3
|
Fantastic! Thanks...
|
04-19-2004, 12:43 PM | #6 |
Green Mole
Join Date: Apr 2004
Location: Texas
Posts: 1
|
PHPDig rocks
I looked forever for an open source site search done in php and tried several without much success. PHPDig has worked well, but I was having similar problems to the one mentioned above. With Charter's modified statement, I seem to be complaint free.
Nice program and stellar support. Thanks to all! |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Index HREF in <FORM> ? | thenniart | Troubleshooting | 1 | 08-15-2005 11:17 AM |
Re-indexing Data Base Fast | ezytrak | Troubleshooting | 1 | 03-15-2005 10:01 AM |
¿Why the label <phpdig:complete_path/> change the width of the tables? | zertiko | How-to Forum | 2 | 07-26-2004 07:49 PM |
¿Modify the label <phpdig:update_date/>? | zertiko | How-to Forum | 2 | 07-25-2004 08:38 AM |
Title of the results - how to change from <phpdig:page_link/> | bforsyth | How-to Forum | 12 | 07-15-2004 09:53 PM |