|
10-09-2003, 02:03 AM | #1 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Indexing META-Tags
Version 1.6.2 with php 4.3.2 is indexing META-Tags "description", "DC.subject" and "keywords" - Is it just the same reason as indexing HTML-Comments
I think this part don't work with PHP > 4.3.2 (robot_functions.php) Code:
//delete content of head, script, and style tags $text = eregi_replace("<head[^<>]*>.*</head>"," ",$text); $text = eregi_replace("<script[^>]*>.*</script>"," ",$text); $text = eregi_replace("<style[^>]*>.*</style>"," ",$text); // clean tags $text = eregi_replace("(</?[a-z0-9 ]+>)",'\1 ',$text); Code:
<head><!-- ID 566789 - generated by CMS --> <!-- Global Meta Beginn, Template: meta.tpl --> <META http-equiv="content-type" content="text/html;charset=ISO-8859-1"> <META HTTP-EQUIV="Content-Language" CONTENT="de"> <META NAME="description" CONTENT="Informationen, and your description here is indexing"> <META NAME="keywords" CONTENT="This, keywords, here, are, indexing, allin, PHPDIG "> <META NAME="publisher" CONTENT="This is also indexing"> <META NAME="copyright" CONTENT="This is also indexing"> <META NAME="creation_Date" CONTENT="11/04/2003"> <META NAME="expires" CONTENT="never"> <META HTTP-EQUIV="Pragma" content="no-cache"> <META NAME="ROBOTS" CONTENT="INDEX,FOLLOW"> <META NAME="revisit-after" CONTENT="7 days"> <META NAME="DC.format" content="text/html"> <META NAME="DC.Date" content="2003-11-04"> <META NAME="DC.contributor" CONTENT="Your Name"> <META NAME="DC.subject" CONTENT="This, keywords, here, are, indexing, allin, PHPDIG"> <META NAME="DC.description" CONTENT="This, keywords, here, are, indexing, in, PHPDIG"> <META NAME="DC.title" CONTENT="This, keywords, here, are, indexing, allin, PHPDIG"> <META NAME="DC.language" CONTENT="de"> <META NAME="DC.type" CONTENT="Information"> <!-- Global Meta End --> <title>Title of Homepage</title> <link href="stylesheet.css" rel="stylesheet" media="screen"> </head>
__________________
-Roland- :: Test PhpDig 1.6.2 here :: - :: Test-Search for (little) Intelligent Php-Dig Fuzzy :: |
10-09-2003, 06:41 PM | #2 | |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Quote:
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
10-10-2003, 01:00 AM | #3 |
Green Mole
Join Date: Oct 2003
Location: PĆ¼ttlingen (Saar) - Germany
Posts: 8
|
Hi,
The "head" regex SHOULD filter everything between <head> and </head>. This includes every meta-tag! But there is a problem if someone do not set the </head> correct. The problem with most of the available http indexing search engines ist that they think that every site is using perfect html markup - but this is not realistic :-( Bernhard |
10-10-2003, 04:24 AM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Thanks, missed that line, and I even quoted it.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-10-2003, 07:21 AM | #5 | |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Quote:
The Site with this HEAD is checked as Valid HTML 4.01 at www.W3C.org ? -Roland- |
|
10-10-2003, 08:05 AM | #6 |
Green Mole
Join Date: Oct 2003
Location: PĆ¼ttlingen (Saar) - Germany
Posts: 8
|
Oh yes your markup is correct - My eyes were shut when I've wrote my reply, sorry!
But wrong markup is a general problem - every parser has this problem :-( By the way: Should'nt the line $text = eregi_replace("<head[^<>]*>.*</head>"," ",$text); look like this (as every other line in your quote): $text = eregi_replace("<head[^>]*>.*</head>"," ",$text); Or why we need this additional '<'? Bernhard |
10-23-2003, 01:54 AM | #7 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Okay, i have spend a little time again in this problem
phpdig is only indexing this two META-tags: <META NAME="description" CONTENT="Informationen, and your description here is indexing"> <META NAME="keywords" CONTENT="This, keywords, here, are, indexing, allin, PHPDIG "> admin\robot_functions.php (755): PHP Code:
__________________
-Roland- :: Test PhpDig 1.6.2 here :: - :: Test-Search for (little) Intelligent Php-Dig Fuzzy :: Last edited by Rolandks; 10-23-2003 at 01:57 AM. |
10-25-2003, 07:46 AM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. You should just be able to comment that piece of code out to remove the indexing of the two meta tags mentioned. As always, try it on a demo page first.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Adding proprietary meta tags/values to be spidered | danwanner | How-to Forum | 2 | 03-03-2005 12:16 PM |
PgpDig 1.8.3 wont index meta tags (description, leywords) | darjanp | Troubleshooting | 0 | 11-14-2004 03:38 AM |
Meta-Tags: Description and Keywords | herberth | How-to Forum | 1 | 06-13-2004 02:45 AM |
Exclude meta tags from text snippet | guillemc | How-to Forum | 2 | 05-03-2004 12:14 AM |
How are the Revisit-After META tags processed? | sid | Troubleshooting | 1 | 11-22-2003 12:50 PM |