PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   How-to Forum (http://www.phpdig.net/forum/forumdisplay.php?f=33)
-   -   Extracting H2 tag (http://www.phpdig.net/forum/showthread.php?t=2152)

mdavila 09-13-2005 04:16 PM

Extracting H2 tag
 
Hi,

I have added the code you suggested to the robot_functions.php to pull the h2 tag instead of the title tag. It works but the problem is that it is pulling both the first and second h2 tags.

This is the code i pasted in:

//extracts title
if (preg_match_all('/< *h2 *>(.*?)< *\/ *h2 *>/is',$text,$regs,PREG_SET_ORDER)) {

// assumes there are at least three h2 tags
$title = trim($regs[0][1]." ".$regs[1][1]." ".$regs[2][1]);
}
else {
$title = "";
}

The results is showing " Contact UsContact Us"

On this page there are 2 h2 tags. http://dobleweb1.doble.com/contactus/ but i only want to show the second one.

Any suggestions?

Thanks,

-Marc

Charter 09-14-2005 03:59 AM

If you only want the second H2 tag try:
Code:

$title = trim($regs[1][1]);
Instead of the following:
Code:

$title = trim($regs[0][1]." ".$regs[1][1]." ".$regs[2][1]);

mdavila 09-14-2005 08:02 AM

When i try that. It brings up "Untitled" and "search.php" for most of them

http://doble.phpslave.com/search.php

-Marc

Charter 09-14-2005 08:57 AM

Are you using the following?
Code:

if (preg_match_all('/< *h2 *>(.*?)< *\/ *h2 *>/is',$text,$regs,PREG_SET_ORDER)) {
        // assumes there are exactly two h2 tags
        $title = trim($regs[1][1]);
}
else {
        $title = "";
}


mdavila 09-14-2005 09:38 AM

Here is the code

//extracts title
if (preg_match_all('/< *h2 *>(.*?)< *\/ *h2 *>/is',$text,$regs,PREG_SET_ORDER)) {
$title = trim($regs[1][1]);
}
else {
$title = "";
}

Charter 09-14-2005 10:26 AM

Keep that code and increase CHUNK_SIZE in the config file, maybe 4096 will do. If not, try another increase so to get the two H2 tags in the same chunk.

mdavila 09-14-2005 01:50 PM

That seems to have done the trick!

Thanks,
-Marc :o


All times are GMT -8. The time now is 10:29 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.