So you have the following?
PHP Code:
$text = eregi_replace("<td[^>]*>.*</td>"," ",$text);
$text = preg_replace("/<[\/\!]*?[^<>]*?>/is"," ",$text);
The first removes stuff between <td...> and </td> (according to CHUNK_SIZE) and the second removes other tag-like things, so you don't really need the first one. If you want to exclude part of a page, look at
this thread or look at how $title is set in the phpdigCleanHtml function in the robot_functions.php file.