I added this line to the config:
Code:
define('FORBIDDEN_PATH','(guestbook|forum|cgi-bin|webring|affiliates|links|webrings|banners)');
I added this code to spider.php (the part in bold red is the addition)
Code:
//test content-type of this page if not excluded
$result_test_http = '';
if (!phpdigReadRobots($exclude,$temp_path) && !eregi(FORBIDDEN_EXTENSIONS,$temp_file) && !eregi(FORBIDDEN_PATH,$temp_path)) {
$result_test_http = phpdigTestUrl($url_indexing,'date',$cookies);
}
I tried the code you gave and even tried variations of it and never was able to get it to ignore a path or directory. This code should be added to the next phpdig version. it's a neccessity if you want to have a little more control over the content that is being indexed and reduce the MySql database.