My spidering problem seems to largely involve redirect/re-direct pages. Like
http://hawkeyesports.cstv.com/ ,
http://arizona.diamondbacks.mlb.com
Here is some info I found from the forums to troubleshoot this from Charter:
Charter sample forum entry:
$url = "http://somewhere.com/path1/path1/file1.php?someid,1,1,1";
print_r(parse_url($url));
Charters output:
Array ( [scheme] => http [host] => somewhere.com [path] => /path1/path1/file1.php [query] => someid,1,1,1 )
My attempt:
$url = "http://houston.astros.mlb.com";
print_r(parse_url($url));
My output:
Array ( [scheme] => http [host] => houston.astros.mlb.com )
another sample:
Array ( [scheme] => hhttp [host] => hawkeyesports.cstv.com )
The path and query are not reading the pages that it should be re-directed to in mine. Any suggestions for failed redirects?
thx