View Single Post
Old 05-24-2004, 07:28 AM   #5
ciaran@clissman
Green Mole
 
Join Date: May 2004
Posts: 10
Hmm, we're not there yet.

The sites I crawling aren't mine, so I can't put robot.txt files into them.

Is there not a function someplace that says
' if the directory of the page you are thinking about indexing is the parent directory of the page you were started at, leave it alone (or not, depending on the config variable)' ?

thanks again

Ciaran
ciaran@clissman is offline   Reply With Quote