![]() |
Newbie on Domains: Yes or No Answer Please :)
Hi!
Would it be possible to say, force Dig to crawl many domains, but only start in a sublevel of the domain, and never "crawl" out of that domain to offsite links? For example fetch URLS to pages in: domain1.com/subdir domain1.com/subdir/subdir... domain2.com/subdir/subdir/... but never return results like: domain1.com domain1.com/doc.html domain1.com/unspecified-dir/ domain1.com/unspecified-dir/doc.html I hope this makes sense. Hoping to use dig to spyder specific content on many domains but not crawl around too much :) or jump onto "offsite" undefined domains. A yes or No would be appreciated! |
Assuming you are using PhpDig v.1.8.7, apply the code change in this post, set LIMIT_TO_DIRECTORY to true and set PHPDIG_IN_DOMAIN to false, both in the config file, and then index http://www.domain.com/subdir/ (with ending slash) from the PhpDig admin panel textbox using whatever "Search depth" and "Links per" you prefer. Note that values present in "Update sites" are used by default, so just choose "no" or edit "Update sites" if you wish to reset the values.
|
All times are GMT -8. The time now is 10:36 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.