![]() |
Spidering issue with my site
Hello, I'm trying to set up phpdig for a web site and I can make it spider other web sites except mine.
I have tried both locally from the command line and remotely from another server. Any time I try to spider it the web page freezes for about 30 seconds after I click on the "Dig This!" button and then goes to the result page with: Spidering in progress... SITE : http://dev.videx.com/ Exclude paths : - @NONE@ No link in temporary table links found : 0 ...Was recently indexed Optimizing tables... Indexing complete ! [Back] to admin interface. The site is, if you didn't notice ;) , dev.videx.com and I have managed to spider other servers in our domain (like www.videx.com). I have removed the robots.txt file from the site but still have a .htaccess restricting use of the /search folder, but otherwise the site is a basic CSS / php based one on a Mac OS X 10.3 server and I am using phpdig version 1.6.2. I have modified my config file to not search through .css files, but still no luck. Any suggestions? |
follow up info
Anyone? Anyone? Bueler?
Well, I've done some more searching and it turns out that the spidering will hang on any Mac OS X 10.3 site that I configure (including a default site with one web page!). It works fine spidering Mac OS X 10.2 servers, however, so I think it has something to do with the Apache config on the server. The site that I can't get phpdig to spider is http://dev.videx.com/ and it is running with the following config: OS: Mac OS X 10.3 Apache: 1.3.28 PHP: 4.3.2 phpdig: 1.6.2 I have tried turning on error logging for php, but it never creates the file. My php.ini file is: include_path=".:/Library/WebServer/php" log_errors = On error_log = ".:/Library/WebServer/log.txt" error_reporting = E_ALL Feel free to attempt to spider http://dev.videx.com/ and let me know if it works :) |
Hi. Below are the results at search depth one for http://dev.videx.com/ - When you try to crawl this site, what shows up in your Apache log files?
links found : 17 http://dev.videx.com/ http://dev.videx.com/favicon.ico http://dev.videx.com/index.html http://dev.videx.com/products/index.html http://dev.videx.com/News/index.html http://dev.videx.com/about/index.html http://dev.videx.com/products/downloads/manuals/accesscontrol/cyberaudit_manual.pdf http://dev.videx.com/products/support.html http://dev.videx.com/products/download.html http://dev.videx.com/products/listing.html http://dev.videx.com/news/tradeshows.html http://dev.videx.com/news/careers.html http://dev.videx.com/map.html http://dev.videx.com/news/press.html http://dev.videx.com/news/studies.html http://dev.videx.com/about/privacy.html http://dev.videx.com/about/contact.html Optimizing tables... Indexing complete ! |
log results
I cleared my apache logs, restarted it, and ran an index. Here are the results in the log files:
access log: 12.17.172.219 - - [19/Jan/2004:09:12:30 -0800] "GET / HTTP/1.1" 200 7404 error log: Processing config directory: /etc/httpd/sites/*.conf Processing config file: /etc/httpd/sites/0000_any_80_.conf Processing config file: /etc/httpd/sites/virtual_host_global.conf [Mon Jan 19 09:11:22 2004] [notice] Apache/1.3.28 (Darwin) PHP/4.3.2 configured -- resuming normal operations [Mon Jan 19 09:11:22 2004] [notice] Accept mutex: flock (Default: flock) It doesn't look very helpful to me. I still can't index the site from other Mac 10.3 servers. I timed the delay between when I click on the "Dig this!" button and when the spider page comes up with 0 results, and it is about 3 minutes and 20 seconds. |
progress
Well, I just updated my phpdig to 1.6.5 and tried out indexing the site.
It works up to a point with the web interface and then gives me the following message from the web browser: Could not open the page “http://12.17.172.219/phpdig1/admin/spider.php” after trying for 60 seconds. All the pages that it indexes up to that point are fine. I am going to try it from the command line, where the timeout should not apply. |
It's alive!
Everything is working fine now with phpdig 1.6.5 - apparently there was something in the php code in 1.6.2 that was causing a problem.
So, in case anyone wants to know, phpdig 1.6.5 works on Mac OS 10.3. |
All times are GMT -8. The time now is 06:26 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.