|
10-05-2003, 03:25 PM | #1 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
Some sites won't index
Hi All,
I have installed PHPDig-1.6.2 on a Redhat Linux 8.1 server running Apache 2.0 and MySQL version 3.23.56 with PHP 4.2.2. I am having problems with some sites not indexing and just giving me the following message. SITE : http://www.somedomain.com/ Exclude paths : - @NONE@ No link in temporary table -------------------------------------------------------------------------------- links found : 0 ...Was recently indexed Optimizing tables... Indexing complete ! I am sure that there are more than 10 links on the index.html page of this site, but still nothing. On other domains on this server PHPDig works correctly. Can anyone give me any idea as to what is happening? Thanks in advance. Jeff |
10-05-2003, 03:41 PM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Did you previously index the sites recently, or are the sites like http://www.domain.com/dirone/index.php and http://www.domain.com/dirtwo/index.php? You can change the reindex timeframe with define('LIMIT_DAYS',7); in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-05-2003, 06:36 PM | #3 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
Thanks for the reply.
I have been trying to get it to work with that specific domain and have read other posts here about problems with reindexing a recently indexed site. So, I have repeatedly deleted the MySQL database and re-installed it using the install.php script. I am only indexing from the top level directory using "www.domainname1.com" and "www.domainname2.com. I have also tried "www.domainname.com/index.html" without any success. I have tried indexing 3 domains on the same server. Only one indexed. The other 2, including the domain that I really what to index, did not. Both domains gave the same message listed in the post above. Jeff |
10-05-2003, 06:49 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. To start over and index from scratch, do the following:
PHP Code:
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-05-2003, 08:04 PM | #5 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
I followed your instructions but still nothing.
The message this time was: 2935: old priority 0, new priority 18 Spidering in progress... ----------------------------- SITE : http://www.somedomain.com/ Exclude paths : - @NONE@ No link in temporary table links found : 0 ...Was recently indexed Optimizing tables... Indexing complete ! Just to recap the installation instructions so I am sure that I got everything right ... I unTARed the phpdig files into a temp directory and then copied all the files into the www.somedomain.com/search directory. I changed the permissions on the admin/temp, includes and text_content directories to 777 to allow write access to everyone. ( Security issue that I will worry about when I get PHPDig running ) I copied the _connect.php file to connect.php and edited it to add the MySQL hostname, username, password and database name. I cleared the PHPDIG_DB_PREFIX field. I then ran the install.php file from a web browser ( although at first it complained about not finding the init_db.sql file, which I then copied to the admin directory). Once the database was created and the tables were installed I tried to index www.somedomain.com with on success. Was there anything else that I was supposed to do? Am I missing any permissions or something? Any other suggestions? Thanks for the help. Jeff |
10-06-2003, 02:53 PM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. That sounds correct. What type of files are you trying to index: *.asp, *.shtml, etcetera? Do you notice if indexing works on some file types but not others?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-06-2003, 03:58 PM | #7 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
I am trying to index plain .html files.
I have done some more tests and I have tried to index 10 different virtual domain sites that reside on my server. I have discovered that of the 10 sites I tried to index only 1 site worked. 9 sites would not index. Looking furthur, I discovered that the only site that would index was a site that had moved to another provider. The directory structure and files for the web site still resided on my server but the DNS now points to another server. All the other virtual domains that I tried to index had DNS entries that pointed to my server IP address. Does this tell you anything? Jeff |
10-06-2003, 04:36 PM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Can you try lynx from command line instead? An example is in this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-06-2003, 06:41 PM | #9 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
I tried using Lynx, with no success.
Lynx would just sit there saying "Making HTTP connection to www.somedomain.com". I was wondering if the issue in this case could be that the web server is behind a NAT'ed firewall? Also, the web sites are on the same machine as the DNS service. So, on the internal network the server has an IP address, for example, of 10.1.1.100. However, in the DNS the domain has an IP address of 123.123.123.1. In this case, Lynx is trying to open the web site that DNS says is at 123.123.123.1, while the server that the web site is really on is at 10.1.1.100. So no connection can be established. Is this a possible explaination for the problem? Has anyone run into this problem before? Any and all help is greatly appreciated. Thanks, Jeff |
10-08-2003, 10:04 AM | #10 |
Green Mole
Join Date: Oct 2003
Location: Mesa, AZ
Posts: 15
|
This is definitely a NAT problem. I am experiencing the same thing and am trying to figure out a rule to get around it. What I'm going to try and figure out how to do is to get the webserver to reply on the same interface as the request came in on, instead of doing NAT on the packet.
If your setup isn't too complex, you may just be able to set up a rule specifying that outbound packets to a given IP should not be NAT'd, or in some specific way only. I am hoping to find a way to tell the system to not do NAT on packets with a certain flag marked ... I'm using ipf on FreeBSD, but I would guess iptables would have this functionality as well... |
10-08-2003, 10:31 AM | #11 |
Green Mole
Join Date: Oct 2003
Location: Mesa, AZ
Posts: 15
|
Well, fixed my problem by adjusting the routing table on the machine with the webserver.
In your case, why not add an explicity entry to your /etc/hosts file pointing to the internal address instead of the external one? |
10-08-2003, 10:46 AM | #12 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
Rayvd,
Yep, that worked. Thanks for the help. I hope the PHPDig will eventually have the ability to directly index a site based on the location of files in the file system instead of only by FQDN/IP address. Again, thanks for the help. Jeff |
10-12-2003, 08:29 PM | #13 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
I have the same problem:
SITE : http://www.blah-blah-blah.com/ Exclude paths : - @NONE@ No link in temporary table >Well, fixed my problem by adjusting the routing table on the machine with the webserver. I can't do that cause I have a simple hosting account. Any suggestions? And thanks in advance for any help. |
10-13-2003, 05:01 PM | #14 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Perhaps in config.php change PHPDIG_DEFAULT_INDEX to false?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-13-2003, 05:17 PM | #15 |
Green Mole
Join Date: Oct 2003
Posts: 6
|
thanks Charter but still the same:
-------------------------------------------------------------------------------- SITE : http://www.somesite.com/ Exclude paths : - @NONE@ No link in temporary table -------------------------------------------------------------------------------- links found : 0 ...Was recently indexed Optimizing tables... Indexing complete ! -------------------------------------------------------------------------------- Any other ideas? Much appreciate the help. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Trying to index some dynamic sites | guillaume | Troubleshooting | 2 | 08-08-2007 05:40 AM |
PHPDig won't index most sites and only go down one level on all | confusion | Troubleshooting | 1 | 10-14-2005 10:32 AM |
I just want to index main sites | afesh | How-to Forum | 1 | 08-26-2005 08:45 PM |
"I don't want to index your sites!!!" - said PHPDig | #ASH | How-to Forum | 1 | 04-06-2005 01:57 PM |
index intershop-sites? | comko | Troubleshooting | 4 | 03-30-2004 08:22 AM |