|
09-19-2003, 03:44 AM | #1 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
No indexing IIS 6 Win 2003 Server
I spend many time to find out what the problems are with the NEW IIS 6 at Windows 2003 Server.
PHPDIG donĀ“t indexing IIS 6 Websites at the moment. I also try to index a IIS 6 Sites from a Linux-System - same result. (email me, I sent you the web-page to test it.) Results of indexing: ### IIS 6 - Log file #### #Fields: date time c-ip c-session cs(Referer) sc-Protocol sc-uri sc-status 2003-09-18 19:41:27 62.142.48.115 1033 217.160.xx.xx 80 HTTP/1.1 HEAD /robots.txt 400 - BadRequest 2003-09-18 19:41:27 62.141.48.115 1034 217.160.xx.xx 80 HTTP/1.1 HEAD // 400 - BadRequest 2003-09-18 19:41:27 62.141.48.115 1035 217.160.xx.xx 80 HTTP/1.1 HEAD / 400 - BadRequest 2003-09-18 19:41:27 62.141.48.115 1036 1217.160.xx.xx 80 HTTP/1.1 HEAD /robots.txt 400 - BadRequest op=HEAD arg=http://www.my-domain.de/ result="400 Bad Request" ## Windows 2003 Monitoring ### <-> Filter: http ---------------------------------- HTTP: HEAD Request from Client HTTP: Request Method =HEAD HTTP: Uniform Resource Identifier =// HTTP: Protocol Version =HTTP/1.1 HTTP: Host =www.my-domain.de HTTP: Accept = */* HTTP: Accept-Charset = iso-8859-1 HTTP: Accept-Encoding =identity HTTP: User-Agent =PhpDig/1.6.2 (PHP; MySql) ------ HTTP: Response to Client; HTTP/1.1; Status Code = 400 - Bad Request HTTP: Protocol Version =HTTP/1.1 HTTP: Status Code = Bad Request HTTP: Reason =Bad Request HTTP: Content-Length =20 HTTP: Content-Type =text/html HTTP: Connection =close I will also ask in a Win-Newsgroups to get the reasons for this. I read some other problems with Error 400: does phpdig use allowed HTTP RFC Commands: see: RFC 2616 -Roland- |
09-19-2003, 09:47 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. With HEAD [your_site]/robots.txt HTTP/1.1 it produces the following:
Content-Length: 24 The robots.txt file contains the following: Code:
User-agent: * Disallow: What do you get?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
09-19-2003, 10:13 AM | #3 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
ok is deleted. You can try again. Its just the same in my tests.
-Roland- |
09-19-2003, 10:22 AM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Please can you post the results like you did above? Maybe there will be something in there, or are the results just like those above?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
09-19-2003, 12:24 PM | #5 | |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Hmm, Monitor-Log is only possible if i start this 2 sec before i dig.
This is wrong - IMHO !! robot_functions.php Line 286 Code:
$request = "HEAD $path HTTP/1.1\n" ."Host: $host$sport\n" .$cookiesSendString .$auth_string ."Accept: */*\n" ."Accept-Charset: ".Dig-Spider_ENCODING."\n" ."Accept-Encoding: identity\n" ."User-Agent: Dig-Spider/".Dig-Spider_VERSION." (PHP; MySql)\n\n"; with LF ('\n')? LF is wrong in RFC - Each header ends with a CRLF !! See: http://www.w3.org/Protocols/rfc2616/...c2.html#sec2.2 Quote:
Last edited by Rolandks; 09-19-2003 at 12:58 PM. |
|
09-19-2003, 12:43 PM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. I believe the problem is that the script uses \n and your machine needs \r\n.
Please try this to fix the problem: First make a backup of the robot_functions.php file. Then in robot_functions.php, do the following:
I think that's all of them that absolutely need to be changed. I also think you could just do a search and replace, changing all \n to \r\n in the files. As a general rule of thumb, I believe it's like this for different OS: Windows uses \r\n Macintosh uses \r *nix uses \n
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
09-19-2003, 12:47 PM | #7 | |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Quote:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
09-19-2003, 01:40 PM | #8 | |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Thanks
I think it should change in the next Version it is conform to RFC - and if users update they can fix this again I wrote above: See: http://www.w3.org/Protocols/rfc2616/...c2.html#sec2.2 Quote:
-Roland- |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
ISAPI or CGI with IIS on win 2003 server | shaders | Troubleshooting | 0 | 11-15-2004 02:29 AM |
auto re-indexing on shared hosting server | mental cube | How-to Forum | 1 | 09-07-2004 04:10 PM |
Indexing problems - IIS on XP | darrenm | Script Installation | 1 | 05-07-2004 03:30 AM |
installing on IIS Server.... | ronyotz | Script Installation | 3 | 03-03-2004 06:22 PM |
Spider cron Job with WIN in V1.8 | Rolandks | Troubleshooting | 4 | 02-09-2004 12:08 AM |