PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Bug Tracker

Reply
 
Thread Tools
Old 10-10-2003, 01:10 AM   #1
BernhardG
Green Mole
 
BernhardG's Avatar
 
Join Date: Oct 2003
Location: Püttlingen (Saar) - Germany
Posts: 8
Not able to index [some site]...

Hi,

I think i recently found a bug in the indexer. Yesterday I tried to index the site http://www.rover-club-berlin.com/ . It is not possible to index this site completly (about 550 pages) - only the first page gets indexed. The problem is that the website author does not have a correct markup style (I think this at least). Other indexers (phpCMS indexer, isearch) can spider this site correctly. So i come to the conclusion that there ist some bug in phpDig.
I have also another problem with the site http://www.rover-club-hessen.de/ . It is possible to index the first level but for example the pages below "Mitglieder" (Members) will not indexed. At the moment I have no idea where the bug is. I don't know if it is a bug in phpDig or a bad markup style.
I was able to index this site with the indexer of phpcms and isearch completly

Bernhard
__________________
phpCMS - Content Management System
http://www.phpcms.de/
BernhardG is offline   Reply With Quote
Old 10-10-2003, 11:07 AM   #2
Rolandks
Purple Mole
 
Rolandks's Avatar
 
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
First Site has 34 BAD Errors in Validator W3C and is full of JAVA

Line 25, column 17: "FRAMESET" not finished but containing element ended

Line 16, column 15: end tag for "HEAD" which is not finished

- something to much for phpDig

Second Site works fine:

Code:
links found : 40
http://www.rover-club-hessen.de/
http://www.rover-club-hessen.de/NOMATCH
http://www.rover-club-hessen.de/burningbook/guestbook.php
http://www.rover-club-hessen.de/rchevo_2/html/about.htm
http://www.rover-club-hessen.de/burningbook/?page=2
http://www.rover-club-hessen.de/burningbook/?page=3
http://www.rover-club-hessen.de/burningbook/gbae.php
http://www.rover-club-hessen.de/rchevo_2/html/members.htm
http://www.rover-club-hessen.de/rchevo_2/html/meetings.htm
http://www.rover-club-hessen.de/rchevo_2/html/forum.htm
http://www.rover-club-hessen.de/rchevo_2/html/tutorials.htm
http://www.rover-club-hessen.de/rchevo_2/html/links.htm
http://www.rover-club-hessen.de/guestbook.php
http://www.rover-club-hessen.de/rchevo_2/
http://www.rover-club-hessen.de/burningbook/guestbook.php?a20198
http://www.rover-club-hessen.de/burningbook/?page=1
http://www.rover-club-hessen.de/burningbook/
http://www.rover-club-hessen.de/burningbook/help.php
http://www.rover-club-hessen.de/rchevo_2/html/spacecake.htm
http://www.rover-club-hessen.de/rchevo_2/html/geisterfahrer.htm
http://www.rover-club-hessen.de/rchevo_2/html/dermeister.htm
http://www.rover-club-hessen.de/rchevo_2/html/hometown.htm
http://www.rover-club-hessen.de/rchevo_2/html/disasterman.htm
http://www.rover-club-hessen.de/rchevo_2/html/dirty-t.htm
http://www.rover-club-hessen.de/rchevo_2/html/thunderdome.htm
http://www.rover-club-hessen.de/rchevo_2/html/thunderdine.htm
http://www.rover-club-hessen.de/rchevo_2/html/englischepatient.htm
http://www.rover-club-hessen.de/rchevo_2/html/fastrabbit.htm
http://www.rover-club-hessen.de/rchevo_2/html/butterflyel.htm
http://www.rover-club-hessen.de/rchevo_2/html/frosty.htm
http://www.rover-club-hessen.de/rchevo_2/html/joker.htm
http://www.rover-club-hessen.de/rchevo_2/html/dragon.htm
http://www.rover-club-hessen.de/rchevo_2/html/treffen011003.htm
http://www.rover-club-hessen.de/rchevo_2/html/oldtimershow.htm
http://www.rover-club-hessen.de/rchevo_2/html/treffen053103.htm
http://www.rover-club-hessen.de/rchevo_2/html/tutorial_1.htm
http://www.rover-club-hessen.de/rchevo_2/html/tutorial_2.htm
http://www.rover-club-hessen.de/rchevo_2/html/roverlinks.htm
http://www.rover-club-hessen.de/rchevo_2/html/clublinks.htm
http://www.rover-club-hessen.de/rchevo_2/html/tuninglinks.htm
Optimizing tables...
Indexing complete !

Last edited by Rolandks; 10-10-2003 at 11:13 AM.
Rolandks is offline   Reply With Quote
Old 10-10-2003, 03:40 PM   #3
BernhardG
Green Mole
 
BernhardG's Avatar
 
Join Date: Oct 2003
Location: Püttlingen (Saar) - Germany
Posts: 8
Hi Roland,

I know that the first page has many errors - and by god I swear I do not wrote this page - but the problem is that other indexers could fetch the site but phpDig not. I'll try to find the bug by myself now.
By the way I think it would be best if an indexer searches just for something that contains src="URL" or href="URL" surrounded by < and >. With this every problem with wrong markup should go away. The problem is that there could be false positives in the generated URL table.

The second site was updated recently so it is possible that some errors are corrected now.

Anyway phpDig is a great project!

Bernhard
__________________
phpCMS - Content Management System
http://www.phpcms.de/
BernhardG is offline   Reply With Quote
Old 10-11-2003, 04:18 AM   #4
BernhardG
Green Mole
 
BernhardG's Avatar
 
Join Date: Oct 2003
Location: Püttlingen (Saar) - Germany
Posts: 8
Hi!

After some testing with various settings I managed to index http://www.rover-club-berlin.com/ . The only setting I needed to change was PHPDIG_DEFAULT_INDEX to false.

Bernhard
__________________
phpCMS - Content Management System
http://www.phpcms.de/
BernhardG is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
cannot index my site ENTHALPIE Troubleshooting 2 11-18-2005 03:02 AM
How to Index a site? davey147 Troubleshooting 0 08-30-2004 01:10 PM
Reindexing site won't index certain page gman Troubleshooting 4 08-06-2004 02:05 PM
How do I create "Site Index" using PHPDig ? jimfletcher How-to Forum 5 07-14-2004 05:56 AM
Can't get PHPDig to index an htaccess protected site mlerch@mac.com Troubleshooting 28 02-25-2004 04:13 PM


All times are GMT -8. The time now is 12:18 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.