PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 02-12-2004, 03:58 AM   #1
renehaentjens
Orange Mole
 
Join Date: Nov 2003
Posts: 69
Index/Spider looping

I managed to get indexing/spidering into a loop. In a very small branch of the site, with no links to elsewhere, it kept repeating the same URL over and over again, with a red cross in front and "Was recently indexed" at the end.

When I stopped the process, the site was locked (of course), and MySQL gave error 145 on table tempspider: cant open file tempspider.MYI.

I had to delete the table and create it back with phpMyAdmin.

As I have long URLs in the site, I first thought that this might have caused the problem. Why, by the way, are fields 'file' and 'path' limited to 127 chars in the spider table? That is not going to be enough for my site! Can't they be TEXT fields like in tempspider?

Anyway, my long URLs are not yet long enough, I have around 50-60 characters in path and file currently.

So, something else must have been the cause of this looping.
__________________
René Haentjens, Ghent University
renehaentjens is offline   Reply With Quote
Old 02-13-2004, 11:19 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Without running a test index on the small branch of your site, you may find the following article useful.

http://www.databasejournal.com/featu...le.php/3300511

You may also find the below links useful.

http://www.faqts.com/knowledge_base/view.phtml/aid/329
http://www.mysql.com/doc/en/Choosing_types.html
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-16-2004, 06:10 AM   #3
renehaentjens
Orange Mole
 
Join Date: Nov 2003
Posts: 69
Thanks, Charter, for not giving up on educating me. The articles that you refer to are always interesting and to the point.

My summary on VARCHAR vs. TEXT:
VARCHAR: max 255, no loss of space, trailing spaces removed;
TEXT: in most respects like unlimited VARCHAR, but no DEFAULT value and sorting only uses the first 1024 chars.

My summary on URL length: no limit specified in RFC 1738, in Windows products upto 2083 chars (max32://max2048).

As there is no loss of space anyway, may I suggest to change VARCHARs to 255 in the next release, except where for some reason you count on a truncation to a smaller size?

I understand that checking all possible side-effects of DB corruptions in the PHP code would be a major coding effort.
__________________
René Haentjens, Ghent University
renehaentjens is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Cannot Spider/Index kccsai Troubleshooting 0 08-24-2007 11:20 AM
Index some, but spider all pages griemer Troubleshooting 0 01-16-2007 06:30 AM
the Spider don't index well, why ? napster Troubleshooting 1 09-27-2005 09:13 AM
spider deleted index liquidice Troubleshooting 1 04-06-2005 02:15 PM
Looping when indexing ZoRaC Troubleshooting 6 07-26-2004 01:05 AM


All times are GMT -8. The time now is 05:41 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.