|
04-24-2004, 08:52 PM | #1 |
Green Mole
Join Date: Apr 2004
Posts: 2
|
Yet another indexing problem
I have searched and searched and can't find an answer.
I have lots of pages where I use the same page, such as 'index.php' and add query strings to it. It seems to get stuck at not being able to tell that a query string is different from another. For example, I have over 200 items in a database, so my page will be something like 'index.php?id=1', index.php"id=2' etc. But phpdig has been able to get only 'index.php?display_table' 'index.php?display_list' 'index.php?display_thumbnails' 'index.php?id=86' I can't get it to be able to see that id=1, id=2, id=3... are different pages. It's like it can only tell the difference if the query strings have different letters, not different numbers. What can I do? Oh, and there are no 404 problems or redirects or any of the other things in all the other posts I've looked into. All the links to all the ?id=n pages are all listed on the first page. |
04-25-2004, 03:03 PM | #2 |
Green Mole
Join Date: Apr 2004
Posts: 2
|
I tried some more things, but no matter what I cannot index any pages with similar query string beyond those ones with ?display_all, ?display_table, ?display_list and only one with a longer query string, whichever one it gets to first.
Odd thing is that if the page is a .shtml page and not a .php page I can index everything. Why is that? Is there anything I can do about that? |
04-25-2004, 03:55 PM | #3 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
One possible way around this, assuming you're on Linux, is to rewrite your URLs so that your dynamic content appears to be static.
I have a whole lot of dynamic content on my own website like this. For example, instead of displaying album content like this: Code:
www.napathon.net/TrackList.php?AlbumID=1530 Code:
www.napathon.net/AlbumID1530.php Code:
RewriteEngine On RewriteRule ^AlbumID([0-9]+).php TrackList.php?AlbumID=$1 [L] |
05-17-2004, 10:06 AM | #4 |
Green Mole
Join Date: May 2004
Posts: 25
|
I'm coming across the same problem as sbhikes -- it grabs page.php?id=1 but doesn't grab 2, 3, etc...
Vinyl-junkie's workaround sounds like it should work, but I'd prefer not to have to go in and find every GET reference like that and rewrite it into the phpdig-friendly version (only to have it get rewritten back again with Apache behind the scenes). Seems like this is a genuine bug in phpdig's spidering process (that happens to have an Apache workaround). I don't suppose some kind soul familiar enough with the phpdig spidering code could to try to fix this for real? |
05-17-2004, 10:20 AM | #5 |
Green Mole
Join Date: May 2004
Posts: 25
|
I'd like to expand on this problem a little bit, in case anyone feels like tackling it. I'm indexing a reasonably complicated site and I've noticed that in some cases it's managing to index dynamic pages with different numbers in their GET string, but not others.
I'm not sure about this, but it appears to only be able to grab one per page. For example, on http://www.freepress.net/news/releases.php, it will only spider the first release on the list (ID 17). However, it appear to be spidering several news article pages (which have urls of the form news/article.php?id=XXXX), because it's finding them via separate pages, rather than on a single page as with the press releases. Or maybe it's dying simply because it stops looking at the releases once it hits the word doc? Not sure... but it's fishy, and frustrating. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem with indexing | Raghavendra | Script Installation | 3 | 09-25-2006 08:52 PM |
Indexing problem... | afrim_05 | Troubleshooting | 3 | 11-24-2005 03:51 AM |
Indexing problem | deshaye7 | Troubleshooting | 1 | 06-01-2005 06:57 AM |
indexing problem outside localhost | onlytrue | Troubleshooting | 2 | 03-18-2004 12:46 AM |
indexing problem?? | Chris2 | Troubleshooting | 2 | 02-21-2004 07:23 AM |