|
01-13-2004, 01:08 AM | #1 |
Green Mole
Join Date: Dec 2003
Posts: 11
|
'Duplicate' Search Results
Hi,
I've noticed that PHPDig seems to not be able to differeniate between nearly identical(I say nearly, because they appear identical to my human eyes) documents located on a website. If one document is located in say /worldwide/ and another in /about_us/ they both come up in a search result with identical percentages. Additionally, documents that are generated dynamically but are identical also give multiple duplicate results. For example: http://www.issa.com/worldwide/index....pe=news&id=153 and http://www.issa.com/worldwide/index....pe=news&id=153 Both are listed as results(they differ by the region variable in the URL). This behavior is understandable, since they are slightly different(from a machines perspective). However, is there a way to increase the criteria used to judge duplicate documents to filter out highly similar documents as well? Say if they share 90% of the same content? Thanks in advance, -Paul For reference, you may see for yourself this behavior at: http://search.custodialadvisorsnetwork.org Search for "cleaning standards" as a good example. Several pages into the search, you'll see some examples of pseudo-duplicates. Last edited by siliconkibou; 01-13-2004 at 01:10 AM. |
01-13-2004, 09:00 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. You might try modifying the $md5 variable talked about in this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Restricting search results by URL at the search form level | innerfire | How-to Forum | 3 | 08-01-2005 09:36 AM |
No most searched terms, biggest results, most 0 results, last search queries, etc. | jongag1 | How-to Forum | 6 | 04-22-2005 11:43 AM |
Too many duplicate link, someone help please! | warrence | Troubleshooting | 1 | 09-07-2004 05:26 PM |
Duplicate/Similar search results? | ChadK | How-to Forum | 3 | 08-20-2004 07:07 AM |
Indexing duplicate descriptions and keywords causing false search results | jerrywin5 | Mod Requests | 3 | 05-04-2004 09:27 AM |