PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Mod Requests

Reply
 
Thread Tools
Old 06-03-2004, 12:36 AM   #1
synnalagma
Green Mole
 
Join Date: Mar 2004
Posts: 22
Optimization

Hi,

This isn't really a mod request but more like a proposition because a really big part of it is already done.

For my work I needed a search engine and I can't use PHPDig for several reason but the most important one was that I don't have the right to use exec family function and settimelimit.

So I decided to do my own search engine wich can progressivly index content (the website is based on a portal called e-xoops) so there's an admin panel and I can index content when I know there's new content (and not reindexing whole site).

Anyway, my code is based on PHPDig one but there're some big differences.

I found some good optimization, I think, and I find quite normal to give something back. I explain it here so if you want to integrate it, do it. If you want me to do it, there's no problem, I'll do it (but I won't if you don't care so that's why there's this post).

So here's the optimization, the idea is to use only one SQL query to do all the search job and also the ordering job.

First you look for your words in the database (I take exemple for exact search but it's also possible for normal/exact/fuzzy):
you search for words exact :

PHP Code:
"SELECT id,word FROM keyword WHERE word='$word'" 
You do this for each word separatly and put each results in an array then you do :


PHP Code:
$sql="SELECT SQL_BIG_RESULT $tengine.site_id, SUM((CASE ";
foreach(
$words as $w){
$sql.="WHEN word_id IN(".implode(',',$word_id_array[$w]).") THEN " 
if(is_excluded($w)){
$sql.="-500000";
}
else{
$sql.="weight";
}
}
$sql.=")) AS sw" 
if you have an "and condition" you add this

PHP Code:
$sql.=",BIT_OR( CASE ";
$andnumber=1;
$andtotal=0;
foreach(
$words as $w){
if(!
is_excluded($w)){
$andnumber<<=1;
$total|=$andnumber;
$sql.="WHEN word_id IN(".implode(',',$word_id_array[$w]).") THEN ".$andnumber;
}
}
$sql.=") AS bo FROM engine WHERE word_id IN(".implode(',',all_word_id).") GROUP BY site_id HAVING bo=$total AND sw>0 ORDER BY sw DESC"
The idea is if all words are present then bo must be equal to total. With this you can also have more complex conditions like (word1 AND word2) OR word3 if you generate all valid bo that can satisfy your condition (I'm actually thinking on how to do it).

With this query you will have all your results (valid one) already ordered. This speed up quite well the search, since MySQL is faster on this than php.
On a slow computer difference where 4 seconds (1.5 for mysql and 2.5 for php) to search for 3000 results (on 4000 documents) with old method (my method's not PHPDig's one) and 0.6 seconds with this one. So there's quite a difference


I hope you will understand it, I have a poor english, french would be better, feel free to contact me about any questions and so on... You can reach me synnalagma AT ezwww dot ch

Last edited by synnalagma; 06-03-2004 at 12:49 AM.
synnalagma is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -8. The time now is 04:48 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.