PhpDig.net

Go Back   PhpDig.net > General Forums > Feedback & News

Reply
 
Thread Tools
Old 10-05-2003, 11:59 AM   #1
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
New Features Inquiry

Hi. Besides the information posted on these forums, I've been thinking about perhaps integrating my commercial search script, or rather parts of it, into PhpDig. You can demo the commercial search script here. It has boolean and phrase searching capabilities, but it is not GPL and it does not index. If I do integrate it into PhpDig, then the integrated parts would then become GPL. Anyway, what I am wondering is if there is anything in particular from the commercial search script that you'd like to see in PhpDig. If nothing, then I'll forget this idea, but if there is something, it'd be good to know in case I decide to undertake the task.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 10-06-2003, 01:41 AM   #2
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Re: New Features Inquiry

Quote:
Originally posted by Charter
Hi. Besides the information posted on these forums, I've been thinking about perhaps integrating my commercial search script, or rather parts of it, into PhpDig. You can demo the commercial search script here. It has boolean and phrase searching capabilities, but it is not GPL and it does not index. If I do integrate it into PhpDig, then the integrated parts would then become GPL. Anyway, what I am wondering is if there is anything in particular from the commercial search script that you'd like to see in PhpDig. If nothing, then I'll forget this idea, but if there is something, it'd be good to know in case I decide to undertake the task.
There are some very interesting features in yuor search engine.
I have a list of new fatures we are working on.
For example "Ad links".
My idea is to integrate Phpdig with a text banner server. We just need to create an "hook" between words searched by user and keyword of the banners.

To allow an easy integration with many ad servers I propose theese steps:
1) Allocation of the space on the right side of the results page phpdig. This column should collapse (or be absent) if the are no "sponsored links" or the feature is disabled.
2) Creation of an "hook function" inside Phpdig written for a specific AdServer. This function takes the Ads from the AdServer and shows them.
Then everything about ads is made by the AdServer: counts, statistics on clicks and so on

In this way it's easy to integrate phpdig with any AdServer. I have my own one, but is possible to do it with PhpAdsNew or others.

alivin70 is offline   Reply With Quote
Old 10-06-2003, 01:48 AM   #3
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Re: New Features Inquiry

Quote:
Originally posted by Charter
Hi. Besides the information posted on these forums, I've been thinking about perhaps integrating my commercial search script, or rather parts of it, into PhpDig. You can demo the commercial search script here. It has boolean and phrase searching capabilities, but it is not GPL and it does not index. If I do integrate it into PhpDig, then the integrated parts would then become GPL. Anyway, what I am wondering is if there is anything in particular from the commercial search script that you'd like to see in PhpDig. If nothing, then I'll forget this idea, but if there is something, it'd be good to know in case I decide to undertake the task.
A feature that I would improve in Phpdig is the calculation of relevance of the pages.
We are studying the algorithms that do that.
If you developed your own algorithm you have the right skills to help us.

We can work togheter to define new more powerful rules to calculate the "ranking" of a page.
We have to develop an extensible code to add new features as we hack Google algorithms

Let me know what do you think about
Alivin70
alivin70 is offline   Reply With Quote
Old 10-06-2003, 05:25 AM   #4
Rolandks
Purple Mole
 
Rolandks's Avatar
 
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
Hey,
boolean and phrase searching is a good idea

But i think the curent ranking is OK and it is not so important, because my Site-statistic shows that users often search for one or two words. And the Google ranking algorithms is not interesting on ONE Website, or what ranking will you create for ONE search word ?
"hook function" and Addserver - hmm, i don´t know who need this and for what, does this work international (US, European, etc. )

My favorite feature is to get word suggestions in the case of User-errors. documnetation must find documentation or Downlaod must find Download . It works well, problems are word-parts: manageroperating not suggest: manager operating like Google.

See my "Test Intelligent Php-Dig Fuzzy " in Signature, or this thread for the full story:
http://www.phpdig.net/showthread.php?s=&threadid=77

I think is not with difficulty to include this as phpDig Results table tags.

Not important

Last edited by Rolandks; 10-06-2003 at 05:27 AM.
Rolandks is offline   Reply With Quote
Old 10-06-2003, 05:42 AM   #5
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Quote:
Originally posted by Rolandks
Hey,
boolean and phrase searching is a good idea

But i think the curent ranking is OK and it is not so important, because my Site-statistic shows that users often search for one or two words. And the Google ranking algorithms is not interesting on ONE Website, or what ranking will you create for ONE search word ?
"hook function" and Addserver - hmm, i don't know who need this and for what, does this work international (US, European, etc. )

I'm interested in it and also Charter, i guess

Anyway I'll do that and release it GPL for Phpdig


Quote:
My favorite feature is to get word suggestions in the case of User-errors. documnetation must find documentation or Downlaod must find Download . It works well, problems are word-parts: manageroperating not suggest: manager operating like Google.

See my "Test Intelligent Php-Dig Fuzzy " in Signature, or this thread for the full story:
http://www.phpdig.net/showthread.php?s=&threadid=77

I think is not with difficulty to include this as phpDig Results table tags.

Not important
That's a great idea and I agree with you.
I've already read the thread and see your test page.
Thanks for let me discover the nice function SOUNDEX() and related. I will help to develop this feature, if I can.

The reason why I need certain features is because I'm building not a single site search engine, but a "few sistes search engine".
Where few means 10-20, depending on Phpdig capacity and speed.

Alivin70

PS Please read my post about documentation of the code ASAP, to avoid some work going lost.
alivin70 is offline   Reply With Quote
Old 10-15-2003, 08:05 AM   #6
druesome
Orange Mole
 
Join Date: Oct 2003
Posts: 30
Hi All,

I would gladly help in developing an algorithm for PHPDig. I want to find out first though, where in the scripts is the variable $weight being computed? I'm not that satisfied with the current relevance ranking. I want to give more weight/importance to the titles than the text. Thank you.
druesome is offline   Reply With Quote
Old 10-15-2003, 09:05 AM   #7
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Quote:
Originally posted by druesome
Hi All,

I would gladly help in developing an algorithm for PHPDig. I want to find out first though, where in the scripts is the variable $weight being computed? I'm not that satisfied with the current relevance ranking. I want to give more weight/importance to the titles than the text. Thank you.
Hi drue
i'm also interested in hacking the page weighting, but I dindn't start it yet.

Maybe the documentation on my website could help you to find
the relevant piece of code.
look at this thread for more details.

I think it could be useful to add more parameters to adjust the weight of a result.
I'm not completely sure, but at the moment it's possible to change the relative weight of a page if the the keyword is found in the title. Looking at config.php i've found
define('TITLE_WEIGHT',3); //relative title weight

We can add weight for meta keywords or for other parameters.

The best thing to do is to put the weighting method in a function or class that can be developed separately from a person or a team. That function could be also easily customized for special purposes.

I the future we can think to implement the simplest Google algorithms of page ranking, for example the weight associated to links: if a page A contains a link named "word" to the page B and you search "word" in google, you will find page B before A, even if page B doesn't contain the keyword "word".
That's reasonable and is the base of Google power!



bye for now
Alivin70
alivin70 is offline   Reply With Quote
Old 10-16-2003, 05:16 AM   #8
druesome
Orange Mole
 
Join Date: Oct 2003
Posts: 30
Hey Alvin,

I think I figured out a hack that gives a higher score to a result if the query terms match the title. I will share it with everyone soon, because it's still kind of sloppy, but it does the job and I'm quite happy with it. What I'll try next is to give each site a pagerank, much like Google's, and to make it have some effect on the search results. Later, and wish me luck.
druesome is offline   Reply With Quote
Old 10-16-2003, 05:55 AM   #9
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Quote:
Originally posted by druesome
Hey Alvin,

I think I figured out a hack that gives a higher score to a result if the query terms match the title. I will share it with everyone soon, because it's still kind of sloppy, but it does the job and I'm quite happy with it. What I'll try next is to give each site a pagerank, much like Google's, and to make it have some effect on the search results. Later, and wish me luck.
I wish you lots of luck!

Anyway, what do you mean with pagerank? A number calculated by the spider (Google style) of assigned by the administrator (dummy but simpler)?
alivin70 is offline   Reply With Quote
Old 10-16-2003, 10:02 PM   #10
sid
Former Member
 
Join Date: Sep 2003
Posts: 34
Hi, I'd like to see the Boolean Capibiltis and the "" phrase search, please.

Can't wait to see the next version of PHPDIG!
sid is offline   Reply With Quote
Old 10-27-2003, 08:23 PM   #11
Wayne McBryde
Orange Mole
 
Join Date: Oct 2003
Location: NC, USA
Posts: 34
I would really like to see a option where you install the software for those of us that know very little about installing scripts on our servers. Of course this option would not be free, but I would pay a reasonable amount to have you install it.

Thanks
__________________
Wayne Mcbryde
http://LakeNormansWeb.com
We search all of Lake Norman!
Wayne McBryde is offline   Reply With Quote
Old 11-04-2003, 11:31 AM   #12
pittster
Green Mole
 
Join Date: Sep 2003
Posts: 2
I'm thinking of adding a feature to log commonly searched keywords and provide a report that could be emailed or viewed online.

This is beneficial to site administrators so they can make commonly searched for items more visible on the site.

If it is already in the works please let me know
pittster is offline   Reply With Quote
Old 11-05-2003, 02:38 AM   #13
drjoju
Green Mole
 
Join Date: Nov 2003
Posts: 2
Hi all!

I think some people are not focusing in the final objective of phpdig. Search and Index Engine!!

If you think that this is the most important objective, them the new features must be :

1.- the boolean capabilities and the "" exact phrase.
2.- Add new file types. If necessary.
3.- The Rolandks idea of word suggestion. Good Idea.
4.- Repair bugs and modify the spider to sniff local directories. (It doesn't work to me or I don't know how to do it)
5.- Integrate new external engines. wvware for example.
6.- Add a commit hook system to index new files without reindex.

As you can see there is a lot of work.

I Know that exists a registered version, but I believe in GPL and the open source.

Best regards!
drjoju is offline   Reply With Quote
Old 11-05-2003, 02:58 AM   #14
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Quote:
Originally posted by drjoju
Hi all!
[...]
1.- the boolean capabilities and the "" exact phrase.
2.- Add new file types. If necessary.
3.- The Rolandks idea of word suggestion. Good Idea.
4.- Repair bugs and modify the spider to sniff local directories. (It doesn't work to me or I don't know how to do it)
5.- Integrate new external engines. wvware for example.
6.- Add a commit hook system to index new files without reindex.
[...]
I agree, 1) is the most important.

4 is easy, if your web server is not public, configure your apache to have web access to files you need and spider it with phdig.
Be careful to permissions, use some .htaccess if you want to protect your dirs

3 is a great idea, but quite difficult. I hope Rolandks will give us good news soon.

6 I proposed that feature and thinking for its implementation. I will inform you as soon as I will hane some news.

2 Needs some external parser (link for PDF or Word files), you can propose some if you know.

5 I didn't understand what you mean ....
alivin70 is offline   Reply With Quote
Old 11-05-2003, 12:13 PM   #15
drjoju
Green Mole
 
Join Date: Nov 2003
Posts: 2
Hi Alivin70,

with point 5 I want to say that exists other engines to parse files like wvware.sourceforge.net that parses doc files.

Best regards.
drjoju is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
spidering error = theURL,winName,features ddowdall Troubleshooting 0 03-19-2006 08:28 AM
Detailed feature inquiry (mainly Metadata and protected) rgrau How-to Forum 1 02-26-2005 09:13 PM
"search depth" and "links per" features laurentxav How-to Forum 1 01-12-2005 08:27 PM
Ban features Slider How-to Forum 20 01-01-2005 05:12 PM
Bugs, and missing Features in V. 1.6.2 Rolandks Bug Tracker 4 01-23-2004 08:01 AM


All times are GMT -8. The time now is 08:18 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.