PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > How-to Forum

Reply
 
Thread Tools
Old 09-09-2004, 02:05 PM   #1
sasa
Green Mole
 
Join Date: Sep 2004
Posts: 2
Question google like search engine

Hi All,

I've gone up and done these forums, and I am a bit confused (nothing new!)

I want to create a search engine for a niche market. There may several thousand sites in this niche.

1) I want to start by listing a few big ones...
2) have the ability for the people to come and request to be indexed
(the process would have a screen for them to list their URL, and after approval, they would get indexed based on some sort of schedule)

So some questions:

a) How is the data stored?
- Does the PhpDig store the URL, Title, Description, Keywords of the "crawled" pages in the mySQL database?
- Where are the actual INDEXED content of the pages stored?

b) How much storage is needed?
- i.e. if we have 1000 sites, with 15 pages each... a total of 15,000 pages, How much storage would be needed?

c) How quick is the code?
- Using the above example (15,000 pages), how long would a 2 word search take?

d) And most importantly, has someone put together a Moded version for this kind of application?

thanks,
Sam
sasa is offline   Reply With Quote
Old 09-09-2004, 09:11 PM   #2
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
Quote:
Originally Posted by sasa
a) How is the data stored?
- Does the PhpDig store the URL, Title, Description, Keywords of the "crawled" pages in the mySQL database?
- Where are the actual INDEXED content of the pages stored?
Just go into phpMyAdmin and look at the data structure there. It will show you all the tables and fields inside each.

Quote:
b) How much storage is needed?
- i.e. if we have 1000 sites, with 15 pages each... a total of 15,000 pages, How much storage would be needed?
You can't just go by number of pages. It also depends on the size of those pages, and how many keywords are contained in them. Probably a few other factors too that don't readily come to mind.

Quote:
c) How quick is the code?
- Using the above example (15,000 pages), how long would a 2 word search take?
See above. Again, it depends.

Quote:
d) And most importantly, has someone put together a Moded version for this kind of application?

thanks,
Sam
Don't know, but maybe you should get together with the person who started this thread.
vinyl-junkie is offline   Reply With Quote
Old 09-10-2004, 09:26 AM   #3
sasa
Green Mole
 
Join Date: Sep 2004
Posts: 2
Dear junkie,

Thanks for the reply. However, you did not give ANY answers

I do understant EVRYTHING depends on something else!

I have not downloaded the code, or installed it yet. My host is on "all Windows" platform. All I wanted to get some estimates before I went and paid for a linux hosting just to try the code.

You seem to know a lot about this code... so here are a few questions:

a) How is the data stored?
- Does the PhpDig store the URL, Title, Description, Keywords of the "crawled" pages in the mySQL database?
- Where are the actual INDEXED (text) content of the pages stored?


b) In your own installation, what are the sizes of the database and how long would a 2 word search take?
sasa is offline   Reply With Quote
Old 09-10-2004, 10:41 AM   #4
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
Quote:
Originally Posted by sasa
I have not downloaded the code, or installed it yet. My host is on "all Windows" platform. All I wanted to get some estimates before I went and paid for a linux hosting just to try the code.
I didn't know you hadn't downloaded the code yet. Otherwise, you'd be able to look at the database structure yourself.

One thing that you might not be aware of is that phpdig will work on a Windows server. However, my own experience with that has been that it doesn't work very well there. You might have better luck than me though. I have been told that a Windows server settings can be tweaked so that phpdig will work pretty well, but I've never pursued that myself so I can offer you any insight on that.

Quote:
You seem to know a lot about this code... so here are a few questions:
I've got everyone fooled! Seriously, I really don't know that much about the code. I just know where to look up the answers to a lot of questions that are asked here in the forums.

Quote:
a) How is the data stored?
- Does the PhpDig store the URL, Title, Description, Keywords of the "crawled" pages in the mySQL database?
Yes, the database stores all those data elements in its tables.

Quote:
- Where are the actual INDEXED (text) content of the pages stored?
I'm not sure exactly what you need to know here. I can't speak for Charter (the forum owner/administrator), but if you're asking how the search engine works, the answer to that is probably beyond the scope of support offered in this forum.

Quote:
b) In your own installation, what are the sizes of the database and how long would a 2 word search take?
My own database has just over 1,500 pages indexed. When I do searches, it takes less than 1 second to retrieve the results.

Hope this answers your questions.
vinyl-junkie is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Who will be next #1 search engine? russell IPs, SEs, & UAs 0 08-14-2006 04:59 PM
can i customize the search engine in this way? warrence How-to Forum 1 12-01-2004 03:19 AM
Mid Google search or big brother? Dave A How-to Forum 3 08-16-2004 01:20 AM
Need Help Installing Search Engine Destroyer X Script Installation 3 06-18-2004 10:16 AM
The search engine of our dreams. JÿGius³ Feedback & News 2 11-20-2003 07:01 PM


All times are GMT -8. The time now is 11:15 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.