PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > How-to Forum

Reply
 
Thread Tools
Old 09-10-2004, 09:36 AM   #1
mdrdlp
Green Mole
 
Join Date: Sep 2004
Posts: 3
suggestions?

ok... here is the situation... i have a site that i need a fairly customized search on. its not a traditional spider job like the rest as i have content that needs to be excluded, so my spidering ability is extremely limited.

so... i have limited out the sections i dont want phpdig to index, and can run this... however, those sections are for the most part, the only way that the spider can travel from link to link. (please dont ask.. its a loooong explanation)

what i need to do is this: i have a list (in spreadsheet... can be dropped easily into a db) of the exact files (both dynamic and static) that i need spidered and indexed. that list is has 1,598 pages. entering them one at a time will take me forever (literally). is there anyway to drop the info i already have laid out into a fashion to automate this process? and if so, how?
mdrdlp is offline   Reply With Quote
Old 09-10-2004, 10:48 AM   #2
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
Welcome to the forum, mdrdlp.

If you have the exact URLs in a spreadsheet, just copy and paste them into a text file, then spider from shell. The phpdig documentation gives an example of how to index specific pages like that.

Hope this helps.
vinyl-junkie is offline   Reply With Quote
Old 09-10-2004, 10:59 AM   #3
mdrdlp
Green Mole
 
Join Date: Sep 2004
Posts: 3
just tried that... simple from the command line, i put in:
php -f httpdocs/find/admin/spider.php http://www.my-site-url.com



and was met with:
PHP Warning: Function registration failed - duplicate name - imap_open in Unknown on line 0
PHP Warning: Function registration failed - duplicate name - imap_popen in Unknown on line 0
PHP Warning: Function registration failed - duplicate name - imap_reopen in Unknown on line 0
PHP Warning: Function registration failed - duplicate name - imap_close in Unknown on line 0

and that error list went on for pages. says it completed successfully at the end of the line, but nothing was done.
mdrdlp is offline   Reply With Quote
Old 09-10-2004, 11:49 AM   #4
mdrdlp
Green Mole
 
Join Date: Sep 2004
Posts: 3
there is always an easy way to do things... sometimes you just have to break it to find it.

did a sample run on a couple of url's to see what data, and where, was being stored in the db. took the 'spider' table... dumped it locally, took my url list, pasted it in and gave them id's, and pushed it back up. then just hit the web admin interface, update site link... and done.


thank you for your help trying to help though. i sincerely appreciate it!
mdrdlp is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
A few suggestions for a revamp GunMuse Feedback & News 1 08-08-2006 05:35 AM
template bug & translation suggestions bugmenot Bug Tracker 0 09-09-2005 12:55 AM
Suggestions needed for pdf tracking mod chris33 Mod Requests 5 04-22-2005 02:20 PM
Some suggestions.. gunwalt Mod Requests 0 11-06-2003 01:23 PM


All times are GMT -8. The time now is 04:32 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.