|
09-10-2004, 09:36 AM | #1 |
Green Mole
Join Date: Sep 2004
Posts: 3
|
suggestions?
ok... here is the situation... i have a site that i need a fairly customized search on. its not a traditional spider job like the rest as i have content that needs to be excluded, so my spidering ability is extremely limited.
so... i have limited out the sections i dont want phpdig to index, and can run this... however, those sections are for the most part, the only way that the spider can travel from link to link. (please dont ask.. its a loooong explanation) what i need to do is this: i have a list (in spreadsheet... can be dropped easily into a db) of the exact files (both dynamic and static) that i need spidered and indexed. that list is has 1,598 pages. entering them one at a time will take me forever (literally). is there anyway to drop the info i already have laid out into a fashion to automate this process? and if so, how? |
09-10-2004, 10:48 AM | #2 |
Purple Mole
Join Date: Jan 2004
Posts: 694
|
Welcome to the forum, mdrdlp.
If you have the exact URLs in a spreadsheet, just copy and paste them into a text file, then spider from shell. The phpdig documentation gives an example of how to index specific pages like that. Hope this helps. |
09-10-2004, 10:59 AM | #3 |
Green Mole
Join Date: Sep 2004
Posts: 3
|
just tried that... simple from the command line, i put in:
php -f httpdocs/find/admin/spider.php http://www.my-site-url.com and was met with: PHP Warning: Function registration failed - duplicate name - imap_open in Unknown on line 0 PHP Warning: Function registration failed - duplicate name - imap_popen in Unknown on line 0 PHP Warning: Function registration failed - duplicate name - imap_reopen in Unknown on line 0 PHP Warning: Function registration failed - duplicate name - imap_close in Unknown on line 0 and that error list went on for pages. says it completed successfully at the end of the line, but nothing was done. |
09-10-2004, 11:49 AM | #4 |
Green Mole
Join Date: Sep 2004
Posts: 3
|
there is always an easy way to do things... sometimes you just have to break it to find it.
did a sample run on a couple of url's to see what data, and where, was being stored in the db. took the 'spider' table... dumped it locally, took my url list, pasted it in and gave them id's, and pushed it back up. then just hit the web admin interface, update site link... and done. thank you for your help trying to help though. i sincerely appreciate it! |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
A few suggestions for a revamp | GunMuse | Feedback & News | 1 | 08-08-2006 05:35 AM |
template bug & translation suggestions | bugmenot | Bug Tracker | 0 | 09-09-2005 12:55 AM |
Suggestions needed for pdf tracking mod | chris33 | Mod Requests | 5 | 04-22-2005 02:20 PM |
Some suggestions.. | gunwalt | Mod Requests | 0 | 11-06-2003 01:23 PM |