![]() |
Problems with URL parsing
Hi,
I try to use phpDig. It seems very good to me. Although when I try to index my site the following happens. URLs like: http://www.webthings.nl/archive/2003...ewoon_een_hype(2)#body will be rewritten to: http://www.webthings.nl/archive/2003...ewoon_een_hype And that doesn't work. How can I fix this? Thanx to the programmer! I searched the web and phpDig was one of the best I could find. Greets, Arjan |
Hi. I'm not sure I understand the problem. When I index http://www.webthings.nl/archive/200...gewoon_een_hype(2)#body using one level I get the following results:
-------------------------------------------------------------------------------- SITE : http://www.webthings.nl/ Exclude paths : - @NONE@ 1:http://www.webthings.nl/archive/200...gewoon_een_hype(2) (time : 00:00:04) + level 1... Duplicate of an existing document 2:http://www.webthings.nl/archive/webthings_stylesheet.css (time : 00:00:06) No link in temporary table -------------------------------------------------------------------------------- links found : 2 http://www.webthings.nl/archive/200...gewoon_een_hype(2) http://www.webthings.nl/archive/webthings_stylesheet.css Optimizing tables... Indexing complete ! Then when I seach on realhosting I see the the following results: 1. [100.00 %] webthings/webdesign/webdesign nieuws limit to http://www.webthings.nl/, this path : archive/ ...2003 - Eduvision BV en Van Duuren Media - Hosting by Realhosting webthings/webdesign/webdesign nieuws webthings/webdesign/webdesign nieuws... When I click the link, it links me to http://www.webthings.nl/archive/200...gewoon_een_hype(2) and I see your page. When you do the above things, what do you see? |
Hi Ruud,
I get the following: Warning: is_executable() [function.is-executable]: open_basedir restriction in effect. File(/usr/local/bin/pstotext) is not within the allowed path(s): (/vhost/webthings.nl/home) in /vhost/webthings.nl/home/www/html/zoek/admin/robot_functions.php on line 635 Duplicate of an existing document 6:http://www.webthings.nl/archive/2003...als_paypalmail (time : 00:00:03) (see last line) PhpDig has found the following: links found : 9 http://www.webthings.nl/ http://www.webthings.nl/archive/2003...ewoon_een_hype http://www.webthings.nl/pivot/submit...hings&group=k_ http://www.webthings.nl/pivot/submit...hings&group=k_ http://www.webthings.nl/webthings/ar...e_2003-m11.php http://www.webthings.nl/archive/2003...als_paypalmail http://www.webthings.nl/pivot/submit...hings&group=k_ http://www.webthings.nl/pivot/submit...hings&group=k_ http://www.webthings.nl/pivot/kortni...p?wtk=selected As you see it will not index the last (5) etc. Strange it works in your configuration I use the standard config (with Apache 1.3.27 and PHP 4.3.1). Any ideas? What am I doing wrong? Greets, Arjan |
Hi. Try installing PhpDig in the open_basedir that is set. You can find this directory by looking at your PHP info ( <? phpinfo(); ?> ) or by asking your host. Also, try changing the path to pstotext. If you have access to shell and are able to use the locate command, you can locate the correct path to pstotext ( locate pstotext ) or try asking your host. Otherwise grab a copy of pstotext and place it in the open_basedir directory and use that path. If is_executable continues to give you problems, you can set USE_IS_EXECUTABLE_COMMAND to zero in the config file.
|
Hi,
Tanx for the answer. The problem is however not that the executables will not work. I don't like to index pdf etc. But the problem is the system indexes URLs like http://www.webthings.nl/archive/2003...als_paypalmail(1) as 6:http://www.webthings.nl/archive/2003...als_paypalmail (time : 00:00:02) (1) fails. I saw in your file it will index it at your server, but it won't index it here? And I have no idea why that is. Do I need to change somethings in my config file? Greets, Arjan |
Hi. Apache 1.3.27 and PHP 4.3.1 under what OS?
What do you see when you run the following: PHP Code:
Code:
Array |
Hi. I am afraid I see the same... It seems that is not the problem. Any other ideas?
Greets, Arjan |
All times are GMT -8. The time now is 02:51 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.