PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 04-01-2004, 03:17 AM   #1
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
converted from html pages to php pages now no pages will index!!! help!!

I have recently converted all my html pages into php pages and now php dig will not index any of them at all!

the pages are extremely important and need indexing so how do i sort this out, also the pages are only little bits of code that link to a template page which is then populated with data, so phpdig doesn't seem to be able to spider these pages now, can anyone explain a way round this???

cheers,

Alex.
bigals is offline   Reply With Quote
Old 04-01-2004, 03:34 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. What is the code from one of these PHP files? Does it have a header redirect? If so, try the ZIP file in this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 03:37 AM   #3
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
the code from the files are as follows:

PHP Code:
<?php
    
// Strip the path from the current script location 
    
$path dirname($_SERVER['PHP_SELF']); 
    
// Explode out the directors from the path 
    
$dirs explode("/"$path); 
    
$numdirs count($dirs) - 1
    
// Directory closest to the php page 
    
$region $dirs[$numdirs];
    
// Directory before dir1 
    
$country $dirs[$numdirs 1];
    
// Set Status
    
$url="http://www.mysite.com/templates/region_template.php";
$url.="?region=".urlencode($region)."&country=".urlencode($country);
$file_output=file_get_contents($url);
echo 
$file_output;
?>
thats all that is in each index.php file, so how can these be indexed?

I'm not very knowledgable so could you possibly explain?

thanks.
bigals is offline   Reply With Quote
Old 04-01-2004, 04:12 AM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Are there links to these PHP files?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 04:20 AM   #5
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
the links to the pages are generated by the pages themselves...

its a directory of the UK

ie. an index page placed in a county folder will create the links to all the towns/cities within that, these links would be somethhing like

leicestershire/leicester/index.php

so the links only exist after the php page has been compiled, i think i read somewhere that phpdig compiles all php then spiders it afterwards.

the index.php pages become html in content but only when compiled

hope that helps explain it!
bigals is offline   Reply With Quote
Old 04-01-2004, 04:26 AM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. I mean when you spider, are you spidering a page that has links to these PHP files, like a directory listing?

BTW, PhpDig doesn't compile PHP; it's compiled server-side. PhpDig checks and reads what is output from the server.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 04:33 AM   #7
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
no i'm just spidering the main folder, i.e.

http://www.mysite.com/database/world/uk/

the first page it will find will be a index.php page this page will display the countries within the uk its laid out like so:

world/
--------uk/index.php
--------uk/england/index.php
--------uk/england/west_midlands/index.php
bigals is offline   Reply With Quote
Old 04-01-2004, 04:48 AM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. So does the http://www.mysite.com/database/world/uk/england/west_midlands/index.php page call up the http://www.mysite.com/templates/region_template.php?region=west_midlands&country=england page? If so, what do you get when you uncomment //print $answer."<br>\n"; from the robot_functions.php file and then index?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 05:03 AM   #9
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
yes that is what happens, bang on...

i tried uncommenting that line and it indexed just the main html pages as before, missing the entire database folder out (as this only consists of these index.php file and folders)

it was returning strange info like this:

Quote:
5:http://www.mysite.com/features/featurepage.html
(time : 00:00:17)
HTTP/1.1 404 Not Found
HTTP/1.1 200 OK
Date: Thu, 01 Apr 2004 12:56:48 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_jk mod_ssl/2.8.4 OpenSSL/0.9.6 PHP/4.3.0 FrontPage/5.0.2.2510 mod_auth_pam_external/0.1 mod_perl/1.26
Last-Modified: Mon, 29 Mar 2004 11:21:42 GMT
ETag: "180433c-2156-406806c6"
Accept-Ranges: bytes
Content-Length: 8534
Content-Type: text/css
HTTP/1.1 404 Not Found
HTTP/1.1 200 OK
Date: Thu, 01 Apr 2004 12:56:48 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_jk mod_ssl/2.8.4 OpenSSL/0.9.6 PHP/4.3.0 FrontPage/5.0.2.2510 mod_auth_pam_external/0.1 mod_perl/1.26
Last-Modified: Mon, 29 Mar 2004 11:21:42 GMT
ETag: "180433c-2156-406806c6"
Accept-Ranges: bytes
Content-Length: 8534
Content-Type: text/css
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
each time saying HTTP/1.1 404 Not Found afew times at the end of each of these blocks, as you can see above.

i've commented that line back to the way it was now.
bigals is offline   Reply With Quote
Old 04-01-2004, 05:16 AM   #10
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. The 404 means PhpDig is not finding the pages. Are you using a base href tag? If so, there is some code in this thread to account for base href tags.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 05:24 AM   #11
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
no i'm not using base h ref i don't think, i'm not sure what that means exactly, but if a search for <BASE HREF in my template pages nothing is returned so that isn't in any of my pages.

argh this is getting confusing!!
bigals is offline   Reply With Quote
Old 04-01-2004, 05:43 AM   #12
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. It seems that there may be a mislink somewhere in the new PHP code, maybe dealing with the $_SERVER['PHP_SELF'] variable. What do you get onscreen when you try the following?

In robot_functions.php right after:
PHP Code:
//print $answer."<br>\n"; 
stick the following:
PHP Code:
echo "Page: ".$host.$path."<br>\n"
and see what pages are generating the 404s on index.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 06:18 AM   #13
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
i get all of this stuff happening: that double// looks a bit suspicious, and then it goes back to one /

Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
+ Page: www.mysite.com/database/world//
Page: www.mysite.com/database/world//
Page: www.mysite.com/database/world//
Page: www.mysite.com/database/world//
bigals is offline   Reply With Quote
Old 04-01-2004, 06:40 AM   #14
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. The double slash is okay. It's removed when it needs to be removed. Maybe the thing to notice is that none of the pages have things like uk/england/west_midlands/index.php in them. Without actually seeing/testing your site, I doubt that I can get this narrowed down.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-01-2004, 06:49 AM   #15
bigals
Orange Mole
 
Join Date: Nov 2003
Posts: 41
ok, well heres one of the pages that an index.php page is replaced with:

this might be more help to you, as you can then see how things are accessed by my pages n stuff, all the templates are the same in dynamics...

i hope this can help!!!!

see attachment...its a php file
Attached Files
File Type: txt country_template.txt (29.6 KB, 21 views)
bigals is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to index other pages but not farther from them? WebSpider Mod Requests 5 02-07-2005 07:18 PM
I don't succeed to index the pages warecast Troubleshooting 2 01-01-2005 04:03 PM
Index on html pages build by template Magnetic Core How-to Forum 1 09-07-2004 11:06 AM
Need to index orphan php pages, how? hotmonkey How-to Forum 3 07-31-2004 11:41 AM
do not index all pages robilix Troubleshooting 2 11-25-2003 02:50 PM


All times are GMT -8. The time now is 09:04 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.