PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   Can't index a table construct (http://www.phpdig.net/forum/showthread.php?t=246)

RedThypon 11-26-2003 02:50 PM

Can't index a table construct
 
Hello,

I'm using:
PhpDig Version 1.6.4
Php Version 4.3.2
Apache Version 2.0.46
Linux


I'm having the problem that PhpDig can't find words which are inside of a table construct.

For example:
The html code is like this:
<table><tr>
<td>word</td>
<td second word</td>
</tr></table>

If I let PhpDig search for word or second, then I get the message "no results found"

I have 3 or 4 pages which have tables inside, how can I get PhpDig to index them correctly and find the words inside the tables?


Thank you for your answers

yours
RedThypoon

Charter 11-26-2003 03:26 PM

Code:

<table><tr>
<td>word</td>
<td second word</td>
</tr></table>

Hi. Maybe just a typo but can you post the HTML here for a look? Also, does 'word' happen to be in the common words file?

RedThypon 11-26-2003 03:43 PM

:)
word is just an example for some text.

I can't post the hole code, it is to much,
but the hole code is validated by w3c.
So this would be the code with only showing the table.

If you like to see the whole code, visit
http://www.redthypoon.de/walrus
and choose "On Stage" from the main menu and then "Marktplatz" from the menu in the window".

here's the code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="de" xml:lang="de">

<head>
<title>Walrus Kultur e. V.</title>
<meta http-equiv="Content-Style-Type" content="text/css" />
<link rel="stylesheet" type="text/css" href="includes/style.css" />
<link href="http://www.walrus-kultur-ev.de/favicon.ico" rel="SHORTCUT ICON" />

</head>

<body>
<div id="body" style="color:#ff0000; background:url(<?php echo $pfad; ?>images/onstage.jpg);">
<div class="titel">
Marktplatz</div>
<div id="inhalt">
<div style="margin:0px 20px 0px 0px; text-align:right;">Stand: 30.09.2003</div>
<div style="margin:10px 0px 0px 0px; color:white;">
<table border="0" cellspacing="3">
<colgroup>
<col width="90" />
<col width="92" />
<col width="200" />
<col width="295" />
<col width="112" />
</colgroup>
<tr>
<td style="color:#ffff00; background:#ff0000; font-size:1.3em;" colspan="5">gesucht wird</td>
</tr>

<tr style="background:#808080; font-weight:bold">
<td><b>Chiffré-Nr.</b></td>
<td><b>Datum</b></td>
<td><b>Bezeichnung</b></td>
<td><b>Beschreibung</b></td>
<td><b>Kontakt</b></td>
</tr>
<tr style="vertical-align:top;">
<td>2-b-ons</td>
<td></td>
<td>Rivera Gitarrenverstärker</td>
<td>im SKS-Case. Einbau in 19"-Rack möglich. Edler 100 Watt Gitarrenverstärker aus den USA ------- VHB 900,- &euro;</td>
<td><a href="mailto:bla@bla.de">bla@bla.de</a></td>
</tr>
</table>
</div>
</div>
</div>
</body>
</html>


thank you for your help

yours
RedThypoon

Charter 11-26-2003 04:02 PM

Hi. Yes, I understand. ;)

Maybe there is a typo in the HTML that is causing that block to be ignored. Can you post the HTML?

RedThypon 11-26-2003 04:05 PM

sorry, i forgot,
edited my post.

Charter 11-26-2003 04:22 PM

Hi. I just indexed http://www.redthypoon.de/walrus/index.php?mnuid=198 at one level and then searched for the word 'Kontakt' and obtained 14 results. What word(s) do not show up in your search?

RedThypon 11-27-2003 05:28 AM

He doesn't show up:
Rivera
Gitarre
Gitarrenverstärker

another page with this problem is:
http://www.redthypoon.de/walrus/index.php?mnuid=189

He doesn't show up the names of the people or their function, like:
Sascha Schabacker
Vorsitzender
Kasse


thank you

yours
RedThypoon

Charter 11-27-2003 06:57 AM

Hi. I crawled the link in your last post and can find, for example, Vorsitzender but I cannot find Litfaßsäule when I do a 'words begin' or 'exact words' search. However, when I do an 'any words part' search for Litfaßsäule, I get Litfaßsäule in the results. Please apply the patch in this thread to fix the highlighting issue, but this does seem like a character encoding problem. I'll need to do more checking on this issue. Thanks for bringing it to my attention.

Charter 11-27-2003 07:25 AM

Hi. I figured out the Litfaßsäule issue. The charcater ß was not allowed in the searches. My bad! As a temporary fix, do the following. I'll come up with something better in the next release.

In search_function.php find:
PHP Code:

if (eregi("[^[:alnum:]^ +^-]+",$query_to_parse)) { $query_to_parse eregi_replace("[^[:alnum:]^ ]+"," ",$query_to_parse); } 

and replace with:
PHP Code:

if (eregi("[^[:alnum:]^ +^-^ß]+",$query_to_parse)) { $query_to_parse eregi_replace("[^[:alnum:]^ ]+"," ",$query_to_parse); } 

This still doesn't answer why Vorsitzender shows in searches for me but not for you. Now I'm thinking this is not a character encoding issue, but rather something to do with stored keywords.

When you run the below query what do you get?
Code:

SELECT * FROM keywords WHERE keyword like 'vo%';

RedThypon 11-27-2003 07:36 AM

Hi, thanks for the solutions with the ß.

You can find Vorsitzender, because it is located on 2 Pages.
the word Vorsitzender is also within this page:
http://www.redthypoon.de/walrus/index.php?mnuid=189

and this is the problem I mentioned first. He can't find this page. He finds only the second page. I suppose, because Vorsitzender is within a table-construct on the page he can't find

When I run the SQL-Code I get this:
key_id twoletters keyword
Edit Delete 3577 vo voices
Edit Delete 3545 vo volker
Edit Delete 3298 vo voll
Edit Delete 3643 vo vordergrund
Edit Delete 3538 vo vorerst
Edit Delete 3121 vo vorname
Edit Delete 3045 vo vorsitzender
Edit Delete 3037 vo vorstand


Thank you for your help

yours
RedThypoon

Charter 11-27-2003 07:42 AM

Hi. I am able to find Schabacker so I don't think it's the table-construct. Hmm, I wonder what's different.

RedThypon 11-27-2003 07:45 AM

Sorry, you are to fast for me, or I don't think before I write :).

Please read my post above your last again, I edited it.

don't mention on the word Schabacker, it is on the same pages as Vorsitzender, so it is the same problem.

thanks

Charter 11-27-2003 07:57 AM

Hi. Can you make a page like so and then crawl it?
Code:

<html>
<body>
Rivera Gitarre Gitarrenverstärker Sascha Schabacker Vorsitzender Kasse
</body>
</html>

Do you get search results with this simple page?

RedThypon 11-27-2003 08:18 AM

Yes, in this simple page, he finds the words

Charter 11-27-2003 08:52 AM

1 Attachment(s)
Hi. Attached is a screenshot of the http://www.redthypoon.de/walrus/index.php?mnuid=189 page. Does the page look the same as it does in your browser?

When you crawl this site, do you get any 'duplicate' page notices?

RedThypon 11-27-2003 08:55 AM

Yes, it looks the same, and yes I get the 'duplicate' notices

Charter 11-27-2003 08:57 AM

Are the duplicate notices for the pages that contain the words you cannot find?

RedThypon 11-27-2003 09:06 AM

No, this was the first thing I had a look at.

Charter 11-27-2003 09:11 AM

What's the link to your PhpDig search page?

RedThypon 11-27-2003 09:41 AM

Do you mean on RedThypoon.de/walrus?

There is no search page, I develop the page on my local system. Yesterday I had no time to upload it, but now I will do.
The only difference between the local and the online version is, that DigPhp is only installed on the local.

I will upload the page now and post a message when it's ready.


thanks

yours redthypoon

Charter 11-27-2003 10:06 AM

Okay, thanks. Maybe I will see something when I search your site.

You probably already checked, but did all of these links get indexed? Also, how many text files show when you do grep Vorsitzender * in the text_content directory?

RedThypon 11-27-2003 10:18 AM

Ohhhhhh aha,

I don't know what to say.

I uploaded the page and let PhpDig crawl online.

Tadaa, it finds every word.

I am very sorry, because I have wasted your time :(

The next thing I do is finding out, why it doesn't run on my local server, to avoid such failures in the future.

Please excuse me.

You did / do a great work, and I am really glad about your support.

I hope, that I can get your help again in the future, after this mistake.

From now on, I will test the homepage online.
Shame on me!

Thank you for everything

yours
RedThypoon

Charter 11-27-2003 10:24 AM

Hi. No problem at all. Besides, your posts led me to the issue with ß. If you find out why it didn't work on your local server, please post your findings. Others might have the same problem, and your findings could help them. :)

RedThypon 11-27-2003 10:45 AM

Hmm,

more shame on me, I found my failure.

I used the PHPDIG_EXCLUDE_COMMENT and the PHPDIG_INCLUDE_COMMENT and I was shure, that I set them on the right positions in my code.
I removed them before I uploaded the page.

I used them, because I wanted to forbid PhpDig to index the submenu (the menu in the window). I think yesterday it was to late for me. Next time I better sleep a night and think about what I am doing.

Be sure, next time I post a problem, I will have thinked about it for a couple of days.

I don't know why I didn't delete this comments earlier.

Hope, that peoples who read this topic, can learn something of it. I have learned.

Thank you again for everything.
You are giving the best support I know.

yours
RedThypoon

Charter 11-27-2003 10:55 AM

Thanks, but don't feel bad. I should have known about ß and other such characters. That was a silly mistake on my part, but we all make mistakes. Anyway, I don't mind at all that people post questions. If you have questions, go ahead and ask. :)


All times are GMT -8. The time now is 07:47 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.