|
10-08-2003, 06:43 AM | #1 |
Green Mole
Join Date: Oct 2003
Posts: 11
|
iso-8859-7
Hello there!
I would like to know how I can change the iso to iso-8859-7 (greek). I read the documentation but could not understand how to set the $phpdig_string_subst['iso-8859-7'] and $phpdig_words_chars['iso-8859-7'] values. Any help please?? |
10-08-2003, 01:51 PM | #2 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
You must define ALL - Chr: in this String
$phpdig_string_subst['iso-8859-7'] ='......here is iso-8859-7 chr ...........' see: http://www.softlab.ntua.gr/~sivann/xgrk/iso8859-7.html and set: define('PHPDIG_ENCODING','iso-8859-7'); Perhaps you found the code in ONE Line with google ? -Roland- |
10-09-2003, 02:19 AM | #3 |
Green Mole
Join Date: Oct 2003
Posts: 11
|
Thanks for your reply Rolandks!
oK, I think I got it...... What about the: $phpdig_words_chars['iso-8859-2'] = '[:alnum:]ðþß'; What is it used for? Will I have to change it? Regards, Mike |
10-09-2003, 05:56 PM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. The $phpdig_words_chars['iso-8859-2'] = '[:alnum:]ðþß'; is for non-accented 'lowercase' letters such as the German ß (pronouced 'ess set' if I remeber correctly) for example. Sort of think of it like anything that doesn't go in $phpdig_string_subst['iso-8859-2'] might go in $phpdig_words_chars['iso-8859-2']. If you will, once you get your 'iso-8859-7' set, please post it in the Mod Submissions forum in case others might want to use it. Thanks.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-26-2003, 09:04 AM | #5 |
Green Mole
Join Date: Oct 2003
Posts: 11
|
Unfortuanetely I can not make it to work.
I have used something like: $phpdig_string_subst['iso-8859-7'] = 'Á:¢,Å:¸,Ç:¹,É:ºÚ,Ï:¼,Õ:¾,Ù:¿,Ü:á,å:Ý,ç:Þ,é:ßúÀ,ï :ü,õ:ýû*,ù:þ'; I have changed the encoding to: define ('PHPDIG_ENCODING','iso-8859-7'); I think that the problem is with $phpdig_words_chars['iso-8859-1']='[:alnum:]ðþß' string. What letters do i put within the [::] characters and what letters after this? The script searches some of the english pages that i have in the site, but does not search any greek pages. The table 'keywords' only contains english words. I would really need some help! ps. I am using the 1.6.2 version. Last edited by mkst; 11-26-2003 at 09:15 AM. |
11-26-2003, 11:06 AM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. I found the below ASCII representation of iso-8859-7 at http://www.gar.no/home/mats/8859-7.htm.
Code:
80-9F: unassigned // note A0 is a space A0-BF: _¡¢£¤¥¦§¨©ª«¬_®¯°±²³´µ¶·¸¹º»¼½¾¿ C0-DF: ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß E0-FF: *áâãäåæçèéêëì*îïðñòóôõö÷øùúûüýþÿ PHP Code:
PHP Code:
PHP Code:
The $phpdig_words_chars['iso-8859-7'] variable is for lowercase non-accented characters (basically those lowercase non-accented characters that copy paste into ASCII as the characters themselves). An example of this would be Greek µ, so it could be added to $phpdig_words_chars['iso-8859-7'] like so: PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-27-2003, 06:21 AM | #7 | |
Green Mole
Join Date: Oct 2003
Posts: 11
|
Thanks for your reply Charter!
...but I am still confused!! Quote:
And what exactly do you mean by '(basically those lowercase non-accented characters that copy paste into ASCII as the characters themselves)' ? I have tried something like this: PHP Code:
PHP Code:
The engine indexes the site alright but only recoginzes and prints results for part of the keyword. Also the 'keywords' table contains words with with latin letters only. It is this allright i guess uh? Thank you for your time Charter, and i hope i am not much of a trouble Last edited by mkst; 11-27-2003 at 07:57 AM. |
|
11-27-2003, 08:10 AM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. I'll use a German word as an example of what I mean by the 'is like' phrase. The German word Gästebuch means Guestbook. The ä in Gästebuch 'is like' the Latin a. Such characters like ä are stored as their Latin counterparts in the database for searching. When you copy paste a character into a text editor, it will either show up as the character or some ASCII equivalent of the character. The characters that show up as the actual character are the ones that go in $phpdig_words_chars['iso-8859-7'] but no accented characters should go in $phpdig_words_chars['iso-8859-7']. All accented or diacritic characters should go in $phpdig_string_subst['iso-8859-7'].
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-28-2003, 06:11 AM | #9 |
Green Mole
Join Date: Oct 2003
Posts: 11
|
Thank you for your reply Charter. It seems that i managed to create the right $phpdig_string_subst and $phpdig_words_chars.
However, I still have one problem regarding words that start with capital letter. I can only find a word that starts with certan capital letters, otherwise I get zero matches. The search works ok for lower case words. Do you have any idea why this is happening? Regards, Mike |
11-28-2003, 06:17 AM | #10 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. What are $phpdig_string_subst['iso-8859-7'] and $phpdig_words_chars['iso-8859-7'] currently set to? What capital letters are not working? Maybe there is a mismatched key value type pairing.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-28-2003, 06:25 AM | #11 |
Green Mole
Join Date: Oct 2003
Posts: 11
|
PHP Code:
PHP Code:
Words starting with Á, ¶, Ð, Ì have no problem. Last edited by mkst; 11-28-2003 at 06:30 AM. |
11-28-2003, 06:51 AM | #12 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Of áâãäåæçèéêëì*îïðñóôõö÷øù the only ones that should be in the $phpdig_words_chars['iso-8859-7'] variable are æçðø like so:
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-28-2003, 07:39 AM | #13 |
Green Mole
Join Date: Oct 2003
Posts: 11
|
Thanks Charter but there is no improvent.
It is now worse than before.... |
11-28-2003, 10:37 AM | #14 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. I am not very familiar with the Greek alphabet beyond mathematical usage. Below is what I came up with assuming that Latin A is like Greek Alpha, Latin a is like Greek alpha, and so forth. I make no claims of correctness.
PHP Code:
I also made the following assumptions: Latin G is like Greek Gamma, Latin g is like Greek gamma, Latin R is like Greek Rho, Latin r is like Greek rho, Latin Y is like Greek Upsilon, Latin y is like Greek upsilon. As I m not very familiar with the Greek language, this is the best that I can offer.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
12-24-2003, 02:41 AM | #15 |
Green Mole
Join Date: Dec 2003
Posts: 7
|
Hi.
I am also trying to index greek pages with encoding 8859-7 and I have some problems. I think that the origin of the problem is that greek characters are converted to latin and then putted in the keywords table. Why is it necessary to convert the greek characters to latin? I think that the engine would have worked much better and more accurate without this conversion. Is there a hack that I could apply so greek characters won't be converted to latin? |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
I want search RUSSIAN (ISO-8859-5) language in PHPDig, How to ??? | Ivan | How-to Forum | 1 | 09-26-2003 04:30 PM |