PhpDig.net - View Single Post

Edomondo · 01-14-2004, 05:16 AM

I'm now facing another problem in striping strings.
ex: ¥³¡¼¥Ê¡¼¤â¤¢¤ê¤Þ¤¹ (in EUC-JP)
is made of 9 mutli-byte characters:
¥³ ¡¼ ¥Ê ¡¼ ¤â ¤¢ ¤ê ¤Þ ¤¹
But the script I wrote considers ¢¤ as a separator and replace it by a space, though it is the end of ¤¢ and the beginning of ¤ê (2 multi-byte characters).
The script returns: ¥³¡¼¥Ê¡¼¤â¤ ê¤Þ¤¹. What result in the end of the string being nosense.

The only way to get rid of this bug would be to check each 2 characters to see if it is a mutli-byte character or not, and replace it by a space if it is a separator. But such a script wouldn't be too much time-consuming? Any idea on how to achieve this?

01-14-2004, 05:16 AM	#18
Edomondo Orange Mole Join Date: Jan 2004 Location: In outer space Posts: 37	I'm now facing another problem in striping strings. ex: ¥³¡¼¥Ê¡¼¤â¤¢¤ê¤Þ¤¹ (in EUC-JP) is made of 9 mutli-byte characters: ¥³ ¡¼ ¥Ê ¡¼ ¤â ¤¢ ¤ê ¤Þ ¤¹ But the script I wrote considers ¢¤ as a separator and replace it by a space, though it is the end of ¤¢ and the beginning of ¤ê (2 multi-byte characters). The script returns: ¥³¡¼¥Ê¡¼¤â¤ ê¤Þ¤¹. What result in the end of the string being nosense. The only way to get rid of this bug would be to check each 2 characters to see if it is a mutli-byte character or not, and replace it by a space if it is a separator. But such a script wouldn't be too much time-consuming? Any idea on how to achieve this? __________________ Uchû Senshi Edomondo http://www.leijiverse.com http://shonen-kokoro.fr.st http://tsukanomanoharu.fr.st