From xemacs-m  Wed Mar 12 21:16:22 1997
Received: from pentagana.sonic.jp (root@tokyo-01-042.gol.com [202.243.51.42])
	by xemacs.org (8.8.5/8.8.5) with ESMTP id VAA15178
	for <xemacs-beta@xemacs.org>; Wed, 12 Mar 1997 21:16:18 -0600 (CST)
Received: (from jhod@localhost) by pentagana.sonic.jp (8.7.1+2.6Wbeta4/3.4W3) id LAA00318; Thu, 13 Mar 1997 11:48:21 +0900
Date: Thu, 13 Mar 1997 11:48:21 +0900
Message-Id: <199703130248.LAA00318@pentagana.sonic.jp>
From: P E Jareth Hein <jhod@po.iijnet.or.jp>
To: Hrvoje Niksic <hniksic@srce.hr>
Cc: xemacs-beta@xemacs.org
Subject: Re: Documentation bug
In-Reply-To: <kig4tegn3g1.fsf@jagor.srce.hr>
References: <199703122109.NAA29413@sandman>
	<m2endk7oix.fsf@altair.xemacs.org>
	<kig4tegn3g1.fsf@jagor.srce.hr>
Mime-Version: 1.0 (generated by tm-edit 7.105)
Content-Type: text/plain; charset=US-ASCII

>>>>> "Hrv" == Hrvoje Niksic <hniksic@srce.hr> writes:

Hrv> Steven L Baur <steve@miranova.com> writes:
>> There are supposed to be a whole set of character specific order
>> comparison functions char<, char>, etc. none of which appear to be
>> implemented.  They look easy enough to add; should they be added?

Hrv> I don't think they're easy to add for anything other than latin1.
Hrv> For example, the `s' in `Niksic' in fact is not `s', but an
Hrv> iso-8859-2 character with the code 185 (it's exponent-1 in
Hrv> iso-8859-1).  That character may be in a buffer.  In that case,
Hrv> the following should all be true (I use ?sh to denote 185 in
Hrv> latin2):

Hrv> (char-equal ?sh ?SH) ; case-insensitive compatison (char> ?sh ?s)
Hrv> (char< ?sh ?t) ; ?sh is between ?s and ?t

Hrv> I don't know how hard this is to add, but it's *definitely*
Hrv> non-trivial, and it *cannot* be achieved with just implicit
Hrv> conversion to integer.  If we don't want a brain-damaged
Hrv> implementation, that is.

Hrv> A side-point: what should XEmacs 20 do if various
Hrv> cross-character-set characters are compared.  What is "bigger",
Hrv> the letter ?c or a Japanese character?

This is what the XPG4 standard is all about, and why the brain-dammage
that is 'locale' exists. At this point, most people have thrown up
their hands and said that multi-lingual programs are non-standard by
definition, and therefore created tools that work for a single
language set (hence locale). For example, Japanese and Chinese
characters are sortable, but not necessarily by their position in the
codeset. How do you sort that? And if they are intermixed with another
language, then what? About the only workable solution I can see off
the top of my head would be to create a 'locale extent, and create
these functions (and modify existing ones) to key off that particular
extent. Comments?

--Jareth


