From xemacs-m  Wed Feb 12 15:16:32 1997
Received: from venus.Sun.COM (venus.Sun.COM [192.9.25.5])
	by xemacs.org (8.8.5/8.8.5) with SMTP id PAA09961
	for <xemacs-beta@xemacs.org>; Wed, 12 Feb 1997 15:16:31 -0600 (CST)
Received: from Eng.Sun.COM ([129.146.1.25]) by venus.Sun.COM (SMI-8.6/mail.byaddr) with SMTP id NAA15637; Wed, 12 Feb 1997 13:16:00 -0800
Received: from kindra.eng.sun.com by Eng.Sun.COM (SMI-8.6/SMI-5.3)
	id NAA05890; Wed, 12 Feb 1997 13:15:56 -0800
Received: from xemacs.eng.sun.com by kindra.eng.sun.com (SMI-8.6/SMI-SVR4)
	id NAA00087; Wed, 12 Feb 1997 13:15:56 -0800
Received: by xemacs.eng.sun.com (SMI-8.6/SMI-SVR4)
	id NAA04421; Wed, 12 Feb 1997 13:15:55 -0800
Date: Wed, 12 Feb 1997 13:15:55 -0800
Message-Id: <199702122115.NAA04421@xemacs.eng.sun.com>
From: Martin Buchholz <mrb@Eng.Sun.COM>
To: wmperry@aventail.com
Cc: xemacs-beta@xemacs.org
Subject: Re: How robust is the mule font code?
In-Reply-To: <199702102239.OAA01958@newman>
References: <199702101552.HAA32270@newman>
	<199702102204.OAA00033@xemacs.eng.sun.com>
	<199702102239.OAA01958@newman>
Reply-To: Martin Buchholz <mrb@Eng.Sun.COM>
Mime-Version: 1.0 (generated by tm-edit 7.100)
Content-Type: text/plain; charset=US-ASCII

>>>>> "Bill" == William M Perry <wmperry@aventail.com> writes:

>> Why would users get this hp-roman8 font in a W3 buffer, but not in an
>> ordinary buffer?

Bill> Because they specified via Xresources the encoding.  What I need to do is
Bill> deal with the 'charset' parameter in HTTP responses and do a mapping from
Bill> that to registry/encoding pairs for the font model.  Not sure if I grok
Bill> this enough yet.  Oh, to have spare time again.

I don't think you need to think about fonts when you do
encoding/decoding.  The HTML term 'charset' seems to correspond to the
XEmacs term 'coding-system'.  Presumably you get your http input using
no-conversion, check for the HTML charset, translate that to an XEmacs
coding system and then

(decode-coding-region (point-min) (point-max) coding-system w3-buffer)

will then convert the characters correctly.

Redisplay will take care of instantiating the correct fonts for the
various characters via the (charset-registry) of the charsets of the
characters.

Check out how TM converts a MIME charset to a coding system - this
should be a similar operation.  I don't know how 'charset's are
specified in HTML, but the obvious way is to do it exactly the same
way as MIME does, since the problem is identical.  Maybe try to
understand emu-x20.el...

-------------- from emu-x20.el -------------------

(defvar default-mime-charset 'x-ctext)

(defvar mime-charset-coding-system-alist
  '((iso-8859-1		. ctext)
    (x-ctext		. ctext)
    (hz-gb-2312		. hz)
    (cn-gb-2312		. euc-china)
    (gb2312		. euc-china)
    (cn-big5		. big5)
    (koi8-r		. koi8)
    (iso-2022-jp-2	. iso-2022-ss2-7)
    ))

Martin

