Message-ID: <3B348EE8.6706BFF9@csi.com>
Date: Sat, 23 Jun 2001 08:43:20 -0400
From: John Colagioia <JColagioia@csi.com>
X-Mailer: Mozilla 4.61 [en] (Win98; I)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: rec.arts.int-fiction
Subject: Re: OT: String Manipulation Hell
References: <g2PV6.221676$eK2.48004027@news4.rdc1.on.home.com> <wkd781ykwo.fsf@turangalila.harmonixmusic.com> <1ev89fd.jjqdib1x2gkfqN%news@davidglasser.net> <3b2f6ce5.707368184@news.worldonline.nl> <GF6uyx.En8@world.std.com> <3b30b1a5.789142630@news.worldonline.nl>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
NNTP-Posting-Host: 128.238.10.127
X-Original-NNTP-Posting-Host: 128.238.10.127
X-Trace: excalibur.gbmtech.net 993300128 128.238.10.127 (23 Jun 2001 08:42:08 EST)
Organization: GBM Technologies Ltd
Lines: 35
X-Authenticated-User: jnc
X-Original-NNTP-Posting-Host: 127.0.0.1
Path: news.duke.edu!newsgate.duke.edu!nntp-out.monmouth.com!newspeer.monmouth.com!news.maxwell.syr.edu!newsfeed1.cidera.com!news-reader.ntrnet.net!uunet!ash.uu.net!excalibur.gbmtech.net
Xref: news.duke.edu rec.arts.int-fiction:89092

Richard Bos wrote:
[...]

> You could, in principle (and curses does exactly that), but your
> language (or library) will then be much less efficient in printing these
> strings than if you use the hardware character. For most languages,
> "string" == "collection of hardware characters".

Actually, just to be annoyingly pedantic for a moment, "for most language
implementations, a string is a collection of hardware characters."  Aside from
C, there are very few languages that tell you about the internals of a string,
and with good reason--it doesn't make any difference to the programmer.


> > (C and C++ weak-typing seem to prejudice
> > people to thinking of memory layouts, instead of abstract data
> > types.)
> Hm, no, efficiency concerns, and legibility by other languages, do. If
> you write a "normal" string to a text file, every other language should
> be able to read it. If you write an enhanced string, most other
> languages will be confused by it.

Efficiency is a nonissue, because--despite what most compiler writers will have
you believe--it really isn't that hard to detect "local idioms" and optimize
them to something significantly more efficient.

As for interprocess concerns, the interacting processes should have a protocol
under which they communicate.  Any process that can't be bothered to convert
the data appropriately (i.e., a flat text string, in this case) probably
shouldn't be trusted to deliver data, anyway.  Big example:  Computers
communicate just fine, despite the fact that Intel machines internally order
their integers "backwards" from most everyone else.  This is because each
application is responsible for transmitting data in "network format."


