Newsgroups: rec.arts.int-fiction
Path: news.duke.edu!newsgate.duke.edu!nntp-out.monmouth.com!newspeer.monmouth.com!news.maxwell.syr.edu!news.stealth.net!news-east.rr.com!news.rr.com!router1.news.adelphia.net!uunet!ash.uu.net!world!buzzard
From: buzzard@world.std.com (Sean T Barrett)
Subject: Re: OT: String Manipulation Hell
Message-ID: <GF925u.8I4@world.std.com>
Date: Wed, 20 Jun 2001 22:22:42 GMT
References: <g2PV6.221676$eK2.48004027@news4.rdc1.on.home.com> <3b2f6ce5.707368184@news.worldonline.nl> <GF6uyx.En8@world.std.com> <3b30b1a5.789142630@news.worldonline.nl>
Organization: The World Public Access UNIX, Brookline, MA
Lines: 50
Xref: news.duke.edu rec.arts.int-fiction:88964

Richard Bos <info@hoekstra-uitgeverij.nl> wrote:
>buzzard@world.std.com (Sean T Barrett) wrote:
>> Richard Bos <info@hoekstra-uitgeverij.nl> wrote:
>> >I doubt it, because I don't think it can be done.
>>[it can be done]
>For most languages, "string" == "collection of hardware characters".

Followed by a 0. Or with a prefix length that's a byte.
Or with a prefix length that's 32-bits. Or the string is
unicode of any of various flavors.

Almost no hardware implements strings. Assuming "character" = "byte"
is an assumption, not a truth.

Finally, the original poster asked for, essentially, a language
where "string" != "collection of hardware characters". So that
comment amounts to question begging.

>> (C and C++ weak-typing seem to prejudice
>> people to thinking of memory layouts, instead of abstract data
>> types.)
>
>Hm, no, efficiency concerns, and legibility by other languages, do. If
>you write a "normal" string to a text file, every other language should
>be able to read it. If you write an enhanced string, most other
>languages will be confused by it.

You are posting to a newsgroup where people regularly use languages
that compress their string storage on disk, that do not allow file
interchange between applications at all, and which are grossly
inefficient because they are interpreted. The vast majority of
non-mainstream languages are also interpreted. Efficiency is just
not a concern when the language is interpreted but the string
primitives are themselves coded in a compiled language.

The language C's string implementation is insanely inefficient
(let's use an O(n) computation to find out the length of a string!),
but people still use it. Also, formatting strings in memory in
different ways doesn't affect file interchange if those strings
get flattened to remove formatting when saved to plain text
files. So that's irrelevant anyway.

So you're wrong: it can be done, and indeed it apparently
has been done (ELISP), where "it" is what the original poster
asked for: a language that makes it easy to do such things.

I can live with people making goofy pedantic points about my posts,
but this goes beyond pedantry.

SeanB
