                                 Muldis::D
                                   TODO
---------------------------------------------------------------------------

Following is a summary of things that still need doing.  It is specific to
the Muldis D specification distribution only, and doesn't talk about things
that would go in other distributions, including implementations.  (But,
look at lib/Muldis/D/SeeAlso.pod for a list of actual or possible
implementations.)

Alternately, this list deals with possible ideas to explore, which may or
may not be good ideas to pursue.

The following list is loosely ordered by priority, and is organized into
groups by approximate subject area, but list items may
actually be addressed in a different order.  There is no specific time
table for these items; they are simply to be done "as soon as possible".

* Generally speaking, make a new release to CPAN once every week, assuming
the progress is non-trivial, so there are regular public snapshots with
nicely rendered documentation.

----------

* Update the development status of the Muldis D language spec to "alpha"
from "pre-alpha" (but don't set the version
to 1.0.0 yet) only when the Muldis Rosetta
Example Engine reference implementation has fully implemented the language
core, or a significant and computationally complete working subset thereof,
and so the language spec is then considered sufficiently complete with
corner cases exposed; Muldis Rosetta would also be updated to "alpha"
status et al simultaneously.  There are no other preconditions to consider
either project "alpha" status.  Curr est mid-late 2011 for this.

* Preconditions for considering the Muldis D language spec to be either
"beta" or "released" status or "1.0" incl:  A significant, computationally
complete working core or subset thereof as a Parrot hosted language, a
TAP speaking test suite with significant feature coverage, a serious level
of post-alpha-status design input solicited of other interested parties,
implementations over multiple SQL DBMSs.  Curr est late 2011.

----------

* Add references to or adoptions of ISO/IEC 11404:2007(E) "Information
technology -- General-Purpose Datatypes (GPD)" which could be very useful.

* ZEROTH PRIORITY...
Make the system catalog into something much closer to a *concrete* syntax
tree-like-thing.  See various following TODO items for details.
Mostly do this as its own spec release with minimal dialect/etc/other chgs,
that is, dialect changes to fill in new slots not needed, but anything the
catalog would break could be altered.

* FIRST PRIORITY...
Reduce the general options for concrete value literals to have just the
simple ones.  For any given "x:y:z", remove all "y" but for Scalar where it
is necessary; people can wrap a literal in an explicit TREAT assertion/etc
otherwise if they really want to.  Also remove all "x" where possible, so
just the plain "z" is the only option, in general.  So then, to write an
integer/rat/text literal, the only option then is to say "42/3.25/'hello'";
you can't say "Int:42/Rat:3.25/Text:'hello'" any more.  So cascade these
simplifications, and we also free up the ":" mostly for other uses.  This
particularly applies to PTMD_STD, but we'll also simply the Perl-STDs where
possible, which is easier in Perl 6.
Unlike "[$|%|@]:..." for generic value literals, only %|@ without the colon
are in use, I believe, as prefix operators, meaning cast-tuple-as-relation
and vice-versa, but there is no colon-less $ prefix in use nor does it make
sense for any similar purpose.
So, use $ for name literals, that is, "$foo" means "Name:foo"
and then say "$<>foo" means "NameChain:foo".
And then we're a long way towards being able to ditch the postcircumfix
syntaxes for eg projection since, say "r keep {$foo,$bar}" is terse enough.
Then come up with something for rename, maybe a set of name-pair literals.
After this, keep postcircumfixes rare and common, like for array elements.
In fact, we could just say foo[x] and bar{x} are then array/dict lookups.
Or the dotted forms are elem lookups and no dots are for slices, like Perl.
And then "<expr>.attr" is then its own thing rather than being a shorthand
for "<expr>.{attr}" although it may still be a function shorthand.

* ALSO...
Get rid of the Set.new() options in Perl 6 and generally update the
Perl-STDs to use just Array/Seq/arrayref and mostly not set/bag/hash/etc,
partly for code brevity but particularly to preserve the visual order of
elements from source to catalog and back again.  Also add scm_vis_ord to
the catalog or change some catalog types to record order.

* Maybe also but probably not yet be concerned w comments in Perl-STD/etc.

* NEXT PRIORITY...
Reformat all declarations, materials/subdepots particularly, to be of the
format "name ::= kind ..." rather than "kind name ...".  Similarly, we may
be able to just nix the "subdepot" keyword so a subdepot is then declared
as just "foo ::= {...}", and a material as "foo ::= kind ...".
So the name on the left of the ::= is no longer part of the material node
itself but rather is part of the larger thing into which the material node
is composed; the material node is now just eg ['function', <payload>].
Material nodes themselves just declare anonymous entities.

* Example of function in Perl with context ...
The FunctionSet tuple:
    ['cube','opt comment of FunctionSet',<Function>]
The Function tuple:
    ['opt comment of Function',<heading>,<body>]
... and so on.  The comments are first so they're like leading comments
as is common with whole-routine definitions and such, and they tend to be
terser besides.

* ALSO...
Change function bodies from {...} to (...).  And make "..." expr/stmt kind.

* Likewise make a code comment a stmt/expr kind or otherwise provide for
specifying where it goes visually in code in a statement position.
Consider reworking scm_vis_ord to be external for some things it describes,
eg mapping an order to a declared name.  In fact, if this is done, then it
becomes much easier to add/remove/reorder code pieces because their
sequence numbers are stored separately and so code diffs on the system
catalog itself may not show much in the way of spurrious diffs, especially
if the mapping is simply an array_of.Name.
Consider pulling out code comments in a similar fashion, just putting them
as their own named (as stmts/etc are named) code bits, which are then
associated by name with other code bits, and are listed in the vis-ord too.
In fact, then where comments are physically visible and what things they
are semantically connected to are then not joined at the hip.
The comment names are optionally user-specifiable too, like with statemnts.
For example:
    cmt_on_x ::= `This roxors!`;
... or:
    comment cmt_on_x ::= `This roxors!`;
... and other details still to fill in like saying what it applies to.
Maybe a new infix-op-bind-like syntax will fit the bill for association.
Also make blank code lines or visual dividing lines recordable in syscat.

* ALSO...
Use colons to separate any kind of heading/body pairs, both materials and
values.  Take Relation now "@:[...]:{...}" as example to follow.
Also, routines now "function (...): (...)" or "updater (...): {...}" or
"procedure (...): [...]"; this for routines is inspired by Python.  This
then opens the door for routine body bounding chars to be opt sometimes,
and makes clearer where a heading ends and a body starts when there are
various extra heading clauses such as is-x or implements x.  Also consider
using ":" in other places where pairs are, maybe freeing up => for
something more specific; eg Python uses ":" in dicts rather than =>; or do
the opposite; keep "=>" for named param/attr/etc lists and use the ":" for
things like Bag literals or generic dicts that are binary relations ... use
one for atvl:atvl (bags/dicts), other for atnm:atvl (tuples, arg-lists).
Also consider using "::" for something, maybe type conversion, as Pg does.
Keep "::=" as for explicitly associating names with what they are naming.

* Have ";" as separator (opt lead or term) for both statements/vars/exprs
etc as well as whole materials.  This also comes together nicely for the
simpler routines that don't need to have bounders because they are just
single statements or expressions, for example:
    cube ::= function (Int <-- topic : Int) :
        topic ^ 3;
... and that's it.

* Remove the "var" and "attr" keywords or make them optional noisewords.
Simply having "foo : bar" in a procedure statement position should be
enough to know it is a variable declaration.
Likewise, "foo : bar" in a sca/tup/rel typedef can be known an attr def.
Then, other things can gain optional noisewords, such as "result" before
the type in function sigs, or "param" before a param in routine sigs,
or "expr" or "stmt" optionally before those things in a routine, etc.

* Update system catalog, if necessary, to support specifying where a named
expression, or a variable declaration, lives visually in a statement list.

* THIRD PRIORITY...
Add support for material and parameter synonyms.  And change what params
any positional arguments implicitly go with from topic|other to 0|1|...
But don't actually change any routines/params until later, except adding
0|1 to all topic|other.

* Update the array-specific postcircumfix concrete syntaxes to make them
more generic such that the array index/es (what's inside the "[]") may be
any arbitrary value expression rather than having to be an integer or
interval literal.  But if nothing else changes, this means the slice will
have to be spelled like "ary[{x..y}]" rather than "ary[x..y]", but
individual element access like "ary.[x]" will still work.  But now you can
actually have the x,y variables rather than those having to be constants.

* Consider taking a more Perl 6 like approach by turning ".." and its 3
friends into infix dyadic functions that take endpoint values and result in
interval values.  Then the surrounding curly braces are no longer needed,
and you can once again say "ary[x..y]".  Note that if ppl still want/need
delimiters for an interval, they can always use parens, like "(x..y)".
If we also redefine an MPInterval to be a set_of.SPInterval, then any
{x..y} would unambiguously mean either a set or MPInterval, but we may then
lose the shorthand "x" meaning "x..x", but this could be ok tradeoff.

* Consider also making the likes of "," and "=>" into dyadic functions
along the lines of Perl 6, though this would have further consequences.

* Demote the numeric operators that are more statistics-oriented from the
language core into a new Statistics extension or some such.  Specifically
this means these 5 in [Numeric|Rational|Integer]: range, frac_mean, median,
frac_mean_of_median, mode; and these 2 in Integer: whole_mean,
whole_mean_of_median.  Also, this "mean" is "arithmetic mean" (division of
sum); there is also "geometric mean" (root of product), etc.  After the
demotion, this set of ops can be changed or expanded to be something more
appropriate for statistical applications; some yet-missing SQL-standard
functions like pop-etc can then come in also.  Now these core-removed
functions are just shorthands for not-too-complicated expressions that
users can define for themselves with core ops, so they're not really
missing anything important if they only get the core.
For example, the current (arithmetic) mean is just:
    arith_mean ::= function (Rat <-- topic : bag_of.Rat)
        ([+]topic / #+topic)
... and geometric mean is something like:
    geom_mean ::= function (PRat <-- topic : bag_of.PRat)
        ([*]topic ** (0 - #+topic))
... but any vers in the dedic Statistics could be impl more efficiently.

* Drop special entity name embedded support for inline type declarations
like "foobag : bag_of.Foo"/etc; instead, this syntax is demoted to a
dialect-specific thing that is just sugar for something like "foobag :
relation-type Bar { attr value : Foo, attr count : PInt, primary-key {
value } }".  That way, we can always point to a specific material that
actually exists when asked what is the declared type of "foobag", and also
we are psychologically more free to just declare things as relation types
anyway, and the added flexibility that comes with that, such as in the
definition of the system catalog itself, and also then the concept of an
entity name chain is no longer overloaded.

* Generalize the Set/Array/Bag/Maybe-specific operators so that: 1. the
names of the value/index/count attributes can be specified with arguments
(that are optional, and default to the current ones if not given); 2. they
work with relations of arbitrary degree.  For example, merge the Counted
extension into Bag and call it Counted, and generalize Array into Ranked
("Ordered" is already taken and best left as is) which also absorbs the
ranking and quota functions from Relation.pod, and generalize Set into
Relation.  The Counted|Bag is then any 1+ degree relation with a
positive-integer typed attribute C that has a key (or superkey) on all of
the attributes except for C; it is treated as special by the functions,
which are analogies to general relational functions that work as normal on
all attributes but C and merge C.  The Array|Ranked is then any 1+ degree
relation with a nonnegative-integer typed attribute I that has a key on I
and is further constrained that "max(r{I})+1 = #r"; I is treated as special
by the functions.  The Maybe is then any relation with a nullary key.  With
these generalizations, some concrete syntax like .[N] will just compile
into special cases such as assuming certain special attribute names, and
you can use the foo() syntax when that isn't the case.  After these
generalizations, some Counted|Array|Maybe|etc functions can be core and
others can be pushed into extensions, as is appropriate.  After these
generalizations, we may or may not still have named Array|Bag|etc types,
which will probably keep their definitions, as special cases of the
generalized where the attribute names match the canonical ones.  Also
rename "index" to "rank" in Array perhaps.  After the
generalizations, the distinct usefulness of Set would decrease somewhat.
Note: For a generalization of Maybe, consider the Zoo name, inspired by
Database Explorations that discusses MD's canonical missing info solution,
or alternately call it C01 in the spirit of D0C0/D0C1/D0.
Still in question is what if anything to change about [S|M]PInterval/etc.

* Add official support for functions/expressions to be able to do some
things that they otherwise couldn't, such as have side-effects or be
quasi-non-deterministic.  To be specific, add support for side-effects that
occur external to the current in-DBMS process, such as output via some
side-channel like STDERR or a message queue, which can be used for
debugging a function.  But any such functionality can't directly affect the
current process, and in particular it can't affect the
function/expression's result value.  On the other hand, it is acceptable
for something to cause the function/expression to abort with a thrown
exception, since this isn't changing the result value.  There should be
metadata for any function which does or might do something like this, to
declare the fact.  In addition, we could support a limited form of
non-determinism, such as allowing a rand() or now() function that does
affect the calling function/expression's result, but that this is
constrained to be mutually deterministic within the whole of a single
Muldis D multi-update-statement.  That is, given the same arguments (or
none), now() would always return the same value within a
multi-update-statement, and might only change between different
multi-update-statements, and rand() likewise.  This might also give some
support for partial-sort functions, as long as they are consistent within
multiple calls in the same multi-update-statement.  Once again, such things
would need to be tagged with metadata.  Normal deterministic functions
always have the same result no matter how far apart.

* Make autonomous transactions / in-DBMS processes not so much startable
directly by a process but rather that the kernal/etc process always does it
directly and any other process asks to have such done by sending a message
to the kernal.  Similarly, DBMS-clients just become message passers, and
they start a process the same way as internally, by sending a message to
the kernal/etc to please call this procedure for me, and the result to the
client is also a message.  This also generalizes the stateful/stateless
thing and streaming/cursor or not thing.  Now also tied into this is
stimulus-response-rules, in that all stimuli are messages.  The kernal can
also initiate messages, such as this depot did mount, or whatever.

* Consider relaxing the restriction of how much of a depot must be defined
just in terms of itself.  So, for example, only a depot's data types (and
dbvar) must be defined wholly internally to the depot.  But any routines in
a depot may invoke routines outside of the depot if the former aren't used
in the definition of a data type or dbvar.

* Consider adding some way of generating a type specification from a value
of that type and consider having something like a system catalog which
describes the actual database value rather than a prescribed database type,
such as to help introspection of a database whose declared type is just
'Database'.  The MST thing of TTM may tie into this.
See also how the "Pick" DBMS works, or something.

* Add a scm_foo to the system catalog next to any place that declares a
DBMS entity name, particularly an expr/var/material, to indicate whether
the declared name is considered explicitly user-specified or parser-gen.
There may be more than 2 possible values (making this an enum rather than a
Bool) that relate, say, to distinguishing explicitly named but inlined
items versus explicitly named and not inlined items.  The sys-cat might
restrict based on this such that it doesn't allow certain references to
entities whose names are marked parser-generated, because any generated
source code would have to make the references visible.  A related
implication is that any entity names marked as generated are not sacred and
are free to be automatically renamed by different catalog-updating actions
such as source code optimizers.  Maybe also have something to distinguish
things declared in positional format so "0"=> etc don't appear, maybe.

* Numeric updates ...
See http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html .
Excise the M;N format for bases 17..36 leaving just 2..16, absolutely.
Use "#" as separator rather than ";".
Write M as a base-10 integer rather than a single character.
These are more like Ada "based" literals then, read better, frees up ";".
So 16#FF is an integer, 16#'FF' is a blob.
Maybe also add Perl-6 inspired commalists, like this:
60#[43,5,12] (integer); no good reason for a blob analogy.

* IN PROGRESS ...
Rewrite/update anything talking about matters affected by process isolation
to both declare that Muldis D is generally orthogonal or agnostic to such
matters and makes no guarantees in general that any routine, even a recipe
or updater/function invoked by one, will see a consistent view of the
database during its execution, and generally remove "atomic" terminology,
and rename "nested transaction" to some other terminology.  Rather, any
guarantees of serializability of a recipe/etc will need further work by
users such as to explicitly configure their isolation or locks or whatever
as appropriate, and of course everything's affected by what DBMS you use
and what concurrency models it supports, such as locking or MVCC.  Likewise
the model being used affects when conflict errors may manifest, eg at
commit or earlier, or when/if user tasks will block, or how complicated it
is to resolve or avoid a conflict.  Matters of the concurrency model or
isolation are best not legislated by Muldis D but be left up to the
implementations and users.  Muldis D just has to require that the database
is always in a consistent state on statement boundaries et al.

* Define how one can split a PTMD_STD depot into multiple text files since
you would conceptually put an entire potentially large program in one.

* Tweak the STD dialects to account for defining system modules with them.

* Update STDIO.pod and Cast.pod concerning the Text types split.

* PACKAGE:
- Support variant of "<[ a..z A..Z _ ]><[ a..z A..Z 0..9 _ - ]>*" nonquoted
name strs that's more liberal "<[ a..z A..Z 0..9 _ - ]>+" for just atnms,
possrep names, param and arg names, so any of "-foo", "3", "-4" can be bw.
- Update system catalog and grammars to add lightweight aliasing support
for whole materials, as a new "synonym" (name?) material.  These have no
mutual order but the actual non-synonym target is the "primary" name.
Grammar can be "synonym foo of nlx.lib.bar" et al in general form, or
"function foo|bar|baz (...) {...}" where original is the first one "foo"
and the other synonyms all live in the same subdepot, and in particular the
others are "not" inner materials of "foo".
- Also add [Integer, Rational, Boolean], make [Int,Rat,Bool] into synonyms.
- Likewise (and necessarily), subdepots themselves can have synonyms.
- Also update tuple (and by extension) database types to add attribute
synonyms which semantically are lightweight virtual attribute maps that
simply make 2 attributes always-identical so only one ever needs storing
and no map function is required.  Not the same as material/sdp synonyms.
- Update system catalog and grammars to add support for routine parameter
aliases, built-in to the definitions of the routines; all names for a param
are defined in an array, that ordering being source-code-metadata, and the
first item in the list being the "primary" name.
Grammar can be "function foo (Int <-- topic|0 : Int, other|1 : Int) {...}".
This is not supported for param names in generic expr context except for
the shorthand "=>foo", so "=>1 is allowed".
- Change grammar so any number positionals supported for both s-d and u-d,
always map to "0","1".."N" and *not* "topic","other".
Also, any ".foo" now is short for "0.foo" rather than "topic.foo".
- Consider changing param names of special routines like value-filter etc,
or at least change any "topic" to "0" (other "1") so it works with ".foo".
- Update the documented signatures of all system-defined routines to use
the updated grammars reflecting the above additions.  Add param aliases of
"0" and "1" for every "topic" and "other" respectively, keeping said old
names too, and add other aliases as appropriate.  Add routine synonyms for
every distinct way of spelling a routine that rtn-invo-alt-syn provided, so
one can then always use that spelling in "foo(...)" plain-rtn-inv syntax;
update all routine docs so that the "also known as" comments no longer
mention any declared synonyms, no longer mention anything as "C<foo>" but
rather just anything as "I<foo>".
- Just stick to that, basically, leave anything else such as Unicode or
rtn-invo-alt-syn alone/not-removed for this release.

* Make the fully qualified language names declarable by code to be more
flexible than the names declared by the language spec itself, so that the
ones in code can specify multiple language versions that they conform to,
as if the code is declaring that it only uses the parts of the language
spec that are unchanged or that intersect between all the specified
versions.  For example, let one say:
    Muldis_D:"http://muldis.com":{0.112..0.125,0.127..0.136}:PTMD_STD
... and then any implementation which takes one of those versions will
parse the code according to any of those same that it supports.  The idea
is to make code more easily compatible with a wider range of interpreters,
such as newer ones designed for version 25 who don't explicitly know how
to emulate version 23 or know what its differences are, and are just
trusting the code to be valid for version 25 even if declaring 23; by the
code declaring a range, it is declaring it is willing to take its chances.
- A multiplicity of stated authorities is also possible.
- A number of consequences still have to be thought out here.
- Specifying a closed range is the code saying it *knows* it is compatible
with all those versions, specifying an open range says take chances or is
not recommended and maybe won't be supported.

* Consider adding a midweight version of virtual-attribute-maps which is
like the fullweight version but that it expressly maps 1 attr to 1 attr; it
still uses a map function but that is no longer Tuple<--Tuple.

* Demote the "[array|set|etc]_of" types from a special concept knowable by
the backend (and explained in Basics.pod), where you can essentially use
some data types without them being declared as system catalog materials, so
that instead actual s-c materials *are* required; this syntax will remain
only as a dialect feature which is a shorthand for inline type definitions;
eg, these are now all equivalent:
    - param : set_of.Foo
    - param : relation-type { over tuple-type T { value : Foo } }
    - param : set-type over Foo
... or we might consider more material kinds specif to [set|array|etc]-type
so to help preserve the user's syntax and be more compact, maybe, but those
6 or so could probably be repr by single m-k which has an enum type attr.
Also thanks to the change about replacing N-adic with dyadic s-d routines,
and its precedent, there is less need for "foo_of" shorthands anyway.

* Externalize all the details of character string repertoires or encodings
from the Muldis D core, such that say all the details of Unicode become
part of a Muldis D extension instead, and maybe ASCII likewise.
More plans pending.

* Considering the following items where non-ASCII chars are much more
pervasive (though strictly optional), replace the "op_char_repertoire"
pragma with a pragma that affects all non-quoted code in general, including
all nonquoted (but not quoted) entity names.  The options would be, at
least, the 3: ASCII, Unicode_6.0.0_canon, Unicode_6.0.0_compat.  There
would separately be options for each kind of quoted character string:
quoted entity names, texts, comments; see later TODO item about this; as
per that, all of these could be part of a single pragma.  A simpler
implementation could support only ASCII across the board as literal
characters, while non-ASCII data could be supported as escape sequences.

* Enhance the cat-type/syntax for defining tuple types as attr lists (and
by extension, relations and scalar possreps) to let users provide an
optional hint for the order that tuple/sca-pr/etc attributes should be
consulted when doing an equality test between 2 tuples/etc so to direct the
DBMS to do the least expensive comparisons first, eg integer attributes,
prior to more expensive ones, eg blob attributes; since the test
short-circuits, and assuming the vast majority of compares would return
false, this should aid performance in a clean way without users resorting
to overload operators or something for performance reasons.  This is a
separate hint from that garnered by marking relation attrs as key attrs,
and could work within that eg to suggest order within multi-attr keys.
Other areas in the language could probably be assisted by hints also.

* Update the PTMD_STD grammar to split up the "Name_payload" or its parts
further so that, rather than just the 2 "[|non]quoted_name_str", there is
at least the additional "nonquoted_rtn_invo_name_str" which is only allowed
to be used in a routine invocation context like <op><unspace>(...), with
trailing parenthesis, and not in a context lacking trailing parenthesis.  A
"nonquoted_rtn_invo_name_str" is a nonquoted string containing no
whitespace and, in addition to all the chars nonquoted_name_str allows,
also many other symbolic chars such that wouldn't confuse the parser, so
bracketing chars would likely be disallowed, at least as leading or
trailing characters in the string, and trailing colon could be disallowed,
and leading comma or leading => etc.  The idea here is that people can then
write "+(foo,bar)" for addition or "++(foo)" for increment, or "=(foo,bar)"
for comparison, "@(t)" or "%(r)", or ":=(target,value)" for assign.
In this case, if infix ops are allowed, they'd have to have mandatory
surrounding whitespace.
We also generally have to revisit Unicode for what is allowed in bareword
variable/etc names such as non-Latin or accented letters in general.  The
parser would have to use Unicode character classes in its definitions,
then.  Look at what Perl 6 does for some guidance.
As per another change, also assume that the idea of the internal catalog
no longer using Unicode for sys-def entity names is no longer true.
So Muldis D would then much more be Polish notation (with parens) by
default, and it should be much easier to just use the whole language that
way when it is more terse like this.  Supporting polish without parens
would be up to rtn-inv-alt-syn replacemnts while above is in plain-rtn-inv.
See also the 2nd(+?) next TODO item on splitting rtn-inv-alt-syn.
Also add yet another nonquoted...name_str that is just for use with
attribute/param/arg names and is only slightly less restrictive than the
old nonquoted_name_str in that it also allows strings of just or leading
digit chars; this is mainly so one can write positional params wo quotes.
Maybe just this last one can be added ASAP, and the other wait longer.

* Consider creating a branch of the Muldis D spec (and of the Muldis D
Manual) which retains all of the current spec features, and subsequently
strip out the whole rtn_inv_alt_syn catalog abstraction level in trunk so
that we can more radically evolve the language design at the more
fundamental level which plain_rtn_inv has access to, without worrying about
clashes or the complexity of a dozen-plus-precedence-level grammar.
Ideally the more fundamental level can evolve to the point that a
lot of what rtn_inv_alt_syn offers is no longer necessary in practice
with regards to making the code more terse.  The branch would merge in the
more fundamental changes with the old retained rtn_inv_alt_syn to see how
they might look together, or show how the new is absorbing the old; ideally
their differences would reduce over time without th branch losing features.
In the interest of marketing, the reduced trunk would retain all or much of
the example code using the then-removed features, as well as gain ones
using not yet specced features.  Each examples section would potentially be
split in 2, with the normal "Examples" just using the reduced spec features
and a new "Potential Future Examples" having anything not yet specced.
Also, the 3 Dialect files wouldn't actually lose the rtn_inv_alt_syn
precedence level but rather it would be made impotent as the grammar would
just define it as a non-proper superset of plain_rtn_inv for now; mainly
the change is that the 2 main pod sections "FUNCTION INVOCATION ALTERNATE
SYNTAX EXPRESSIONS" and "IMPERATIVE INVOCATION ALTERNATE SYNTAX STATEMENTS"
would be removed, or alternately stripped down to collection of "Potential
Future Examples" sections with a bit of commentary to explain if needed.

* The new version may be a lot easier to learn, considering that SQL + many
other C-like languages actually don't have too many non "f()" format ops.
Perhaps the main use of rtn_inv_alt_syn later is for people that want their
code to look like math/logic/etc exprs rather than named function calls.
IDEA:  Split rtn_inv_alt_syn into 2 abstraction levels where the lower one
has just 1-2 dozen or so plain prefix/infix ops such as
[:=, =,≠,!=, <,>,≤,<=,≥,>=,--,++, not,!,and,or,xor, +,-,|-|,*,/, ~, @,%,#]
and few are allowed having modifiers or that aren't in most languages.
Likely disallowed in lower level are [<=>,abs,div,mod,exp,^,**,log], the
other math ops, all other or Unicode variants of logic ops, all hyper-ops
including hypers of := or !, practically all relational/set/array/etc ops
including membership or sub/super tests.  As a middle-ground, for which we
could probably have a middle-third level from the split, are all the
postcircumfix ops that do restricted-to-constants shorthands of the likes
of array element access, projection, rename, un/group, un/wrap etc.
Things like the full set of infix logic ops are reserved for highest level,
and likewise for majority of Unicode ops and their ASCII-symbolic versions.
Now assuming we get generic <sym-op>(...) in plain-rtn-inv, and so
"+(foo,bar)" etc is an option, then we should reprioritize the above 3
post-split levels so that a level adding just postcircumfix syntax for
project/group/ary-acc/rename/etc should be the lowest additional level, so
one can be able to say "foo{...}" without also needing support for foo+bar.
Maybe call that new lowest "rtn_inv_pcfx_alt_syn".  Making postcircumfix
the lowest alt syn is also fitting because just it is like some of the
lower levels such as code-as-data where using some syntaxes make certain
inputs hard-coded, such as the attr names or interval-endpoint-flags,
versus those taking variables in the the more verbose generic syntaxes.
Presumably all levels higher than rtn_inv_pcfx_alt_syn are plain infix
or paren-less prefix with fully-variable arguments like generic functions.

* Maybe this isn't feasible, but ...
Consider formally making every function map 1:1 from a tuple input to a
tuple output; it declares exactly 1 parameter that is a tuple type and its
result declared type is a tuple type.  Consider making every updater
formally do something analogous, such as having exactly 2 tuple-typed
parameters where only 1 is subject-to-update.  A recipe is like that but
has 4 tuple-typed parameters, 2 like updater and 2 global alias analogies.
A virtual attribute map kind of resembles this already.
Doing this would require making tuple attribute accessors special, their
own expression/etc node kind and not just a function ... though they kind
of are already as an alternative; also, variable assignment would have to
be a special node kind and not just an updater; in both cases, to save
their definitions from being mutually recursive.

----------

* In all 3 STD.pod, add code examples for each of these 4 material kinds:
scalar-type, domain-type, subset-type, mixin-type.

* In all 3 STD.pod, complete the description text, defining interpretation
in PTMD_STD and structure in the 2 Perl-STD, for each of these 7 material
kinds: scalar-type, tuple-type, relation-type, domain-type, subset-type,
mixin-type, subset-constraint.

* In all 3 STD.pod, populate the entire pod sub-section for each of these 2
material kinds, to provide concrete grammar, description text, and code
examples: distrib-key-constraint, distrib-subset-constraint.

----------

* Eliminate the simple monadic postfix special syntax category.  Convert ++
and -- into simple prefix ops, because an expression with
that in it is no longer end-weighted, and it would be less likely to
confuse people into thinking the op is variable increment rather than just
returning a result.  Removing the category also simplifies the parser as
there are no longer pre vs post precedence conflicts, and helps open the
door to the parser being more generic.  Simply eliminate postfix "!"
factorial or change it to prefix "fact".

* Update Basics.pod or other places to distinguish between the 2 main ways
that a type can be infinite, such as with "outwardly infinite" and
"inwardly infinite"; the later is when any 2 values have an infinite number
of others between them, so eg a time-of-day type could be infinite in the
inward sense but not in th outward sense; th result type of sin() likewise.
Also, the singleton types -Inf, Inf only refer to outwardly infinite types.

* Change the basic exception throwing mechanism from a function/procedure
to its own expression/statement node kind.  Call the new node kind "fail"
or "failure" or "throw" or "raise" something.  The "fail" node has a child
expression node or references a variable node which defines an Exception
value.  Simply evaluating a "fail" expression node will throw the exception
so a "fail" expr node is expected to only be the child of a short-circuit
expression like ??!!.
- Add a "fail" term, which throws a generic/default Exception value,
and/or a tight-binding "fail" prefix-keyword which takes an Exception arg;
that term/prefix is the concrete syntax for the new fail node.
- The "assertion" function can then go away; instead of writing
[$foo asserting $foo != 0], say [$foo = 0 ?? fail !! $foo].
- Add a few simple functions that each result in a kind of generic
Exception value.  At least have a niladic one for the most gen exception.
Then one could write [<cond-expr> ?? gen_exception() !! <expr-when-ok>].
- The treated() function then is just a wrapper over ??!! + isa.
- The fail() procedure will go away, replaced with a term/keyword also,
which maps to the "fail" statement node.
- Maybe use 'fail' for niladic term and 'raise' for prefix term?
- New keyword speelings:
    - failure
    - raised <expr>
    - fail
    - raise <var>
- Maybe alternatively, make an assertion into a lexical entity that is like
an expr node but doesn't have its own node name, and so is always used
either inline or offside, the main point being that users don't have to
come up with another node name when the node represents the same value as
another node and should naturally just have the same name.
Example:
    foo ::= ...
    asserts bar( foo )
    baz( foo )
... here, the assertion only happens when baz() is going to be evaluated;
the spelling is "asserts" since it should be an adjective.
- There also needs to be a version that can assert multiple exprs.
- Or actually, the ??!! version may still be better?
- Naming the "duplicate" isn't actually that hard; just use a leading
underscore, eg:
    _foo ::= foo asserting bar
    _foo ::= bar ?? foo !! failure
... so maybe that's best?
- A BIG THING TO CONSIDER HERE IS, HOW DO FUNCTIONAL LANGUAGES MAKE
ASSERTIONS ON COMBINATIONS OF ARGUMENTS ... OR IS THE ANSWER THAT ALL
FUNCTIONS HAVE EXACTLY ONE ARGUMENT?  SEE WHAT HASKELL/ETC DOES.

* Change generic assertion mechanism from a function/procedure to its own

* Add support for materials to have aliases.  But this kind of alias would
be simple, just an alternate unqualified name that exists in the same
namespace and is for the same material.  Aliases would be declared with an
"aliases" attribute, typed set-of-Name, held directly in the same catalog
types that have "name" attributes; for example, add it to the "FunctionSet"
type.  So, R.count becomes a simple alias for R.cardinality, and we can add
a whole bunch more aliases, so to make it friendlier for people who prefer
to call routines with foo(x,y) syntax rather than alternate symbols.  A
common use could be to provide both "prefix" and "infix" reading names,
such as both "product" and "multiply", and especially to give shorthands.
Example: "function product|multiply|mul (Int <-- x : Int, y : Int) {...}".
The first one in the list is the primary name, remainder are the aliases.
Or actually, it would probably be better for FunctionSet et al to *not*
internalize aliases, but rather have each alias exist as a separate
material which cites what it aliases.  And then that version could exist in
any public namespace (usually nlx), and not just the same subdepot as what
is being aliased.
The SYNONYM schema object of Oracle and other dbs corresponds to this, and
maybe "synonym" is what I should call mine too, being what the specific
material kind is called, leaving "alias" as a more generic term.
Even if we have separate synonym materials for routines/etc, one can still
declare them bundled into their originals like in the above foo|bar example
as that would just be a dialect shorthand but produce separate materials.
Also useful in support of users having their own home subdepots which have
aliases to the things they use, without them having to know where they are.
Add alias for every 'op' node 2nd element for a routine, meaning eg add
"+" and "⋈" as aliases, and so then a Muldis D parser can then produce
calls to those, as if one said `"+"(4,5)` or `"⋈"(foo,bar)`, and so we can
better remember the individual syntactic choices that the users made.
But then, how do we deal with the idea of making logical-not into a meta-op
so that there is no actual is_not_same|"≠" function etc; how do we
preserve user's individual syntactic choices then?  So think about that.
While SQL synonyms can also be used for relvars, mine would probably only
be used for materials - types, routines, stim-resp-rules, themselves, etc;
perhaps leave relvar aliases to be handled by virtual attributes.

* With the improvements from having aliases or supporting "+"(x,y) etc, and
other language improvements, it becomes a lot more feasible for users to
settle for users to be satisfied with "plain_rtn_inv", that being
sufficiently terse, and so there is less need for "rtn_inv_alt_syn" to be
implemented or available.

* Maybe also treat material names like `function "infix<+>" (...) {...}` as
special such that if a parser encounters a random "foo + bar" then it would
parse it as if it were `"infix<+>"(foo,bar)` maybe I guess.  But if this is
going to work in a general sense, including for user-defined things, then
general format rules have to be set out for the parser so that if it sees
anything like X, without knowing what ops are declared, then it treats it
as an operator rather than some other construct.  On the other hand, we're
sure to run into trouble in trying to support non foo(x,y) syntax for
user-defined operators (besides those overloading system-defined virtuals),
and so better off just not doing this period; "infix<+>" is not special.

* Add support for routine parameters to have aliases, that is, for a named
parameter to be able to bind with a named argument where the argument may
have several possible names.  One use for this would be to support
parameters where it is desired to refer to them within their routine using
one name, but to use a different name in the argument, such as because the
latter is shorter or reads better (the Perl 6 spec should have some
examples of this).  Another use for this is to provide better support for
mixtures of arbitrary numbers each of positional and named routine
arguments; any parameters that would be reasonable to have a positional
argument would have 2 names, where one is an integer and one is text.  All
Muldis D grammars would be updated to no longer consider 'topic' and
'other' as special, which is a contrived notion, and instead consider
'0','1',... special.  And so, for all system-defined or user-defined
routines, any `op(foo,&bar,baz)` would be parsed into the same thing as
`op("0"=>foo,&"1"=>bar,"2"=>baz)`, and `.name` would be `"0".name`.  Now it
will so happen that "topic","other" will be commonly used in parameter
names, typically paired with "0","1" but we can now be a lot freer to name
parameters something more descriptive, such as "addends", and not
artificially make them topic/other simply so they support positional
syntax.  An idea for declaration syntax when aliases exist is to use the
"|" char; eg `function foo (Int <-- topic|"0" : Int, other|"1" : Int)`.
Of course, this complexity is only in param lists; arg lists are unchanged
and still are plain tuples with a single name per attr/arg.
For simplicity, a single param name will be more important than the others,
and only that would be its "expression node name" or "variable name" within
its routine, by which it must be referenced; therefore, the current
system catalog for declaring parameters can remain unchanged, and new
rtn-decl-type rtn-heading-attrs can be added to declare aliases.
Largely for flexibility, and correctness where they don't make sense,
parameters will never automatically have a number alias, but rather only
when the routine definer explicitly gives it one.
Of course, these aliases only apply to regular params, not global params.
One result of this change is that the Muldis D grammars will no longer
consider positional ro and rw args in separate spaces such that they can
appear in either order; now all positional args must be in the correct
mixed relative order, as there is only one "0", not one per ro and rw.

* Add special syntax for more ops:
    - ?#foo - "has 1+ elements" - is_not_empty(foo)
    - !#foo - "has zero elements" - is_empty(foo)
    - foo :=!# - assign_empty(&foo)
... and maybe rename underlying routines in the process.

* Update the mixins feature to add support for mixins that define
attributes that types can compose, whereby we support some approximation of
"specialization by extension" while still actually being just
"specialization by constraint".
Maybe also it could be said ...
A primary purpose of mixins is to help with managing software reuse, mainly
when multiple types have a number of attributes in common, a mixin can
define these and then the multiple types can compose that mixin.  A mixin
or type that composes a mixin can both add additional attributes of its own
to what the mixin defines, and the composer can add extra constraints over
the composed attributes like forcing a subtype.
Maybe also do ...
Support delegation / 'handles'; for example:
    - Name explic delegate to Text attr
    - maybe Blob, Text explic delegate to String attr
    - a ColoredCircle would delegate to both Color and Circle attrs?
This will all take some work to get right; not /all/ Rat/etc can be subst.
Probably *only* those operators that Rational/etc explicitly declares can
be delegated to Rat/etc by TAIInstant/etc.

* Replace many N-adic routines with dyadic ones, specifically
those whose definition is a repetition of a dyadic operation (so, 'sum' or
'join' etc yes but 'mean' no), which users then can invoke by way of a
reduction function if they want N-adic syntax.  Also let system catalog
store more information such as whether or not functions are commutative or
associative or idempotent or symmetric etc; likewise, the function def can
store what the operation's identity value is, if it has one, as meta-data,
useable when comm/assoc; the reduction func can read this using a
meta-programming function or something.  Reduction will fail if used on a
base func that doesn't define an identity if given an empty list.
The point of this change is to make the common dyadic case of N-adic
operators simpler, and also set a foundation for user-defined operators
that provide more information such that a compiler can be more effective
in optimizing them, or something.
The explicit/normal way, then, to indicate in code whether you want the
parser to produce a reduce op wrapper call rather than nested direct
invocations in the system catalog, is to just invoke the reduction
operator directly and explicitly pass an operand list; but the reduce op
would have special syntax, taking normal collection exprs, such as:
    [+] {5,23,5}
    [~] ['hello', 'world']
    [join] {order,inventory}
    [*] {1..5}
... or something.  Not using that would parse into nested dyadic calls
instead though the compiler can still rearrange.
Once we do that, its also simple to add hyper-operators, though arguably
these are redundant with 'map' or 'extension' etc.
Or this would be better for simplicity, given it won't be used as often,
and any dyadic infix function at all may be used, spelled the same way:
    reducing + {4,23,5}
    reducing ~ [...]
    reducing join {...}
    reducing * {...}
    reducing <nlx.lib.myfunc>(a=>3) {...}
... and so the regular operators can be parsed as usual.
Or maybe:
    reduced {4,23,5} using +
    reduced {...} using <nlx.lib.myfunc(a=>3)
... but that might have an end-weight problem?
Or, still go symbolic like the first one, but use prefix notation so that
it works well with both symbolic and wordy or inline-defined operators:
    []+ {5,23,5}
    []~ ['hello', 'world']
    []join {order,inventory}
    []* {1..5}
    []<nlx.lib.myfunc>(a=>3) {...}
Another consideration is that, when combined with routine synonyms that are
symbolic, the plain_rtn_inv alone would let you do this:
    reduce( <"+">(), {5,23,5} )
    reduce( <"~">(), ['hello', 'world'] )
    reduce( <join>(), {order,inventory} )
    reduce( <"*">(), {1..5} )
    reduce( <nlx.lib.myfunc>(a=>3), {...} )

* Furthering the above, add somewhat generalized support for what Perl 6
calls "meta" operators, at least in that we define and exploit several.
The general reducer above would be one of these.  Another is the negated
relational, whose syntax is putting ! or not- in front of any Bool-resu op.
Another is the assignment, putting := in front of any function.
For !, we can then eliminate all the "not" variants of any Bool-resulting
functions, so eg "x != y" parses into "not(is_same(x,y))", same as if
they had said "!(x = y)".  As for the old intended purpose of all the not-
variants, which is to preserve the user's intent of how code should look,
we could simply have an alias for the not() function which is what is
parsed into when != is used, and the old not() is just parsed into when the
separate prefix op is used.  On the other hand, while lots of not- variants
would go away, we'll keep the alias-but-param-order-reversed dualities such
as less-than/greater-than and sub-superset; unlike these, what we're
eliminating would not result in losing track of which args are lhs/rhs.
A related change is infix ops like ≠ or ⊈ would parse into not(foo()) even
though they don't have the !; these would be aliases for the combos, same
as Perl 6 has != as an alias for !==.
For :=, we can eliminate all the updaters that are just shorthands for
doing an op and assigning the result to one of the args.  And so a
"foo :=union bar" would parse to "assign(&foo,union(foo,bar))".  Once
again, an alias for assign() can exist which such combos are parsed into,
where the regular assign() is used when users write "foo := foo union bar".
Of course, despite Muldis D requiring operator combos where singles used to
work, we assume that implementations will be smart enough to, say, use a
single "!=" or "insert into foo ..." etc when it sees the combination, so
there is no performance loss.
Probably, any meta'd operator would have the same precedence as the base
operator that it is modifying.
Adding the hyper-meta may not be useful since we already have map()/etc;
or alternately it might be useful in avoiding some uses of map or extend
or substitute etc where users are just adding/defining one attr.
Or maybe hyper-meta would only be useful with Set/Array/Bag because the
general map/extend/etc would require naming the attribute explicitly.
As for ASCII vs Unicode etc, that preference is never encoded in the system
catalog, so when code would be generated from the system catalog, it would
be up to the generator's configuration for which versions are used.

* Add hyper-meta in a more general fashion, as per the Ranked general type
of which Array is a more specific kind.  The hyper-meta is fundamentally
associated with the join operator, because it typically involves taking 2
relations, joining them on one set of same-named attrs (exactly 1 usually),
and then taking another set of *same-named* attrs and applying the hypered
op pairwise and deriving a single replacement set of those attrs with the
results.  The argument attrs would be renamed distinct first.  For example,
given 2 relations A{key,value,x} and B{key,value,y}, where we assume that
"key" is a unary key of each relation, the expression
"A >>+<< B" is roughly like this code:
  with (
    a ::= A{%others_a<-!key,value}{value_a<-value}
    b ::= B{%others_b<-!key,value}{value_b<-value}
    ab ::= a join b
    f ::= function (Tuple <-- t : Tuple) {
      %{ value => t.value_a + t.value_b }
    }
    fr ::= extension( ab, <nlx.lib.f> ){!value_a,value_b}
  )
  fr{<-%others_a}{<-%others_b}
And the result is a relation with heading {key,value,x,y} but of course
with the more typical case the inputs and output are just {key,value}, in
which case that simplifies to:
  with (
    a ::= A{value_a<-value}
    b ::= B{value_b<-value}
    ab ::= a join b
    f ::= function (Tuple <-- t : Tuple) {
      %{ value => t.value_a + t.value_b }
    }
    fr ::= extension( ab, <nlx.lib.f> ){!value_a,value_b}
  )
  fr
A variant taking a relation and a tuple would be like the >>+>> /etc form.
We might have variants for join vs union etc or generalize this further so
that bag/counted variants of relational ops can be defined using this
generalized hyper in combination with the regular relational ops, maybe.

* About extra metadata in the system catalog for functions/etc, see
http://www.postgresql.org/docs/9/static/extend.html for some ideas, such
as 35.13.x on Pg's use of COMMUTATOR and NEGATOR where function pairs
declare their complement operator.  The first pairs up "<" and ">" say (and
"+" pairs with itself) while the second pairs up "<", ">=" (dbl-chk that).

* Note that Pg exts are like Muldis D system modules in what they do, such
as that they add types and routines etc to the language.

* Change multi-update to be a sequence of statements rather than a set, and
explicitly allow the same target to be used more than once ... this could
be the case anyway thanks to virtual relvars etc.

* Move or adapt more Text functions into Stringy.
- Fundamentally all Stringy funcs work on Text in terms of the
"maximal_chars" possrep; this will just work correctly for when all
func args are of the same Text subtype, such as Canon etc.
- The Stringy/Text ops are analogous to Rational ops such that it is like
doing fraction math.  catenation() is like sum(), replication() is like
multiply, a substring test is related to difference/subtract (maybe "?~"
and "!~" might work as infix ops for something?).
- Move cat_with_sep to Stringy; semantics are clear cut and generalizable.
- has_substr ought to work with Stringy no problem from the Text and Array
perspectives, but Blob presents an issue purely concerning bit alignment,
such as whether we're searching on bits or on octet/etc alignments.

----------

* Update the virtual attributes maps so there is a way to manually specify
a reverse function, as meanwhile all the virtuals don't have to be either
read-only or updatable due to an automatically generated reverse function,
which might vary by implementation, which may be considered broken.  Note
that the reverse functions might have to be defined as per-tuple
operations, separately for insert/substitute/delete.

* Add new "material" kinds that define state constraints (address as simple
nlx.*.data.*), like type constraints but ref in reverse.

* Update the "material" kinds that def stimulus-response rules / triggered
routines so that they work for more kinds of stimuli, and maybe change the
keywords.  The material kind has 2 main attributes, where the "stimulus"
defines what to look out for and "response" defines what to do when the
former is sighted.  Some possible keywords for the first are "stimulus",
"cause", "when"; for the latter, "response", "effect", "invoke".

* Add new "material" kinds that define descriptions of resource locks that
one wants to get, starting with basic whole dbvar, relvar locks (address as
simple fed.data.foo.*, as well as simple relvar tuple locks (addr as prior
plus lists of values to match like with a semijoin); leave out generic
predicate locks at first but note they will be added later.
Update the system catalog concerning managing shared|exclusive locks or
looking for consistent reads between statements, etc.

* Large updates to docs concerning transactions and resource locking.
Note:  Supposedly PostgreSQL and MySQL use read-committed isolation by
default while SQLite provides serializable.

* Rewrite the "Exception" catalog type so it can carry metadata on what
kind of exception occurred, not just that an exception occurred.

* Also study SQL concept of conditions and handlers, looks sort of like
something between exception handling, signals; or it is their exceptions.

* Also adapt something like Postgres' LISTEN/NOTIFY/UNLISTEN feature, which
is an effective way for DB clients to be sent signals, such as when a
database relvar has changed.

* Use a conceptual framework for database transactions that is strongly
inspired by how distributed source-code version control systems (VCSs)
work, in particular drawing on GIT specifically.  The fundamental feature
of the framework is that the DBMS is managing a depot consisting of 1..N
versions of the same database, where every one of these versions is both
consistent and durable.  Each version is completely defined in isolation,
conceptually, and so any versions in a depot may be deleted without
compromising each of the other versions' ability to define a version of the
entire database.  It is implementation-dependent as to how the versions are
actually stored, such as each having all of the data versus most of them
just having deltas from some other version; what matters is that each
version *appears* to be self-contained.  Every version is created as a
single atomic action, and it is never modified afterwards, though it may be
later deleted (also an atomic action).  Every in-DBMS user process,
henceforth called "user", has its own concept of the current state of the
database, which is one of the depot's versions that is designated a "head".
A user's current head is never replaced during the course of the in-DBMS
process unless the user explicitly replaces it, such as by either
performing an update or requesting to see the latest version (the latter
done such as with an explicit "synchronize" control statement).  Therefore,
each user is highly isolated from all the others, and is guaranteed
consistent repeatable reads and no phantoms; they will get repeatable reads
until they request otherwise.  The framework has no native concept of
"nesting transactions" or "savepoints" or explicit "commit" or "rollback"
commands.  Rather, every single DBMS-performed parent-most multi-update
statement (which is the smallest scope where TTM requires the database to
be consistent both immediately before and immediately after its execution),
is a durable atomic transaction all by itself.  The effect of a successful
multi-update statement is to both produce a new (durable) version in the
depot and to update the executing user's "head" to be that new version (the
prior version may then be deleted automatically depending on
circumstances); a failed multi-update statement is a no-op for the depot,
and the user gets a thrown exception.  A depot's versions are arranged in a
directed acyclic graph where each version save the oldest cites 1..N other
versions as its parents, and conversely each version may have 0..N
children.  A child version has exactly 1 parent when it was created as the
result of executing a multi-update statement in the context of the parent
version; the parent version is the pre-update state of the database and the
child is the post-update state of the database.  A child version has
multiple parents when it is the result of merging or serializing the
changes of multiple users' statements that ran in parallel.  One main
purpose of tracking parents like this is for reliable merging of parallel
changes, so that the intended semantics of each change can be interpreted
correctly, and potential conflicts can be easily detected, and effectively
resolved.  More on how this works follows below.  Note that versions simply
have unique identifiers to be referenced with and there is no implied
ordering between them if they are generated as serial numbers or using date
stamps, though versions with earlier date stamps are given priority in the
case of a merge conflict.  So a multi-update statement is the only native
"transaction" concept, and it is ACID all by itself.  Now, the
multi-statement "transactions" or concepts of nested transactions or
savepoints would all be syntactic sugar over the native concept, and
basically involve keeping track of versions prior to the head and
optionally making an older one the head.  This framework uses the VCS
concept of "branching" (which is something that GIT strongly encourages the
use of, as GIT makes later "merging" relatively painless) as the native way
to manage concurrent autonomous database updates by multiple users.  By
default, when no users have made any changes to the database, a depot just
has a "trunk", and its childmost or only version is called "master"; every
database user process' "head" starts off as the "master" version when that
process starts.  Each (autonomous) user process that wants to update the
database will start by creating a new branch off of the trunk, and
subsequent versions of theirs will go into that, rather than into the trunk
or some other branch.  The trunk is shared by all users while each user's
branch is just for that user, as their private working space.  Note that,
unlike a VCS in general where branches can become long-lived and interact
with each other independently of the trunk, the framework instead follows
the typical needs of an RDBMS, which espouses a single world view as being
dominant over any others, and expects that any branches will be very
short-lived, not existing for longer than a conceptual "database
transaction" would; only the trunk is expected to be long-lived.  (This
isn't to say that a DBMS can't maintain them long term, but one that acts
like a typical RDBMS of today wouldn't.)  Note that the final action on a
branch that involves merging into the trunk, this would be perceived by all
other DBMS users as all of the changes wrought by the branch being a single
atomic update, though the user performing it may see several steps.

* Flesh out matters related to starting or communicating between multiple
autonomous in-DBMS processes, in general, besides the special case about
sequence generators.

----------

* Add to Routines_Catalog.pod and other files
definitions of any remaining routines, eg String routines, that would be
needed so that for all system-defined types all the necessary
system-defined routines would exist that are necessary for defining said
types, especially their constraint or mapping etc definitions.  So in
String.pod we need [catenation, repeat, length, has_substr] etc.
Also add "is_coprime" or GCD or LCM or etc which are used either in the
constraint definition of Rat or in a normalization function for Rat; see
also "the Euclidean algorithm" as an efficient way to do the calculations.

* Consider adding type introspection routines like: is_incomplete() or
is_dh() or is_primitive|structure|reference|enumerated etc.  Or don't
since one could look that up in the system catalog.  But more tests on
individual values might be useful, or maybe we have enough already.

* Add ext/TAP.pod, which is a partial port of Perl 5's Test::More / Perl
6's Test.pm / David Wheeler's pgTAP to Muldis D; assist users in testing
their Muldis D code using TAP protocol.  The TAP messages have type Text.

----------

* Add concept of shallowly homogeneous / sh- relation types to complement
the deeply version, and named maximal types like SHRelation, SHSet,
SHArray, etc to complement the DH/etc, and sh_set_of/etc to complement
dh_set_of/etc; but not sh-scalar or sh-tuple as the concept doesn't make
sense there.  Then update functions like Relation.union/etc to take
sh_set_of.Relation rather than set_of.Relation, which more formally defines
some of their input constraints.

* Consider adding an imperative for-each looping statement; the main
question here is whether it should work on any (unordered) relation or just
on an Array (in which case it iterates through the tuples in sequence by
index); the question is what tasks the for-each would be used for; perhaps
both versions are useful; presumably the main reason to have for-each at
all is when I/O is involved and some derivative needs to be output either
where order matters or where order does not matter; but perhaps only a
routine is needed here such as a catenate function plus normal I/O output.
The question also is what tasks would an imperative for-each be needed for
that functional constructs like the list-processing relational functions
can't better be used for those tasks instead.

----------

* Add a round-rule param to rat division, I suppose, since in general we'll
need it if we want to maintain a rational radix through every op (+,-,*
will already do so when all their args are in the desired radix).

* Add explicit support for +/- underflow, +/- overflow, NaNs, etc.
I'm inclined to think +/- zero is unnecessary when we have underflow and
can be confusing anyway (just a single normal number zero is better).
I'm not sure if +/- overflows are useful or if infinities cover them for
our purposes.  How this would work is that we define a set
of scalar singleton types, one for each of the special values.  Then we
define extended versions of the Int, Rat, etc types where the extended
types are defined in terms of being union types that union the regular
numeric types with the special singleton type values.  This approach also
means just one each of +Underflow, -Overflow, etc is needed and is a member
of extended Int or Rat etc.  Consider using the existing names "Int"/"Rat"
with the versions that include these special values, and make new names for
the current simpler versions that don't, such as "IntNS" (int no specials),
"RatNS", etc.  Either way, it is useful to support the full range of values
that a Perl 6 numeric can support, or that an IEEE float can support,
without users necessarily having to define it themselves.
IDEA:  Maybe make all normal math/etc ops work with the extended versions
(those with NaNs, infinities, etc) and in situations where users don't want
those special values they just use a declared type excluding them, and then
the normal type constraints will take care of throwing exceptions when one
divides by zero for example.

* Flesh out Interval.pod to add a complement of functions for comparing
multiple intervals in different ways, such as is-subset, is-overlap,
is-consecutive, etc, as well as for deriving intervals from a
union/intersect/etc of others, as well as for treating intervals as normal
relations in some contexts, such as for joining or filtering etc, as well
as a function or 3 to do normalization of Interval values.
Maybe the type name 'Range' can be used for something.
Maybe the type name 'Span' or 'SpanSet' can be used for something;
there are Perl modules with those names concerning date ranges.
Input is welcome as to what interval-savvy functions Muldis D should have.

* Flesh out some window/partition funcs, which are kind of like a
generalization of aggregation/reduction functions.  A window()/partition()
wrapper func is like the summary() wrapper func but it has the same number
of output tuples as input ones; when wrapping an agg/reduc func, all output
tuples have the same value per tuple in the same group; when wrapping a
window/partition-oriented func, such as rank(), each tuple in the group
gets or can get a different value.
See these:
- http://www.postgresql.org/docs/9.0/interactive/tutorial-window.html
- http://www.postgresql.org/docs/9.0/interactive/functions-window.html
- http://www.postgresql.org/docs/9.0/interactive/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

* IN PROGRESS ...
Add Bool-resulting relational operators EXISTS and FORALL, that provide
"existential quantification" and "universal quantification" respectively,
these being useful in constraint definitions.  See TTM book p168, pp394-5
for some info on those.  Also add analogies to Perl 5's List::MoreUtils
operators any(), all(), notall(), none(), true(), false(); some of those
may be the same as EXISTS/FORALL.  Also add an EXACTLY operator like the
Tutorial D language has, and a one() op that is between any() and none().
Maybe some pure boolean ops can be added analogous to the above also; eg
any() an alias for or() and all() an alias for and().
is_(any|all|one|none|notall|etc)_of_(restr|semijoin|semidiff|etc)
source is any|etc matching|where|etc filter|etc
ADD RELATIONAL OPERATORS THAT COMBINE BOOL OPS ADDED IN 0.80.0 WITH
RELATIONAL MAP/RESTRICTION/ETC AND ... The new functions are modelled after
some in Perl 5's List::MoreUtils module.
That is, add prefix ops exactly|all|any|one|none|etc
which take a relation and result in True or False depending on what that
relation's cardinality is.  In some cases, an extra arg is needed:
    - exactly((s⋉t),n) = (#(s⋉t) = n)
    - none((s⋉t)) = exactly((s⋉t),0) = !#(s⋉t)
    - any((s⋉t)) = !exactly((s⋉t),0) = ?#(s⋉t)
    - all((s⋉t),#s) = exactly((s⋉t),#s) = (#(s⋉t) = #s)
    - notall((s⋉t),#s) = !exactly((s⋉t),#s) = (#(s⋉t) != #s)
    - one((s⋉t)) = exactly((s⋉t),1) = (#(s⋉t) = 1)
OR MAYBE THESE AREN'T ANY MORE USEFUL THAN THEIR EQUIVALENT EXPRS.

* Consider adding sequence generator updaters|procedures in Integer.pod.

* Consider adding random value generators for data types other than integer
and rational numerics, such as for character strings or binary strings.

* Consider analogy to SQL's "[UNION|EXCEPT|INTERSECT] CORRESPONDING BY
(attr1,attr2,...)", which is a shorthand for combining projection and
union, that takes a list of attributes and unions the projections of those
attributes from every input relation; so this means, as with join(), that
the input relations don't need to have the same headings.

----------

* In PTMD_STD, consider further changes to how character escape sequences
in strings/etc are done.  For example, whether the simple escape sequence
for each string delimiter char may be used in all kinds of strings (as they
are now) or just in strings having the same delim char as is being escaped.

* IN PROGRESS ...
Update the STD dialects to support inline definition of basic
routines (and types?) right in the expressions/etc where they are used,
such as filter functions in restriction() invocations, so many common cases
look much more like their SQL or Perl counterparts, or for that matter, a
functional language's anonymous higher order functions.  This syntax would
be sugar over an explicit material definition plus a FooRef val selection,
which means the inner def effectively is an expression node, and users can
choose to name or not name the FooRef selecting node as normal with value
expressions.  It is expected that the materials could be decl anonymously
and names for them (the inn.foo, not the FooRef's lex.foo) would be
generated as per inline expression nodes etc.

* Further to the previous item, add some special syntax, similar to how one
references a parameter to get its argument's value, which can see into the
caller's lexical scope.  This would be sugar over declaring parameters with
the same name and having the caller explicitly pass arguments to it,
without having to explicitly write that.  Generally this syntax would only
be used with inline-declared routines.  But similarly, add some special
syntax allowing one to essentially just write the body of a routine without
having to explicitly write its heading / parameter list, which is useful
for routines invoked directly from a host language, where said parameters
are attached to host bind variables.  Now one still has to say what the
expected data type is for these bind variables, but then the explicit
syntax for such Muldis D routines is more like that of a SQL statement you
plug into DBI or whatever, without the explicit framing.  May not work
anywhere, but should help where it does.  Maybe use $$foo rather than $foo
to indicate that the 'foo' wasn't explicitly declared in the current
lexical scope and we are referring to the caller or a bind variable.  Or
rather than $$foo, have something like "(param foo : Bar)" for an
expression-inline parameter definition and use, where the part after the
"param" has all the same syntax as an actual param list; this is the one
for host language bind parameters.  Actually that might be useful by
itself.  Similarly "(caller foo)" would be the look to parent Muldis D
lexical scope, or $$foo would just do that maybe, unless this should have
an explicit type declaration still.  Note, if same inline-declared host
param used more than once, you just need "(param foo : Bar)" form once and
other uses can just say foo as per usual; in fact, it must be this way.

* Consider in all STD adding a new pragma that concerns whether data in
delimited character string literals is ASCII or Unicode etc.
Example PTMD_STD grammar additions:
            <ws>? ',' <ws>? str_char_repertoire <ws>? '=>' <ws>? <str_cr>
    <str_cr> ::=
        '{' <ws>?
            [<str_cr_describes> <ws>? '=>' <ws>? <str_char_reper>]
                ** [<ws>? ',' <ws>?]
        <ws>? '}'
    <str_cr_describes> ::=
        all | text | name | cmnt
    <str_char_reper> ::=
          ASCII
        | Unicode_6.0.0_canon
        | Unicode_6.0.0_compat
Example PTMD_STD code additions:
    str_char_repertoire => { text => Unicode_6.0.0_canon,
        name => Unicode_6.0.0_compat, cmnt => Unicode_6.0.0_compat },
    str_char_repertoire => { all => ASCII },
Of particular interest is the Unicode canonical vs compatibility, that is
NFC|D vs NFKC|D; it is generally recommended such as by the Unicode
consortium to use canonical for general data but to use compatibility for
things like identifiers or to avoid some kinds of security problems; see
http://www.unicode.org/faq/normalization.html.  Note that compatibility is
a smaller repertoire than canonical, so converting from the latter to the
former will lose information.  The text|name affect how delimited char
strs that are Text|Name are interpreted, and the effects are
orthogonal to whether characters are specified literally or in escaped
(eg "\c<...>" form); canonical will preserve exactly what is stated (but
for normalization to NFD) and compatibility will take what is stated and
fold it so semantically same characters become the same codepoints (like as
normalizing to NFKD).  The suggested usage is compatibility for Name to
help avoid security or other problems, and canonical for Text; as for
comments, I currently don't know which is better.  If ASCII is chosen, the
semantics are different; with both Unicode any input is accepted but folded
if needed; for ASCII, it is more likely an exception would be raised if
there are any codepoints outside the 0..127 range in character strings.
The 'all' is a shorthand for giving the same value to all 3 text|name|cmnt
and is more likely to occur with ASCII but it might happen otherwise.
An additional reason to raise this feature is to setup support for other
char sets in future, such as Mojikyo, TRON, GB18030, etc which go beyond
Unicode eg no Han-unification (see http://www.jbrowse.com/text/unij.html +
http://www.ruby-forum.com/topic/165927) but type system also needs update.

* Update HDMD_Perl6_STD.pod considering that a 2010.03.03 P6Syn update
eliminated the special 1/2 literal syntax for rats and so now one writes
<1/2> instead (no whitespace allowed by the '/'); now 1/2 could still work
but now it does so using regular constant folding and so having a higher
precedence op nearby affects its interpretation.

* Update HDMD_Perl6_STD.pod considering names of Perl collection types,
such that "Enum" is the immutable "Pair" and "EnumMap" was renamed from
"Mapping", and "FatRat" is now the "Rat" of unlimited size, etc.

* Consider using postcircumfix syntax for extracting single relation
attrs into Set or Bag etc, meaning wrap_attr; eg "r.@S{a}", "r.@B{a}".
Now that might not work for Array extraction, unless done like
"(r.@A{a} ordered ...)" or some such, which isn't pure postcircumfix,
but that may be for the best anyway.

* Consider adding concrete syntax that is shorthand for multiple
single-attribute extractions where each goes to a separate named expression
node (or variable) but the source is a single collection-typed expr/var.
Or the source could be a multiplicity as well, or mix and match.
The idea here is to replicate some common idioms in Perl such as
"(x, y) = @xy[0,1]" or "(x, y) = %xy{'x','y'}", this being more useful
when the source is an anonymous arbitrary expression.
Proposed syntax is that, on each side of the "::=" or ":=", the source and
target lists are bounded in square brackets, indicating named items assign
in order, and syntax for collections supplying/taking multiple items are
ident to single-attr accessors (having a ".") but that a list is in the
braces/brackets; for example: "[x, y] ::= [3, 4]",
"[a, b] ::= t.{a,b}", "[c, d] ::= ary.[3,5]".  This syntax would
resolve into multiple single-attr accessors when app in system catalog.
The assignment variants of the above would naturally fall out the ability
to have arbitrary expressions on both sides of the ":=", so what you do is
have an array-valued expression on both sides, eg "[x,y] := [y,x]" works
because "[...]" is an array literal now.
We can overload ".[]" for tuples in general so they extract like projection
but return an array rather than a tuple, so we can then say
"[a,b] ::= t.[a,b]" or even "t1.[x,y] := t2.[a,b]" to multi-substitute,
that being a shorthand for "t1.x := t2.a, t1.y := t2.b".  We can't do that
for general relations though since the array subtype of rel is using it.
This mechanism also provides a general way for a function to have multiple
ord retv; eg, "[x,y,z] := foo(...)"; like Perl's "($x,$y,$z) = foo(...)".
A variable (or subject-to-update parameter), "bar", may be aliased using
"foo ::= bar" such that "foo" is an expr node, but like all named exprs in
procedures, "foo" is conceptually reevaluated per mu-statement.
Ordered tuples can be used instead of arrays, and in fact might be a better
solution for multiple reasons.  To do this, just say "%:{x,y,z}" rather
than "[x,y,z]"; the former is shorthand for '%:{"0"=>x,"1"=>y,"2"=>z}'.

* In PTMD_STD, consider loosening the grammar regarding some of the normal
prefix or postfix or infix operators so that rather than mandating
whitespace be present between the operators and their arguments, the
whitespace is optional where it wouldn't cause a problem.

----------

* Restore the concept of public-vs-private entities directly in sub|depots.

* Restore the concept of "topic namespaces" (analogous to SQL DBMS concept
of "current database|schema" etc) in some form if not redundant.

* Update the system catalog to deal with database users and privileges etc.

----------

* IN PROGRESS ...
A Muldis D host|peer language can probably hold lexical-to-them variables
whose Muldis D value object is External typed, and so they could
effectively pass around an anonymous closure of
their own language.  Such a value object would be a black box to the host
and can't be dumped to Muldis D source code.

* IN PROGRESS ...
Fully support direct interaction with other languages, mainly either peer
Parrot-hosted languages or each host language of the Muldis D
implementation.  Expand the definition of the "reference" main type
category (or if we need to, create a 5th similarly themed main category) so
that it is home to all foreign-managed values, which to Muldis D are simply
black boxes that Muldis D can pass around routines, store in transient
variables, and use as attributes of tuples or relations.  These
of course can not be stored in a Muldis D depot/database, but they can be
kept in transiant values of Muldis D collection types which are held in
lexical variables by the peer or host language; that language is then
really just using Muldis D as a library of relational functions to organize
and transform its own data.  We also need to add a top level namespace by
which we can reference or invoke the opaque-to-us data types and routines
of the peer or host language.  This can not go under sys.imp or
sys.anything because these are supposed to represent user-defined types and
routines, which in a dynamic peer language can appear or disappear or
change at a moment's notice, same as in Muldis D; on the other hand, types
or routines built-in to the peer/host language that we can assume are as
static as sys.std, could go under sys.imp or something.  This also doesn't
go under fed etc since fed is reserved for data under Muldis D control and
only ever contains pure s/t/r types.  Presumably this namespace will be
subdivided by language analogously to sys.imp or whatever syntax Perl 6
provides for calling out into foreign languages co-hosted on Parrot.  Since
all foreign values are treated as black boxes by Muldis D, it is assumed
that the Muldis D implementation's bindings to the peer/host language will
be providing something akin to a simple pointer value, and that it would
provide the means to know what foreign values are mutually distinct or
implement is_same for them.  One thing for certain is that every
foreign value is disjoint from every Muldis D value, and by default every
foreign value is mutually distinct from every other foreign too, unless
identity is overloaded by the foreign, like how Perl 6's .WHICH works.
The foreign-access namespace may have a simple catalog variable
representing what types and routines it is exposing, but to Muldis D this
would be we_may_update=false.

* IN PROGRESS ...
About External type ... update Perl5_STD and
Perl6_STD to add a new selector node kind 'External' which takes any Perl
value or object as its payload; this is treated completely as a black box
in general within the Muldis D implementation.  For matters of identity
within the Muldis D envirnment, it works as follows:  Under Perl 6, the
Perl value's .WHICH result determines its identity.  Under Perl 5, if the
value is a Perl ref ('ref obj' returns true) then its memory address is
used, and this applies to all objects also (since all refs are mutable,
this seems to be the safest bet); otherwise ('ref obj' is false) then the
value's result in a string context, "obj", is used as the identity; the
mem addr and stringification would both be prefixed with some constant to
distinguish the 2 that might stringify the same.  By default, an
External supports no operators but is/not_same.

----------

* Add new "FTS" or "FullTextSearch" extension which provides weighted
indexed searching of large Text values in terms of their component tokens,
such as what would be considered "words" in human terms.  This is what
would map to the full text search capabilities that underlying SQL DBMSs
may provide, if they are sufficiently similar to each other, or there might
be distinct FTS extensions for significantly different ones?

* Add new "Perl5Regex" extension which provides full use of the Perl 5
regular expression engine for pattern matching and transliteration of Text
values.  Maybe the PCRE library can implement this on other foundations
than Perl 5 itself if they are sufficiently alike; otherwise we can also
have a separate "PCRE" extension.  Or the same extension can provide both?

* Add new "Perl6Rules" extension which provides full use of the Perl 6
rules engine for pattern matching and transliteration of Text values.

* Add new "PGE" or "ParrotGrammarEngine" extension, or whatever an
appropriate replacement is, for pattern matching and transliteration of
Text values.  This and "Perl6Rules" may or may not be sufficiently similar
to combine into one extension.

* Add functions for splitting strings on separators or catenating them with
such to above extensions or to Text.pod as appropriate.  Text has one now.

* Update or supplement the order-determination function for Text so that it
compares whole graphemes (1 grapheme = sequence starting with a base
codepoint plus N combining codepoints, or something) as single string
elements, rather than say comparing a base char against a combining char.

* Add new "Complex" extension which provides the numeric "complex" data
types (each expressed as a pair of real numbers with at least 2 possreps
like cartesian vs polar) and operators.  Note that the SQL standard does
not have such data types but both many general languages as well some
hardware CPUs natively support them.  Probably make "Complex" a mixin type
and have the likes of "RatComplex" and "IntComplex" composing it.  Note
that a complex number over just integers is also called a Gaussian integer.
A question to ask is whether a distinct "imaginary" type is useful; some
may say it is and Digital Mars' "D" has it, but I don't know if others do.
In any event, complex numerics should most likely not be part of the core,
even though their candidacy could be considered borderline; for one thing,
I would expect that most actual uses of them would work with inexact math.

* Add other mathematical extensions, such as ones that add trigonometric
functions et al, or ones that deal with hyperreal/hypercomplex/etc types,
or ones with variants of the core numeric types that propagate NaNs etc.

* Consider adding a sleep() system-service routine, if it would be useful.

* Add multiplication and division operators to the Duration types; these
would both be dyadic ops where the second op is a Numeric.

* Consider adding a Temporal.pod type specific to representing a period in
time, maybe simply as an alias for 'interval_of.*Instant' or some such.
See also the PGTemporal project and its 'Period' type.

* Flesh out "Spatial" extension; provide operators for the spatial data
types, maybe overhaul the types.

* Consider another dialect that is JSON ... like HDMD in form, but stringy.

----------

* Add one or more files to the distro that are straight PTMD_STD code like
for defining a whole depot (as per the above) but instead these files
define all the system entities.  Or more specifically they define just the
interfaces/heads of all the system-defined routines, and they have the
complete definitions of all system-defined types, and they declare all the
system catalog dbvars/dbcons.  In other words these files contain
everything that is in the sys.cat dbcon; anything that users can introspect
from sys.cat can also be read from these files in the form of PTMD_STD
code, more or less.  The function of these files is analogous to the Perl 6
Setting files described in the Perl 6 Synopsis 32, except that the Muldis D
analogy explicitly does not define the bodies of any built-in routines.  An
idea is that Muldis D implementations could take these files as is and
parse them to populate their sys.cat that users see; of course, the
implementations can actually implement the routines/types as they want.
Note that although this Muldis D code would be bundled with the spec, it is
most likely that the PTMD_STD-written standard impl test suite will not.
Note that these files will not go in lib/ but in some other new dir.  Note
that it is likely any implementation will bundle a clone of these files
(suitably integrated as desired) rather than having an actual external
dependency on the Muldis::D distro.  Note that some explicit comment might
be necessary to say there are no licensing restrictions on copying this
builtins-interfaces-defining code into Muldis D implementations, or maybe
no comment is necessary.  Probably a good precedent is to look at what
legalities concern existing tutorial/etc books that have sample code.

* Create another distribution, maybe called Muldis::D::Validator, which
consists essentially of just a t/ directory holding a large number of files
that are straight PTMD_STD code, and that emit the TAP protocol when
executed.  The structure and purpose of this collection is essentially
identical to the official Perl 6 test suite.  A valid Muldis D
implementation could conceivably be defined as any interpreter which runs
this test suite correctly.  This new distro would be a "testing requires"
external dependency of both Muldis::Rosetta and any Parrot-hosted language
or other implementation, though conceivably either could bundle a clone of
Muldis::D::Validator rather than having an actual external dependency.
This test suite would be LGPL licensed.  This new distribution would have a
version number that is of X.Y.Z format like Muldis::D itself, where the X.Y
part always matches that of the Muldis D spec that it is testing compliance
with, while the .Z always starts at zero and increments independently of
the Muldis D spec, as often there may be multiple updates to ::Validator
for awhile between releases of the language spec, and also since .Z updates
in the language spec only indicate bug fixes and shouldn't constitute a
change to the spec from the point of view of ::Validator.

----------

* Whatever else needs doing, such as, fixing bugs.
