950221

0. Improved efficiency.

Don't bother doing any performance analysis until most of the
following items are taken care of, because there's no question
they represent serious space/time problems, although some of
them show up only given certain kinds of (popular) input.

* Improve malloc package and its uses to specify more info about
  memory pools and, where feasible, use obstacks to implement them.

* Skip over uninitialized portions of aggregate areas (arrays,
  COMMON areas, EQUIVALENCE areas) so zeros need not be output.
  This would reduce memory usage for large initialized aggregate
  areas, even ones with only one initialized element.

* Prescan the statement (in sta.c) so that the nature of the statement
  is determined as much as possible by looking entirely at its form,
  and not looking at any context (previous statements, including types
  of symbols).  This would allow ripping out of the statement-
  confirmation, symbol retraction/confirmation, and diagnostic inhibition
  mechanisms.  Plus, it would result in much-improved diagnostics.  For
  example, "CALL some-intrinsic(...)", where the intrinsic is not a
  subroutine intrinsic, would result actual error instead of the
  unimplemented-statement catch-all.

* Throughout g77, don't pass ffewhereLine/ffewhereColumn pairs where
  a simple ffewhere type, which points to the error as much as is
  desired by the configuration, will do, and don't pass ffelexToken types
  where a simple ffewhere type will do.  Then, allow new default
  configuration of ffewhere such that the source line text is not
  preserved, and leave it to things like EMACS' next-error function
  to point to them (now that next-error supports column numbers).
  The change in calling sequences should improve performance somewhat,
  as should not having to save source lines.  It might even be possible
  to change ffewhere from a pointer to a single 32-bit item that has
  24 bits for line#, 8 bits for col#, or something like that, if it's
  worthwhile for performance' sake at that point.  It might also be
  worthwhile to make it easy to configure away preservation of column
  numbers if that might make g77 faster, though with most Fortran
  programs, column numbers are quite helpful.  (Whether this whole
  item will improve performance is questionable, but it should greatly
  improve maintainability.)

* Handle DATA (A(I),I=1,1000000)/1000000*2/ more efficiently, especially
  as regards the assembly output.  Some of this might require improving
  the back end, but lots of improvement in space/time required in g77
  itself can be fairly easily obtained without touching the back end.
  Maybe type-conversion, where necessary, can be speeded up as well in
  cases like the one shown (converting the "2" into "2.").

* If analysis shows it to be worthwhile, optimize lex.c.

1. Better optimization.

* Get the back end to produce at least as good code involving array
  references as does f2c+gcc.

* Do the equivalent of the trick of putting "extern inline" in front
  of every function definition in libf2c and #include'ing the resulting
  file in f2c+gcc -- that is, inline all run-time-library functions
  that are at all worth inlining.

* When doing CHAR_VAR = CHAR_FUNC(...), and it's clear that types line up
  and CHAR_VAR is addressable or not a VAR_DECL, make CHAR_VAR, not a
  temporary, be the receiver for CHAR_FUNC.

* Design and implement Fortran-specific optimizations that don't
  really belong in the back end, or where the front end needs to
  give the back end more info than it currently does.

* Design and implement a new run-time library interface, with the
  code going into libgcc so no special linking is required to
  link Fortran programs using standard language features.  This library
  would speed up lots of things, from I/O (using precompiled formats,
  doing single or small #s of calls for arrays or array sections, and
  so on) to general computing (array/section implementations of
  various intrinsics, implementation of commonly performed loops that
  aren't likely to be optimally compiled otherwise, etc.).  Among
  the important things the library would do are: be a one-stop-shop-type
  library, hence shareable and usable by all, in that what are now
  library-build-time options in libf2c would be moved at least to the
  g77 compile phase, if not to finer grains (such as choosing how
  list-directed I/O formatting is done by default at OPEN time, for
  preconnected units via options or even statements in the main program
  unit, maybe even on a per-I/O basis with appropriate pragma-like
  devices).

* Probably requiring the new library design, change interface to
  normally have COMPLEX functions return their values in the way
  gcc would if they were declared complex float, rather than using
  the mechanism currently used by CHARACTER functions (whereby the
  functions are compiled as returning void and their first arg is
  a pointer to where to store the result).  Don't append underscores on
  external names for COMPLEX functions in some cases once g77 uses
  gcc rather than f2c calling conventions.

* Do something useful with "doiter" references where possible.  E.g.
  CALL FOO(I) cannot modify I if within a DO loop that uses I as the
  iteration variable, and the back end might find that info useful
  in determining whether it needs to read I back into a register after
  the call.  (It normally has to do that, unless it knows FOO never
  modifies its passed-by-reference argument, which is rarely the case
  for F77 code.)

2. Simpler porting.

* A new library (see above) should improve portability as well as
  produce more optimal code.  Further, g77 and the new library should
  conspire to simplify naming of externals, such as by removing unnecessarily
  added underscores, and to reduce/eliminate the possibility of naming
  conflicts, while making debugger more straightforward.  Also, it should
  make multi-language applications more feasible, such as by providing
  Fortran intrinsics that get Fortran unit numbers given C FILE *
  descriptors.

* Possibly related to a new library, g77 should produce the equivalent
  of a gcc "main(argc, argv)" function when it compiles a main program
  unit, instead of compiling something that must be called by a library
  implementation of main().  This would do many useful things such as
  provide more flexibility in terms of setting up exception handling,
  not requiring programmers to start their debugging sessions with
  "breakpoint MAIN__" followed by "run", and so on.

* The back end needs to understand the difference between alignment
  requirements and desires.  E.g. on x86 machines, g77 currently imposes
  overly strict alignment requirements, due to the back end, but it
  would be useful for Fortran and C programmers to be able to override
  these _recommendations_ as long as they don't violate the actual
  processor _requirements_.

3. More extensions.

* Support INTEGER/REAL/COMPLEX equivalents for all applicable back-end-
  supported types (char, short int, int, long int, long long int, and long
  double).  This means providing intrinsic support &c as well, and for most
  machines will result in automatic support of INTEGER*1, INTEGER*2,
  INTEGER*8, and so on.

* Provide as the default source-line model a "pure visual" mode, where
  the interpretation of a source program in this mode can be accurately
  determined by a user looking at a traditionally displayed rendition
  of the program (assuming the user knows whether the program is fixed
  or free form).  That is, assume the user cannot tell tabs from spaces
  and cannot see trailing spaces on lines, but has canonical tab stops
  and, for fixed-form source, has the ability to always know exactly
  where column 72 is.  Then provide common alternate models (Digital, f2c,
  &c) via command-line options.  This includes allowing arbitrarily long
  lines for free-form source as well as fixed-form source and providing
  pedantic limits and diagnostics as appropriate, plus even on a non-
  tabbed fixed-form line, treating a line with the first non-blank character
  starting with column 6 being a digit as a continuation line (to effect
  the "<TAB>1continuationline..." behavior in "pure visual" mode).

* Intrinsics in constant expressions.  This, plus F90 intrinsics such
  as SELECTED_INT_KIND, would give users the ability to write clear,
  portable code.

* Provide more intrinsics for system services like EXIT.

* A FLUSH statement that does what many systems provide via CALL FLUSH,
  but that supports * as the unit designator (same unit as for PRINT).

* Finish support for V027 VXT PARAMETER statement (like PARAMETER in
  stc but type of destination is set from type of source expression).

* Consider adding a NUMERIC type to designate typeless numeric constants,
  named and unnamed.  The idea is to provide a forward-looking, effective
  replacement for things like the VXT PARAMETER statement when people
  really need typelessness in a maintainable, portable, clearly documented
  way.

* Allow DATA VAR/.../ to come before COMMON /.../ ...,VAR,....

* Character-type selector/cases for SELECT CASE.

* Option to initialize everything not explicitly initialized to "weird"
  (machine-dependent) values, e.g. NANs, bad (non-NULL) pointers, and
  "-0" integers.

* Add run-time bounds-checking of array/subscript references a la f2c.

* Output labels for use by debuggers that know how to support them.  Same
  with weirder things like construct names.  It is not yet known if any
  debug formats or debuggers support these.

* Provide necessary g77/gdb support to make better native Fortran-language
  debugging.  In the meantime, see item about writing a file named CALLING,
  which would help users understand how various Fortran features are
  implemented at the debugger-visible level.

* Support the POSIX standard for Fortran.

* Support DEC-style lossage of virtual blanks at end of source line
  if some command-line option specified.  This affects cases where
  a character constant is continued onto the next line in a fixed-form
  source file -- g77, and many other compilers, virtually extend
  the continued line through column 72 with blanks that become part
  of the character constant, but DEC Fortran normally didn't.  (Fairly
  recently, at least one version of DEC Fortran was enhanced to provide
  the g77 behavior when a command-line option is specified, apparently due
  to demand from readers of the USENET group comp.lang.fortran)

* Implement # directives in f771 so preprocessing works better.

* Consider a preprocessor designed specifically for Fortran to replace
  cpp -traditional.  There are several on the 'net to look at.

* Support OPEN(...,KEY=(...),...).

* OPEN(NOSPANBLOCKS,...) is treated as OPEN(UNIT=NOSPANBLOCKS,...), so a
  later UNIT= in the first example is invalid.  Make sure this is
  what DEC Fortran users expect.

* Currently we disallow READ(1'10) since it is an obnoxious syntax, but
  supporting it might be pretty easy if needed (more details needed, such
  as whether general expressions separated by an apostrophe are supported,
  or maybe the record number can be a general expression, &c).

* Support STRUCTURE/UNION/MAP/RECORD fully.  Currently no support at all
  for %FILL in STRUCTURE and related syntax, whereas the rest of the
  stuff has at least some parsing support.

* F90 and g77 probably disagree about label scoping relative to INTERFACE/
  END INTERFACE and their contained SUBROUTINE/FUNCTION interface bodies
  (blocks?).

* F90: ENTRY doesn't support RESULT() yet, since that was added after S8.112.

* F90: Empty-statement handling (10 ;;CONTINUE;;) probably isn't consistent
  with the final form of the standard (it was vague at S8.112).

* It seems to be an "open" question whether a file, immediately after being
  OPENed, is positioned at the beginning, the end, or wherever -- it might
  be nice to offer an option of opening to "undefined" status, requiring
  an explicit absolute-positioning operation to be performed before any
  other (besides CLOSE) to assist in making applications port to systems
  (some IBM?) that OPEN to the end of a file or some such thing.

4. Generalize the machine model.

* Switch to using REAL_VALUE_TYPE to represent REAL/DOUBLE constants
  exclusively so the target float format need not be required.  This
  means changing the way g77 handles initialization of aggregate areas
  having more than one type, such as REAL and INTEGER, because currently
  it initializes them as if they were arrays of "char" and uses the
  bit patterns of the constants of the various types in them to determine
  what to stuff in elements of the arrays.

* Rely more and more on back-end info and capabilities, especially in the
  area of constants (where having the g77 front-end's IL just store
  the appropriate tree nodes containing constants might be best).

* Suite of C and Fortran programs that a user/administrator can run on a
  machine to help determine the configuration for GNU Fortran before building
  and help determine if the compiler works (especially with whatever
  libraries are installed) after building.

5. Useful warnings.

* In global.c for file-wide case, still warn about (consistently) initial-
  padded COMMON area.  E.g. even if COMMON /X/ I,D is seen in every
  program unit in a file and I is INTEGER*1 while D is DOUBLE PRECISION
  (which, on many machines, means that for D to immediately follow I,
  I must not actually start until several (padding) bytes into /X/),
  warn that the padding is needed.

* Support -pedantic more thoroughly, and use it only to generate
  warnings instead of rejecting constructs outright.  Have it warn:
  if a variable that dimensions an array is not a dummy or placed
  explicitly in COMMON (the 77 standard does not allow it to be
  placed in COMMON via EQUIVALENCE); if specification statements
  follow statement-function-definition statements; about all sorts of
  syntactic extensions.

* Warn about modifying DO variables via EQUIVALENCE.  This test might
  be useful in setting the "doiter" flag for a variable or even array
  reference within a loop, since that might produce faster code someday.

* Warn if brain-damage auto-decimal-convert-constant-to-REAL*8
  feature might be expected in source (if such warnings are enabled); for
  example, warn in cases like "parameter (pi=3.14159);foo=pi*3d0;" because
  apparently in these and other cases, some compilers append decimal zeros
  to the original single-precision constant and converts the result to
  double-precision -- though undoubtedly it uses an easier equivalent
  implementation (and I suppose g77 could, too, if this kind of dangerous
  feature were actually more useful than just fixing the source).

6. Better documentation of how GCC works and how to port it.

* Write CALLING, a text file that describes rules for how g77 passes
  arguments to subroutines and functions, handles COMPLEX return values,
  handles alternate returns, and so on.

* Develop and maintain a list of gcc compiler options supported for .f
  files.

7. Better internals.

* Make FFEEXPR_contextANY, generally make expression handling focus
  more on critical syntax stuff, leaving semantics to callers.  E.g.
  anything a caller can check, semantically, let it do so, rather
  than having expr.c do it.  (Exceptions might include things like
  diagnosing "FOO(I--K:)=BAR" where FOO is a PARAMETER -- if it seems
  important to preserve the left-to-right-in-source order of production
  of diagnostics.)

* Come up with better naming conventions for -D to establish requirements
  to achieve desired "language" via proj.h.

* In global, clean up used tokens and ffewheres in _terminate_1.

* Replace sta outpooldisp mechanism with malloc_pool_use.

* Check for opANY in more places in com.c, std.c, and ste.c.

* Utility to read and check bad.def msgs and their references in the
  code, to make sure calls are consistent with message templates.

* Make a symbol dumper for standalone FFE so testing can be more exhaustive.

* Search and fix "&ffe" and similar so that "ffe...ptr..." macros are
  available instead (a good argument for wishing we could have written all
  this stuff in C++, I suppose).

* Some modules truly export the member names of their structures (and the
  structures themselves), maybe fix this, and fix other modules that just
  appear to as well (by appending "_", though it'd be ugly and probably
  not worth the time).

* Implement C macros RETURNS(value) and SETS(something,value) in proj.h
  and use them throughout FFE source code so they can be tailored to catch
  code writing into a RETURNS() or reading from a SETS().

* Decorate throughout with "const" and other such stuff.

* All F90 notational derivations in the source code are still based
  on the S8.112 version of the draft standard.  Probably should update
  to the official standard, or put documentation of the rules as used
  in the code...uh...in the code.

* Some ffebld_new calls (those outside of ffeexpr or inside but invoked
  via paths not involving ffeexpr_lhs or ffeexpr_rhs) might be creating things
  in improper pools, leading to such things staying around too long or
  (doubtful, but possible and dangerous) not long enough.

* Some ffebld_list_new (or whatever) calls might not be matched by
  ffebld_list_bottom (or whatever) calls, which might someday matter.

* Probably not doing clean things when we fail to EQUIVALENCE something
  due to alignment/mismatch or other problems -- they end up without
  ffestorag objects, so maybe the backend (and other parts of the front
  end) can notice that and handle like an "opANY" (do what it wants, just
  don't complain or crash).

8. Better diagnostics.

* Implement non-F90 messages (especially avoid mentioning F90 things g77
  doesn't yet support).

* Generally continue processing for warnings and recoverable (user)
  errors whenever possible -- don't gratuitously make bad code.  Example:
  INTRINSIC ZABS;CALL FOO(ZABS);END when -ff2c-intrinsics-disable should
  complain about passing ZABS but still compile, instead of rejecting
  the entire CALL statement (some of this is related to improving sta.c
  to do the statement-preprocessing work).

* If -fno-ugly, reject badly designed trailing-radix quoted (typeless)
  numbers, such as '123'O.

* For diagnostics, would be nice to say "In procedure XYZ:" or something
  like that (see what gcc does, etc), perhaps even the line number within
  the procedure would be appropriate?

* -fflag-ugly, -fflag-automatic, -fflag-vxt-not-f90 (syn. -fflag-f90-not-vxt),
  -fflag-f90 all should flag places (via diagnostics) where ambiguities
  are found.

* When FUNCTION and ENTRY point types disagree (CHARACTER lengths,
  type classes, &c), ANY-ize the offending ENTRY point and any _new_ dummies
  it specifies.

* Complain when list of dummies containing an adjustable dummy array does
  not also contain every variable listed in the dimension list of the
  adjustable array.  Currently g77 does complain about a variable that
  dimensions an array but doesn't appear in any dummy list or COMMON area,
  but this needs to be extended to catch cases where it doesn't appear in
  every dummy list that also lists any arrays it dimensions.

* Make sure things like RETURN 2HAB are invalid in both source forms (must
  be RETURN (2HAB), which probably still makes no sense but at least can
  be reliably parsed).  Fixed form rejects it, but not free form, except
  in a way that is a bit difficult to understand.

* Speed up and improve error handling for data when repeat-count is
  specified; as in "integer x(20);continue;data (x(i),j=1,20)/20*5/;end",
  so 20 messages don't come out after the important one.

* Warn if size discrepancies exist for various specifications of a
  named COMMON area.

9. More library routines.

* The sort of routines usually found in the BSD-ish libU77 should be
  provided in addition to the few utility routines in libF77.  Some of
  this work has been done.
