| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
This chapter describes the syntax and semantics of the AutoGen definition file. In order to instantiate a template, you normally must provide a definitions file that identifies itself and contains some value definitions. Consequently, we keep it very simple. For "advanced" users, there are preprocessing directives, sparse arrays, named indexes and comments that may be used as well.
The definitions file is used to associate values with names. Every value is implicitly an array of values, even if there is only one value. Values may be either simple strings or compound collections of name-value pairs. An array may not contain both simple and compound members. Fundamentally, it is as simple as:
| prog-name = "autogen";
flag = {
    name      = templ_dirs;
    value     = L;
    descrip   = "Template search directory list";
};
 | 
For purposes of commenting and controlling the processing of the
definitions, C-style comments and most C preprocessing directives are
honored.  The major exception is that the #if directive is
ignored, along with all following text through the matching
#endif directive.  The C preprocessor is not actually invoked, so
C macro substitution is not performed.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
The first definition in this file is used to identify it as a
AutoGen file.  It consists of the two keywords,
`autogen' and `definitions' followed by the default
template name and a terminating semi-colon (;).  That is:
| AutoGen Definitions template-name; | 
Note that, other than the name template-name, the words `AutoGen' and `Definitions' are searched for without case sensitivity. Most lookups in this program are case insensitive.
Also, if the input contains more identification definitions, they will be ignored. This is done so that you may include (see section Controlling What Gets Processed) other definition files without an identification conflict.
AutoGen uses the name of the template to find the corresponding template file. It searches for the file in the following way, stopping when it finds the file:
If AutoGen fails to find the template file in one of these places, it prints an error message and exits.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
Any name may have multiple values associated with it in the definition file. If there is more than one instance, the only way to expand all of the copies of it is by using the FOR (see section FOR - Emit a template block multiple times) text function on it, as described in the next chapter.
There are two kinds of definitions, `simple' and `compound'. They are defined thus (see section Finite State Machine Grammar):
| compound_name '=' '{' definition-list '}' ';'
simple_name '=' string ';'
no_text_name ';'
 | 
No_text_name is a simple definition with a shorthand empty string
value.  The string values for definitions may be specified in any of
several formation rules.
| 2.2.1 Definition List | ||
| 2.2.2 Double Quote String | ||
| 2.2.3 Single Quote String | ||
| 2.2.5 An Unquoted String | ||
| 2.2.4 Shell Output String | ||
| 2.2.6 Scheme Result String | ||
| 2.2.7 A Here String | ||
| 2.2.8 Concatenated Strings | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
definition-list is a list of definitions that may or may not
contain nested compound definitions.  Any such definitions may
only be expanded within a FOR block iterating over the
containing compound definition.  See section FOR - Emit a template block multiple times.
Here is, again, the example definitions from the previous chapter, with three additional name value pairs. Two with an empty value assigned (first and last), and a "global" group_name.
| autogen definitions list;
group_name = example;
list = { list_element = alpha;  first;
         list_info    = "some alpha stuff"; };
list = { list_info    = "more beta stuff";
         list_element = beta; };
list = { list_element = omega;  last;
         list_info    = "final omega stuff"; };
 | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
The string follows the C-style escaping (\, \n, \f,
\v, etc.), plus octal character numbers specified as \ooo.
The difference from "C" is that the string may span multiple lines.
Like ANSI "C", a series of these strings, possibly intermixed with
single quote strings, will be concatenated together.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
This is similar to the shell single-quote string.  However, escapes
\ are honored before another escape, single quotes '
and hash characters #.  This latter is done specifically
to disambiguate lines starting with a hash character inside
of a quoted string.  In other words,
| fumble = ' #endif '; | 
could be misinterpreted by the definitions scanner, whereas this would not:
| fumble = ' \#endif '; | 
As with the double quote string, a series of these, even intermixed with double quote strings, will be concatenated together.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
This is assembled according to the same rules as the double quote string, except that there is no concatenation of strings and the resulting string is written to a shell server process. The definition takes on the value of the output string.
NB The text is interpreted by a server shell.  There may be left over
state from previous server shell processing.  This scriptlet may also leave
state for subsequent processing.  However, a cd to the original
directory is always issued before the new command is issued.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
A simple string that does not contain white space may be left
unquoted.  The string must not contain any of the characters special to
the definition text (i.e. ", #, ', (,
), ,, ;, <, =, >, [,
], `, {, or }).  This list is subject to
change, but it will never contain underscore (_), period
(.), slash (/), colon (:), hyphen (-) or
backslash (\\).  Basically, if the string looks like it is a
normal DOS or UNIX file or variable name, and it is not one of two
keywords (`autogen' or `definitions') then it is OK to not
quote it, otherwise you should.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
A scheme result string must begin with an open parenthesis (.
The scheme expression will be evaluated by Guile and the
value will be the result.  The AutoGen expression functions
are disabled at this stage, so do not use them.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
A `here string' is formed in much the same way as a shell here doc. It is denoted with a doubled less than character and, optionally, a hyphen. This is followed by optional horizontal white space and an ending marker-identifier. This marker must follow the syntax rules for identifiers. Unlike the shell version, however, you must not quote this marker. The resulting string will start with the first character on the next line and continue up to but not including the newline that precedes the line that begins with the marker token. No backslash or any other kind of processing is done on this string. The characters are copied directly into the result string.
Here are two examples:
| str1 = <<-  STR_END
        $quotes = " ' `
        STR_END;
str2 = <<   STR_END
        $quotes = " ' `
        STR_END;
STR_END;
 | 
The first string contains no new line characters. The first character is the dollar sign, the last the back quote.
The second string contains one new line character.  The first character
is the tab character preceding the dollar sign.  The last character is
the semicolon after the STR_END.  That STR_END does not
end the string because it is not at the beginning of the line.  In the
preceding case, the leading tab was stripped.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
If single or double quote characters are used, then you also have the option, a la ANSI-C syntax, of implicitly concatenating a series of them together, with intervening white space ignored.
NB You cannot use directives to alter the string content. That is,
| str = "fumble"
#ifdef LATER
      "stumble"
#endif
      ;
 | 
will result in a syntax error. The preprocessing directives are not carried out by the C preprocessor. However,
| str = '"fumble\n" #ifdef LATER " stumble\n" #endif '; | 
Will work.  It will enclose the `#ifdef LATER'
and `#endif' in the string.  But it may also wreak
havoc with the definition processing directives.  The hash
characters in the first column should be disambiguated with
an escape \ or join them with previous lines:
"fumble\n#ifdef LATER....
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
In AutoGen, every name is implicitly an array of values. When assigning values, they are usually implicitly assiged to the next highest slot. They can also be specified explicitly:
| mumble[9] = stumble; mumble[0] = grumble; | 
If, subsequently, you assign a value to mumble without an
index, its index will be 10, not 1.
If indexes are specified, they must not cause conflicts.
#define-d names may also be used for index values.
This is equivalent to the above:
| #define FIRST 0 #define LAST 9 mumble[LAST] = stumble; mumble[FIRST] = grumble; | 
All values in a range do not have to be filled in. If you leave gaps, then you will have a sparse array. This is fine (see section FOR - Emit a template block multiple times). You have your choice of iterating over all the defined values, or iterating over a range of slots. This:
| [+ FOR mumble +][+ ENDFOR +] | 
iterates over all and only the defined entries, whereas this:
| [+ FOR mumble (for-by 1) +][+ ENDFOR +] | 
will iterate over all 10 "slots". Your template will likely have to contain something like this:
| [+ IF (exist? (sprintf "mumble[%d]" (for-index))) +] | 
or else "mumble" will have to be a compound value that, say, always contains a "grumble" value:
| [+ IF (exist? "grumble") +] | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
There are several methods for including dynamic content inside a definitions
file.  Three of them are mentioned above (Shell Output String and
see section Scheme Result String) in the discussion of string formation rules.
Another method uses the #shell processing directive.
It will be discussed in the next section (see section Controlling What Gets Processed).
Guile/Scheme may also be used to yield to create definitions.
When the Scheme expression is preceded by a backslash and single quote, then the expression is expected to be an alist of names and values that will be used to create AutoGen definitions.
This method can be be used as follows:
| \'( (name  (value-expression))
    (name2 (another-expr))  )
 | 
This is entirely equivalent to:
| name = (value-expression); name2 = (another-expr); | 
Under the covers, the expression gets handed off to a Guile function
named alist->autogen-def in an expression that looks like this:
| (alist->autogen-def
    ( (name (value-expression))  (name2 (another-expr)) ) )
 | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
Definition processing directives can only be processed
if the '#' character is the first character on a line.  Also, if you
want a '#' as the first character of a line in one of your string
assignments, you should either escape it by preceding it with a
backslash `\', or by embedding it in the string as in "\n#".
All of the normal C preprocessing directives are recognized, though
several are ignored.  There is also an additional #shell -
#endshell pair.  Another minor difference is that AutoGen
directives must have the hash character (#) in column 1.
The final tweak is that #! is treated as a comment line.
Using this feature, you can use:  `#! /usr/local/bin/autogen'
as the first line of a definitions file, set the mode to executable
and "run" the definitions file as if it were a direct invocation of
AutoGen.  This was done for its hack value.
The ignored directives are:
`#assert', `#ident', `#let', `#pragma',  and `#if'.
Note that when ignoring the #if directive, all intervening
text through its matching #endif is also ignored,
including the #else clause.
The AutoGen directives that affect the processing of definitions are:
#define name [ <text> ]Will add the name to the define list as if it were a DEFINE program argument. Its value will be the first non-whitespace token following the name. Quotes are not processed.
After the definitions file has been processed, any remaining entries in the define list will be added to the environment.
#elifThis must follow an #if
otherwise it will generate an error.
It will be ignored.
#elseThis must follow an #if, #ifdef or #ifndef.
If it follows the #if, then it will be ignored.  Otherwise,
it will change the processing state to the reverse of what it was.
#endifThis must follow an #if, #ifdef or #ifndef.
In all cases, this will resume normal processing of text.
#endmacThis terminates a "macdef", but must not ever be encountered directly.
#endshellEnds the text processed by a command shell into autogen definitions.
#error [ <descriptive text> ]This directive will cause AutoGen to stop processing and exit with a status of EXIT_FAILURE.
#if [ <ignored conditional expression> ]#if expressions are not analyzed.  Everything from here
to the matching #endif is skipped.
#ifdef name-to-testThe definitions that follow, up to the matching #endif will be
processed only if there is a corresponding -Dname command line
option or if a #define of that name has been previously encountered.
#ifndef name-to-testThe definitions that follow, up to the matching #endif will be
processed only if there is not a corresponding -Dname
command line option or there was a canceling -Uname option.
#include unadorned-file-nameThis directive will insert definitions from another file into the current collection. If the file name is adorned with double quotes or angle brackets (as in a C program), then the include is ignored.
#lineAlters the current line number and/or file name.  You may wish to
use this directive if you extract definition source from other files.
getdefs uses this mechanism so AutoGen will report the correct
file and approximate line number of any errors found in extracted
definitions.
#macdefThis is a new AT&T research preprocessing directive. Basically, it is a multi-line #define that may include other preprocessing directives.
#option opt-name [ <text> ]This directive will pass the option name and associated text to the AutoOpts optionLoadLine routine (see section optionLoadLine). The option text may span multiple lines by continuing them with a backslash. The backslash/newline pair will be replaced with two space characters. This directive may be used to set a search path for locating template files For example, this:
| #option templ-dirs $ENVVAR/dirname | 
will direct autogen to use the ENVVAR environment variable to find
a directory named dirname that (may) contain templates.  Since these
directories are searched in most recently supplied first order, search
directories supplied in this way will be searched before any supplied on
the command line.
#shellInvokes $SHELL or `/bin/sh' on a script that should
generate AutoGen definitions.  It does this using the same server
process that handles the back-quoted ` text.
CAUTION  let not your $SHELL be csh.
#undef name-to-undefineWill remove any entries from the define list that match the undef name pattern.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
When AutoGen starts, it tries to determine several names from the
operating environment and put them into environment variables for use in
both #ifdef tests in the definitions files and in shell scripts
with environment variable tests.  __autogen__ is always defined.
For other names, AutoGen will first try to use the POSIX version of the
sysinfo(2) system call.  Failing that, it will try for the POSIX
uname(2) call.  If neither is available, then only
"__autogen__" will be inserted into the environment.
In all cases, the associated names are converted to lower case, surrounded
by doubled underscores and non-symbol characters are replaced with
underscores.
With Solaris on a sparc platform, sysinfo(2) is available.
The following strings are used:
SI_SYSNAME (e.g., "__sunos__")
SI_HOSTNAME (e.g., "__ellen__")
SI_ARCHITECTURE (e.g., "__sparc__")
SI_HW_PROVIDER (e.g., "__sun_microsystems__")
SI_PLATFORM (e.g., "__sun_ultra_5_10__")
SI_MACHINE (e.g., "__sun4u__")
For Linux and other operating systems that only support the
uname(2) call, AutoGen will use these values:
sysname (e.g., "__linux__")
machine (e.g., "__i586__")
nodename (e.g., "__bach__")
By testing these pre-defines in my definitions, you can select
pieces of the definitions without resorting to writing shell
scripts that parse the output of uname(1).  You can also
segregate real C code from autogen definitions by testing for
"__autogen__".
| #ifdef __bach__ location = home; #else location = work; #endif | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
The definitions file may contain C and C++ style comments.
| /* * This is a comment. It continues for several lines and closes * when the characters '*' and '/' appear together. */ // this comment is a single line comment | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
This is an extended example:
| autogen definitions `template-name';
/*
 *  This is a comment that describes what these
 *  definitions are all about.
 */
global = "value for a global text definition.";
/*
 *  Include a standard set of definitions
 */
#include standards.def
a_block = {
    a_field;
    a_subblock = {
        sub_name  = first;
        sub_field = "sub value.";
    };
#ifdef FEATURE
    a_subblock = {
        sub_name  = second;
    };
#endif
};
 | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
The preprocessing directives and comments are not part of the grammar. They are handled by the scanner/lexer. The following was extracted directly from the generated defParse-fsm.c source file. The "EVT:" is the token seen, the "STATE:" is the current state and the entries in this table describe the next state and the action to take. Invalid transitions were removed from the table.
| dp_trans_table[ DP_STATE_CT ][ DP_EVENT_CT ] = {
  /* STATE 0:  DP_ST_INIT */
  { { DP_ST_NEED_DEF, NULL },                       /* EVT:  autogen */
  /* STATE 1:  DP_ST_NEED_DEF */
    { DP_ST_NEED_TPL, NULL },                       /* EVT:  definitions */
  /* STATE 2:  DP_ST_NEED_TPL */
    { DP_ST_NEED_SEMI, &dp_do_tpl_name },           /* EVT:  var_name */
    { DP_ST_NEED_SEMI, &dp_do_tpl_name },           /* EVT:  other_name */
    { DP_ST_NEED_SEMI, &dp_do_tpl_name },           /* EVT:  string */
  /* STATE 3:  DP_ST_NEED_SEMI */
    { DP_ST_NEED_NAME, NULL },                      /* EVT:  ; */
  /* STATE 4:  DP_ST_NEED_NAME */
  { { DP_ST_NEED_DEF, NULL },                       /* EVT:  autogen */
    { DP_ST_DONE, &dp_do_need_name_end },           /* EVT:  End-Of-File */
    { DP_ST_HAVE_NAME, &dp_do_need_name_var_name }, /* EVT:  var_name */
    { DP_ST_HAVE_VALUE, &dp_do_end_block },         /* EVT:  } */
  /* STATE 5:  DP_ST_HAVE_NAME */
    { DP_ST_NEED_NAME, &dp_do_empty_val },          /* EVT:  ; */
    { DP_ST_NEED_VALUE, &dp_do_have_name_lit_eq },  /* EVT:  = */
    { DP_ST_NEED_IDX, NULL },                       /* EVT:  [ */
  /* STATE 6:  DP_ST_NEED_VALUE */
    { DP_ST_HAVE_VALUE, &dp_do_str_value },         /* EVT:  var_name */
    { DP_ST_HAVE_VALUE, &dp_do_str_value },         /* EVT:  other_name */
    { DP_ST_HAVE_VALUE, &dp_do_str_value },         /* EVT:  string */
    { DP_ST_HAVE_VALUE, &dp_do_str_value },         /* EVT:  here_string */
    { DP_ST_HAVE_VALUE, &dp_do_str_value },         /* EVT:  number */
    { DP_ST_NEED_NAME, &dp_do_start_block },        /* EVT:  { */
  /* STATE 7:  DP_ST_NEED_IDX */
    { DP_ST_NEED_CBKT, &dp_do_indexed_name },       /* EVT:  var_name */
    { DP_ST_NEED_CBKT, &dp_do_indexed_name },       /* EVT:  number */
  /* STATE 8:  DP_ST_NEED_CBKT */
    { DP_ST_INDX_NAME, NULL }                       /* EVT:  ] */
  /* STATE 9:  DP_ST_INDX_NAME */
    { DP_ST_NEED_NAME, &dp_do_empty_val },          /* EVT:  ; */
    { DP_ST_NEED_VALUE, NULL },                     /* EVT:  = */
  /* STATE 10:  DP_ST_HAVE_VALUE */
    { DP_ST_NEED_NAME, NULL },                      /* EVT:  ; */
    { DP_ST_NEED_VALUE, &dp_do_next_val },          /* EVT:  , */
 | 
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
There are several methods for supplying data values for templates.
It is entirely possible to write a template that does not depend upon
external definitions.  Such a template would likely have an unvarying
output, but be convenient nonetheless because of an external library
of either AutoGen or Scheme functions, or both.  This can be accommodated
by providing the --override-tpl and --no-definitions
options on the command line.  See section Invoking autogen.
AutoGen behaves as a CGI server if the definitions input is from stdin
and the environment variable REQUEST_METHOD is defined
and set to either "GET" or "POST", See section AutoGen as a CGI server.  Obviously,
all the values are constrained to strings because there is no way
to represent nested values.
AutoGen comes with a program named, xml2ag.  Its output can
either be redirected to a file for later use, or the program can
be used as an AutoGen wrapper.  See section Invoking xml2ag.
The introductory template example (see section A Simple Example) can be rewritten in XML as follows:
| <EXAMPLE  template="list.tpl">
<LIST list_element="alpha"
      list_info="some alpha stuff"/>
<LIST list_info="more beta stuff"
      list_element="beta"/>
<LIST list_element="omega"
      list_info="final omega stuff"/>
</EXAMPLE>
 | 
A more XML-normal form might look like this:
| <EXAMPLE template="list.tpl"> <LIST list_element="alpha">some alpha stuff</LIST> <LIST list_element="beta" >more beta stuff</LIST> <LIST list_element="omega">final omega stuff</LIST> </EXAMPLE> | 
but you would have to change the template list_info references
into text references.
Of course. :-)
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
 
  This document was generated by Bruce Korb on April, 9 2006 using texi2html 1.76.