
                               GN SECURITY GUIDE
                                       
   
     _________________________________________________________________
   
   This document discusses possible threats to the integrity of a system
   running the _GN_ server and to the data on that server. It also
   considers what the maintainer can and should do to enhance the level
   of security. There are two types of threat which we address
   separately: (1) external, from a client or purported client on a
   remote host, and (2) local, from a user with an account on the server
   host.
   
External Threats

   The maintainer's objective is to prevent any unauthorized access to
   (or alteration of) files on the host system. Scripts run on the server
   with the CGI or exec mechanisms cause special problems and are
   discussed separately below. If you do not need to use any executable
   scripts you should uncomment the #define FORBID_EXEC line in the file
   config.h. This disallows any attempt to execute a command on your
   server and does not allow any data sent by a client to even be written
   to a temporary disk file. In this situation the key to _GN_ security
   is twofold: no document is served without explicit permission from the
   maintainer and nothing is written to disk on the server except the
   logfile.
   
   The basic philosophy of _GN_ security is that by default no client
   requests are granted. Permission to serve a document must be
   explicitly granted by the maintainer. This is done in one of two ways:
   A file named _.cache_ which is in the _GN_ data hierarchy may be
   served and a file in the hierarchy which is listed in a file with the
   _.cache_ file format may be served.
   
   Despite this strong foundation several additional steps are prudent.
   The most important is that the maintainer must assure that no
   untrusted source has write access to any part of the _GN_ hierarchy.
   For example an "incoming" anonymous ftp directory should never be part
   of a _GN_ hierarchy, because an attacker could put a _.cache_ file
   there granting access any file in the hierarchy.
   
   Starting with version 2.14 of _GN_ there is optional support for
   additional security against counterfeit _.cache_ files. This is
   achieved by specifying a userid or groupid (not both) for .cache
   files. To do this use the "-k uid#" or "-K gid#" option to gn or sgn.
   When invoked in this way gn or sgn will not serve a document unless
   the .cache file listing it has the prescribed owner or gid. This uid#
   or gid# should be that of the maintainer _not_ the one under which gn
   or sgn runs. If on your server all .cache files are created by a
   single user or a single group I strongly recommend using this option.
   Note that for a given .cache file in a directory to be served the
   owner of the _different_ .cache file (which lists the given one and
   resides in the directory above it) must be correct, not the owner of
   the given .cache file itself. In particular, the top level .cache is
   always allowed as it is not listed in any .cache file. Sometimes even
   I get confused about this :)
   
   Also the server should be run with a USER_ID (which can be set in
   config.h) with as few permissions as possible. Of course it must have
   read permission on all the files served but it should not have write
   permission for any directory or file other than its logfile. If the
   syslog option for logging is enabled there is not even any need for
   write permission on a logfile. A good practice is to have files in
   your hierarchy which you intend to serve be owned by the maintainer or
   their creator. They should be world readable (assuming they are for
   general consumption) but with restricted write permission. The files
   in your hierarchy should _not_ be owned by the user id under which
   _GN_ will run.
   
   _GN_ does not use the chroot system call to further restrict the files
   which the server can access. Doing so would enhance security at the
   expense of extra work for the maintainer. The effect of this is to
   prevent the server from even internally accessing any file which is
   not in your data directory. If you are especially concerned about
   security you may wish to run one of the public domain TCP wrappers in
   conjunction with _GN_. This will simultaneously enhance security for
   other TCP services like anonymous ftp.
   
  CGI AND EXEC
  
   Enabling the use of scripts or programs run on the server greatly
   enhances its functionality but also increases the potential risk of an
   attack. The greatest danger is that even though the script is under
   the control of the maintainer, the arguments passed to it can be set
   by a potential attacker. When _GN_ invokes a script it is actually
   done by passing the script name and its arguments to /bin/sh. If the
   arguments were in no way restricted an attacker could supply an
   argument like "arg; rm *" which would result in "command arg; rm *"
   being executed by the server. For this reason in its default
   configuration _GN_ removes any character in the user supplied argument
   which has special meaning to the shell.
   
   More precisely, if any of the characters
   

     ; !` ' | \ * ? - ~ < > ^ ( ) [ ] { } & $ \r \n  / or \

   
   
   occurs in an argument for an item of "exec" type it is replaced by a
   space. There are other programming constructs which would allow the
   invocation of the command without the intervention of a shell. However
   using them and not altering the arguments would merely pass the risk
   on to the script writer. While _GN_ would then arguably be blameless
   in the event of damage, it would still be very easy for an
   inexperienced maintainer to have a script line like exec mail
   "$maintainer $1" If $1 contained a ; followed by a dangerous command
   this could be disastrous. For that reason I have chosen to check the
   arguments and replace dangerous characters.
   
   CGI (Common Gateway Interface) scripts work somewhat differently than
   the exec type. The CGI specification does not permit altering
   dangerous characters in the arguments. Briefly, here's what happens
   with CGI arguments.
   
   If the request is of type POST, information is read from the client
   and put in a temporary file on disk. Then the script is executed with
   no arguments and its standard input comes from this file. Security is
   the responsibility of the script writer. It is not so dangerous to
   have arguments come from standard input but the script writer must
   still exercise care.
   
   If the request is of type GET, the arguments are examined to see if
   they contain an '='. If they do, it is assumed that this is a CGI form
   response (something like name=John&toppings=pepperoni). In this case
   the script is executed with no arguments and the argument string is
   placed in an environment variable where the script can read it. Again
   this is fairly safe but the script writer must exercise care.
   
   Finally if the GET request has arguments but no '=' it is assumed to
   be an ISINDEX type request and the script should be executed with the
   given arguments. While the CGI specification does not permit the
   altering of arguments, it does say that if the arguments pose any
   security problems it is permissible to put the string in an
   environment variable and execute the script without arguments, just as
   in the CGI forms case described above. _GN_ takes a very strict
   position here and views any of the characters in the list above as a
   security problem requiring this action. It is quite possible that this
   will cause some scripts not to work with some user inputs, but this
   has not appeared to be a problem.
   
   Exercise care when writing scripts. If possible avoid /bin/sh scripts
   in favor of something like perl, or even better C. Anytime you get
   input from the client make sure it contains no funny characters. For
   example the perl lines
   

      $Name =~ s/[^A-Za-z. ]//g;
      $Phone =~ s/[0-9()\- ]//g;

   delete any characters except letters, '.' and spaces from $Name and
   any characters except digits, parentheses, '-', and space from $Phone.
   
Internal Threats

   Whenever untrusted users have accounts on a system there is risk
   involved. The objective of _GN_ is to insure that running the server
   does not increase this risk. If the server is wisely managed, I
   believe this goal can be achieved. Here are some guidelines.
   
   If it is possible make sure that no untrusted user has write access to
   any part of your _GN_ hierarchy. As mentioned above an attacker with
   write access to your hierarchy can create a .cache file which will
   give access to anything in your hierarchy. Even worse, she can create
   a shell script and a .cache file permitting it to be executed. A good
   principle to keep in mind is: _Everyone with write access to any part
   of your data hierarchy has all the permissions of the userid under
   which your server runs!_ Of course _sgn_ is a special case when run as
   root. If you want to use _sgn_ on one of the standard ports it must be
   run as root because only root can access these ports. The first thing
   the server does is open the necessary socket and immediately change
   its userid to the one set in the config.h file. It is then the
   permissions of this userid that are effectively transferred to a user
   with write access to your GN hierarchy. I do not recommend making
   _sgn_ be setuid root. This would allow users without root access to
   start up the server on any privileged port. If individuals without
   root access need to be starting or stopping a server they can do so on
   a non-privileged port. Sometimes it is not possible or desirable to
   deny write access to your _GN_ hierarchy. For example, you may want to
   allow all users to have a subdirectory of the hierarchy in which to
   publish their "home pages". The "FORBID_EXEC" directive mentioned
   above may be a good idea in this case, to prevent any execution of
   scripts. You should note that there is no way to use _.access_ files
   to prevent users on your system with write access to the data
   hierarchy from gaining access to files you are serving. They can
   simply make a symbolic link in their part of the hierarchy to the file
   you want to restrict and a .cache file permitting it to be served.
   Since the server has access to the restricted file it will serve it if
   it is listed in a .cache file.
   
   The most important thing to remember in this situation is the
   principle cited above. All users have some permissions and are denied
   others. Remember that any permissions you grant to the userid under
   which _GN_ runs are also granted to every user who can create a .cache
   file in your data hierarchy.
   
   
     _________________________________________________________________
   
    John Franks -- Dept of Math. Northwestern University <john@math.nwu.e
    du>
