                                                              -*- org -*-
#+TITLE: GNU Shepherd NEWS — history of user-visible changes
#+STARTUP: content hidestars

Copyright © 2002, 2003 Wolfgang Jährling
Copyright © 2013-2014, 2016, 2018-2020, 2022-2023 Ludovic Courtès <ludo@gnu.org>

  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
  notice and this notice are preserved.

Please send Shepherd bug reports to bug-guix@gnu.org.

* Changes in 0.10.0

** Distinguish ‘starting’ and ‘stopping’ intermediate service statuses

In previous version, a service would be either “running” or “stopped”.  The
intermediate states “starting” and “stopping” are now properly captured and
you can see them when running ‘herd status’.

** ‘start’ and ‘stop’ block when service is already being started/stopped
  <https://issues.guix.gnu.org/54786#4>

With previous version, a client running ‘herd start SERVICE’ while SERVICE is
already being started would cause shepherd to attempt to start a second
instance of that service, ultimately resulting in confusion, disappointment,
and frustration.

This is no longer the case: when a service is already being started/stopped,
additional invocation of ‘herd start’ or ‘herd stop’ now block until the
service is running/stopped.

** ‘shepherd’ starts services in parallel

Services started with ‘start-in-the-background’ and more generally service
dependencies get started in parallel.  This can reduce startup times in case
of a “wide” service dependency graph with some services that take a while to
start.

** ‘shepherd’ keeps track of failures and status change times

For each service, shepherd maintains an event log including the time of recent
status changes as well as the time of startup failures, if any.  The ‘herd
status SERVICE’ command now shows the time when the service entered its
current status and whether it failed to start; ‘herd status’ also prominently
lists services that failed to start.

** New ‘herd log’ command

Related to the previous item, the new ‘herd log’ command displays an aggregate
of the service event logs, showing the time at which each service changed
statuses.

** New ‘herd graph’ command

The new ‘herd graph’ command emits a Graphviz/Dot representation of the
service dependency graph, which can be viewed for example with ‘xdot’:

  herd graph | xdot -

Guix System users get similar information with ‘guix system shepherd-graph’
(and likewise for Guix Home).  The difference here is that this reflects the
current system status, showing transient services, services that failed to
start, and so on.

** ‘herd’ output is colorized

At long last!  We hope you’ll enjoy a little bit of coloring to highlight
important bits in the output of various commands.

** New services shipped: ‘monitoring’ and ‘repl’

The Shepherd now ships with optional services—see “Service Collection” in the
manual.  The ‘monitoring’ service logs resource usage of the ‘shepherd’
process itself.  The ‘repl’ service runs a read-eval-print loop (REPL) in the
‘shepherd’ so you can hack it live—enjoy it, but handle it with care!

** Socket-actived, systemd-style services can now be started eagerly

The ‘make-systemd-constructor’ procedure has a new #:lazy-start? parameter.
It defaults to #true, meaning that the process is started lazily, on the first
connection to one of its sockets, as was the case in 0.9.x.  Passing
#:lazy-start? #false instructs shepherd to instead start the process eagerly,
as soon as the listening sockets are ready.

This is useful for services that require socket activation as a startup
synchronization mechanism, yet are expected to run as soon as possible.  An
example is ‘guix publish --advertise’: it should be started eagerly so it can
start advertising itself via Avahi.

** Each registered name maps to exactly one service

There used to be a fuzzy notion of “conflicting services”, when a given
service name could potentially refer to more than one service.  This has
proved to be confusing more than anything else; now, each registered service
name refers to exactly one service.  The interface related to that feature,
such as the ‘conflicts-with’ method, is done.

** For systemd and inetd services, retry ‘bind’ upon EADDRINUSE
   <https://issues.guix.gnu.org/58485#13>

Services started with ‘make-systemd-constructor’ and ‘make-inetd-constructor’
will now retry several times when ‘bind’ returns EADDRINUSE (“Address already
in use”) for their listening socket(s).

** ‘system’ and ‘make-system-constructor’ are now non-blocking
   <https://issues.guix.gnu.org/61803>

In versions up to 0.9.3, calling Guile’s ‘system’ procedure (which is what
‘make-system-constructor’ does) would block the ‘shepherd’ process until the
shell spawned by ‘system’ has terminated.  This is no longer the case.

** GOOPS interface is deprecated

When it was created in 2002, the Shepherd (née dmd) embraced GOOPS, Guile’s
object-oriented programming system, then a brand new and promising approach
for 21st century programs.  In hindsight, while there were a couple of classes
and a bunch of methods, the code base was not really making much use of GOOPS.
The current maintainer deemed it unnecessary and encouraging a programming
style at odds with the shiny horizon of purely functional, actor-style
programming.

The GOOPS interface is still available in 0.10.0; for example, you can still
write ~(make <service> #:provides …)~ in your configuration file.  However,
GOOPS support will be removed in the next major series, most likely labeled
1.0.

A new interface has been defined.  Check out the “Legacy GOOPS Interface”
section of the manual for more information, and email guix-devel@gnu.org if
you have any questions or concerns.

** Interfaces removed and changed

Several obscure or undocumented interfaces were removed:

  - support for the ‘unknown’ service;
  - support for “persistency” (sic);
  - the ‘cd’ action of the ‘root’ service;
  - the ‘launch-service’ procedure of (shepherd service).

New deprecations:

  - ‘make-actions’ is deprecated in favor of ‘actions’;
  - calling ‘register-services’ with an arbitrary number of arguments is now
    deprecated; you should now call it with a single argument, the list of
    services to register.

** Major internal overhaul

As you can guess from the list of user-visible changes above, the Shepherd has
undergone a major internal overhaul.  The 0.9.x series introduced the use of
Fibers, Guile’s lightweight concurrent facility; shepherd took advantage of it
notably with the introduction of systemd-style and inetd-style services.  This
new stable series takes it further.

In particular, each <service> record has an associated fiber called the
“service controller”.  Following the actor model, each of these fibers reacts
to messages it receives, be they event notification—e.g., process
termination—or user requests—e.g., querying the service status, requesting
that the service be stopped.  Other noteworthy actors include the “process
monitor” and the “service registry”.

This has allowed us to address a number of race conditions while also leading
to clearer code with linear flows that one can more easily reason about.
Overall, it makes the code base much more pleasant to work with and certainly
easier to hack than other implementations mired in the “callback hell”.

Documentation has been overhauled as well to reflect all these changes.  Check
out the new subsections under “Services” for more information.

* Changes in version 0.9.3

** Service ‘stop’ is now synchronous
   <https://issues.guix.gnu.org/58485>

Previously, ‘herd stop SERVICE’ would send SIGTERM to the service’s process
and immediately move on without waiting for the process to actually terminate.
This could cause problems for example when running ‘herd restart SERVICE’:
there was a possibility that a new instance of the service would be spawned
before the previous one had terminated.

This is now fixed: ‘stop’ only returns once the process has actually
terminated.  Furthermore, the destructor returned by ‘make-kill-destructor’
sends SIGKILL after some grace period has expired if the process is still
around; this is configurable with #:grace-period and
‘default-process-termination-grace-period’.

** Non-blocking replacement for ‘system*’
   <https://issues.guix.gnu.org/56674>.

Service code can now call ‘system*’ lightheartedly: shepherd installs a
cooperative, non-blocking replacement for Guile’s ‘system*’ procedure.
Concretely, it means that it’s OK to use ‘system*’, say, in the ‘start’ method
of a service: it won’t block shepherd, one can still interact with it with
‘herd’.

** Fewer continuation barriers

The ‘stop’ method of services, and ‘eval’ and ‘load’ actions of the ‘root’
service, and a few other points acted as “continuation barriers”, meaning that
user code would not be allowed to suspend the current fiber for example by
calling the ‘sleep’ procedure from (fiber).  These limitations have been
lifted.

** Reduced memory consumption while logging

Service output logging allocates less memory than before.

** Updated translations: ro, sr

* Changes in version 0.9.2
** File descriptors used internally are now all marked as close-on-exec

Previously, services started indirectly with ‘exec-command’ (which is usually
the case) would not inherit any file descriptor from shepherd because
‘exec-command’ would explicitly close all of them.  However, services started
with ‘make-system-constructor’ and processes created by some other means, such
as calling ‘system*’, would inherit some of those descriptors, giving them
more authority than intended.

The change here consists in marking all internally-used file descriptors as
“close-on-exec” (O_CLOEXEC), a feature that’s been available on GNU/Linux and
GNU/Hurd for years but that so far wasn’t used consistently in shepherd.  This
is now fixed.  As a side-effect, the file-descriptor-closing loop in
‘exec-command’ is now gone.

** Client connections with ‘herd’ are non-blocking

Previously, a misbehaving client could send an incomplete command
(s-expression), causing shepherd to hang while waiting for completion.  (Note
that said client is required to run with the same UID as shepherd, so this was
not a security issue.)

** Directory of log file is created if it doesn’t exist

When a service constructor is passed ‘#:log-file "/var/log/foo/bar.log"’,
shepherd now created /var/log/foo if it doesn’t exist; previously it would
fail gracelessly.

* Changes in version 0.9.1
** ‘make-inetd-constructor’ now accepts a list of endpoints

In 0.9.0, ‘make-inetd-constructor’ would take a single address as returned by
‘make-socket-address’.  This was insufficiently flexible since it didn’t let
you have an inetd service with multiple endpoints.  ‘make-inetd-constructor’
now takes a list of endpoints, similar to what ‘make-systemd-constructor’
already did.

For compatibility with 0.9.0, if the second argument to
‘make-systemd-constructor’ is an address, it is automatically converted to a
list of endpoints.  This behavior will be preserved for at least the whole
0.9.x series.

** ‘AF_INET6’ endpoints are now interpreted as IPv6-only

In 0.9.0, using an ‘AF_INET6’ endpoint for ‘make-systemd-constructor’ would
usually have the effect of making the service available on both IPv6 and IPv4.
This is due to the default behavior of Linux, which is to bind IPv6 addresses
as IPv4 as well (the default behavior can be changed by running
‘sysctl net.ipv6.bindv6only 1’).

‘AF_INET6’ endpoints are now interpreted as IPv6-only.  Thus, if a service is
to be made available both as IPv6 and IPv4, two endpoints must be used.

** ‘shepherd’ reports whether a service is transient
** ‘herd status’ shows whether a service is transient
** Fix possible file descriptor leak in ‘make-inetd-constructor’
   (<https://issues.guix.gnu.org/55223>)
** Fix value of ‘LISTEN_FDNAMES’ variable set by ‘make-systemd-constructor’
** Fix crash when logging IPv6 addresses
** ‘start-in-the-background’ returns *unspecified* instead of zero values

* Changes in version 0.9.0
** The Shepherd now depends on Fibers 1.1.0 or later
** ‘shepherd’ no longer blocks when waiting for PID files, etc.
** Services without #:log-file have their output written to syslog
** Services with #:log-file have their output timestamped
** New ‘make-inetd-constructor’ procedure for inetd-style services
** New ‘make-systemd-constructor’ for systemd-style “socket activation”
** New ‘start-in-the-background’ procedure
** Services can now be “transient” (see the manual for details)
** New #:supplementary-groups parameter for ‘make-forkexec-constructor’
** New #:create-session? parameter for ‘make-forkexec-constructor’
** New #:resource-limits parameter for ‘make-forkexec-constructor’
** Log file of unprivileged ‘shepherd’ is now under $XDG_DATA_DIR
** Do not reboot upon ‘quit’ when running as root but not PID 1
** Improved documentation and examples
** The Shepherd can no longer be built with Guile 2.0
** Work around Guile 3.0.[5-7] compiler bug
   (<https://bugs.gnu.org/47172>)
** Updated translations: da, de, sv, uk

* Changes in version 0.8.1
** Fix race condition that could lead shepherd to stop itself
   (<https://bugs.gnu.org/40981>)
** Use ‘signalfd’ on GNU/Linux to improve efficiency and simplify code
** Outdated bits have been removed from the manual
** Updated translation: sv

* Changes in version 0.8.0
** Kill the whole process group when the PID file doesn’t show up
   (<https://bugs.gnu.org/40672>)
** ‘make-kill-destructor’ kills the process group
** New ‘default-pid-file-timeout’ SRFI-39 parameter
** New #:file-creation-mask parameter for ‘make-forkexec-constructor’
** ‘make-forkexec-constructor’ creates log files as #o640
   (<https://bugs.gnu.org/40405>)
** Improve documentation and examples
** Ensure man pages are up to date
   (<https://bugs.gnu.org/39694>)
** Fix compilation on systems without ‘prctl’ such as GNU/Hurd
** Remove kludge that would send SIGALRM every second
** Address “error in finalization thread” warning
** ‘make-forkexec-constructor’ no longer supports old calling convention

The first argument must be a list of strings.  Passing several strings has
been deprecated since 0.1.

* Changes in version 0.7.0
** New crash handler allows shepherd as PID 1 to dump core on GNU/Linux
** (shepherd service) now exports ‘default-environment-variables’
** ‘make-forkexec-constructor’ no longer removes log file
** Disable reboot on ctrl-alt-del before loading the config file
   (<https://bugs.gnu.org/35996>)
** Exception handling adjusted for Guile 3.0.0
* Changes in version 0.6.1
** ‘herd status’ distinguishes between “stopped” and “one-shot” services
** ‘read-pid-file’ gracefully handles PID files not created atomically
   (<https://bugs.gnu.org/35550>)
** ‘shepherd’ no longer crashes when asked to load files with syntax errors
   (<https://bugs.gnu.org/35631>)
** New translations: de, sk
** Updated translations: da, es, fr, pt_BR
* Changes in version 0.6.0
** Services can now be “one-shot” (see the manual for details)
** ‘shepherd’ deletes its socket file upon termination
** ‘herd stop S’ is no longer an error when S is already stopped
** ‘herd’ exits with non-zero when executing an action that fails
** ‘shepherd’ ignores reboot(2) errors when running in a container
** Translation of error messages has been fixed
** New translation: ta (Tamil)
** Updated translations: da, es, fr, pt_BR, sv, ta, uk, zh_CN
* Changes in version 0.5.0
** Services now have a ‘replacement’ slot
** Restarting a service now restarts its dependent services as well
** Gracefully halt upon ctrl-alt-del when running as PID 1 on GNU/Linux
** Actions can now be invoked on services not currently running
** Guile >= 2.0.13 is now required; Guile 3.0 is supported
** Unused runlevel code has been removed
** Updated translations: es, fr, pt_BR, sv
* Changes in version 0.4.0
** When running as non-root, keep track of forked processes
** When running as root, log to /dev/log (syslogd) or /dev/kmsg by default
** ‘exec-command’ opens log file in append mode
** Add native language support (5 languages currently supported)
** ‘log-output-port’ is now a SRFI-39 parameter
** New ‘make-shepherd-output-port’ in lieu of ‘shepherd-output-port’
** Fix non-deterministic test suite issues

* Changes in version 0.3.2
** ‘herd status’ displays a bullet list
** No longer crash when ‘enable’ & co. are passed a wrong argument number
   (<http://bugs.gnu.org/24684>)
** ‘make-forkexec-constructor’ has a new #:pid-file-timeout parameter
** Processes that failed to create their PID file are now killed
** .go files are now installed in LIBDIR/guile/2.X/site-ccache
** Build system supports compilation with Guile 2.2

* Changes in version 0.3.1
** Process respawn limit is honored again (regression introduced in 0.3)
** ‘herd status SERVICE’ displays the last respawn time, if any
** (shepherd service) exports ‘&action-runtime-error’ and related bindings
** ‘mkdir-p’ adjusted to cope with GNU/Hurd file system behavior

* Changes in version 0.3

** GNU dmd becomes the GNU Shepherd

The GNU Shepherd herds your daemons!
See http://www.gnu.org/software/shepherd/#history for details.
As a side effect, many incompatible changes were made:

  - The ‘dmd’ command was renamed to ‘shepherd’.
  - The ‘deco’ command was renamed to ‘herd’.
  - The default system-wide config file is now /etc/shepherd.scm.
  - The default per-user config file is now ~/.config/shepherd/init.scm.
  - The special ‘dmd’ service is now called ‘root’ and ‘shepherd’.  Thus,
    instead of:
       deco load dmd foo.scm
    you would now type:
       herd load root foo.scm
  - Guile modules now live in the (shepherd …) name space.

** ‘herd status’ and ‘herd detailed-status’ assumes the ‘root’ service

That is, ‘herd status’ is equivalent to ‘herd status root’.

** ‘herd help’ returns a meaningful help message
** ‘shepherd’ stops itself when it receives SIGINT

This is what happens when ‘shepherd’ is running as PID 1 on GNU/Linux and
ctrl-alt-del is pressed (see ctrlaltdel(8)).

** ‘halt’ and ‘reboot’ connect to the system socket unconditionally
** ‘herd’ uses a non-zero exit code upon errors
** The ‘root’ service has a new ‘eval’ action
** Basic man pages are now provided
** ‘make-forkexec-constructor’ has new #:group and #:user parameters
** ‘make-forkexec-constructor’ has a new #:pid-file parameter
** (shepherd services) now exports ‘make-actions’ and ‘provided-by’
** ‘shepherd --pid=FILE’ writes FILE atomically
** The communication protocol is now entirely sexp-based (see the manual)
** ‘shepherd’ is more robust to misbehaving clients
** Cross-compilation is now supported
** The build system uses “silent rules” by default
** Internally, the coding style of various parts has been improved

* Changes in version 0.2

** Non-root configuration file is now ~/.dmd.d/init.scm.

For unprivileged uses of dmd, the configuration file used to be
~/.dmdconf.scm.  It is now ~/.dmd.d/init.scm

** Generate template configuration file when none is found.

A ~/.dmd.d/init.scm template configuration file is now generated when
dmd is started and no such file exists.

** The 'dmd' service has new 'unload' and 'reload' actions.

The 'unload' action allows a service to be stopped and its definition to
be unloaded; 'reload' allows a service to be unloaded, and a new
redefinition to be reloaded, atomically.  See the manual for details.

** 'make-forkexec-constructor' has a new calling convention.

In particular, the procedure now has #:environment-variables
and #:directory arguments.  See the manual for details.

** New 'exec-command' and 'fork+exec-command' convenience procedures.
** The 'status' action displays the running value of services (the PID.)
** 'dmd' has a new '--pid' option.
** Failures to connect to dmd are gracefully handled.
** Data is always appended to the log file.
** Assorted bug fixes and documentation improvements.

* Changes in version 0.1
** A single socket is used for communication with dmd, with a new protocol.

The new communication protocol between 'dmd' and 'deco' is simpler,
versioned, and extensible.

** The default socket name is now independent of the calling user.
** The socket directory is now created under $(localstatedir).
** The 'dmd' service has new actions 'power-off' and 'halt'; 'stop' reboots

When dmd is running as root, as is the case when it is used as a
PID-one init system, these actions allow 'root' to cleanly reboot or
halt the machine.

** New 'reboot' and 'halt' commands.
** 'dmd' only write to stdout when no client is connected.
** The configuration file is loaded in a fresh module.
** 'make-forkexec-constructor' closes all file descriptors after forking.
** License upgraded to GPL version 3 or later.
** Manual license upgraded to FDL version 1.3 or later.
** Many bug fixes, documentation improvements, etc.

* Changes in version -0.4
** Awaken from a 10-year nap.
** Ported to Guile 2.0.
** Modules are modules instead of being loaded.
** Build system fixes, cleanups, and upgrades.

* Changes in version -0.5
** dmd: `--socket=-' instead of `--socket=none'.
** Renamed `extra-action' to plain `action'.
** The result of user-defined stop code is ignored now.
** New action for all services: `dmd-status'.
** Distribution contains file `QUESTIONS'.
** Improved the `unknown' service implementation in `examples/'.
** Number of args given to actions is verified.
** Made docstrings for actions optional.
** Renamed `{en,dis}able-persistency' to `{,no-}persistency'.
** Can pass file name to dmd action `persistency'.

* Changes in version -0.6
** New action `doc' for displaying documentation.
** `list-actions' is a sub-action of `doc' now.
** New action `cd' for dmd, useful with `--socket=none'.
** Distribution contains example for an `unknown' service.
** At configure time, dmd checks for a Guile installation.
** Enable readline on `--socket=none' and non-dumb terminal.
** Startup time finally became completely unacceptable. :-)

* Changes in version -0.7
** Can fork into background via dmd extra-action `daemonize'.
** New action for all services: list-actions.
** New options for dmd: --logfile (-l), and --silent/--quiet
** Standard option --usage works for both dmd and deco.
** You can pass relative file names to deco.
** Never send respawn-output to deco by accident.
** Better handling of terminals and similar services.
** Documented evolution of runlevels.
** Service groups can be used to start/stop services at once.
** Persistency (i.e. safe state on exit and restore next time).
** Invoke actions of service `unknown' (if defined) as fallback.
** Read commands from standard input if socket file name is `none'.

* Changes in version -0.8
** Show output in deco, not only in dmd.
** New options in deco: --insecure (-I) and --result-socket (-r)
** --help displays the options for both dmd and deco.
** Disable services which are respawning too fast.
** New actions for all services: enable, disable and enforce.
** Default extra actions work even when the service is stopped.
** Documented some internals of dmd.

* Changes in version -0.9
** Example configuration added.
** New option for deco: --socket (-s).
** New option for dmd: --insecure (-I).
** Added tutorial and completed documentation.
** Create default socket dir on startup if desired.
** Added a real build system.

* Changes in Version -0.9.6
** Controlling dmd completely with deco is now possible.
** A few bugfixes for service handling.
** Long options can be abbreviated, short ones also work.
** Respawning of services works.

* Changes in version -0.9.7
** User-defined code is always protected with a `catch'.
** New options: --config and --socket.
** The new deco program can be used to send commands to dmd.

* Changes in version -0.9.8
** Starting and stopping of services by symbol works better.
** Performing extra actions on services possible.
** Improved documentation.
** More detailed output.

* Version -0.9.9
** Initial release.
