.\" To run off, use: eqn file | tbl | troff -ms | print
.tr ~
.EQ
delim @@
.EN
.ps 12
.nr PS 12
.vs 14
.nr VS 14
.TL
Design and Implementation of Parallel Make
.AU
Erik H. Baalbergen
.AI
Dept. of Mathematics and Computer Science
Vrije Universiteit
Amsterdam
The Netherlands
.AB
\fIMake\fP
is the standard 
.UX
utility for maintaining programs.
.UX
programmers have been using it for almost 10 years, and many 
.UX
programs nowadays are maintained by it.
The strength of \fImake\fP is that it allows the user
to specify how to compile program components,
and that the system, after an update,
is regenerated according to the specification and with minimum number of
recompilations.
With the appearance of multiple processor systems, we expect the
time needed to ``make'' a program, or \fItarget\fP, can effectively be
reduced.
Although the hardware provides parallelism, few tools are able to
exploit this parallelism.
The introduction of parallelism to \fImake\fP is the subject of this paper.
We describe a parallel \fImake\fP and give an analysis of its performance.
.AE
.NH
INTRODUCTION
.PP
Large programs are often written as a collection of small files rather than as
one big file so that changes to one source file only require recompilation of
that one file.
A large C~[1]
program, for example, may be split up over tens, or even hundreds
of files. 
Code which is common to a set of source files is often placed in a single
file, which is included (using a C Preprocessor ``#include'' line) in each
of the source files.
A consequence of the inclusion of source code is that if we change an
included file, we have to recompile all files that include the file.
Instead of recompiling all program components, we limit ourselves to recompiling
the affected source code only.
This efficiency, however, imposes a strict discipline on the
programmer, who has to remember which
files \fIdepend on\fP (i.e. ``include'') other files, and which commands are
used to regenerate components.
.PP
\fIMake\fP~[2]
is a program that keeps track of which object files are up to date and
which must be regenerated by compiling their corresponding sources.
Apart from the concept of efficiently regenerating components, we can achieve
more efficiency by speeding up the commands that \fImake\fP executes,
and adapting \fImake\fP itself.
With the advent of parallel processor systems, it has become possible to speed
up the operation of \fImake\fP by doing compilations in parallel.
Parallelizing \fImake\fP, however, is not at all straightforward;
there are numerous problems and pitfalls.
These problems, their solutions, and various optimizations form the body
of this paper.
We also discuss the performance of our parallel \fImake\fP and compare it
to work elsewhere.
.NH
SPEEDING UP THE MAKE PROCESS
.PP
Apart from parallelism, there are several methods in decreasing \fImake\fP's
response time.
On the one hand, it is possible to speed up single compilations, which
does not affect \fImake\fP itself.
On the other hand, we can optimize \fImake\fP, by, for example,
eliminating drivers, optimizing command invocation and task scheduling,
and compiling description files.
.NH 2
Concurrent compilations
.PP
Much work has been done in the area of \fIconcurrent\fP compilation.
A short overview of the work is given in~[3].
Strategies for concurrently compiling a single program are, among others,
\fIpipelining\fP~[4,\|5,\|6],
\fIsource-code splitting\fP~[3],
and \fIparallel evaluation of attribute grammars\fP~[7].
The main purpose of running a single compilation concurrently is to decrease
the response time.
We can indeed speed up the \fImake\fP process by invoking concurrent compilers, but,
in practice, few concurrent compilers exist yet.
Moreover, \fImake\fP is routinely used to invoke \- besides compilers \-
generators (e.g., \fIyacc\fP~[8]
and \fIlex\fP~[9]),
linkers, text formatters, and many other tools.
To effectively exploit parallelism inside tools, we need
concurrent tools, which still are rare.
.\" We may, however, not neglect the concurrent compilers, since
.\" \fImake\fP provides several implicit rules, which may be applied often,
.\" for doing compilations.
There are, however, few compilers available, which are able to do (part of)
the work concurrently.
A compilation in \fIACK\fP~[5],
for example, is done by passing the code through several compiler
components.
\fIMake\fP can exploit pipelining techniques by efficiently
scheduling the compiler components among the available processing power.
.NH 2
Eliminate Compiler Drivers
.PP
One possible improvement to \fImake\fP's response time
is to replace compiler ``drivers,'' such as
\fIcc\fP, by supplying rules to call the compiler phases explicitly.
Instead of having the rule
.DS
\&.c.o:
        $(CC) $(CFLAGS) \-c $*.c
.DE
we can introduce the rules
.DS
\&.c.i:
        /lib/cpp $(CFLAGS) $*.c >$*.i
\&.i.s:
        /lib/ccom $(CFLAGS) $*.i $*.s
\&.s.o:
        /bin/as $*.s $*.o
.DE
This is not normally done since the traditional \fImake\fP is unable
to transitively use implicit rules.
More important, compiler drivers introduce some means of compatibility, since,
for example, calling \fIcc\fP is portable among
.UX
systems, while the invocation of the compiler phases might differ.
.NH 2
Optimize Command Invocation
.PP
A considerable speed-up, which is already available (or easy to implement)
in existing \fImake\fPs, is optimizing the invocation of commands by shunting
out the shell whenever possible.
If a command line contains no shell-specific
constructs (such as \fB;\fP, \fB&&\fP,
\fB&\fP, \fB(\fP, \fB[\fP, \fB$\fP, etc.) and commands (\fIcd\fP, \fIfor\fP,
\fIif\fP, \fIcase\fP, etc.), \fImake\fP can easily
place the arguments in an argument array and invoke \fIexecve\fP itself,
instead of calling a shell to parse and execute the command line.
Another strategy is to have a shell running
as co-process to \fImake\fP,
and write command lines to the shell via a pipe.
The latter approach was adopted in \fInmake\fP~[10].
.NH 2
Compile Description Files
.PP
Compiling description files instead of interpreting them introduces
some additional speed-up.
Instead of reading and parsing the description file at each invocation,
\fImake\fP compiles
it into a binary format, and uses the binary description file in subsequent
invocations.
The binary description file may consist of \fImake\fP's internal data structures,
built when parsing the description file.
\fIMake\fP has to recompile the description file, if it
is more recent than its compiled binary version.
The method of compiling description files is used in, for example, \fInmake\fP.
.PP
Another strategy is to translate the description file
into a compilable language, such as C, and compile the resulting program.
Instead of calling \fImake\fP to update a system, we,
or the \fImake\fP command itself,
can invoke the generated program.
Both compiled description files and generated \fImake\fP programs
offer a constant decrease in response time
since we only reduce the \fImake\fP overhead.
However, since most of the execution time comes from executing the commands,
rather than from \fImake\fP interpreting the rules, the gain here is small.
.NH 2
Exploit Parallelism
.PP
If multiple processors are available to execute commands for \fImake\fP,
a potentially large speed-up is possible by running commands in parallel.
We can get an optimum speed-up if we schedule the commands cleverly.
Consider the rule
.DS
prog: main.o util.o prog.o
.DE
Assume that each of the compilations
of \fImain.c\fP and \fIutil.c\fP into \fImain.o\fP
and \fIutil.o\fP,
respectively,
takes \fIt\fP seconds, and the compilation
of \fIprog.c\fP into \fIprog.o\fP takes
\fI2t\fP seconds.
If there are 2 processors available, \fImain.o\fP and \fIutil.o\fP
are made in parallel.
The compilation of \fIprog.o\fP is postponed until one of the processors
finishes, which is (at least) at time \fIt\fP. 
From this moment on, it still takes \fI2t\fP seconds to make \fIprog.o\fP.
The \fImake\fP process totally takes \fI3t\fP seconds.
If, on the other hand, we let one processor make \fIprog.c\fP,
and let the other processor make \fImain.o\fP
and \fIutil.o\fP, as if the rule were
.DS
prog: prog.o main.o util.o
.DE
then the make only takes \fI2t\fP seconds.
.PP
The problem of running \fIn\fP independent tasks, with known execution times,
on \fIm\fP similar 
processors with minimal response, or \fIfinishing\fP, time,
is studied by the theory of deterministic sequencing and
scheduling~[11],
and is known as the @P || C sub max@ problem.
Since the problem is NP-complete, we are dependent on heuristics\*(<.~[12]\*(>.
gives an easy-to-read introduction into various techniques involved
with optimizing scheduling algorithms.
.PP
Known heuristics for a feasible schedule with minimum finishing time,
often called the \fIindependent task-scheduling problem\fP, are the
\fIlist-scheduling\fP algorithm, the \fILPT (Largest Processing Time)\fP
algorithm, and the \fIMULTIFIT\fP algorithm.
List scheduling treats the tasks in the given order, and assigns each task
in turn to a processor which is free if there are free processors,
or which is the first one to finish a task if all processors are busy.
A better scheduling, with lower finishing time, is \fILPT\fP,
which sorts the given
jobs according to decreasing execution time, and applies list scheduling to
the reordered list of jobs.
\fIMULTIFIT\fP~[13]
is the best scheduling algorithm (for our model) yet found.
The idea is to find a minimum value for the deadline, the time all tasks
need to have finished.
The tasks are again assigned in decreasing order of execution time, each to
the lowest indexed processor, if the total execution time on that
processor does not exceed the deadline.
Otherwise, the next processor is taken.
(This strategy is called the \fIfirst fit decreasing method\fP.)
For any values of the deadline, the division among the processors either
succeeds or fails, according to which we adapt the deadline.
\fIMULTIFIT\fP uses binary search to find a minimum value for the deadline.
.PP
Should \fImake\fP itself reorder the
dependency list, or does the user have to specify the order himself?
The former approach requires \fImake\fP to have knowledge of the time needed to
execute a command block, whereas the latter forces the user to estimate
times, which may differ among various environments.
It is difficult to come up with a set of heuristics for \fImake\fP to
estimate the execution time (such as ``compilation time is proportional
to the length of the source''), since \fImake\fP has
no idea of what a command does.
One solution is to keep track of execution times of command blocks
in \fImake\fP runs, and use the results in future \fImake\fPs.
Possible implementations are to let \fImake\fP keep the timing results in
a global state file,
or to enable \fImake\fP to reorder the dependency lists and
overwrite a description file, or output a new description file.
.NH
PARALLELISM AND DISTRIBUTION
.PP
Before discussing the techniques and problems of implementing
a parallel \fImake\fP,
we consider the relation between parallelism and distribution.
Experiments~[14]
have shown that running several compilations in parallel on a single processor
in general does not result in a significant speed-up,
since compilations are usually CPU-bound.
At best, while one process is doing I/O, another one can compute.
.PP
Several practical problems arise when running compilations in parallel on a
single-processor
.UX
system.
First, processes compete for a fixed number of CPU cycles.
The more processes there are, the fewer cycles each one gets.
Moreover, time needed for swapping or paging increases as the number
of processes grows.
The net result is that processes running in parallel slow down each other.
Second, each
.UX
user is allowed to have only a limited number (commonly 20 or 25)
of processes running at a time.
This limit reduces the number of simultaneous compilations because each one
may need several processes.
.PP
If multiple processors are available, we can achieve a speed-up by
running each compilation on a different processor.
The trick is to arrange for this parallelism without burdening
the programmer with all the details.
Several approaches are discussed in~[14].
In what follows we will assume that the mechanics of forking off processes
to remote CPUs is handled by the operating system.
Our concern is \fIwhat\fP should be run in parallel, not \fIhow\fP
parallel execution is achieved.
We assume that the underlying operating system has a smart
processor allocation strategy; that multiple processors (say, at least 8)
are available; and that commands can be executed on any processor in the
network without loosing efficiency.
A distributed operating system that serves our needs is \fIAmoeba\fP~[15,\|16,\|17].
.NH
PARALLELIZING DESCRIPTION FILES
.PP
In this section we consider various naive approaches in
making \fImake\fP run commands in parallel
by adapting description files.
Apart from the conclusion that correct parallelization of \fImake\fP
is not achieved by naively altering the description files, and
that it is impossible to maintain compatibility with existing \fImake\fPs
and description files, there are a few
basic problems we have to solve in designing a parallel \fImake\fP.
.PP
A first approach in making \fImake\fP run commands in
parallel is enveloping each command line in parentheses, and append an
ampersand to it.
The shell runs the command line in the background, and returns immediately.
This simple and naive approach will not work in practice due to the
following problems:
.IP [1]
The commands in a command block must execute in sequence, since
a command may use the result of a previous command within the same
command block.
.\" The rule
.\" .DS
.\" prog.c: prog.y
.\"	yacc prog.y
.\"	mv y.tab.c prog.c
.\" .DE
.\" illustrates the problem.
.\" The \fImv\fP command can execute properly only if the
.\" \fIyacc\fP command has successfully finished.
.\" Replacing the rule by (or interpreting the rule as)
.\" .DS
.\" prog.c: prog.y
.\"	yacc prog.y&
.\"	mv y.tab.c prog.c&
.\" .DE
.\" will cause the second command line to fail.
.\" \fIMake\fP does not know that the \fImv\fP command may execute only
.\" if the \fIyacc\fP command has successfully finished.
.IP [2]
Starting a job in the background via a shell causes the shell to return
immediately, and \fImake\fP to believe that the command has finished.
Moreover, \fImake\fP has no facility to wait for the background process to
finish, nor can it tell whether the command succeeded.
It may decide to activate a rule's command block, while
its dependencies still are absent or out of date.
.IP [3]
\fIMake\fP does not keep track of how many child processes are still alive.
The system may refuse to execute commands due to the
.UX
per-user process quantity.
Letting \fImake\fP continuously try to fork off a process after a failure does
not solve the problem either.
If, for example, \fIcc\fP is unable to fork off a pass,
it does not try again; it just reports back failure.
.IP [4]
There are commands which cannot run in parallel with each other,
because they use fixed file names.
\fIYacc\fP is a standard example.
.\" Consider the \fImake\fP rules
.\" .DS
.\" prog: a.o b.o lex.o
.\" a.c: a.y
.\" 	yacc \-d a.y
.\" 	mv y.tab.c a.c
.\" 	mv y.tab.h a.h
.\" b.c: b.y
.\" 	yacc \-d b.y
.\" 	mv y.tab.c b.c
.\" 	mv y.tab.c b.h
.\" lex.o: a.h b.h
.\" a.o: a.h
.\" b.o: b.h
.\" .DE
.\" This example shows the problem that a few standard
.\" .UX
.\" tools, like \fIyacc\fP and \fIlex\fP,
.\" use fixed names for their output files.
.\" Running two \fIlex\fP or two \fIyacc\fP processes in parallel
.\" within the same directory leads to
.\" inconsistencies; one of the output files is simply overwritten.
.\" The \fIlex\fP problem, however, is solved by using \fIlex\fP' ability to
.\" produce its output on standard output, as is shown in
.\" .DS
.\" lex.c: lex.l
.\" 	lex \-t lex.l | sed 's/yy/lyy/g' >lex.c
.\" .DE
.\" \fIMake\fP cannot solve the problem itself, since it has no knowledge of
.\" the commands it executes.
.PP
Executing the commands in a command block sequentially, and the command
block as a whole in the background, solves problem [1].
This is achieved by surrounding the command block by parentheses, appending
an ampersand to the closing parenthesis, and using a single shell to execute
the command block as a whole.
Problems [2], [3] and [4], however, still exist.
Worse yet, the latter mechanism introduces another problem:
.IP [5]
Traditional \fImake\fP executes each command line of a command block in a
separate shell.
This means that a \fIcd\fP, ``change directory'', command has
effect only within a single command line, not in the succeeding commands within
the same block.
To get the same behavior if the command block is executed in a single shell,
we have to adapt the command block in the description file.
.PP
Problems [1] and [5] are solved by the mechanism to surround each command line
by parentheses (i.e., execute it in a separate shell);
place parentheses around the command block as a whole; and append an ampersand
to it.
This still leaves problems [2], [3] and [4] unsolved, and introduces a lot
more, almost dummy, shells.
.PP
The next section discusses an approach which solves the five problems.
.NH
DESIGN OF PARALLEL MAKE
.PP
In order to design and develop our parallel \fImake\fP, which we have called
\fIpmake\fP, we consider a few issues to make it easy to use.
.NH 2
Design Goals
.PP
An important issue is to maintain upwards compatibility;
existing description files
still should be accepted and interpreted properly, and \fIpmake\fP
description files should be accepted by \fImake\fP.
A second important issue is to hide the parallelism completely from the
description file writer.
Programmers and \fIpmake\fP invokers
should not be confronted with the use of complicated constructs
to exploit parallelism.
Unfortunately, there are several serious obstacles which prevent our
goals of compatibility and transparency from being achieved completely in
parallel \fImake\fP.
The problems and possible solutions are discussed in the remainder of this
section.
.NH 2
Virtual Processors
.PP
To overcome the problems discussed in the previous section, we introduce
virtual processors as the basis of \fIparallel make\fP.
For each command block to be executed, \fIpmake\fP creates a child process,
which we call a \fIvirtual processor\fP.
\fIPmake\fP does not wait for the virtual processor to finish,
but continues processing the list of dependencies in which the target appeared,
which caused the command block's execution.
The virtual processor controls the execution of the command lines in the command
block.
The strategy is depicted globally in Figure 1.
.DS
make(\fItarget\fP):
    let \fIR\fP be the rule
        \fItarget\fP \fB:\fP \fIdependency-list\fP
            \fIcommand-block\fP
    for each \fIdependent\fP in \fIdependency-list\fP do make(\fIdependent\fP)
    if all makes succeed and any \fIdependent\fP is newer than \fItarget\fP then
        allocate virtual processor (i.e. check number of child processes)
        fork child: (* this is the virtual processor *)
	    report execute(\fIcommand\(rublock\fP)
    return (* do not wait for the virtual processor to finish *)

execute(\fIcommand\(rublock\fP):
   for each \fIcommand\fP in \fIcommand\(rublock\fP do
       execute \fIcommand\fP via a shell
       wait for the shell to finish
       if \fIcommand\fP failed then report \fIfalse\fP
   report \fItrue\fP
.ce 1
\fBFigure 1.\fP
.DE
The algorithm shows
that synchronization among the update commands is driven by
the dependency graph.
.PP
A virtual processor runs the command lines one after another,
each in a separate shell environment.
This mechanism solves problems [1] and [5].
As soon as one of the commands fails,
the virtual processor exits with status \fIfalse\fP.
If all commands succeed, then the virtual processor
stops with status \fItrue\fP.
.PP
When the target, which is the command block's result, is needed, (i.e., 
when all dependencies of the parent rule are checked),
\fIpmake\fP waits for all virtual processors dealing with the dependencies
to finish, before deciding whether to make the parent target.
This strategy solves the synchronization problem [2].
.PP
To avoid problem [3],
the number of available virtual processors within a single \fIpmake\fP
run is limited.
If \fIpmake\fP has to allocate a virtual processor while the maximum number of
virtual processors is running, it has to wait until at least one of them
is ready.
The number of available virtual processors must be carefully chosen, since
each one can also fork off processes.
The user can explicitly specify how many virtual processors may be
used by using the \fB\-Pn\fP\fInum\fP command-line option.
Otherwise, a default number is chosen, and it is up to the system
administrator to select an efficient number, depending
on the number of available physical processors.
The virtual processor mechanism introduces a minor overhead
because an extra process is introduced for each virtual processor to
control the execution of the command block.
.PP
It is not possible to solve problem [4] in a transparent way,
since \fIpmake\fP
cannot deduce which commands may not run in parallel in the same directory.
\fIConcurrent make\fP, or \fIcmake\fP~[18],
provides a facility to specify which command blocks should execute mutually
exclusively, i.e., only one command block in a certain group of command blocks
may run at a time.
To do so, one has to define a \fImutex\fP on the group of command blocks, by
declaring a rule having target name \fI.MUTEX\fP, with the targets of the
command blocks as dependents.
The \fIlex\fP example becomes
.DS
prog: a.o b.o
\&.MUTEX: a.c b.c
a.c: a.l
        lex a.l
        sed 's/yy/ayy/g' lex.yy.c > a.c
b.c: b.l
        lex b.l
        sed 's/yy/byy/g' lex.yy.c > b.c
.DE
To force \fIcmake\fP to run only one command at a time (i.e., to behave
like \fImake\fP), the description file should contain the rule
.DS
\&.MUTEX:
.DE
We adapted the \fI.MUTEX\fP mechanism in \fIpmake\fP.
Although it is certainly not transparent to the user,
it is a convenient way to solve problem [4].
In practice, there is hardly ever a need to define a mutex.
.PP
Note that a description file with a mutex is still compatible with
the traditional \fImake\fP.
.NH 3
Command Failures
.PP
Since the dependents of a target are checked and possibly updated in
parallel, we cannot prevent a dependent from being made if
one of its predecessors in the dependency list has failed.
This behavior is similar to \fImake\fP with the ``\-k'' flag, which
indicates that the \fImake\fP process should continue even if a command fails.
One way to get rid of the ``\-k'' behavior is to stop checking
the dependency list as soon as possible.
This leaves us in a situation where a nondeterministic
number of dependencies has already been made.
To prevent the nondeterminism as much as possible, we do not examine
the success or failure of making a dependent until
we need the results; this results in ``\-k'' behavior.
.NH 3
Multiple-target Rules
.PP
\fIMake\fP lacks a means of specifying that a command block
produces multiple files.
It seems, however, able to deal with multiple-output-files commands, through 
nothing but a coincidence.
We erroneously take the rule
.DS
y.tab.c y.tab.h: grammar.y
	yacc grammar.y
.DE
as a specification of how to produce both \fIy.tab.c\fP
and \fIy.tab.h\fP, using a single \fIyacc\fP command.
\fIMake\fP, however, considers the rule to be a short-hand specification of
.DS
y.tab.c: grammar.y
	yacc grammar.y
y.tab.h: grammar.y
	yacc grammar.y
.DE
.PP
In practice, we can hardly tell the difference, since if one of \fIy.tab.c\fP
and \fIy.tab.h\fP appears as dependent and is updated,
the other one is created too, at the 
same time.
If the latter file appears as dependency, then the file already exists and
\fImake\fP decides not to create or update the file.
.\" Dick: This depends on the time evaluation order
Worse, programmers sometimes tacitly assume that \fIy.tab.h\fP is created when
using the rule
.DS
y.tab.c:  grammar.y
	yacc grammar.y
.DE
which indeed produces \fIy.tab.h\fP, although \fImake\fP does not note that
it has been created.
.\" (Dick) but prog: lex.o y.tab.o does not work --> dependencies are not a set
The programmer must take care that \fIy.tab.c\fP is created before
any other file is made,
that depends on \fIy.tab.h\fP.
In \fIpmake\fP, however,
neither the ``\fIy.tab.c\fP \fIy.tab.h\fP'' rule nor the ``\fIy.tab.c\fP''
rule will always work correctly.
For example, the construct
.DS
prog: y.tab.o lex.o
	$(CC) \-o prog y.tab.o lex.o
lex.o: y.tab.h
y.tab.o: y.tab.c
y.tab.c y.tab.h: parse.y
	yacc parse.y
.DE
is treated correctly by any sequential \fImake\fP.
\fIPmake\fP, however, tries to ``make''
\fIlex.o\fP and \fIy.tab.o\fP in parallel,
independently from each other.
Both actions require the last rule to be applied, resulting in two
\fIyacc\fP processes running in parallel, both writing output to the same files,
\fIy.tab.c\fP and \fIy.tab.h\fP.
As soon as one of the commands has finished, the result is used in creating
\fIlex.o\fP or \fIy.tab.o\fP,
while the other \fIyacc\fP process still may be busy in writing
output to \fIy.tab.c\fP and \fIy.tab.h\fP,
which files are supposed to be complete.
Omitting \fIy.tab.h\fP from the last rule's target list is correctly dealt with
in sequential \fImake\fP,
but \fIpmake\fP may complain about \fIy.tab.h\fP's absence.
\fIPmake\fP does not know that \fIy.tab.h\fP is created as side effect in creating \fIy.tab.c\fP.
Therefore, we have to take care that at least each file that is
created appears as target.
Furthermore, to prevent simultaneous execution of the \fIyacc\fP commands,
we can
define a mutex on \fIy.tab.c\fP and \fIy.tab.h\fP, or, even better,
on \fIlex.o\fP and \fIy.tab.o\fP.
The former mutex does not prevent \fIyacc\fP being invoked twice, while
the latter does.
.PP
A better solution, without changing \fImake\fP, is to introduce an
\fIintermediate\fP target, which is
an empty file, and which is newer than the other files
created by the commands in the command block.
The following code shows the use of an intermediate target,
called \fIyacc\(rudone\fP.
.DS
prog: y.tab.o lex.o
	$(CC) \-o prog y.tab.o lex.o
lex.o: y.tab.h
y.tab.o: y.tab.c
y.tab.h y.tab.c: yacc\(rudone
yacc\(rudone: parse.y
	yacc parse.y
	touch yacc\(rudone
.DE
The \fItouch\fP command creates or updates file \fIyacc\(rudone\fP,
and ensures that it becomes more recent than any of the files produced by
the preceding \fIyacc\fP command.
.NH
IMPLEMENTATION OF PARALLEL MAKE
.PP
Instead of building a parallel \fImake\fP from scratch
in order to get experience with the use of
\fIparallel make\fP, we developed a module which implements parallelism and
which can be plugged into traditional \fImake\fPs.
The module uses the command
block (i.e. the sequence of commands belonging to a rule)
as the basic unit of execution.
The idea is to collect the commands of a command block and
keep administration of the target which causes the rule to be selected. 
As soon as the last command of a command block is encountered, the \fImake\fP
process forks off a virtual processor, which executes the commands in sequence.
.PP
We equipped the
.UX
System V Release 2 \fIMake\fP~[19]
with the parallel module, and ran
the result on the \fIAmoeba\fP distributed operating system.
The configuration consisted of a processor pool, a disk, and
a terminal (actually a window on a SUN), connected by a 10 Mbps Ethernet.
.NH 2
Implementation Problems in \fIPmake\fP
.PP
Apart from implementation problems such as the absence of explicit
dependencies in the internal data structure of implicit rules,
signal handling (signal propagation to virtual processors), and
synchronization (the modification time of a target which is being created),
there is the problem of multiplexing diagnostic output.
.PP
The diagnostic output from the commands is multiplexed arbitrarily,
leaving the user with an incoherent collection of messages or, even worse,
characters.
If the commands produce output on standard output, and especially
if the characters are not buffered,
the resulting list of diagnostics may look messy.
One solution is to gather output from commands via pipes and present
them to the user, or an ``intelligent''
editor (e.g., \fIemacs\fP), in an ordered manner.
Since \fImake\fP may read several directories at a time, using file
descriptors,
we have to be careful with using file descriptors for pipes.
A better approach is to redirect the output from commands onto files, and
present the contents of the files, preceded with an indication of the source.
.NH 2
Problems in the Test Environment
.PP
\fIAmoeba\fP is still under development and undergoing changes.
We made use
of a beta version of \fIAmoeba\fP, which suffers from various problems.
\fIAmoeba\fP currently lacks load balancing among the processors
in the pool.
Processes are assigned to physical processors at random.
The chance that some processors are overloaded
while other processors are idle is nonnegligible.
The random selection takes place if either a new program is loaded into memory
using the \fIexec\fP system call, or if a process forks itself,
using \fIfork\fP,
and the current processor does not have enough space for creating a copy of the
process.
.PP
Another problem is the inefficient use of memory,
because \fIAmoeba\fP currently lacks a shared text facility.
Forking a process causes the kernel to copy both text and data space in
memory, which results in both time-expensive forks and waste of memory.
There is only need to copy data space when forking in shared text environments.
.PP
We take care of the influence of ``random'' processor assignment under
\fIAmoeba\fP by taking the minimum of the experimental results;
we believe that load balancing in
\fIAmoeba\fP will eventually result in a better distribution of processes
among physical processors.
.NH 2
Timing Results
.PP
The timing test consists of compiling a large set (several tens) of
C source files into corresponding
object code, using the \fIACK\fP~[5]
C compiler running on \fIAmoeba\fP.
We use the real time, measured under perfect conditions, which means
single user, no background processes.
Furthermore, we deliberately do not include the link \fIld\fP phase in
the measurements, since it cannot be done in parallel.
Finally, we manually reordered the dependency lists according to the
\fILPT\fP algorithm.
.\" Amoeba vs. SUN
.\" \fInumber of virtual processors\fP	\fISun/4.2 BSD\fP	\fIAmoeba\fP
.\" 1	107	114
.\" 2	 86	 66
.\" 3	 81	 52
.\" 4	 80	 45
.\" 5	 80	 42
.\" 6	 82	 40
.\" 7	 85	 38
.\" 8	 88	 37
.\" 9	 89	 40
.\" 10	 91	 44
.\" .TE
.PP
Table 1 shows the speed-up factor acquired
when using the indicated number of virtual processors.
During the test,
\fIAmoeba\fP ran on a pool of ten 68020 processors running at 16 MHz with
a file server and a directory server.
.TS
box, center;
c|c.
\fInumber of\fP
\fIvirtual processors\fP	\fIspeed-up\fP
_
1	1.00
2	1.85
3	2.53
4	3.18
5	3.60
6	3.74
7	3.97
8	3.86
9	3.74
10	3.81	
.TE
.ce 1
\fBTable 1\fP
.PP
The table shows that the speed-up is far below linear.
This is not surprising since we have to deal with some overhead in starting
a compilation.
\fIPmake\fP \fIruns\fP its compilations in parallel but has to \fIstart\fP
them sequentially.
Consider @H + C@ the response of time of a compilation, where @H@ indicates
the time needed to start the compilation (i.e. to fork off the compiler
driver), and @C@ the time needed for executing the compilation.
If \fIpmake\fP has to run @N@ compilations, we cannot expect it to
fork them off simultaneously.
In the ideal case, when processors are available when needed, the
@N@-th compilation is started after @N - 1@ compilations have been started.
\fIPmake\fP's minimum response time then becomes @N * H + C@, which is
much more than the response time with linear speed-up @H + C@.
If the number of compilations exceeds the number of processors, then \fIpmake\fP
has to wait from time to time until a processor becomes available,
in which case the total response time increases and the speed-up decreases.
.NH
COMPARISON WITH OTHER PARALLEL MAKES AND DISCUSSION
.PP
This section gives a short overview of how several makes
supply parallelism.
Besides compatibility and transparency issues, we compare each \fImake\fP
with \fIpmake\fP.
.NH 2
Nmake
.PP
\fINmake\fP~[10],
or \fINew Make\fP or \fI4-th generation make\fP, was
developed at AT&T Bell Laboratories.
\fINmake\fP executes the update commands by sending the command blocks to the
shell \fIsh\fP, which runs as co-process.
As a consequence of sending complete command blocks to the shell, a \fIcd\fP
command, or the introduction of a shell variable, remains effective during
execution of the command block, as opposed to Feldman's \fImake\fP and
\fIpmake\fP.
If the user takes no special action, each command is executed by the shell,
denoted as the \fIforeground\fP shell.
If, however, the user specifies that update commands may execute
in parallel, the foreground shell starts a subshell, called the \fIbackground\fP
shell, for each update command block to be run.
In \fIpmake\fP, commands are always executed in a background shell.
Parallelism in \fInmake\fP is activated by specifying \fB\-j\fP\fIn\fP
(or \fB\-j\fP) as
command line option, which means that up to \fIn\fP (default 3)
background shells may be active at a time.
Like in \fIpmake\fP, the dependency graph is used for synchronizing the jobs.
.PP
Specifying \fI.FOREGROUND\fP, or its synonym \fI.WAIT\fP, in the dependency
list of a target, causes the update command block of the target to
execute in the foreground shell, which in turn causes \fInmake\fP to block until
the commands have finished.
.PP
\fINmake\fP does not provide an explicit facility to prevent certain
commands from being executed mutually exclusively.
It is possible, however, to run the critical commands in the foreground
shell, thus imposing sequential execution.
.NH 2
Concurrent Make
.PP
\fIConcurrent Make\fP~[18],
or \fIcmake\fP,
was developed primarily to reduce the time needed to
run a \fImake\fP process, not
to increase \fImake\fP's functionality.
It is written in \fIConcurrent C\fP~[20],
and is based on the Version 8
.UX
\fImake\fP.
The latter requires the user to indicate explicitly which commands
can execute in parallel, whereas \fIcmake\fP runs the update commands
in parallel by default.
Furthermore, the explicit indication in Version 8 \fImake\fP introduces a
non-portable construct in a description file.
.PP
The main difference between \fIcmake\fP and other parallel \fImakes\fP,
including
\fIpmake\fP and \fIVersion 8 make\fP, is that \fIcmake\fP takes care of
distribution among several processors.
The major assumption is that executing a command remotely has the
same result as executing it locally.
.\" For example, a C compiler on a remote machine is supposed to produce exactly
.\" the same code as the one on the local machine.
.\" The distribution and machine assignment policy is controllable by means
.\" of four environment variables and a special rule.
.\" The environment variables are \fIMAKE\(ruMACH\fP, which is the list of
.\" available machines; \fIMAKE\(ruMACH\(ruSPEED\fP, indicating the relative machine
.\" speeds;
.\" \fIMAKE\(ruMACH\(ruLOAD\fP, which specifies the number of commands executed
.\" in parallel on a machine (2 by default); and \fIMAKE\(ruMACH\(ruDIR\fP, which
.\" specifies for each machine the working directory in which the commands are
.\" to be executed.
The rule
.DS
\&.LOCAL: [\fItarget\fP ...]
.DE
forces \fIcmake\fP to run the update commands for the targets appearing in its
dependency list on the local machine.
An empty dependency list in a \fI.LOCAL\fP rule
forces \fIcmake\fP to run all update commands locally.
\fIPmake\fP and other parallel \fImake\fPs do not take distribution into
account.
They silently assume that the underlying operating system provides efficient
parallel execution among the available processors.
.PP
To prevent certain command blocks from executing in parallel, the rule
.DS
\&.MUTEX: [\fItarget\fP ...]
.DE
causes the update commands of the targets, which appear as dependency,
to execute mutually exclusively.
Parallelism can be suppressed by specifying a \fI.MUTEX\fP rule with an empty
dependency list.
The \fI.MUTEX\fP mechanism has also been adopted in \fIpmake\fP.
.PP
\fICmake\fP description files, like that of \fIpmake\fP, are silently
accepted by \fIVersion 7\fP-compatible \fImake\fPs.
The compatibility results from the use of \fImake\fP's syntax
to specify parallel constructs and options.
\fIMake\fP interprets, for example, a \fI.MUTEX\fP rule
as a rule to be applied when creating a target \fI.MUTEX\fP,
instead of a special command.
.NH 2
Mk
.PP
\fIMk\fP~[21],
like \fInmake\fP, is an enhanced version of the original \fImake\fP.
The number of jobs run in parallel is user-settable by defining the macro
\fI$NPROC\fP.
The number of concurrent jobs is 1 by default, which implies serial execution
of the commands.
Unlike \fInmake\fP, \fIcmake\fP and \fIpmake\fP,
\fImk\fP has no provision for the mutual exclusive
execution of commands, although commands can be executed one after another,
using serial execution.
To use parallelism and
to deal with, for example, several \fIyacc\fP commands in a single \fImk\fP run,
the programmer has to take care of name clashes explicitly.
An implicit rule for creating a linkable object file out of a \fIyacc\fP specification
file is
.DS
%.o: %.y
        mkdir /tmp/$nproc; cp $stem.y /tmp/$nproc
        (cd /tmp/$nproc; yacc $stem.y; mv y.tab.c $stem.c)
        $CC $CFLAGS \-c /tmp/$nproc/$stem.c
	rm \-rf /tmp/$nproc
.DE
Although the implicit rule is artificial, there is now no need to
prevent several \fIyacc\fP commands to run in parallel with each other.
.NH 2
Parmake
.PP
\fIParmake\fP~[22]
is an extension of the traditional \fImake\fP, and provides concurrent execution
of the operations which have no mutual dependencies.
\fIParmake\fP has been implemented at DEC's System Research Center
on a local area network of shared-memory multiprocessor \fIFirefly\fP
workstations.
The processing power of idle workstations is supplied by a distant process
facility \fIdp\fP.
\fIParmake\fP itself orders independent jobs by topologically sorting
the dependency graph in the description file.
The description file is compatible with the traditional \fImake\fP, although a
syntactic mechanism is introduced to force left-to-right evaluation
of the dependencies.
.PP
A set of heuristics, controlled by parameters which reflect the relative
cost of the operations, is used to balance the local load, while \fIdp\fP
schedules the distant processes, based on machine-load statistics.
.PP
Experiments in using \fIparmake\fP in recompiling a large set of Modula-2+
files
have shown a maximum speed-up of 2.2, using the 5 local processors only,
and 13.5, using 20 concurrent local and distant processes.
An important observation is that the speed-up strongly depends on the nature
of the jobs being executed; the performance advantage increases along with
the ratio of computation to I/O.
A much smaller and faster compiler (for example, a C compiler) turned out to
be limited by the disk speed.
Using 20 or more local and distant processes showed a speed-up of only 5.8.
.NH 2
DYNIX Make
.PP
\fIDYNIX Make\fP~[23]
provides a mechanism to activate parallelism explicitly.
If the string which separates a target from its dependencies is ``:&'' or
``::&'', then the command blocks to make the dependents can execute
simultaneously.
If two dependents are separated by ``&'', those two can be created in parallel.
The rule
.DS
target : dep1 dep2 dep3
.DE
causes \fImake\fP to update dep1, dep2, and dep3 sequentially.
The rule
.DS
target :& dep1 dep2 dep3
.DE
implies that dep1, dep2 and dep3 may be updated in parallel to each other.
The construct
.DS
target : dep1 & dep2 dep3
.DE
updates dep1 and dep2 in parallel, and then updates dep3.
The number of simultaneously active commands is controlled by
the \fB\-P\fP\fInum\fP command-line argument.
By default, three commands can run in parallel.
.PP
Only the last command line of a multi-line command block is eligible for
asynchronous execution, while the other commands are executed sequentially.
In contrast with \fIpmake\fP, we have to combine multiple command
lines into a single command line explicitly (by appending backslashes to
all but the last lines),
to force the command block as a whole being run asynchronously.
.PP
To preserve compatibility with \fImake\fPs in other systems, variable expansion
is done before the parsing for parallel constructs.
This allows rules to be written as
.DS
target :$(PAR) dep1 dep2 dep3
.DE
which is accepted by any \fImake\fP if $(PAR) is not defined,
and which is accepted and
interpreted as parallel construct if we invoke \fIDYNIX Make\fP by specifying
.DS
make "PAR='&'"
.DE
.PP
\fIDYNIX make\fP uses the technique of preceding the error messages from
asynchronously executing commands with process identifiers, thus enabling
the programmer to find the source of errors.
.NH
Conclusions
.PP
We believe that \fIpmake\fP satisfies our requirements in that it shows
a considerable speed-up, its description files are compatible
with Feldman's \fImake\fP, and parallelism and the problems that come with
parallelism are almost transparent to the user.
.PP
The experiments have shown a considerable speed-up in using
\fIpmake\fP on a multiprocessor environment.
We believe that it is hardly possible to achieve linear speed-up
even when we disregard the shortcomings in our test environment
as discussed in section 6.2.
First, bookkeeping and the virtual processor mechanism introduce a minor
overhead in \fIpmake\fP.
Second, it is hard to determine an optimal distribution of tasks
among the virtual processors.
Applying the \fILPT\fP algorithm, discussed in section 2.4, might help but
requires external information.
\fIPmake\fP does not implement the \fILPT\fP algorithm, but we observed a minor
speed-up when we manually reordered dependency lists.
Third, we have to deal with ``bottle necks'' in practice.
Compiling a multi-source-file program, for example, requires a link phase, which
does not run in parallel with any of the compiling phases, although it could be
made to run incrementally (so it would finish soon after the last object
came in.)
.\" Dick: But that's a dif'rent story!
Fourth, if any of the available processors are multiprogrammed, we have to
deal with processes competing for CPU cycles.
The ideal situation is to have multiple monoprogrammed processors.
.PP
The syntactic compatibility with \fImake\fP description files is maintained.
\fIPmake\fP description files are accepted and interpreted 
correctly by \fImake\fP.
\fIMake\fP description files, however, need revision if commands, like
\fIlex\fP and \fIyacc\fP, should run mutually exclusively.
Either we have to take care of non-clashing file names, by running commands
in separate directories or by forcing the commands to use
non-standard file names,
or we need to define a \fI.MUTEX\fP on a group of commands.
In practice, we have not yet encountered a legal \fImake\fP description file
which is treated incorrectly by \fIpmake\fP.
.PP
\fICmake\fP and \fIparmake\fP description files, too,
are compatible with \fImake\fP description 
files, but a major disadvantage is the
explicit treatment of distribution.
\fIPmake\fP assumes the underlying operating systems supplies efficient parallel
processing on multiple processors, while \fIcmake\fP has to distribute the
commands itself.
.NH
Acknowledgements
.PP
I would like to thank
Eduardo Krell, who gave me information about \fInmake\fP;
Andrew Hume, who sent me documentation on \fImk\fP;
Barton Schaefer, who supplied information on \fIDYNIX Make\fP;
the \fIAmoeba\fP team at the Vrije Universiteit, who made the \fIAmoeba\fP
system usable, and who had to suffer from
my complaints about \fIAmoeba\fP's shortcomings; and Andy Tanenbaum and
Dick Grune for their suggestions, useful comments and critical readings
of this paper.
.BP
.NH
REFERENCES
.nr [W \w'10'
.LP
.]<
.ds [F 1
.]-
.ds [T The C Programming Language
.ds [K clang
.ds [A B.W. Kernighan
.as [A " and D.M. Ritchie
.ds [I Prentice-Hall
.ds [C Englewood Cliffs, New Jersey
.ds [D 1978
.nr [T 0
.nr [A 0
.nr [O 0
.][ 2 book
.ds [F 2
.]-
.ds [T Make\(emA Program for Maintaining Computer Programs
.ds [A S.I. Feldman
.ds [J Software\(emPractice and Experience
.ds [V 9
.ds [N 4
.ds [P 255-265
.nr [P 1
.ds [D April 1979
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 3
.]-
.ds [T Concurrent Compilation
.ds [A V. Seshadri
.as [A ", I.S. Small
.as [A ", and D.B. Wortman
.ds [J Proc. IFIP Conference Distributed Processing
.ds [D October 1987
.ds [C Amsterdam, Netherlands
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 4
.]-
.ds [T Unix Modula-2\(emMM2M Manual Page
.ds [K mm2m
.ds [A P. Robinson
.ds [I Cambridge University Computer Laboratory
.ds [C Cambridge, UK
.ds [D 1984
.ds [T A Pipelined DYNAMO Compiler
.as [A ", W. Huen
.as [A ", O. El-Dessouki
.as [A ", E. Huske
.as [A ", and M. Evens
.ds [J Proc. of the International Conference on Parallel Processing
.ds [P 57-66
.nr [P 1
.ds [D August 1977
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 5
.]-
.ds [T A Practical Toolkit for Making Portable Compilers
.ds [A A.S. Tanenbaum
.as [A ", J.M. van Staveren
.as [A ", E.G. Keizer
.as [A ", and J.W. Stevenson
.ds [J Comm. of the ACM
.ds [V 26
.ds [N 9
.ds [P 654-660
.nr [P 1
.ds [D September 1983
.ds [K ack
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 6
.]-
.ds [T Distributed compilation: a case study
.ds [A J.A. Miller
.as [A " and R.J. LeBlanc
.ds [J Proc. 3th IEEE Int. Conference on Distributed Computing Systems
.ds [P 548-553
.nr [P 1
.ds [D October 1982
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 7
.]-
.ds [T Parallel Attribute Grammar Evaluation
.ds [A H.J. Boehm
.as [A " and W. Zwaenepoel
.ds [R internal report
.ds [I Dept. of Computer Science, Rice University
.ds [C Houston, Texas
.ds [D October 1986
.nr [T 0
.nr [A 0
.nr [O 0
.][ 4 tech-report
.ds [F 8
.]-
.ds [T Yacc: Yet Another Compiler-Compiler
.ds [A S.C. Johnson
.ds [I Bell Laboratories
.ds [C Murray Hill, NJ
.ds [D July 1978
.ds [K yacc
.nr [T 0
.nr [A 0
.nr [O 0
.][ 2 book
.ds [F 9
.]-
.ds [T Lex\(emA lexical Analyzer Generator
.ds [A M.E. Lesk
.as [A " and E. Schmidt
.ds [I Bell Laboratories
.ds [C Murray Hill, NJ
.ds [D 1978
.ds [K lex
.nr [T 0
.nr [A 0
.nr [O 0
.][ 2 book
.ds [F 10
.]-
.ds [T The Fourth Generation Make
.ds [A G.S. Fowler
.ds [J Proc. USENIX Summer Conference
.ds [C Portland, Oregon
.ds [P 159-174
.nr [P 1
.ds [D June 1985
.ds [K nmake
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 11
.]-
.ds [T Recent Developments in Deterministic Sequencing and Scheduling: a Survey
.ds [A E.L. Lawler
.as [A ", J.K. Lenstra
.as [A ", and A.H.G. Rinnooy Kan
.ds [B Deterministic and Stochastic Scheduling
.ds [E M.A.H. Dempster et al.
.ds [I Nato Advanced Study Series
.ds [C Dordrecht, The Netherlands
.ds [D July 1981
.nr [T 0
.nr [A 0
.nr [O 0
.][ 3 article-in-book
.ds [F 12
.]-
.ds [T Performance Guarantees for Scheduling Algorithms
.ds [A M.R. Garey
.as [A ", R.L. Graham
.as [A ", and D.S. Johnson
.ds [J Operations Research
.ds [V 26
.ds [N 1
.ds [P 3-21
.nr [P 1
.ds [D 1978
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 13
.]-
.ds [T An Application of Bin Packing to Multiprocessor scheduling
.ds [A E.G. Coffman, Jr.
.as [A ", M.R. Garey
.as [A ", and D.S. Johnson
.ds [J SIAM Journal on Computing
.ds [V 7
.ds [N 1
.ds [P 1
.nr [P 0
.ds [D February 1978
.ds [K multifit
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 14
.]-
.ds [T Parallel and Distributed Compilations in Loosely-Coupled Systems:
.as [T " A Case Study
.ds [A E. H. Baalbergen
.ds [J Proc. Workshop on Large Grain Parallelism
.ds [C Providence, RI
.ds [D October 1986
.ds [K latest
.ds [K experiment compile
.ds [K published refereed
.ds [L 1986/10
.ds [K Flgp86.n
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 15
.]-
.ds [T Principles of Distributed Operating System Design
.ds [A S. J. Mullender
.ds [R PhD. dissertation
.ds [I SMC
.ds [C Amsterdam
.ds [D October 1985
.ds [K latest
.ds [K thesis principles
.ds [K published refereed
.ds [L 1985/10
.nr [T 0
.nr [A 0
.nr [O 0
.][ 4 tech-report
.ds [F 16
.]-
.ds [T The Design of a Capability-Based Distributed Operating System
.ds [A S. J. Mullender
.as [A " and A. S. Tanenbaum
.ds [J The Computer Journal
.ds [V 29
.ds [N 4
.ds [P 289-300
.nr [P 1
.ds [D March 1986
.ds [K latest
.ds [K protection
.ds [K published refereed
.ds [L 1986/03
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 17
.]-
.ds [T Using Sparse Capabilities in a Distributed Operating System
.ds [A A. S. Tanenbaum
.as [A ", S. J. Mullender
.as [A ", and R. van Renesse
.ds [J Proc. of the 6th Int. Conf. on Distr. Computing Systems
.ds [P 558-563
.nr [P 1
.ds [C Cambridge, MA
.ds [D May 1986
.ds [K latest
.ds [K capabilities naming protection
.ds [K published refereed
.ds [L 1986/05
.ds [K Fdcs6.n
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 18
.]-
.ds [T Concurrent Make: The Design and Implementation of a Distributed Program
.as [T " in Concurrent C
.ds [A B. Cmelik
.ds [B Concurrent C Project
.ds [I AT&T Bell Laboratories
.ds [C Murray Hill, NJ
.ds [D 1986
.ds [K cmake
.nr [T 0
.nr [A 0
.nr [O 0
.][ 3 article-in-book
.ds [F 19
.]-
.ds [T A Program for Maintaining Computer Programs (Make)
.ds [A AT&T
.ds [B System V Support Tools Guide
.ds [P 11-40
.nr [P 1
.ds [D June 1982
.ds [K make5.2
.nr [T 0
.nr [A 0
.nr [O 0
.][ 3 article-in-book
.ds [F 20
.]-
.ds [T Concurrent C
.ds [A N.H. Gehani
.as [A " and W.D. Roome
.ds [J Software\(emPractice and Experience
.ds [V 16
.ds [N 9
.ds [P 821-844
.nr [P 1
.ds [D September 1986
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 21
.]-
.ds [T Mk: A Successor to Make
.ds [A A. Hume
.ds [R Computing Science Technical Report No. 141
.ds [I AT&T Bell Laboratories
.ds [C Murray Hill, NJ
.ds [D November 1987
.ds [K mk
.nr [T 0
.nr [A 0
.nr [O 0
.][ 4 tech-report
.ds [F 22
.]-
.ds [T Parmake and Dp: Experience with a Distributed, Parallel Implementation
.as [T " of make
.ds [A E.S. Roberts
.as [A " and J.R. Ellis
.ds [J Proc. 2nd Workshop on Large-Grained Parallelism
.ds [I Carnegie-Mellon University
.ds [C Pittsburgh, Pennsylvania
.ds [D November 1987
.ds [P 74-76
.nr [P 1
.ds [K avail;
.ds [O Available as Tech. Rep. CMU/SEI-87-SR-5
.nr [T 0
.nr [A 0
.nr [O 0
.][ 1 journal-article
.ds [F 23
.]-
.ds [T DYNIX Make Manual Page
.ds [A Sequent
.ds [B DYNIX Programmer's Manual\(emRevision 1.15
.ds [K dynix
.ds [D August 1987
.nr [T 0
.nr [A 0
.nr [O 0
.][ 3 article-in-book
.nr [W \w'10'
.]>
.nr [W \w'10'
