Previous: Compatibility Mode, Up: Implementation Differences [Contents][Index]
GNU
troff does not emit output if it has nothing to format.
For example,
it treats an input document consisting solely of
nr
and
tm
requests as empty,
and produces nothing on its standard output stream.
AT&T
troff does,
creating a blank page.
Use of C0 control characters in identifiers is not portable;
Solaris,
Plan 9,
and Heirloom Doctools
troffs accept
Control+B,
Control+C,
Control+E,
Control+F,
and
Control+G (only);
DWB 3.3
troff does not.
GNU
troff rejects C0 controls in identifiers with an error diagnostic.
Formatters that don’t implement
GNU
troff extension request names
tend to ignore them,
and if they don’t support a
GNU
troff extension escape sequence,
they are liable to format its function selector character as text.
For example,
the adjustable,
non-breaking space escape sequence
\~
is also supported by Heirloom Doctools
troff 050915 (September 2005),
mandoc
1.9.5 (2009-09-21),
neatroff
(commit 1c6ab0f6e,
2016-09-13),
and Plan 9 from User Space
troff (commit 93f8143600, 2022-08-12),
but not by Solaris
or Documenter’s Workbench
troffs, which both render it as
‘~’.
Recall Manipulating Filling and Adjustment.
GNU
troff’s features sometimes cause incompatibilities with documents written
assuming old implementations of
troff.
AT&T troff discards trailing spaces from input
lines, like GNU troff, but when it does so, AT&T
troff also cancels end-of-sentence detection. Use of the
dummy character escape sequence \& is more portable.
When adjusting output lines to both margins,
AT&T
troff at first adjusts spaces starting from the right;
GNU
troff begins from the left.
Both implementations adjust spaces
from opposite ends on alternating output lines
in this adjustment mode
to prevent “rivers” in the text.
GNU
troff does not always hyphenate words as AT&T
troff does.
The AT&T implementation uses a set of hard-coded rules
specific to U.S. English,
while GNU
troff uses language-specific hyphenation pattern files derived from TeX.
Some versions of
troff reserved meager storage for hyphenation exception words
(arguments to the
hw
request);
GNU
troff has no such restriction.
When the
hy
request is invoked without an argument,
GNU
troff sets the automatic hyphenation mode to the value of the
.hydefault
register;
the AT&T implementation sets it to
‘1’,
which is not suitable in GNU
troff
for some languages,
including English.
Unlike
GNU
troff, AT&T
troff does not recognize an occurrence of
\%
at the beginning of a word as suppressing its hyphenation;
instead,
it (uselessly) marks the start of the word
as a potential hyphenation point,
permitting output lines to end with hyphens
that are not interior to a word.
GNU
troff handles the dummy character
\&
differently from AT&T
troff when it is followed by the hyphenation control escape sequence
\%
at the beginning of a word.
GNU
troff does not regard the dummy character as “starting” the word;
AT&T
troff does.
Further,
Heirloom Doctools
troff does not honor an explicit hyphenation point marked with
\%
after a word-initial one.186
GNU
troff interprets request arguments representing file names
and system commands
in the same way it does the
contents
argument to the
ds
and
as
requests:
it removes a leading neutral double quote
‘"’
from the argument to the
cf,
nx,
pi,
so,
and
sy
requests,
and the second argument
(if present)
to the
lf
request,
permitting initial embedded spaces in it,
and reads it to the end of the input line in copy mode.
Recall Copy Mode.
This difference permits the formatter to handle files
with spaces in their names,
but requires more care with trailing comments,
and doubling of an initial neutral double quote
‘"’
if the file name has one.
The existence of the
.T
string is a common feature of device-independent
troffs—DWB 3.3, Solaris,
Heirloom Doctools,
and Plan 9
troffs all support it—but valid values are specific to each implementation.
The (read-only) register
.T
interpolates 1
if GNU
troff is run with the
-T
option,
and 0 otherwise.
In contrast,
AT&T
troff interpolated 1 only if
nroff was the formatter and was run with
-T.
AT&T
troff ignored attempts to remove read-only registers;
GNU
troff honors such requests.
Recall Built-in Registers.
The
lf
request sets the number of the
current
input line in AT&T
troff and the
next
in GNU
troff.
AT&T
troff had only environments named
‘0’,
‘1’,
and
‘2’.
In GNU
troff,
any number of environments may exist,
using any valid identifiers for their names.
Recall Identifiers.
As noted in Using Fractional Type Sizes,
AT&T
troff’s ps
request ignores scaling units
and thus
‘.ps 10u’
sets the type size to 10 points,
whereas in GNU
troff it sets the type size to
10 scaled
points,
possibly a much smaller measurement.
AT&T’s behavior also means that
‘.ps 10p’
and
‘.ps 10z’
are portable.
The ab request differs from AT&T troff: GNU
troff writes no message to the standard error stream if no
arguments are given, and it exits with a failure status instead of a
successful one.
The
bp
request differs from AT&T
troff: GNU
troff does not accept a scaling unit on the argument,
a page number;
the former does (uselessly).
In AT&T
troff, the
pm
request reports
macro,
string,
and
diversion
sizes in units of 128-byte blocks,
and an argument reduces the report to a sum of the above in the same
units.
GNU
troff reports ther lengths in characters or nodes if given no arguments,
and otherwise dumps
the JSON-encoded name,
contents,
and other properties of each named argument.
AT&T
troff ignores the
ss
request if the output is a terminal device;
GNU
troff rounds down the values of minimum inter-word and additional
inter-sentence space each to the nearest multiple of 12.
GNU
troff distinguishes characters from glyphs.
Characters can be ordinary,
special,
or indexed,
and populate strings and macros.
Characters
per se
have not (yet) been formatted.
Glyphs represent graphemes
(supplied by the output device)
and populate diversions
(recall Diversions).
Formatting converts characters into
(sequences of)
glyphs.
GNU
troff stores properties of the environment
that affect how a glyph is rendered with the glyph node’s data.
Thus,
subsequent formatting operations do not affect it,
including
bd,
cs,
tkf,
tr,
and
fp
requests.
Normally,
a macro or string
contains only a list of characters
and a diversion
contains only a list of nodes.
However,
applying the
asciify
or
unformat
requests to a diversion converts some of its nodes back into characters.
Where the formatter cannot recover the character representation
of a node,
it stores a null character in the character list
corresponding to a single node in the node list.
Consequently, a glyph node does not behave as a character does in macro interpolation: it does not inherit special properties that the character from which it was constructed might have had. For example, the input
.di x \\\\ .br .di .x
produces ‘\\’ in GNU troff. Each pair of backslashes
becomes one backslash glyph; the resulting backslashes are thus
not interpreted as escape characters when they are interpolated
as the diversion is output. AT&T troff would
interpret them as escape characters when interpolating them and end up
printing one ‘\’.
One correct way to obtain a printable backslash in most documents is to
use the \e escape sequence; this always prints a single instance
of the current escape character,187 regardless of whether it is used in a diversion; it also works
in both GNU troff and AT&T troff.
The other correct way,
appropriate in contexts independent of the backslash’s common use as a
roff
escape character—perhaps in discussion of character sets or other
programming languages—is the special character escape sequence
\(rs or \[rs],
for “reverse solidus”,
from its name in the ECMA-6 and ISO 10646
standards.188
To store in a diversion an escape sequence
that is interpreted when the diversion is interpolated,
either use the traditional
\!
transparent output facility,
or,
if this is unsuitable,
the new
\?
escape sequence.
Recall Diversions and GNU troff Internals.
Like AT&T
troff, GNU
troff maintains a buffer of device-independent output commands,189
populating the buffer as formatted output accumulates.
GNU
troff always flushes this buffer when processing a break;
AT&T
troff does so according to no obvious schedule.
(Perhaps,
if the buffer is of fixed size,
the formatter performs the flush when the buffer runs out of room.)
In the somewhat pathological case where a diversion exists
containing a partially collected line
and a partially collected line at the top-level diversion
has never existed,
AT&T
troff outputs a partially collected but otherwise empty line
(as if
‘\c’
were in the top-level diversion)
at the end of input;
GNU
troff does not.
Previous: Compatibility Mode, Up: Implementation Differences [Contents][Index]