Previous: , Up: Implementation Differences   [Contents][Index]


5.39.3 Other Differences

GNU troff does not emit output if it has nothing to format. For example, it treats an input document consisting solely of nr and tm requests as empty, and produces nothing on its standard output stream. AT&T troff does, creating a blank page.

Use of C0 control characters in identifiers is not portable; Solaris, Plan 9, and Heirloom Doctools troffs accept Control+B, Control+C, Control+E, Control+F, and Control+G (only); DWB 3.3 troff does not. GNU troff rejects C0 controls in identifiers with an error diagnostic.

Formatters that don’t implement GNU troff extension request names tend to ignore them, and if they don’t support a GNU troff extension escape sequence, they are liable to format its function selector character as text. For example, the adjustable, non-breaking space escape sequence \~ is also supported by Heirloom Doctools troff 050915 (September 2005), mandoc 1.9.5 (2009-09-21), neatroff (commit 1c6ab0f6e, 2016-09-13), and Plan 9 from User Space troff (commit 93f8143600, 2022-08-12), but not by Solaris or Documenter’s Workbench troffs, which both render it as ‘~’. Recall Manipulating Filling and Adjustment. GNU troff’s features sometimes cause incompatibilities with documents written assuming old implementations of troff.

AT&T troff discards trailing spaces from input lines, like GNU troff, but when it does so, AT&T troff also cancels end-of-sentence detection. Use of the dummy character escape sequence \& is more portable.

When adjusting output lines to both margins, AT&T troff at first adjusts spaces starting from the right; GNU troff begins from the left. Both implementations adjust spaces from opposite ends on alternating output lines in this adjustment mode to prevent “rivers” in the text.

GNU troff does not always hyphenate words as AT&T troff does. The AT&T implementation uses a set of hard-coded rules specific to U.S. English, while GNU troff uses language-specific hyphenation pattern files derived from TeX. Some versions of troff reserved meager storage for hyphenation exception words (arguments to the hw request); GNU troff has no such restriction. When the hy request is invoked without an argument, GNU troff sets the automatic hyphenation mode to the value of the .hydefault register; the AT&T implementation sets it to ‘1’, which is not suitable in GNU troff for some languages, including English.

Unlike GNU troff, AT&T troff does not recognize an occurrence of \% at the beginning of a word as suppressing its hyphenation; instead, it (uselessly) marks the start of the word as a potential hyphenation point, permitting output lines to end with hyphens that are not interior to a word.

GNU troff handles the dummy character \& differently from AT&T troff when it is followed by the hyphenation control escape sequence \% at the beginning of a word. GNU troff does not regard the dummy character as “starting” the word; AT&T troff does. Further, Heirloom Doctools troff does not honor an explicit hyphenation point marked with \% after a word-initial one.189

GNU troff interprets request arguments representing file names and system commands in the same way it does the contents argument to the ds and as requests: it removes a leading neutral double quote ‘"’ from the argument to the cf, nx, pi, so, and sy requests, and the second argument (if present) to the lf request, permitting initial embedded spaces in it, and reads it to the end of the input line in copy mode. Recall Copy Mode. This difference permits the formatter to handle files with spaces in their names, but requires more care with trailing comments, and doubling of an initial neutral double quote ‘"’ if the file name has one.

The existence of the .T string is a common feature of device-independent troffs—DWB 3.3, Solaris, Heirloom Doctools, and Plan 9 troffs all support it—but valid values are specific to each implementation.

The (read-only) register .T interpolates 1 if GNU troff is run with the -T option, and 0 otherwise. In contrast, AT&T troff interpolated 1 only if nroff was the formatter and was run with -T.

AT&T troff ignored attempts to remove read-only registers; GNU troff honors such requests. Recall Built-in Registers.

The lf request sets the number of the current input line in AT&T troff and the next in GNU troff.

AT&T troff had only environments named ‘0’, ‘1’, and ‘2’. In GNU troff, any number of environments may exist, using any valid identifiers for their names. Recall Identifiers.

As noted in Using Fractional Type Sizes, AT&T troff’s ps request ignores scaling units and thus ‘.ps 10u’ sets the type size to 10 points, whereas in GNU troff it sets the type size to 10 scaled points, possibly a much smaller measurement. AT&T’s behavior also means that ‘.ps 10p’ and ‘.ps 10z’ are portable.

The ab request differs from AT&T troff: GNU troff writes no message to the standard error stream if no arguments are given, and it exits with a failure status instead of a successful one.

The bp request differs from AT&T troff: GNU troff does not accept a scaling unit on the argument, a page number; the former does (uselessly).

In AT&T troff, the pm request reports macro, string, and diversion sizes in units of 128-byte blocks, and an argument reduces the report to a sum of the above in the same units. GNU troff reports ther lengths in characters or nodes if given no arguments, and otherwise dumps the JSON-encoded name, contents, and other properties of each named argument.

AT&T troff ignores the ss request if the output is a terminal device; GNU troff rounds down the values of minimum inter-word and additional inter-sentence space each to the nearest multiple of 12.

GNU troff distinguishes characters from glyphs. Characters can be ordinary, special, or indexed, and populate strings and macros. Characters per se have not (yet) been formatted. Glyphs represent graphemes (supplied by the output device) and populate diversions (recall Diversions). Formatting converts characters into (sequences of) glyphs. GNU troff stores properties of the environment that affect how a glyph is rendered with the glyph node’s data. Thus, subsequent formatting operations do not affect it, including bd, cs, tkf, tr, and fp requests. Normally, a macro or string contains only a list of characters and a diversion contains only a list of nodes. However, applying the asciify or unformat requests to a diversion converts some of its nodes back into characters. Where the formatter cannot recover the character representation of a node, it stores a null character in the character list corresponding to a single node in the node list.

Consequently, a glyph node does not behave as a character does in macro interpolation: it does not inherit special properties that the character from which it was constructed might have had. For example, the input

.di x
\\\\
.br
.di
.x

produces ‘\\’ in GNU troff. Each pair of backslashes becomes one backslash glyph; the resulting backslashes are thus not interpreted as escape characters when they are interpolated as the diversion is output. AT&T troff would interpret them as escape characters when interpolating them and end up printing one ‘\’.

One correct way to obtain a printable backslash in most documents is to use the \e escape sequence; this always prints a single instance of the current escape character,190 regardless of whether it is used in a diversion; it also works in both GNU troff and AT&T troff.

The other correct way, appropriate in contexts independent of the backslash’s common use as a roff escape character—perhaps in discussion of character sets or other programming languages—is the special character escape sequence \(rs or \[rs], for “reverse solidus”, from its name in the ECMA-6 and ISO 10646 standards.191

To store in a diversion an escape sequence that is interpreted when the diversion is interpolated, either use the traditional \! transparent output facility, or, if this is unsuitable, the new \? escape sequence. Recall Diversions and GNU troff Internals.

Like AT&T troff, GNU troff maintains a buffer of device-independent output commands,192 populating the buffer as formatted output accumulates. GNU troff always flushes this buffer when processing a break; AT&T troff does so according to no obvious schedule. (Perhaps, if the buffer is of fixed size, the formatter performs the flush when the buffer runs out of room.)

In the somewhat pathological case where a diversion exists containing a partially collected line and a partially collected line at the top-level diversion has never existed, AT&T troff outputs a partially collected but otherwise empty line (as if ‘\c’ were in the top-level diversion) at the end of input; GNU troff does not.


Previous: , Up: Implementation Differences   [Contents][Index]