Next: Character Classes, Previous: Font Positions, Up: Using Fonts [Contents][Index]
A glyph is a graphical representation of a character. Whereas a character is an abstraction of semantic information, a glyph is an intelligble mark visible on screen or paper. A character has many possible representation forms; for example, the character ‘A’ can be written in an upright or slanted typeface, producing distinct glyphs. Sometimes, a sequence of characters map to a single glyph: this is a ligature—the most common is ‘fi’.
Space characters never become glyphs in
GNU
troff. If not discarded
(as when trailing text lines),
horizontal motions represent them in the output.
In a
troff system,
a font description file
(recall Font Directories)
lists all of the glyphs a particular font provides.
If the user requests a glyph
not available in the currently selected font,
the formatter looks it up an ordered list of
special fonts.
By default,
the
‘ps’
(PostScript)
and
‘pdf’
output devices support the two special fonts
‘SS’
(slanted symbol)
and
‘S’ (symbol);
and these devices’
DESC
files arrange them such that the formatter searches
the former before the latter.
Other output devices use different names for special fonts.
Fonts mounted with the
fonts
keyword in the
DESC
file are globally available.
GNU
troff’s special
and
fspecial
requests alter the list of fonts treated as special on a general basis,
or only when a certain font is currently selected,
respectively.
The formatter supports three kinds of character. An ordinary character is the most commonly used, has no special syntax, and typically represents itself.115 Interpolate a special character with the ‘\[xxx]’ or ‘\C'xxx'’ escape sequence syntax, where xxx is an identifer. An indexed character bypasses most character-to-glyph resolution logic, uses the ‘\N'i'’ syntax, and selects a glyph from the currently selected font by its integer-valued position i in the output device’s representation of that font.116
User-defined characters
are similar to string definitions,117
and permit extension of or substitution within the character repertoire.
Any ordinary,
special,
or indexed character can be user-defined.
The
char,
fchar,
schar,
and
fschar
requests create user-defined characters
employed at various stages of the character-to-glyph resolution process.
GNU
troff employs the following procedure to resolve an input character
into a glyph.
User-defined characters make this resolution process recursive.
The first step that succeeds ends the resolution procedure
for the character being formatted,
which may not be the last in the sequence interpolated
by a user-defined character.
char
request
and apply this procedure to each character in its definition.
fchar
request
and apply this procedure to each character in its definition.
fspecial
request,
for a glyph corresponding to the character.
fschar
request for the currently selected font,
and apply this procedure to each character in its definition.
special
request for a glyph corresponding to the character.
sschar
request
and apply this procedure to each character in its definition.
special
directive,118
check it for a glyph corresponding to the character.
This stage of the resolution process
can sometimes lead to surprising
results since the
fonts
directive in the
DESC
file often contains empty positions that are filled
by a macro file or document employing the
fp
request
after the formatter initializes.
For example, consider the following:
fonts 3 0 0 FOO
This mounts font foo at font position 3. We assume that
FOO is a special font, containing glyph foo, and that no
font has been loaded yet. The line
.fspecial BAR BAZ
makes font BAZ special only if font BAR is active. We
further assume that BAZ is really a special font, i.e., the font
description file contains the special keyword, and that it also
contains glyph foo with a special shape fitting to font
BAR. After executing fspecial, font BAR is loaded
at font position 1, and BAZ at position 2.
We now switch to a new font
XXX,
trying to access glyph
foo
that is assumed to be missing.
There are neither font-specific special fonts for
XXX
nor any other fonts made special with the
special
request,
so the formatter starts the search for special fonts
in the list of already mounted fonts,
with increasing font positions.
Consequently,
it finds
BAZ
before
FOO
even before
XXX,
which is not the intended behaviour.
See Device and Font Description Files, and Special Fonts, for more details.
The groff_char(7) man page houses a complete list of predefined special character names, but the availability of any as a glyph is device- and font-dependent. For example, say
man -T dvi groff_char > groff_char.dvi
to obtain those available with the DVI device and default font
configuration.119
If you want to use an additional macro package to change the fonts used,
you must run
groff
(or
troff)
directly.
groff -T dvi -m ec -m an groff_char.7 > groff_char.dvi
Special character names not listed in groff_char(7) are
derived algorithmically, using a simplified version of the Adobe Glyph
List (AGL) algorithm, which is described in
https://github.com/adobe-type-tools/agl-aglfn. The (frozen)
set of names that can’t be derived algorithmically is called the
groff glyph list (GGL).
uXXXX[X[X]]. X must be an
uppercase hexadecimal digit. Examples: u1234, u008E,
u12DB8. The largest Unicode value is 0x10FFFF. There must be at
least four X digits; if necessary, add leading zeroes (after the
‘u’). No zero padding is allowed for character codes greater than
0xFFFF. Surrogates (i.e., Unicode values greater than 0xFFFF
represented with character codes from the surrogate area U+D800-U+DFFF)
are not allowed either.
‘u’ component1 ‘_’ component2 ‘_’ component3 …
Example: u0045_0302_0301.
For simplicity, all Unicode characters that are composites must be
maximally decomposed to NFD;120 for example,
u00CA_0301 is not a valid glyph name since U+00CA (LATIN
CAPITAL LETTER E WITH CIRCUMFLEX) can be further decomposed into U+0045
(LATIN CAPITAL LETTER E) and U+0302 (COMBINING CIRCUMFLEX
ACCENT). u0045_0302_0301 is thus the glyph name for U+1EBE,
LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE.
u0100 (LATIN
LETTER A WITH MACRON) is automatically decomposed into
u0041_0304. Additionally, a glyph name of the GGL is preferred
to an algorithmically derived glyph name; groff also
automatically does the mapping. Example: The glyph u0045_0302 is
mapped to ^E.
^E_u0301 is invalid.
Typeset a special character name (two-character name nm) or a composite glyph consisting of base-glyph overlaid with one or more combining-components. For example, ‘\[A ho]’ is a capital letter “A” with a “hook accent” (ogonek).
There is no special syntax for one-character names—the analogous form
‘\n’ would collide with other escape sequences. However, the
four escape sequences \', \-, \_, and \`,
are translated on input to the special character escape sequences
\[aa], \[-], \[ul], and \[ga], respectively.
A special character name of length one is not the same thing as an
ordinary character: that is, the character a is not the same as
\[a].
If name is undefined, a warning in category ‘char’ is produced and the escape is ignored. See Warnings, for information about the enablement and suppression of warnings.
GNU troff resolves \[…] with more than a single
component as follows:
uXXXX form.
uXXXX that is found in the list of
decomposable glyphs is decomposed.
No check for the existence of any component (similar to tr
request) is done.
Examples:
\[A ho]‘A’ maps to u0041, ‘ho’ maps to u02DB, thus the
final glyph name would be u0041_02DB. This is not the expected
result: the ogonek glyph ‘ho’ is a spacing ogonek, but for a
proper composite a non-spacing ogonek (U+0328) is necessary. Looking
into the file composite.tmac, one can find ‘.composite ho u0328’, which changes the mapping of ‘ho’ while a composite glyph
name is constructed, causing the final glyph name to be
u0041_0328.
\[^E u0301]\[^E aa]\[E a^ aa]\[E ^ ']‘^E’ maps to u0045_0302, thus the final glyph name is
u0045_0302_0301 in all forms (assuming proper calls of the
composite request).
It is not possible to define glyphs with names like ‘A ho’
within a groff font file. This is not really a limitation;
instead, you have to define u0041_0328.
'xxx'Typeset the special character
xxx.
Normally,
it is more convenient to use
‘\[xxx]’,
but
\C
has some advantages:
it is compatible with AT&T device-independent
troff
(and therefore available in compatibility
mode121)
and can interpolate special characters with
‘]’
in their names.
The delimiter need not be a neutral apostrophe;
recall Delimiters.
Map ordinary or special character name c1 to c2 when
c1 is a combining component in a composite character. See above
for examples. This is a strict rewriting of the special character name;
no check is performed for the existence of a glyph for either.
Typically, composite is used to map a spacing character to a
combining one. A set of default mappings for many accents can be found
in the file composite.tmac, loaded by the default troffrc
at startup.
You can obtain a report of mappings defined by composite on the
standard error stream with the pcomposite request.
See Debugging.
'n'Format indexed character numbered
n
in the current font
(n is
not
the input character code).
n can
be any non-negative decimal integer.
Most devices number glyphs with codes between 0 and 255 only;
the
utf8
output device uses codes in the range 0–65535.
If the current font does not contain a glyph with that code,
special fonts are
not
searched.
The
\N
escape sequence can be conveniently used in conjunction with the
char
request.
.char \[phone] \f[ZD]\N'37'
The code of each glyph is given in the fourth column in the font
description file after the charset command. It is possible to
include unnamed glyphs in the font description file by using a name of
‘---’; the \N escape sequence is the only way to use these.
No kerning is applied to glyphs accessed with \N. The delimiter
need not be a neutral apostrophe; see Delimiters.
A few escape sequences are also special characters.
'An escaped neutral apostrophe is a synonym for \[aa] (acute
accent).
`An escaped grave accent is a synonym for \[ga] (grave accent).
An escaped hyphen-minus is a synonym for \[-] (minus sign).
An escaped underscore (“low line”) is a synonym for \[ul]
(underrule). On typesetting devices, the underrule is font-invariant
and drawn lower than the underscore ‘_’.
Assign properties encoded by non-negative integer n to each character or class122. c. Spaces need not separate c arguments.
Characters, whether ordinary, special, or indexed, have certain associated properties. The first argument is the sum of the desired flags and the remaining arguments are the characters to be assigned those properties. arguments.
The non-negative integer n is the sum of any of the following. Some combinations are nonsensical, such as ‘33’ (1 + 32).
1Recognize the character as ending a sentence if followed by a newline or two spaces. Initially, characters ‘.?!’ have this property.
2Enable breaks before the character. A line is not broken at a character with this property unless the characters on each side both have non-zero hyphenation codes. This exception can be overridden by adding 64. Initially, no characters have this property.
4Enable breaks after the character. A line is not broken at a character with this property unless the characters on each side both have non-zero hyphenation codes. This exception can be overridden by adding 64. Initially, characters ‘\-\[hy]\[em]’ have this property.
8Mark the glyph associated with this character as overlapping other instances of itself horizontally. Initially, characters ‘\[ul]\[rn]\[ru]\[radicalex]\[sqrtex]’ have this property.
16Mark the glyph associated with this character as overlapping other instances of itself vertically. Initially, the character ‘\[br]’ has this property.
32Mark the character as transparent for the purpose of end-of-sentence recognition. In other words, an end-of-sentence character followed by any number of characters with this property is treated as the end of a sentence if followed by a newline or two spaces. This is the same as having a zero space factor in TeX. Initially, characters ‘"')]*\[dg]\[dd]\[rq]\[cq]’ have this property.
64Ignore hyphenation codes of the surrounding characters. Use this in combination with values 2 and 4 (initially, no characters have this property).
For example, if you need an automatic break point after the en-dash in numeric ranges like “3000–5000”, insert
.cflags 68 \[en]
into your document. However, this practice can lead to bad layout if
done thoughtlessly; in most situations, a better solution instead of
changing the cflags value is to insert \: right after the
hyphen at the places that really need a break point.
The remaining values were implemented for East Asian language support; those who use alphabetic scripts exclusively can disregard them.
128Prohibit a line break before the character, but allow a line break after the character. This works only in combination with flags 256 and 512 and has no effect otherwise. Initially, no characters have this property.
256Prohibit a line break after the character, but allow a line break before the character. This works only in combination with flags 128 and 512 and has no effect otherwise. Initially, no characters have this property.
512Allow line break before or after the character. This works only in combination with flags 128 and 256 and has no effect otherwise. Initially, no characters have this property.
In contrast to values 2 and 4, the values 128, 256, and 512 work pairwise. If, for example, the left character has value 512, and the right character 128, no break will be automatically inserted between them. If we use value 6 instead for the left character, a break after the character can’t be suppressed since the neighboring character on the right doesn’t get examined.
"][contents]"][contents]"][contents]"][contents]Define an ordinary, special, or indexed character c as contents.
Omitting contents gives c an empty definition.
GNU
troff removes a leading neutral double quote
‘"’
from
contents,
permitting initial embedded spaces in it,
and reads it to the end of the input line in copy mode.
See Copy Mode.
Defining
(or redefining)
a
character c
creates a formatter object
that
GNU
troff recognizes like any other ordinary,
special,
or indexed character on input,
and produces
contents
on output.
When
formatting
c,
GNU
troff processes
contents
in a temporary environment and enscapsulates the result
in a node;123
disabling compatibility mode
and setting the escape character
to \
while interpreting
contents.
Any emboldening,
constant spacing,
or track kerning applies to this object
rather than to individual glyphs resulting from the formatting of
contents.
A character defined by these requests
can be used just like a glyph provided by the output device.
In particular,
other characters can be translated to it with the
tr
and
trin requests;
it can be made the tab or leader fill character with the
tc
and
lc
requests,
respectively;
sequences of it can be drawn with the
\l
and
\L
escape sequences;
and,
if the
hcode
request is used on
c,
it is subject to automatic hyphenation.
However, a user-defined character c does not participate at its boundaries in kerning adjustments or italic corrections.
The formatter prevents infinite recursion
by treating an occurrence
of a character in its own definition
as if it were undefined;
when interpolating such a character,
GNU
troff emits a warning in category ‘char’.124
The tr and trin requests take precedence if char
accesses the same symbol.
.tr XY
X
⇒ Y
.char X Z
X
⇒ Y
.tr XX
X
⇒ Z
The
fchar
request defines a fallback glyph:
troff
checks for glyphs defined with
fchar
only if it cannot find the glyph in the current font.
troff
performs this test before checking special fonts.
fschar
defines a fallback glyph for font f:
troff
checks for glyphs defined with
fschar
after the list of fonts declared as font-specific special fonts
with the
fspecial
request,
but before the list of fonts declared as global special fonts
with the
special
request.
Finally,
the
schar
request defines a global fallback glyph:
troff
checks for glyphs defined with
schar
after the list of fonts declared as global special fonts
with the
special
request,
but before the already mounted special fonts.
See Character Classes.
Caution:
These requests remove a leading neutral double quote
‘"’
and treat the remainder of the input line
as their second argument,
including any spaces,
up to a newline or comment escape sequence.
See the discussion of the
ds
request in Strings.
Remove definition of each
ordinary,
special,
or
indexed
character
c,
undoing the effect of a
char,
fchar,
or
schar
request.
Spaces need not separate
c
arguments.
The character definition removed
(if any)
is the first encountered in the resolution process documented above.
Glyphs,
which are defined by font description files,
cannot be removed.
rfschar
removes character definitions created by
fschar
for
font f.
Next: Character Classes, Previous: Font Positions, Up: Using Fonts [Contents][Index]