DKIM Access Control and Differential Changes

steffen@sdaoden.eu

General DKIM This document specifies a DKIM (RFC 6376) iteration that allows cryptographical verification of SMTP (RFC 5321) envelope data, and of any signature along the message path, even beyond IMF (RFC 5322) message content changes. It addresses existing security glitches, and introduces active mitigations to embrace collateral damage effects of email solutions of the younger past by a standardized solution, also by moving complexity away from lower network protocol layers, where problems cannot be solved. It updates DKIM in certain aspects that reality has proven to be superfluous, incomplete, or obsoleted.

Introduction DKIM was not designed to cover SMTP envelope data, allowing replay of valid, verifiable messages to an infinite set of recipients by malicious third parties, undetectable by sender and recipients. (Rationale: to aid SMTP delivery to recipients in various conditions even the existing but optional "x=" expiration tag timestamp must be chosen so far in the future that malicious players have plenty of time to misuse messages.) Whereas DKIM standardized rudimentary, incomplete approaches to inspect at least header field modifications of IMF message content that happen along the message path (the "z=" tag content; also to point to the "l=" tag), the overall design was agreed in not to survive them (compare, for example, ).

The resulting paradigm is "as long as one signature can be successfully verified, DKIM verification will succeed". It is a context-free "take it and accept it" approach.

This is problematic as message content changes may be falsely attributed to (the) address(es) in the IMF originator field(s). Later policy-enforcing standards effectively complicated the situation, in that false attribution may now technically be avoidable, but mitigations of practice like "user A via B" will still be attributed to "A" by a human for one, and, in short, anything is valid, if just one signature is. ( elaborates more context.) Potentially many signatures may exist in a message. DKIM gives hints on how verification can be performed, but in practice mitigations are applied in order to reduce excessive and useless verifications on hops down the message path: elder, especially broken signatures are removed, or renamed, as changes are performed on message content. Especially mailing-lists, or, in general, hops that cross the definition of a "final delivery for the message" act like this. A standardized approach to avoid excessive network traffic, and, in parts, CPU work during message verification will mitigate careless configurations.

Conventions and Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. The term "FOSS" refers to Free and Open Source Software. The term "privately encrypt" is used to denote that the software should take appropriate steps to ensure data encryption and key security.

DKIM Vivid Adjustments This document obsoletes certain unused / incomplete aspects of DKIM, and adjusts certain vivid parts as follows. The full context of the changes will become clear as the remains of this document unfold.

Informative remark: The protocol changes apply only to signatures signalling compliance with the iterated variant presented in this document.

Temporary errors encountered during DNS record lookups MUST be passed through to the SMTP layer as such via a 451 or 455 reply code; with enhanced SMTP status codes 4.4.3 ("routing server failure") or 4.7.5 ("cryptographic failure") MUST be used.
Informative remark: Verification and creation of cryptographic signatures becomes crucial, therefore DNS failures can no longer be subject of local policy. Among others the DKIM sections 4.2 "Interpretation", 6.3. "Interpret Results/Apply Local Policy" and 8.15. "Attacks Involving Extra Header Fields" are declared "changed" in meaning to stand back behind this paradigm.
It is herewith clarified that during "relaxed" canonicalization of header fields any encountered carriage-return CR or line feed LF MUST be ignored. This to make official the approach taken by existing implementations to deal with compatibility issues required by IMF (RFC 5322). Likewise care has to be taken to ensure that only the sequence CRLF is interpreted as a line terminator in body processing.
Informative remark: The milter protocol in widespread use for implementing SMTP extensions or support, like DKIM, does not pass CRLF during header processing, but uses LF. Also because DKIM processing is byte-based, and does not know about "quoted-pair" at all, the specification could never be complied to in full in reality.
The signature expiration tag "x=" is no longer optional. It MUST be used to place a lifetime constraint. The maximum "t=" to "x=" delta MUST NOT be greater than 864000 seconds (ten days: to reach into the next working week). Example delta values for tag auto-generation may be the bounce defaults 432000 seconds (five days: used for example by the Mailman2 and mlmmj mailing-list managers, and the postfix MTA), 345600 seconds (four days: OpenSMTPD MTA), or 172800 seconds (two days: Exim MTA).
Informative remark: The DKIM section 5.2 defined "reasonable validation interval before [keys are] being removed from the key server" is being pointed to in that respect.
Informative remark: For "ingress" signatures with the DKIM/ACDC "I" flag, as below, this limit does not apply: it MAY be as high as local policies desire, in order to support delayed verification of validation results.
The AUID (DKIM, section 3.5, "i=" tag description) SHOULD be used, and the value should enable the signer to identify the originator for whom it creates the signature. The value SHOULD NOT hinder obfuscation, for example hiding real (sender etc.) address(es) for the purpose of cryptographically protecting email in an end-to-end security scenario.
Configuration directives to verify that for locally created messages SMTP envelope MAIL FROM matches the first address of the IMF From: header field SHOULD be offered, and may be enabled by default.
Informative remark: Offering an easily accessible constraint to ensure matching identities of envelope and message data for messages newly injected into the email system prevents an entire type of attack, which is not easy to counteract otherwise. Since parsing of "RFC5322.From" is a common DKIM operation (compare DMARC), bolstering a domains' operational stringency is in good hands.
Informative remark: Not only for more context on operational email status reading , and using the Author: header field described there is recommendet.
DKIM section 3.7 defines how "Computing the Message Hashes" has to be performed. Different to the RSA algorithm (RFC 8017) solely defined for DKIM at the time section 3.7 was written, modern algorithms include checksumming themselves. Section 3.7 is hereby modified in that the input to "sig-alg", the "data-hash", can adapt to standardized algorithms as appropriate. If an algorithm chooses adaption, "hash-alg" is only used to produce the "body-hash", whereas the input formerly used to create the "data-hash" is fed in full into "sig-alg", instead of to "hash-alg". More formally, the new pseudo-code for the signature algorithm is:
Informative remark: An algorithm that exists in "adopted" and "non-adopted" variants MUST be treated as a single algorithm. For example, a constraint like "one X per algorithm" would not allow both variants to be used.
Informative remark: Different to plain DKIM DKIM/ACDC, as below, requires to keep the normalized content of header fields around for consumption by the differential changes algorithm. The immediate consumption of all input by "hash-alg", as was and is implemented by certain (not all) DKIM software, can therefore no longer be a primary design goal. Whether the larger amount of data ("data-hash" vs all of "h-headers", "D-SIG", "body-hash"), when fed into "sig-alg", is of negative impact, is to a large extend a digital signature quality of implementation, and hardware performance issue.
A new "Key Type" (DKIM section 7.6) is added: "EDA-SHA256". It is identical to ED25519 except for using the above adaption. DKIM/ACDC compatible software MUST support as well as "eda-sha256". The type "eda" MUST NOT be used for the backward compatible DKIM-Signature: header fields, but only for the "DKIX-" DKIM/ACDC-only variants, as below.

DKIM/ACDC The DKIM iteration Access Control and Differential Changes:

Changes paradigm: there is absolute trust in public key cryptography.
Informative remark: A correct cryptographic signature is the only known way to safely ensure message validity. A failing integrity check is described by from 2003 as "the message was corrupted or altered", meaning the system is "unable to validate the message". Furtherly processing and delegating such a message is misguided, not only in a world in which not even communication handshakes succeed in an identical situation.
Introduces a flag second: base DKIM, and any base DKIM compatibility items introduced below, become obsolete after "t=" 2054847098 (0x7A7A7A7A hexadecimal), which corresponds to ISO 8601 2035-02-11T22:51:38Z. DKIM/ACDC aware software MUST NOT generate nor interpret compatibility items when the "t=" tag of signatures denotes a larger timestamp. It MAY have configuration options to do so much earlier, in general or for certain network communication partners.
Informative remark: A "t=" older than the allowed maximum "t=" to "x=" delta (as above), or more than 84421 seconds in the future, MUST cause the signature to be ignored, or the message to be rejected in case the signature "is vivid"; in the latter case the reply code 550 is to be used; with enhanced SMTP status codes 5.4.7 ("delivery time expired") MUST be used. (A DKIM/ACDC DKIX-Sig: with the highest "sequence" is "vivid".)
Introduces new header fields: DKIX-Sig: is the signature, DKIX-DC: covers differential changes, as necessary, and DKIX-AC: constitutes a access-control signature. There is also a "temporary" DKIX-Store: header field. They should be treated like trace headers, just like DKIM-Signature:.
Informative remark: DKIM/ACDC treats the DKIM-Signature: header field as actively maintained legacy that is left alone, except that any such field generated by DKIM/ACDC aware software links to its corresponding DKIX-Sig: successor via the new "w=" tag, and may (will) be removed by later DKIM/ACDC aware hops as part of active mitigations : the DKIM section 4.2 "Interpretation" is cleared away in this respect. (Past the flag second any such field MUST be removed.)
DKIX signatures MUST NOT be generated with the RSA algorithm ("rsa-sha256"). If a legacy DKIM-Signature: header field is generated, it may, or, for maximum compatibility, should be generated with it. Any DKIX header field MUST NOT have FWS in surrounding the "=" separator in key/value lists. More formally, this specification obsoletes the use of FWS in "ag-spec". (DKIM/ACDC thus "reverts" back to "original policies" as used for example by MIME.)
Places signatures in an ordered, numbered, random-accessible sequence which' state correlate. (Signatures generated at the same hop share a sequence number.)
Informative remark: With DKIM/ACDC it can be, and usually is, sufficient to verify only the cheaply detectable highest numbered signature.
Adds reversible data difference tracking, and as such supports cryptographical content verification of any (DKIM/ACDC aware) intermediate message state, up to the initial variant as sent by the author.
Cryptographically protects the SMTP envelope, that is, RCPT TO addresses as well as the MAIL FROM address.
Informative remark: Replay of valid messages to initially not addressed recipients, as well as backscatter bounces to random addresses instead of the originator, becomes detectable.
The advices of DKIM section 5.4 and 5.4.1, "Determine the Header Fields to Sign" and "Recommended signature content", respectively, are replaced with a dedicated header field database , the members of which MUST be signed. Database members are not announced via the "h=" tag. Instead DKIX-Sig: uses a "visualized" bitset that is stored in the "hfdb-bits" subtag of the "acdc=" tag.
Informative remark: For DKIX-Sig: "h=" becomes an optional tag that is only used to denote those fields which are not covered by the database.
Informative remark: The concept of "sealing" or "oversigning", that is, the DKIM saying of section 3.5, "h=" tag description, to "sign fields not present at the time of signing" does not apply. Each hop signs the message state it sees, and content changes are recorded via DKIX-DC: differential changes.
Side note: It is not up to DKIM(/ACDC) to decide upon the validity of changes. With DKIM/ACDC it is just like today, but in a standardized fashion: each hop "mitigates away" elder message instances to create a valid signature environment for hops further down the message path. Different to today, however, these hops can cryptographically verify and inspect elder instances up the path, back to the message author. Since DKIM/ACDC mitigations require message content changes to "take message ownership", this is always visible to and verifiable by end users. Oversigning has no meaning: without content changes nothing is mitigated, with changes mitigations can be unrolled.
Only makes use of the "relaxed" canonicalization type. The "simple" variant MUST NOT be used. (A rational for choosing "relaxed" is given in the section "Further DKIM Updates".)
Allows recognition of certain flagged conditions (along the message path) only by looking at the highest numbered signature.
Allows cryptographically verifiable collection of statistics of organizational trust (, section 2.5) along the entire message path.

The DKIM iteration Access Control and Differential Changes creates DKIX-Sig: signature header fields, which are identical to DKIM-Signature: header fields except for the new "acdc=" and "dch=" tags, the missing "v=" tag, and the changed semantics of the now optional "h=" tag. The "dch=" tag holds the checksum of the canonicalized data of all existing DKIX-DC: header fields, each header field concluded with a CRLF, sorted in reverse header stack order as defined by DKIM section 5.4.2, "Signatures Involving Multiple Instances of a Field" (which should be equal the reverse "sequence" order), with the DKIX-DC: header created for this signature, if any, last (at the top of the stack). The "acdc=" tag consists of multiple subtags separated with colon (U+003A, :). For efficiency reasons it SHOULD be placed early, before tags like the new "dch=", but also "bh=", "b=", and "h=", or example.

The tag starts with the "sequence" subtag, a decimal number starting at 1, or incremented by 1 from the highest DKIM/ACDC "sequence" encountered in the message; the maximum value is 999: if incrementing would result in overflow, the message MUST be rejected; detected sequence holes MUST also cause rejection (but see below); in both cases SMTP reply code 550 is to be used; with enhanced SMTP status codes 5.5.4 MUST be used.
Informative remark: The chosen limit seems sufficiently high to never cause problems in practice; compare for example SMTP section 6.3 "Loop Detection".
Multiple signature header fields with the same "sequence" MAY be generated by a domain, in which case each field MUST use a different "s=" selector, and maximally one selector per algorithm MUST be used.
The second subtag after "sequence" is "hfdb-bits", a "visualized" (base 36) bitset that stores 5 bits per byte (examplary C code exists ), and records presence of header fields which are part of the header field database . (Standard conforming messages always have multiple header fields announced by "hfdb-bits", but "hfdb-bits" MUST be at least presented by "0".)
Informative remark: The optional "h=" tag is only used for header fields which are not included in the database. An empty "h=" tag MUST NOT be generated.
The third subtag contains a list of flags that announce state and conditions of the message at its current point in the message path. If a flag is said to be necessary, all flags that it implies must also be set, even if not explicitly mentioned. Flag description is normative. (Again: note the missing FWS separators around "="!) ABNF:

A
Alongside the "V" flag (see there) only: all existing signatures were verified.
D
The message content was modified at this hop, differential changes were generated, and are stored in a DKIX-DC: header field. The "Y" flag has to be set.
C
The hop signals interest in collection and (periodical) report of statistical informations regarding (this) message(s). The exact semantics are out of scope for this document. As an example of what this could be DMARC aggregate reporting (RFC 9990) may be mentioned.
E
The SMTP envelope (MAIL FROM and/or RCPT TO) was modified. The "O" or "N" flag has to be set if the MAIL FROM changed. The "y" flag has to be set. But for "I"ngress signatures a new "Access Control" evaluation has been performed. Existing DKIX-AC: header fields MUST be removed.
I
This signature header field was generated at ingress. Special rules apply to these signatures, for example unlimited "x=" tag expiration.
Informative remark: Such fields offer a cryptographically verifiable message state authentication contract: for as long as the ("local") "s=" selector announced key is available, the message state at the time it entered the "local" email processing system is assurable by for example user interfaces.
All signature instances with this flag set MUST be removed when messages enter and leave the email system. This is meant as simple: if the flag is set, remove the field.
Informative remark: If a local "I" message is removed on egress, the newly generated DKIX-Sig: overtakes its logical flag subset.
L
This hop announces that it supports the SMTP extension STARTTLS. The flag MUST only be set if any incoming SMTP connection will reach a TLS-enabled endpoint. Author remark: "Dies mein Abendgruss folgend dem SRV DNSSEC Blues" ("here my evening greeting following the SRV dnssec cheatin'").
N
The hop detected an unprotected or "irregulary changed" SMTP envelope (compare the mutual exclusive "O" flag), but the message will be accepted, necessarily alongside the "Z" flag. Some non DKIM/ACDC aware hop changed the SMTP envelope. If there is a DKIX-AC: header field, the access control check MUST instead fail if (at least the domain of) the MAIL FROM is unchanged. The "E" flag has to be set.
Informative remark: Except for messages with "sequence" 1 the "N" state is usually mitigated , causing the "O" flag condition.
O
This hop claims the message origin. This either means that the message originated at this hop, in which case the signature (usually, DKIM-typical) refers to the first address of the From: header field, and the "sequence" is 1. Or it means the current hop was the, quoting , "final delivery for the [original] message", that the message got a "new envelope return address", that is, the MAIL FROM of the SMTP envelope was changed. In this case the "E" flag has to be set.
P
Postmaster mode. With this flag set the behavior of DKIM/ACDC borders test mode in that rejections must not occur (due to DKIM/ACDC). This is to allow for a communication possibility window in a situation where messages would always be rejected, due to misconfigurations et cetera, and as such reflects SMTP section 4.5.1 Minimum Implementation. If the "sequence" is 1, message recipients have to be inspected. If the IMF header fields To: and Cc: only contain a single addressee with the local-part postmaster, and if the same "postmaster" is addressed as the only SMTP RCPT TO recipient, then the "P" flag has to be set. Once set, all future DKIM/ACDC signatures must copy it. It MUST, however, be removed when in conjunction with the "E" flag the according SMTP envelope conditions are no longer satisfied.
R
Reputation check to collect organizational trust (, section 2.5) along the signature chain was performed. On top of the "V" (and possibly "A") flag(s) this means that all differential changes have been applied, and all signatures (at least one per "sequence") along the chain have been verified, and the entire chain validated correctly. Only in signatures with a "sequence" greater than 1, and without the "z" flag.
Informative remark: The presence of "R" reveals local state publically; however, in a chain of trust this seems desirable even. The use of organizational trust may for example mean to perform full reputation checks more and more sparingly, the higher the trust, falling back to only random checks. (For a more complete example, see , section 2.5.)
S and s
Only in conjunction with the "I" flag: upon ingress the SPF state was, or was not, respectively, successfully verified.
Informative remark: From DKIM/ACDC's point of view SPF is legacy, and it actively mitigates it to transpose trust to DKIX-AC: (also see the "N" flag). With DKIM/ACDC SPF users can announce the strict -all mode that allows SPF verifiers to apply policy.
T
This hop requires complete trust to be put into its signature. In general all DKIX-DC: header fields were removed, applying changes for verification of elder signatures is therefore impossible. Corresponding additional flags have to be set, like "Z". Please read about the "t" flag.
t
This is like "T", except that DKIX-DC: header fields are not (yet) removed, so that elder signatures (to the extend as indicated by the usual DKIM/ACDC flag machinery) can be verified.
Informative remark: the "T" and "t" flags are meant to adapt to operational reality. There "trusted proof points" are hired to handle email, to apply all the necessary checks for, and removal of spam, malicious, dangerous, or otherwise undesired message (MIME part) content, before passing the results further to their real recipients. As of today only the "equivalent of T" is a known mode of operation; DKIM/ACDC, however, allows for a new business model via "t": the "trusted proof point" readily prepares messages just like today, but also creates and includes a DKIX-DC: header field to undo these modifications, as well as keeping all elder DKIX-DC: header fields intact. (Read: simply through the normal DKIM/ACDC mode of operation, except for setting the "t" flag in addition.) Turning a "t" message into a "T" message practically means nothing but removing the DKIX-DC: header fields: an operation that can fastly and safely be performed by simplemost command line utilities or scripting languages, thanks to the plain-text nature of SMTP, of IMF messages. DKIM/ACDC aware software MAY also offer a mode which removes DKIX-DC: header fields after the signature verification step (and creation of an "I" signature) for messages tagged "t" coming from a configurable "trusted proof point".
Informative remark: because DKIX-DC: header fields are covered by the "dch=" hash, removing them still allows for successful signature verification, simply by trusting the original "dch=" checksum. DKIM/ACDC's "t" flag allows customers to perform a complete "R" reputation check on data delivered by "trusted proof points". (To be written or extended message access software could also be allowed to access more portions via DKIX-DC:.) It is only their users verifying "I" ingress signatures who have no option but putting trust into "dch=" hashes.
V
DKIM/ACDC signature verified successfully. The signature with the highest "sequence" has been verified correctly, the (otherwise untested) DKIM/ACDC signature chain is complete, and their flags make sense (in the sequence). In conjunction with the flag "R" even deeper inspection was performed. If multiple signatures with the same highest "sequence" exist, the verifier behavior is unspecified in that "V" signals success: at least one signature was checked, and all tested signatures verified successfully. If however all signatures were verified, the "A" flag SHOULD be set; in single-signature cases the "A" flag MAY be omitted. Only in signatures with a "sequence" greater than 1.
v
DKIM signature verified successfully. In signatures with "sequence" 1, then missing the "O", but with the "N" flag, it means the message originated at a non DKIM/ACDC aware hop, and normal DKIM processing was performed and succeeded. If the signature covering "RFC5322.From" verified the "Z" flag must be set, otherwise "z". In messages with a higher "sequence" it comes alongside the "X" flag: necessarily the DKIM/ACDC chain was broken, and the message changed, by an intermediate non DKIM/ACDC aware hop. The "z" flag must be set.
X
DKIM/ACDC verification failed. Also see "v" and "x" flags. The "z" flag must be set.
x
"Plain old DKIM verification" failed, or there was no (more) signature to verify. In signatures with "sequence" 1, then missing the "O", but with the "N" flag, it means the message originated at a non DKIM/ACDC aware hop, and normal DKIM processing was performed and failed. The "z" flag must be set. Otherwise, with an existing DKIM/ACDC chain, it comes alongside the "X" flag: necessarily the chain was broken, and the message changed, by an intermediate non DKIM/ACDC aware hop. The "z" flag must be set.
Y
The message has seen IMF modifications: somewhere along the chain the message data was modified. Once set, all future DKIM/ACDC signatures must copy it.
y
The message has seen SMTP envelope modifications: somewhere along the chain the envelope was modified. Once set, all future DKIM/ACDC signatures must copy it.
Z
Announces the DKIM/ACDC chain is incomplete. The message was processed by DKIM/ACDC unaware hops. However, the message verifies correctly and seems to have never been modified non-reversibly. Once set, all future DKIM/ACDC signatures must copy it, unless later downgraded to the "z" flag.
z
The message has seen non-reversible modifications, and cannot be cryptographically verified back to its origin. Once set, all future DKIM/ACDC signatures must copy it. When a message newly enters, or "reenters", the "z" state, all existing DKIX-DC: header fields MUST be removed.
Informative remark: Often "z" signals a condition that MUST cause message rejection, for example in conjunction with the "x" flag. Local policy MAY behave differently for certain conditions, but SHOULD NOT, as the flag combination may reduce the hops organizational trust (, section 2.5).
"id"
The optional "message and bounce identifier" offers enough room for Universally Unique IDentifiers.
Informative remark: It MAY be generated to help sending domains to uniquely identify messages within the "t=" and "x=" time delta, as well as to ensure that successively sent identical messages are not detected as being the same. It MUST be generated if the signature does not cover a Message-ID: header field, and it SHOULD be used if the uniqueness of the "msg-id" is dubious.
Informative remark: Receiving domains SHOULD NOT use this identifier due to the denial of service attack surface, regardless of collected organizational trust.

Unknown flags MUST be ignored. Invalid flag combinations and flag misuse, as far as detectable, and false "hfdb-bits" specifications MUST result in rejection with SMTP reply code 550; if enhanced status codes are used, 5.5.4 MUST be used.

The DKIX-Store header field The DKIX-Store: header field has no meaning in the email system. The sole purpose of mentioning it is to announce that it MUST be removed when messages enter and leave the email system. It could for example be temporarily created and used by non-integrated mail solutions that consist of otherwise unrelated software, to pass informational data in between the "ingress" and the "egress" processing side. To address possible software bugs and configuration errors this specification enforces removal of all occurrences.

Informative remark: In order to achieve locality it is suggested to "privately encrypt" data passed around in this temporary header field.

Access Control SMTP delivers messages to individual domains. With DKIM/ACDC, whenever a SMTP envelope is created or changed, all distinct domain-names found within the list of intended SMTP envelope RCPT TO addressees are collected, because messages need to be actively forged on this individual domain base: DKIM/ACDC will create and include DKIX-AC: header fields covering SMTP envelopes as messages are sent to individual domains. The domains' _dkimacdc DNS entries, as below, are queried. Dependent upon the detected state the DKIX-AC: header fields will either contain exact envelope info (DKIM/ACDC supported), or only domain names. In any case the completely prepared message, including the readily prepared signatures, is forged, (a) DKIX-AC: header field(s) is/are generated which cover(s) the logical recipient subset, and the resulting message is then sent.

Informative remark: MTA-integrated DKIM/ACDC implementations can create perfect fit DKIX-AC: header fields only for recipients truly accepted by the receiving MTA (not hindering even SMTP pipelining), and use successive transmissions until all recipients have been worked.

DKIM/ACDC aware recipient domains are expected to manage a DKIX-AC: identity cache to mitigate replay attacks. (Hint: a verified DKIX-AC: signature seems like a natural cache key source, see below.)

Informative remark: The now mandatory and constrained "x=" tag allows for finite identity cache sizes.

Informative remark: Perfect fit DKIX-AC: header fields can create write-once cache entries.

A DKIM/ACDC aware hop that receives a message that contains at least one DKIM/ACDC enabled signature, and that does not contain a DKIX-AC: header field MUST reject it with SMTP reply code 550; if enhanced status codes are used, 5.5.4 MUST be used. It MUST reject messages which fail the signature check of a DKIX-AC: or signature header field, or the condition and flag check verification, with SMTP reply code 550; the enhanced status code MUST be 5.7.7 ("message integrity failure"). It MUST likewise fail if the DKIX-AC: header field does not correspond to the SMTP envelope data, with exceptions as documented for the "N" flag of the "acdc=" tag of DKIX-Sig:natures. It MUST test for a superset of recipients, and only fail if an envelope recipient is not included in the DKIX-AC: header field. DKIX-AC: header fields with an "ec=" are treated specially. Senders MAY use Delivery Status Notifications to fine-tune the resulting behavior.

The DKIX-AC header field The syntax of this header field is the usual semicolon separated list of DKIM-style tags of unspecified order; unknown tags MUST be ignored. It is used to cryptographically link the SMTP envelope to the sent IMF mail message. The "w=" tag is the linked DKIX-Sig: "sequence", best placed early. Multiple signatures with the same "sequence", but different algorithms, may exist, and so may DKIX-AC: header fields. The selector of the linked signature is given by the "s=" tag, the used algorithm can be deduced from there. The "o=" tag is the domain of the SMTP MAIL FROM, the "f=" tag denotes the "local-part".

Informative remark: In conjunction with the "acdc=" "N" flag these do not correspond to the "local email system".

The "d=" tag value is the recipient domain, with one to multiple "t=" tag(s) for the "local-part"s of RCPT TOs.

Warning: The "d=" tag may have an empty value alongside "P"ostmaster mode!

Warning: SMTP address "local-part"s permit "quoted-string"s!

In case the recipient domain for a particular message forge has not announced support for DKIM/ACDC, and to strengthen SMTP envelope anonymity in permanent IMF message data, the tag "f=", as well as any "t=" tag MUST be omitted, and instead a "privately encrypted" "ec=" tag be placed: the content of this tag is BASE64 encoded, and MUST correlate to the hidden "f=" and "t=" tags.

Informative remark: The SMTP envelope domains are cryptographically fixated even in the minimal non-DKIM/ACDC variant of DKIX-AC:, protecting users of DKIM/ACDC aware hops against replay. The security enhancement was considered worth the resulting unfortunate leakage of this minimal DKIX-AC: header field to permanent storage.

Mirroring DKIM-Signature: the tag list is concluded with the "b=" tag that is the cryptographic signature data of the DKIX-AC: header field. To ensure proper linkage and uniqueness of the "b=" signature the reassembled (see DKIM section 3.5) value of the linked DKIX-Sig: signature is "temporarily assigned" to "b=" before creating it; Thereafter the "b=" tag is assigned its own value. All instances of DKIX-AC: header fields MUST be removed by DKIM/ACDC aware software as soon as possible: they MUST NOT be delivered by local delivery agents as part of the message. They MUST, however, exist in rejected messages. However, if a domain is only an intermediate, which was neither directly addressed nor which originated the mail, and which does not modify the SMTP envelope either, then it MUST NOT remove the "current" DKIX-AC: header field(s), and it MUST NOT generate (a) new one(s).

The _dkimacdc.DOMAIN DNS TXT RR The syntax of this DNS resource record is the usual semicolon separated list of DKIM-style tags of unspecified order; unknown tags MUST be ignored. DNS CNAME chains MUST be followed when looking up this DNS RR. The optional tag "c=" (%x64 "=") MUST have the value "n" (%x6E). It announces that the DKIM-Signature: legacy header field need not be generated for messages sent to this host. Senders MAY follow this advise if they can ensure that the message will not pass intermediate hops. The optional tag "a=" (%x61 "=") represents a colon-separated list of supported algorithm names, interpreted case-insensitively. Unknown list entries MUST be ignored. The entries "ed25519-sha256" and "eda-sha256" are implied.

Differential Changes Whenever a DKIM/ACDC enabled domain detects during signature creation that the canonicalized representation of a message, whether header fields and/or body data, was modified, a new DKIX-DC: header field has to be created.

The DKIX-DC header field The syntax of this header field is the usual semicolon separated list of DKIM-style tags of unspecified order; unknown tags MUST be ignored. The "w=" tag is the linked DKIX-Sig: ACDC "sequence", best placed early. The "h=" tag is used to store differential data for header fields, "b=" that for body content. Both tags are optional, but at least one MUST exist in a valid DKIX-DC: header field, and a given one MUST NOT have an empty value. The differential data is stored in the patch format as below, which is first compressed with ZLIB, and then BASE64 encoded.

Preparing patch creation The differential changes are created with canonicalized header fields and body data, respectively, as seen on egress, alongside the equally canonicalized data present before modifications took place, that is, on ingress. All header fields covered by the header field database MUST be included. All header fields covered by former signatures of the DKIM/ACDC chain MUST be included. DKIM/ACDC enabled signatures, and any other "DKIX-" header field MUST NOT be included. The header fields MUST be sorted byte-wise (by numeric ASCII byte value) by-name, the formed subgroups MUST remain in the (reverse) header stack order defined by DKIM section 5.4.2, "Signatures Involving Multiple Instances of a Field". Differential changes are then expressed with the patch content as below.

Patch content The patch format is an adopted variant of the BSDiff algorithm patch format, as below. Overall it consists of a header, followed by control data. Thereafter the two byte (8-bit octet) streams of differential data (in reverse order) and extra data conclude the patch. Erroneous patch data MUST cause rejection. The header and the control data consist of 32-bit signed integers, stored in network byte order (MSF; most significant byte first). The header consists of four values denoting the length of the control data tuple block in bytes, the length of the differential data block in bytes, the length of the extra data block in bytes, concluded by the length of the original "data target" in bytes; The sum of the first three values must be one less than the maximum positive 32-bit signed integer. The number of control data tuples MUST NOT excess the length of the original "data source" (in bytes) plus one. The control data is a stream of tuples of three values each. The first denotes the length of differential data to join in in bytes, or 0. Differential data is joined in by adding "the current" differential byte to "the current" byte of the "data source", then storing the addition result byte in the "data target". The read positions within the differential data and the "data source" move forward accordingly. The write position in the "data target" moves forward accordingly. The second value denotes the length of extra data to copy in bytes, or 0. The read position within extra data move forward accordingly. The write position in the "data target" moves forward accordingly. All tuples are worked in order, and within each tuple first the differential data, if any, then the extra data, if any, is worked. The last value of control data tuples denotes the number of bytes to seek relatively in the "data source" before the next tuple is worked. Of all the values, only this one may be negative. The overall offset within the "data source" MUST NOT become negative after the seek.

Informative remark: Of all control data tuples only the first may only perform seek adjustments without also storing data in "data target". Even if it is otherwise ignored the seek value of the last control data tuple must result in a valid offset. (0 is always valid.) Other conditions MUST be treated as errors.

BSDiff patch content adaption

32-bit integers are used for length and offset values. This almost halves memory usage, and produces smaller patch control data. It is deemed sufficient for email purposes.
Data is stored in big endian (network; MSF; most significant byte first) instead of little endian (LSF; least significant byte first) byte order.
In order to allow for memory usage reduction during patch generation the adaption uses a shared memory region for differential and extra data: the former is therefore stored in reversed order, top down. (Reduces memory usage by the size of the target data set.)
The entire, readily prepared patch is passed through a compressor. (The original uses three separate bzip2 streams to sequentially serialize control, differential and extra data separately.)
The original header did not contain the size of the extra data, which was stored last, with its size implicitly extending to the end of the patch. The adaption includes the extra data size in the header, allowing more verification tests to be applied with only the header being readily parsed. This also enables the I/O layer to allocate perfectly sized memory with only the header data being available.

An example algorithm A very fast and simple algorithm that processes data linewise is presented here.

Informative remark: Like SMTP DKIM does not know about MIME, it treats the body (and the header fields, in that respect) as CRLF terminated lines of bytes, with certain byte-based ("relaxed") normalizations applied on top. MIME reencoding (may) happen along the message path: the MIME content-transfer-encoding type, its line length(s), as well as character sets, and more, can theoretically be in flux. With the advent of tracking of differential changes it is expected that software becomes smarter, by adapting more to what exists in messages, instead of performing "brute force modifications". For example, a service provider that rewrites URIs within messages can ensure that the line lengths of a base64 formatted input are preserved after the rewrite, as base64, and with the underlaying character set being unchanged, and that, where modifications took place, the lengths of only the modified lines are adjusted so to keep the differential changes as minimal as possible. In such a world a very fast linewise algorithm is sufficient.

Initialize a "list of lines" that can be iterated over uni-directionally to 0. While there is still input data from the target (egress) data set:

Search for LF line feed. If found, advance over it, otherwise use the entire remaining input.
Verify the resulting data length fits into 31-bit, minus 1.
Create a hash of the data that is proof against complexity attacks.
Put the data at the end of the "list of lines".

Initialize a "hashmap" to 0. If any, store all members of the "list of lines" therein, indexed by their hashes. Initialize an "absolute position", a "current length of differences", and a "current length of extra data" to 0. Initialize a "former line" to 0. Also initialize an "overall length of differences", and an "overall length of extra data" to 0. While there is still input data from the source (ingress) data set:

Search for LF line feed. If found, advance over it, otherwise use the entire remaining input.
Verify the resulting data length fits into 31-bit, minus 1.
If the "hashmap" is not 0:
- Create a hash of the data (with the same algorithm as above).
- If the "former line" is not 0, check whether its next line (in the "list of lines"), if any, matches the current data. If so, make the next line the "former line", verify the resulting "overall length of differences" will fit in 31-bit, minus 1; Add that many bytes to the "absolute position", add that many bytes to the "current length of differences", add that many zero bytes to "differences".
- Otherwise search "hashmap" for the hash (ensure an exact data match, as necessary). If a match is found, then if either the "current length of differences" is not 0, or the "current length of extra data" is not 0, or if no control tuple has yet been dumped, then dump a control tuple first, and set the "current length of extra data" to 0; Make it the "former line", verify the resulting "overall length of differences" will fit in 31-bit, minus 1; If the offset of the start of the "former line" (within the target (egress) data set) does not equal the "absolute position", ensure the relative seek of the formerly dumped control tuple is set to the subtraction of the start of the "former line" and the "absolute position"; Set the "current length of differences" to that many bytes, set the "absolute position" to the start of the "former line" plus that many bytes, add that many zero bytes to "differences". (Optimization: dumping an otherwise "empty" initial control tuple is not necessary if no relative seek is to be applied.)
Otherwise the line is extra data. Verify the resulting "overall length of extra data" will fit in 31-bit, minus 1; Add that many bytes to the "current length of extra data", copy that many bytes to "extra data".

After all the data has been worked, then if either the "current length of differences" is not 0, or the "current length of extra data" is not 0, then dump a control tuple. Ensure the relative seek of the last control tuple, if any, is 0. Here exemplary results of the FOSS plug-and-play ISO C99 and perl reference implementation that generates the above patch format with either (the string suffix sort based) BSDiff algorithm of Colin Percival, or the above textual variant:

Rationale Differences are included to allow DKIM verifiers to restore previous message content for the purpose of cryptographically verifying elder signatures. This for example allows for collecting trustworthy statistics of organizational trust (, section 2.5) in an automated fashion. Alternatively or in addition per-user decisions for certain message paths, involving certain modification per path hop, are made possible, and can be taken into account.

For example, user interfaces could use traffic light semantics that unfold on click to traffic light semantics of all message versions, which would (with precautions) visualize differences. This can empower users to make decisions on the trustworthiness of intermediates, and, for example, request display of the From: header field as created by the original message sender for a message path that crosses a mailing-list. (As in, "yes, i accept this hop changes From:, it is a mailing list".)

Informative remark: The data exists in the DKIM "relaxed" normalized variant: former states are not meant to be usable messages. This is deemed acceptable because of the purpose of including differential changes, and because their visualization of a successfully verified DKIM/ACDC chain should still be sufficient to allow users, and automated systems, making responsible decisions.

Mitigations for Future At the time of this writing the email infrastructure is deeply penetrated by mitigation code that circumvents problems incurred by standards like DMARC and SPF, driven by the desire to keep existing infrastructures (configurations) in an usable state. For example, SPF will not survive a single hop, which means that alias expansion, a widely used core feature of the email infrastructure, does no longer easily work. The IETF has no solution for this problem, but the software world has created a "Sender Rewriting Scheme", involving dedicated software to implement mitigations, so that aliases can be used regardless. As another example, DMARC causes a lot of mailing-lists to apply mitigations of various form and style: old signatures are removed, or renamed, often the From: header field is rewritten in a "User A via List B" style, and the Reply-To: header field will announce the real sender, unless that was already set. The introduction of this requirement of blind trust into "A via B" displays seems like a devastating psychological failure.

Mitigations This memo suggests to apply mitigations actively as part of DKIM processing, at minimum temporarily, until, at some future time, the email infrastructure has adapted to a new reality. Future engineers can then decide how to proceed further. In any case it seems wise to move decisions on actual content changes away from the SMTP layer, to reduce failures to cryptographical signature failures, and let users and/or algorithms on a higher layer decide whether a certain content change or applied mitigation is "acceptable", or not.

Remove existing DKIM/ACDC announcing DKIM-Signature: header fields. In case no mitigations have yet been applied to "RFC5322.From", and no such mitigation will be applied, as below, the signature linked to "sequence" 1 is an exception. This mitigation MUST be applied. The mitigation MAY be applied to non DKIM/ACDC linked DKIM-Signatures: as well in as far as local policy allows. Before the flag day DKIM/ACDC will create a single DKIM-Signature: that will verify correctly.
Mitigate non-local MAIL FROM envelopes. Because a possible SPF check will fail on the next hop (in situations with a strict SPF policy that applies a policy), if a message that does not originate locally leaves the email system on egress, with a SMTP envelope MAIL FROM of a foreign domain, mitigate such addresses, so that the current hop becomes the, quoting , "final delivery for the [original] message". DKIM/ACDC software SHOULD offer options to exclude certain domains from these mitigations. To mitigate, synthesize for example an address of the local domain with a "local-part" starting with DKIX=, followed by at least 16 bytes of the BASE64 encoded HMAC of a dedicated cryptographic private key and the original MAIL FROM. Alternatively, using a dedicated subdomain is an approach that avoids any possible "local-part" ambiguities. Then for example the IETF mailing-list From: header field DMARC mitigation approach could be used, which decomposes the original MAIL FROM by replacing the commercial at (U+0040, @) with its "hexadecimal value in quoted-printable notation" to end with "local-part=40domain", followed by the domains of the mitigating host: local-part=40domain@subdomain.domain.
Informative remark: the SMTP size limit of "local-part" is 64 octets, however the overall "reverse-path" limit of RFC 821 and RFC 2821 was 256 octets.
The synthesized address MUST be linkable to the original MAIL FROM for at least 864000 seconds (ten days: to reach into the next working week). It SHOULD be linkable only by Delivery Status Notifications or (other) message bounces. If the bounce transports enough message data content this MAY be furtherly constrained to verifiable DKIM signatures of the local domain, even the exact message for which the address was synthesized. The optional bounce identifier "id" may be usable for this purpose.
Informative remark: Except for linking purposes to the original envelope the synthesized address is otherwise "transparent", and should appear as if it does not exist: DKIM/ACDC software is expected to cause appropriate rejection on SMTP level at the earliest possible time.
Mitigate From: header fields, if necessary. When a message was changed in between ingress and egress, so that the DKIM signature (not only: related to the From: header field) will no longer verify. Then, if the From: header field was not already locally mitigated (by for example mailing-list software), actively mitigate the From: header field, so that the current hop becomes the, quoting , "final delivery for the [original] message" in respect to the IMF message that is visible to recipients. DKIM/ACDC software SHOULD offer options to exclude certain domains from these mitigations. To mitigate, place the original "name-addr" in the Reply-To: header field, unless that already exists, set the Author: header field if it can be ensured that From: was, actually, the author, and replace From: with synthesized content. The examples of non-local MAIL FROM envelope mitigation apply also here in respect to "addr-spec"; yet, the dedicated subdomain approach results in visually more appealing header field content. For the "display-name" a "From: X via <Y>" notation MAY be used, where "X" denotes the original "display-name". For example, if the original content was "Forename Surname <for.sur@example1.net>" then the mitigation could be "Forename Surname via" <for.sur=40example1.net@dkim.example2.net>. Without dedicated subdomains a variant of the widely known construct "Forename Surname <for(DOT)sur(AT)example1(DOT)net>" via <dkix-dedicated@example2.net> may be used. Whether DKIM/ACDC dedicated subdomain or DKIM/ACDC dedicated address, the backing software implementation is expected to rewrite the address

Example An example that shows the flow of a single message with multiple different recipients, including mailing-lists and aliases. It assumes all recipients announced DKIM/ACDC support. It provides full mitigations and support for SPF and DMARC. RCPT TO: RCPT TO: ... DKIX-AC: w=1; s=K1ed; o=b.c; f=a; d=f.g; t=d; t=e; b=.. DKIM-Signature: w=1; s=K1; .. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a@b.c To: d@f.g, e@f.g, x@y.z, u@v.w, r@s.t, o@p.q ... f.g, local delivery (to d@ and e@): ... DKIX-Sig: acdc=2:N0C:AIV; s=K2ed; .. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a@b.c ... x@y.z -- a mailing-list! It redistributes after RFC 2369 and RFC 2919 additions, in-message unsubscribe footer, and From: mitigated (in best RFC 3461 manner): MAIL FROM: RCPT TO: ... DKIX-AC: w=2; s=K2ed; o=y.z; f=x; d=m.n; t=l; b=.. DKIX-DC: w=2; h=BASE64; b=BASE64 DKIM-Signature: w=2; s=K2; .. DKIX-Sig: acdc=2:V0C0U3:ADEOVYy; s=K2ed; dch=.. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a(AT)b(DOT)c via Reply-To: a@b.c ... List-Unsubscribe: bla u@v.w -- an expanded alias! The hop honours RFC 3461, and changes MAIL FROM; it keeps DKIM-Signature: w=1 for DMARC compatibility: MAIL FROM: RCPT TO: ... DKIX-AC: w=2; s=K2ed; o=v.w; f=u; d=realv.realw; t=realu; b=.. DKIM-Signature: w=2; s=K2; .. DKIX-Sig: acdc=2:N0C:EOVy; s=K2ed; .. DKIM-Signature: w=1; s=K1; .. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a@b.c ... r@s.t -- an expanded alias! Note: invalid DKIM/ACDC, because no MAIL FROM update, will later fail SPF; it keeps DKIM-Signature: w=1 for DMARC compatibility: MAIL FROM: RCPT TO: ... DKIX-AC: w=2; s=K2ed; o=b.c; f=a; d=reals.realt; t=realr; b=.. DKIM-Signature: w=2; s=K2; .. DKIX-Sig: acdc=2:N0C:EVy; s=K2ed; .. DKIM-Signature: w=1; s=K1; .. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a@b.c ... ... the same, but DKIM/ACDC compliant: MAIL FROM: RCPT TO: ... DKIX-AC: w=2; s=K2ed; o=s.t; f=DKIX=a=40b.c; .. DKIM-Signature: w=2; s=K2; .. DKIX-Sig: acdc=2:N0C:EOVy; s=K2ed; .. DKIM-Signature: w=1; s=K1; .. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a@b.c ... o@p.q -- a mailing-list! Note: invalid DKIM/ACDC, because no From: mitigation, c/would later fail DMARC; it redistributes after RFC 2369 and RFC 2919 additions, and in-message unsubscribe footer. MAIL FROM: RCPT TO: ... DKIX-AC: w=2; s=K2ed; o=p.q; f=o; d=X.X; t=X; b=.. DKIX-DC: w=2; h=BASE64; b=BASE64 DKIM-Signature: w=2; s=K2; .. DKIX-Sig: acdc=2:V0C0U3:DEOVYy; s=K2ed; dch=.. DKIX-Sig: acdc=1:V0C0U3:O; s=K1ed; .. From: a@b.c ... List-Unsubscribe: bla ... the same, but DKIM/ACDC compliant (using dedicated mitigation subdomain): MAIL FROM: RCPT TO: ... DKIX-AC: w=2; s=K2ed; o=p.q; f=o; d=X.X; t=X; b=.. DKIX-DC: w=2; h=BASE64; b=BASE64 DKIM-Signature: w=2; s=K2; d=dkim.p.q; .. DKIX-Sig: acdc=2:V0C0U3:DEOVYy; s=K2ed; d=dkim.p.q; dch=.. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: "a@b.c" via Reply-To: a@b.c ... List-Unsubscribe: bla l@m.n (recipient of x.y.z mailing-list), local delivery: ... DKIX-Sig: acdc=3:V0C0U3:IVYy; s=K3ed; .. DKIX-DC: w=2; h=BASE64; b=BASE64 DKIX-Sig: acdc=2:V0C0U3:ADEOVYy; s=K2ed; dch=.. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a(AT)b(DOT)c via ... realr@reals.realt -- expanded alias target, local delivery: ... DKIX-Sig: acdc=3:N0C:IVy; s=K3ed; .. DKIX-Sig: acdc=2:N0C:EOVy; s=K2ed; .. DKIX-Sig: acdc=1:N0C:O; s=K1ed; .. From: a@b.c ... ]]>

IANA Considerations IANA is asked to add the header fields DKIX-Sig:, DKIX-AC:, DKIX-DC:, and DKIX-Store: to the "Permanent Message Header Field Names" registry. IANA is asked to add "eda" to the "DKIM Key Type" registry. IANA is asked to add the tag "w=" to the "DKIM-Signature Tag Specifications" registry.

Security Considerations Public key cryptography is the safest approach to identification of counterparts and verification of data. This specification enables DKIM to cryptographically verify SMTP envelopes, and to cryptographically verify all message transitions back to the original message sender.

Normative References Informative References BSDIPA, a mutation of BSDiff

Header field database The database of header fields, in an automatically extractable form. Lines starting with EQUALS SIGN U+003D form start and, with a following SOLIDUS U+002F, end tags. The tag "HFDB" encloses the entire database. The header fields are case-insensitive, followed by whitespace, followed by its assigned bit number. Lines starting with NUMBER SIGN U+0023 are comments.

Example code for managing "hfdb-bits" Example C code to interpret and create the DKIM/ACDC "hfdb-bits" tag. #include /* 0-based */ #define HFDB_ENTRIES (42 +1) #define HFDB_STORE_BITS ((HFDB_ENTRIES + (8 - 1)) & ~(8 - 1)) #define HFDB_STORE_SIZE (HFDB_STORE_BITS / 8) #define HFDB_STORE_C2OFF(C) ((hfdb_u8)(C) / 8) #define HFDB_STORE_C2BIT(C) ((hfdb_u8)(C) & (8 - 1)) #define HFDB_STORE_SET(SP,BIT) \ ((SP)->s_dat[HFDB_STORE_C2OFF(BIT)] |= (1u << HFDB_STORE_C2BIT(BIT))) #define HFDB_STORE_CLEAR(SP,BIT) \ ((SP)->s_dat[HFDB_STORE_C2OFF(BIT)] &= ~(1u << HFDB_STORE_C2BIT(BIT))) #define HFDB_STORE_TEST(SP,BIT) \ ((SP)->s_dat[HFDB_STORE_C2OFF(BIT)] & (1u << HFDB_STORE_C2BIT(BIT))) #define HFDB_STRING_SIZE (HFDB_STORE_BITS / 5) typedef unsigned char hfdb_u8; struct hfdb_store{ hfdb_u8 s_max; /* Max bit set in .s_dat, +1 */ hfdb_u8 s_dat[HFDB_STORE_SIZE]; }; /* Number of bits set/characters stored, -1 on error */ static int hfdb_from_cp(struct hfdb_store *sp, char const *cp); static int hfdb_to_cp(struct hfdb_store *sp, char cp[HFDB_STRING_SIZE +1]); static int hfdb_from_cp(struct hfdb_store *sp, char const *cp){ int rv, addbits; memset(sp, 0, sizeof(*sp)); rv = 0; for(addbits = 0; *cp != '\0'; addbits += 5, ++cp){ int bs; bs = (int)(unsigned)*cp; /* Support case-insensitivity */ if(bs >= 'a') bs -= 'a' - 'A'; /* 5 bits/char: 0b11111: 36#V: 0-9A-V */ if(bs > 'V' || bs < '0') break; bs -= '0'; /* "atoi" */ if(bs > 9){ bs -= 7; /* ..but ASCII U+003A..U+0040 not allowed */ if(bs <= 9) break; } for(; bs != 0; ++rv){ int b; b = ffs(bs); --b; bs ^= 1 << b; b += addbits; if(b >= HFDB_ENTRIES) goto jleave; HFDB_STORE_SET(sp, b); sp->s_max = ++b; } } jleave: return (*cp == '\0' ? rv : -1); } static int hfdb_to_cp(struct hfdb_store *sp, char cp[HFDB_STRING_SIZE +1]){ int xmax, addbits; char *cursor, *lx; lx = cursor = cp; xmax = (sp->s_max <= HFDB_ENTRIES) ? sp->s_max : HFDB_ENTRIES; for(addbits = 0; addbits < xmax; addbits += 5){ int c, b; b = 5; c = xmax - addbits; if(b > c) b = c; for(c = 0; b > 0;){ int bx; bx = addbits + --b; if(HFDB_STORE_TEST(sp, bx)) c |= 1u << b; } *cursor++ = "0123456789ABCDEFGHIJKLMNOPQRSTUV"[c]; if(c != 0) lx = cursor; } *lx = '\0'; return (int)(lx - cp); } ]]>

Further DKIM Updates

This specification obsoletes the simple canonicalization type; It MUST NOT be used by DKIM/ACDC compatible software. Rationale: in order to minimize processing cost in time and space for and of differential processing, being able to work on and with only one data representation is beneficial. The "extremely crude ASCII Art attacks" mentioned in DKIM section 8.1 are considered to be a rather artificial attack vector. Furthermore the statement of DKIM section 5.4.1. "Recommended Signature Content" that e-commerce sites (etc) will generally prefer "simple" canonicalization does not match reality in the author's experience. Almost exclusively it is relied on enriched format messages with dedicated whitespace rules, like HTML, or even more refined (compressable, encryptable, optionally interactive, scripted) formats, like PDF. In addition, and foremost, choosing space-preserving MIME content transfer encodings is the, used, natural choice, as appropriate. As a personal comment, because rarely used in practice, there exist possibilities to sign and/or encrypt actual message content to ensure its privacy on a per recipient base, like S/MIME and PGP, which use dedicated, replicable normalization algorithms to protect their content. In operational reality the "relaxed" normalization is by far the most commonly used form, percent-wise it possibly can be estimated in the high nineties. It follows that whitespace granularity that exact does not matter for domain signatures, and if reduction to a single algorithm is desired, "relaxed" is the only viable form for header fields (anyway), and using it for message bodies is a most widely accepted variant.
This specification obsoletes the DKIM "l=" tag that restricts the number of DKIM covered bytes of the normalized message body. It MUST NOT be used by DKIM/ACDC compatible software, and all the message body MUST always be used to create the body hash. Rationale: "l=" has always been insufficient to deal with message changes caused by mailing-lists etc, but effectively includes the security risk that message parts which are not covered by the signature appear as "valid content" to users looking at a DKIM verified message. The DKIM/ACDC differential changes offer a better approach to deal with message changes, while completely covered message bodies ensure content validity.
For the "i=" tag this specification obsoletes the possible use of DKIM-Quoted-Printable for the optional "Local-part". Rationale: because the syntax is "a standard email address where the local-part MAY be omitted", quoted-printable encoding is not necessary for representation.
This specification obsoletes the DKIM "z=" tag that was defined "for diagnostic use" to copy a freely defined set of header fields and their values present during signature creation. It MUST NOT be used by DKIM/ACDC compatible software. Rationale: the DKIM/ACDC differential changes provide access to the same information.
For the "q=" tag this specification obsoletes the possible use of DKIM-Quoted-Printable for the optional "x-sig-q-tag-args" of possibly introduced future query types. Rationale: shall ever a new type become standardized beside the dns/txt that is with DKIM from the very start, that standard can very well give meaning to a "hyphenated-word" proxy identifier without making use of byte values which would require encoding.
This specification obsoletes the DKIM key representation tag "n=" that was meant to include "notes that might be of interest to a human", "intended for use by administrators, not end users", and which "should be used sparingly". Rationale: no use case has been encountered in the DNS, let alone serious such; if future space unconstrained key providers other than DNS should ever exist and be used to distribute DKIM keys, it is likely that they support inclusion of strings via some method that need not be included in the DKIM key representation itself.
Because above changes remove all use cases for the "dkim-quoted-printable" encoding defined in RFC 6376 2.11, this specification obsoletes the DKIM-Quoted-Printable encoding.

Acknowledgements Thanks to, in the order of appearance, Jesse Thompson, Richard Clayton for arguments against reliance on header field stacks, and pro the numbering scheme, and especially for noticing the partial transaction replay attack problem, Douglas Foster, Michael Thomas for explicit man-in-the-middle replay addressing; Alessandro Vesely inspired the explicitness of the E flag, Bron Gondwana for the inspiration to split up binary differences of headers and body, as well as the IANA registry revision, Lasse Collin for LZMA2 (and clarification vs XZ), and Julian Seward for bzip2. A big fat acknowledgment is due to Murray S. Kucherawy. Special thanks to Klaus Schulze, Manuel Goettsching, both also as Ash Ra Tempel, Laeuten der Seele, Laurent Garnier, as well as the Sleeping Environmental Bot broadcast.