Threading
threadId is a stable conversation identifier, calculated deterministically
from a single message’s own headers — never from any other message that has or
hasn’t been seen. This is the core determinism claim of AECS-1 §5: threadId
“must be identical for all messages in the same conversation, across
implementations.”
The algorithm (AECS-1 §5.2)
Evaluated in order — the first rule that produces a result wins:
-
Referencespresent → scan its entries in order and use the first entry that is a valid Message-ID (not just the first list entry — an invalid entry is skipped, not treated as ending the header). For example,References: garbage, <valid@example.com>resolves tovalid@example.com, not to rule 2. -
Otherwise,
In-Reply-Topresent and valid → use it asthreadId. -
Otherwise, own
Message-IDpresent and valid → use it (this is the root message of a new thread). -
Otherwise (no valid Message-ID found anywhere on the message) → generate a deterministic fallback hash:
SHA-256(from_email + ":" + subject_lowercased_trimmed + ":" + date_utc_iso8601)encoded as lowercase hex, UTF-8 input, both
from_emailand the subject Unicode-normalized to NFC first so two independent implementations can’t diverge on visually-identical strings with different byte sequences.
Angle brackets are stripped and whitespace trimmed before any comparison. A
valid Message-ID (AECS-1 §5.1) is, informally: non-empty after trimming and
stripping one optional pair of enclosing angle brackets, containing exactly one
@ with non-empty text on each side.
Why no JWZ reparenting
Mail clients implementing the Jamie Zawinski (“JWZ”) threading algorithm build a container tree incrementally and re-parent messages as earlier context arrives out of order — an orphaned reply gets retroactively attached once its missing parent finally shows up.
AECS-1 deliberately does not do this. threadId is a pure function of one
message’s own headers and never depends on which other messages have or
haven’t been seen. This is a narrower guarantee than JWZ reparenting, traded for
the property AECS-1 is built around: threadId is computable from a single
message in isolation, with no external state, and is guaranteed stable
regardless of processing order.
An implementation that wants JWZ-style merge-on-discovery behavior can build it
as an application-layer feature that groups AECS-1 threadIds together after
the fact — that grouping logic is intentionally out of scope for AECS-1 itself.
thread.position
thread.position cannot be computed from a single message — it requires knowing
every other message in the thread:
- A single, isolated
parse()of one message always setsthread.positiontonull— it has no view of the rest of its thread. EmailThread.from(messages)is the operation that populates it: sort ascending bymetadata.timestamp(the sender-suppliedDateheader — see the note below), then assignposition = 0, 1, 2, ....
The ordering key is metadata.timestamp, not the order in which an
implementation received or processed each message — these differ whenever mail is
delayed, backdated, or a sender’s clock is skewed, and Date is sender-controlled,
untrusted input (AECS-1 §7). Implementations that need true receipt order for
robustness against clock skew or spoofing should use processing.processedAt
instead of thread.position for that purpose.
See Threads & wrappers for EmailThread.from() in
practice, and the full normative text in
AECS-1 §5.