RFC 6531 – SMTP Extension for Internationalized Email
émilie@exemple.fr or an address in Hindi, Japanese, or any other writing system.
Why This RFC Exists
Email was designed in the era of ASCII. The original SMTP specification (RFC 5321) restricts email addresses and envelope commands to 7-bit ASCII characters. This works fine for English, but excludes the majority of the world's languages. An email address like 田太郎@例.jp was simply impossible.
RFC 6531 defines the SMTPUTF8 extension, which allows UTF-8 encoding in the SMTP envelope — specifically in MAIL FROM, RCPT TO, and the EHLO domain. This is part of a suite of RFCs collectively known as Email Address Internationalization (EAI), which also includes RFC 6532 for internationalized message headers.
This extension opens email to every writing system supported by Unicode, which is essential for global email adoption.
How It Works
- The client sends
EHLOand confirms the server advertisesSMTPUTF8in its capability list. - When sending a message that uses non-ASCII addresses (in the envelope, headers, or both), the client adds the
SMTPUTF8parameter to theMAIL FROMcommand. - The
MAIL FROMandRCPT TOaddresses may now contain UTF-8 characters. - The message headers may also contain UTF-8 (per RFC 6532), replacing the old RFC 2047 encoded-word workarounds.
- If the next-hop server does not support SMTPUTF8, the sending server must either downgrade the message or reject it — it cannot silently strip the international characters.
SMTP Example
Sending a message with internationalized addresses:
Key Technical Details
The SMTPUTF8 Parameter
The SMTPUTF8 keyword on MAIL FROM signals that this message uses internationalized content. It must be present whenever any of the following contain non-ASCII characters:
- The
MAIL FROMaddress - Any
RCPT TOaddress - The message headers (From, To, Cc, Reply-To, etc.)
If the SMTPUTF8 parameter is not declared and the envelope contains non-ASCII characters, the server must reject the command.
Domain Names: IDN and UTF-8
Internationalized domain names (IDN) have existed for years using Punycode encoding (e.g., xn--e1afmapc.xn--p1ai for пример.рф). SMTPUTF8 allows the UTF-8 form of domain names directly in SMTP, though Punycode (A-labels) remain valid. For DNS lookups, the domain must still be converted to its A-label form.
Downgrading and Fallback
The biggest challenge with SMTPUTF8 is interoperability with servers that don't support it. When relaying a message to a non-SMTPUTF8 server:
- If the message can be downgraded (e.g., only the display name contains non-ASCII, addresses themselves are ASCII), the server may perform downgrade conversion.
- If the envelope addresses themselves contain non-ASCII characters, downgrading is impossible — the message cannot be delivered to that server. The sending server must generate a bounce.
Interaction with Authentication
Email authentication mechanisms need updates for internationalized addresses:
- SPF: Uses the domain from MAIL FROM. If the domain is internationalized, it must be converted to A-label form for the DNS lookup.
- DKIM: Signs and verifies headers. With SMTPUTF8, headers may contain raw UTF-8, so the DKIM implementation must handle this correctly.
- DMARC: Domain alignment checks must account for both UTF-8 and A-label forms of the same domain.
Common Mistakes
-
Omitting the SMTPUTF8 parameter. If your message contains any non-ASCII in addresses or headers, you must include
SMTPUTF8on theMAIL FROMcommand. Forgetting it causes the server to reject the addresses. - Assuming universal SMTPUTF8 support. Adoption is growing but not universal. As of 2025, major providers like Gmail and Outlook support SMTPUTF8, but many smaller servers do not. Your sending infrastructure must handle fallback gracefully.
- Not converting domains for DNS lookups. Even with SMTPUTF8, DNS queries require the A-label (Punycode) form of internationalized domains. You can use UTF-8 in the SMTP envelope, but the MX/A/AAAA lookups must use the ASCII-compatible encoding.
- Storing internationalized addresses without normalization. Unicode has multiple representations for the same character (e.g., composed vs. decomposed forms). Use NFC normalization for email addresses to ensure consistent matching and deduplication.
- Silently stripping non-ASCII characters. If your system can't handle internationalized addresses, it must reject them cleanly — never silently modify addresses by removing characters. That changes the recipient.
- Forgetting bounce addresses. If the sender's address is internationalized and the message bounces, the bounce must also be sent via SMTPUTF8. If the bounce path doesn't support SMTPUTF8, the bounce is lost.
Deliverability Impact
- Reaching a global audience. SMTPUTF8 support signals that your sending infrastructure is modern and inclusive. As internationalized addresses become more common, supporting them becomes a baseline expectation.
- Bounce handling complexity. Messages to internationalized addresses may bounce differently if intermediate servers don't support SMTPUTF8. Your bounce processing must handle both ASCII and UTF-8 addresses.
- Authentication alignment. SPF, DKIM, and DMARC must all work correctly with internationalized domains. Mismatches between UTF-8 and A-label representations can cause spurious authentication failures.
- Major provider support is strong. Gmail, Microsoft 365, and other major platforms accept SMTPUTF8 messages. Support for sending from internationalized addresses is expanding rapidly.
- Future-proofing. The proportion of email addresses using non-ASCII characters will only grow. Investing in SMTPUTF8 support now prevents deliverability gaps as global adoption increases.