gTLD Registry Constituency GNSO


Potential of IDN for malicious abuse

Measures agreed by the gTLD Registry Constituency to clarify the situation, and substantially reduce the risk for deceptive IDN registration

The gTLD Registry Constituency provides a forum for the consideration of the shared concerns of the generic top-level domain registries. It is a member of ICANN's Generic Names Supporting Organization and is represented on the GNSO Council. The following statement is an expanded version of a resolution unanimously adopted at the Constituency's meeting on 16 February 2005. It addresses immediate operational and policy aspects of the registries that currently accept the registration of internationalized domain names (IDN) -- .biz, .com, info, .museum, .net, .org, -- and will apply to the other member registries as they commence this service.

The ICANN Guidelines for the Implementation of Internationalized Domain Names state that "as the deployment of IDNs proceeds, ICANN and the IDN registries will review these Guidelines at regular intervals, and revise them as necessary based on experience."

Issues requiring consideration in such revision are illustrated by the recent attention that has been called to the opportunity for malicious exploitation of graphic similarity between characters at different Unicode code points. Although the underlying concern was recognized early in the process of IDN development, it has now manifested itself in a manner to which the gTLD Registry Constituency wishes to respond immediately, pending more deliberate action toward the preparation of newer versions of the Guidelines.

The acute concern is with the visual overlap between the Cyrillic and Latin alphabets. To avert the risk for confusion, the inclusion in the same label of letters in the Unicode Cyrillic code chart [1] and the Latin-based code charts [2] will be blocked for all relevant languages. The digits 0-9 and the hyphen sign may, however, appear in all labels containing Cyrillic or Latin letters. Individual registries may subsequently adopt more detailed policies for dealing with requests for names with justifiable and secure cross-script components in these code point ranges.

Registration requests submitted without a value assigned to the language tag will only be accepted for non-IDN registration which, by definition, is restricted to the LDH character table [3]. We recognize the requirement in the ICANN Guidelines for establishing language-based IDN policies but deem it inappropriate to use English (or any other language) as a baseline designator for the LDH repertoire. We anticipate need for English language IDN registration, and do not wish to constrain the ability of registries to adopt explicit English language inclusion tables on a peer footing with any other language they may wish to support. Until a registry has implemented such a table for English, it will use the LDH character table as a default for registration requests submitted with EN or ENG as the language tag value.

For languages that use the LDH characters and for which a registry does not have explicit inclusion tables, the LDH table may be used as a default table for registration requests tagged with those languages, pending fully developed inclusion tables becoming available during the course of dialogue between the registries and their reference groups in the respective language communities. The decision to apply the LDH default to a given language will be made in consideration of the requirements applying to the other scripts that might appear in the full inclusion table, such as the restrictions on bidirectional strings imposed by the IDNA standard.

These measures will be implemented in the shortest possible time by all of the IDN registries listed above. Further means are being developed for flagging and blocking deliberate homographic confusion that they will not reduce. The registries are reviewing the IDN Guidelines in detail and suggest that their formal revision be initiated together with ICANN without delay. Establishing a more nuanced policy basis appears particularly urgent, extending the language-based approach to include script- and locale-based policies, as well. This is exemplified by the need for making "LDH" available as a language tag value.

The gTLD Registry Constituency notes that CENTR, an organization representing a large number of ccTLDs, has issued a similar statement [4], encouraging its registries to adopt appropriate policies for their user communities in consideration of the security impact of the present situation. We congratulate the efforts of the ccTLDs in the development and maintenance of local policies extending the range of languages on the Internet, and their ongoing efforts to internationalize the DNS.

Notes:

  1. code points in the ranges 0430..045F, and the small letters in 048B..04F9

  2. 0061..007A, the small letters in the ranges 00E0..0233, 0250..02AF, and 1E00..1EFF

  3. LDH = "letter/digit/hyphen", with characters restricted to the
    26-letter Latin alphabet <A-Z a-z>, the digits <0-9>, and the hyphen
    <->. It should be noted that the risk for confusing similar
    characters exists in this range as well, for example between the
    lower case letter <l> and the digit <1>, and the upper case letter
    <O> and the digit <0>.

  4. http://www.centr.org/docs/2005/02/homographs.html

Copyright 2005 GNSO gTLD Registry Constituency. All Rights Reserved.