Internationalized Domain Names (IDN) Committee

Input to the IETF on Permissible Code Point Problems
27 February 2002

On the basis of an analysis summarized in the accompanying briefing paper, the ICANN IDN Committee urges that the Internet Engineering Task Force (IETF) determine the set of code points to be permitted by the IDNA standard -- i.e., what code points should be added to the existing LDH hostname specification -- in a cautious, conservative manner, in order to avoid imposing potentially harmful policy choices by default. More specifically, the committee recommends that the IETF employ an "inclusion-based" mechanism for identifying permissible code points, and that at least the following sets of characters not be included, pending further analysis:

  • line and symbol-drawing characters,
  • symbols and icons that are neither alphabetic nor ideographic language characters, such as typographical and pictographic dingbats,
  • punctuation characters, and
  • spacing characters.

In this manner, the IETF could proceed with deployment of the draft IDN standards currently under review by including the conventional scripts' coded characters defined in the Unicode Standard, while leaving aside for the moment the problematic characters. The immediate introduction of the problematic Unicode characters would create a serious risk of confusion, spoofed registrations, security risks, and other problems for DNS users. Accordingly, a delay in their inclusion.

The ICANN IDN Committee intends this note as useful input to the IETF for its consideration, not as some sort of demand. We recognize that the problem of distinguishing among the Unicode sets or collections of characters is not easy, and is not ultimately the IETF's responsibility. Nevertheless, it may be possible for the IETF to discuss workable solutions with the Unicode Consortium. At a minimum, it may be possible to distinguish (and refrain from including) non-language-based code charts such as Box Drawing, Block Elements, Geometric Shapes, Miscellaneous Symbols, Dingbats, Byzantine Musical Symbols, Musical Symbols, Mathematical Alphanumeric Symbols, Letterlike Symbols, Number Forms, Arrows, Mathematical Operators, and Miscellaneous Technical. The problem of identifying punctuation characters is more difficult, but perhaps dialogue with the Unicode Consortium would lead to a workable approach.

The objective of the recommended approach is to allow the relevant Internet policy coordination bodies to understand, evaluate, and develop sensible global policies for the introduction and use of these problematic Unicode code points, in order to minimize the problems for users that are otherwise likely to arise.

The ICANN IDN Committee stresses that this note does not relate to the various CJK issues that have been the subject of much recent discussion. The types of code points addressed above are neither ideographic nor alphabetic in a general sense, and as such raise conceptually distinct concerns.

Comments concerning the layout, construction and functionality of this site
should be sent to webmaster@icann.org.

Page Updated 27-Feb-2002
©2002  The Internet Corporation for Assigned Names and Numbers. All rights reserved.