the Internationalized Domain Names Working GroupResponses
to Survey A
Posted: 28 August
to Survey A: Technical Questions
What different technologies currently enable the use of non-Latin scripts
as domain names?
WALID, Inc. (WALID)
considers that the deployed or proposed approaches to Internationalized
Domain Names (IDNs) have focused mainly on four approaches:
- Upgrading the DNS
protocol in certain ways to support the tagging of UTF-8 or
codepage data in DNS packets, typically using some of the as-yet
unused bits in the DNS packet header. This has been proposed
in a number of IETF Internet-Drafts;
- Sending UTF-8 or
codepage data (sometimes unmarked) on the wire using the existing
protocol, and upgrading the authoritative DNS servers to store
and process that data;
- Sending UTF-8 or
codepage data (sometimes unmarked) to a DNS proxy server or
other network agent, which performs an ACE transformation on
the data and then presents the encoded name to the DNS for resolution;
- Performing ACE transformations
directly on the DNS client node, in the resolver and/or the
application layer. WALID is in favor of this approach to IDNs,
and the approach is embodied in WALID's WorldConnect technology.
Recently, other technology providers have begun to produce similar
Standards to enable
the use of non-Latin scripts as domain names have not been finalized.
Broadly speaking, three different methods are currently in use:
1. Domain names are
sent in a local encoding (such as GB, BIG5, SJIS, etc.)
2. Domain names are
sent in a Unicode Transformation Format, such as UTF-8.
3. Domain names are
converted to a "safe" representation using only the
subset of ASCII characters currently supported by the Internet's
infrastructure before sending.
VeriSign Global Registry
Services (VeriSign GRS) will conform its testbed to whatever standard
emerges from the IETF process.
There are in general
three approaches to multilingualize the domain name system:
- Brute Force Approach
- the DNS was designed to transport domain name characters in
unsigned octets. Therefore, the protocol itself is actually
capable of carrying 8bit information. The reason for restricting
it to the US-ASCII scheme is simply for backward compatibility
issues at the time it was devised. While the DNS has been arbitrarily
constrained to English only alphanumeric names, implementers
were not advised to reject names outside of the constraints,
because the DNS will ultimately determine the existence of a
domain name through its hierarchical search. It is possible
therefore to force 8bit character information (such as UTF-8
or local encoding schemes: Big5, GB, JIS, etc.) through the
DNS and existing implementation experiments indicate that root
servers and middle-wares are usually unaffected.
- Protocol Extension
Approach - An approach to solve the multilingual DNS problem
is to introduce additional flagging or tags within the DNS packet
to alert servers of the encoding scheme used by the request.
Whether it is an encoding tag or simply a multilingual flag,
the protocol approach utilizes the bit format within the existing
DNS packet to notify the receiving end of the context of the
domain name in question. A number of drafts have been proposed
at the Internet Engineering Task Force (IETF) discussion on
multilingual domain names, including the DNSII mechanism [DNSII-MDNP],
utilization of EDNS to signify multilingual labels [IDNE], use
of the reserved header bits [UDNS] as well as the introduction
of a new DNS class for universal characters [ClassUC]. While
there are relative advantages for the different proposals, DNSII
supports the use of multiple encoding schemes, IDNE utilizes
the newly standardized EDNS mechanism for DNS extensions, UDNS
makes it possible for information to be tunneled all the way
to the authoritative server and the introduction of Class UC
could effectively create a coherent but entirely new namespace.
- ASCII Conversion
Approach - The allocation of pain, or in other words where
most effort for the use of multilingual domain names should
be put drives the discussion towards an ASCII conversion approach
whereby the legacy servers and transportation protocol does
not need to change and that all multilingual character information
are transformed into ASCII Compatible Encoding (ACE) formatted
DNS requests. The assumption for an ASCII conversion approach
is that all existing users on the Internet would upgrade to
a multilingual aware DNS resolver, which would perform a standardized
transformation mechanism to transform multilingual characters
into alphanumeric strings that would fit into the original DNS
specifications. In addition to transforming the multilingual
characters to ASCII strings, an in-label identifier is usually
appended to the domain name. For example, the Row-based ASCII
Compatible Encoding [RACE] scheme, calls for the use of a "bq--"
prefix. The common objective for all ACE schemes is to represent
multilingual characters in alphanumeric form, fitting names
within the current character constraints of the DNS. Each has
a slightly different transformation mechanism, from a simple
hex dump such as TRACE [DNSII-TRACE], to multiple compression
scheme approach such as RACE.
Besides the three main
approaches, it is also possible to have a hybrid approach that
mixes and matches the different technologies. It is also Neteka's
opinion that the best strategy forward to deploy multilingual
domain names consists of a hybrid of all three approaches and
contemplates a phased transition:
Brute force approach - with the brute force approach, registries
could immediately offer functional multilingual domain names
to satisfy user demand. Neteka's technology allow the use of
UTF8 as well as any other local encoding schemes (Big5, GB,
JIS, KUC, etc.) to be resolved at the server without requiring
any client side reconfiguration or plug-in.
ASCII Conversion approach - a server end ASCII conversion approach
is best used as a consolidation strategy for the different IDN
solutions. It offers a common platform for the convergence of
the technologies and provides a smooth transition and migration
from the existing system (including with brute force multilingual),
to a longer-term solution. Neteka's technology utilizes both
RACE and TRACE as a platform for administration for multilingual
names in brute force format, ASCII converted format as well
as protocol extension (mode bit flagging) format.
Protocol Extension approach - for a longer-term solution, a
protocol extension mechanism is generally considered the best
approach because it eliminates ambiguity by clearly identifying
multilingual names and does not compromise the efficiency of
the domain system. Neteka's technology employs the DNSII bit
flagging approach as well as the EDNS approach to transport
and identify multilingual requests. The DNSII approach also
allows the tagging of the encoding of the requested string,
making it more precise and effective.
With all three approaches
pre-installed into Neteka's NeDNS, it is immediately deployable
as a server end solution for registries to offer multilingual
names, and prepared for the migration towards a longer-term solution,
whatever stream it might turn out to be.
Two general types of
technologies attempt to enable the use of non-Latin scripts as
domain names. The first of these approaches involves the transmission
of native encodings as part of the DNS labels used within queries
and/or responses. Native encodings are character encodings, such
as ISO-foo or Shift-JIS used to represent non-Latin scripts. These
encodings are generally 8-bit and always involve the use of characters
outside those permitted within DNS labels by RFC 1034 [RFC1034].
The second type of
approach is the conversion of IDNs into a domain name that conforms
with RFC 1034. These approaches involve the use of ASCII-compatible
encodings (ACEs) of non-Latin scripts. Generally, ACE-based proposals
involve both the compression of non-ASCII data as well as a transformation
into an RFC 1034 compliant string.
For both of these approaches,
various parties have proposed a variety of specific implementations.
Internet Drafts currently describe several ACEs, as well as various
approaches that describe the use of ACEs within various parts
of the DNS. Different approaches using 8-bit character transmissions
within the DNS have also been described, including on-the-wire
transmission of native encodings as well as common formats such
Finally, some more
radical approaches, such as the creation of a new DNS class or
the use of a new directory layer to replace traditional DNS functionality,
have been suggested.
follow IETF IDN WG discussion. Application solution complies the
IDNA architecture, with NAMEPREP and ACE.
(1) Interim case:
a. Using NAMEPREP
to convert IDN into English domain name (ACE encoding) for IDN
b. Setting up DNS
(web) proxy to support IDN resolving. The DNS (web) proxy convert
IDN into English domain name.
c. Supporting various
zone file encoding in server side.
(2) Test bed case:
a. Modifying BIND
software to support clean 8 bits (native encoding) and UTF-8
b. Modifying related
software: Apache, Squid, etc., to support clean 8 bits (native
encoding) and UTF-8 encoding.
Two basic mechanisms
enable the use of non-Latin scripts as domain names:
- encapsulated transport
of extensions to the standard ASCII LDH character repetoire,
also called "ASCII Compatible Extension" labels ("ACE
labels"), but more correctly described as "Encodings
Contained in ASCII", and
- native transport
of extensions to the standard character repetoire, also called
"binary labels", and "utf-8 labels".
Both mechanisms rely
upon the Universal Character Set (UCS) for a single ASCII preserving
character set containing a large, but incomplete, repetoire of
- Adopting "multi
8-bit encoding" solution
- Adopting ACE solution
- Adopting "Directory"
- Adopting EDNS resolution
It is our understanding
that the following are the more well known technology providers
in the Multilingual (ML) field:
- TUCOWS/Open SRS
Status Report: Server-based,
client-based and hybrids
What are the strengths and weaknesses of the technologies referenced
in Question 1? Please give concrete examples.
A number of the proposed
approaches described above treat the problem of IDN as if it were
a 'DNS protocol problem', instead of a 'domain name problem'.
That is to say, if the DNS protocol or infrastructure can be changed
to support non-Latin scripts, then the problem would be solved.
The rough consensus of the technical community, however, is that
this approach is fundamentally incorrect, and perhaps the approach
stems from a view that the only applications running on the Internet
that need to be considered are web browsers and web servers (and
sometimes only particular web browsers or servers).
WALID would suggest
that any approach to IDNs must take into account the entire deployed
base of protocols, applications, and implementations that run
on the Internet today, many of which are crucial for ensuring
the stability, security, and operation of the network. Many of
these protocols and implementations will not support characters
outside of the LDH (letters, digits, and hyphen) set, either in
forward or reverse resolution contexts.
These fundamental issues
aside, approaches that focus on changes to the infrastructure,
either by deploying a new protocol, new servers, or new types
of server applications face the inertia associated with the deployed
nameservice infrastructure. The DNS is everywhere, and attempting
to make significant changes to the DNS as a whole would likely
take at least a decade for complete deployment, risking the creation
of islands of dis-connectivity in the process. Infrastructure-based
approaches also suffer from the problem that updates are difficult
to deploy. In this respect, one need only consider the large numbers
of very old BIND distributions still in operation with serious
known security vulnerabilities.
The approaches above
that would send Unicode data directly (typically in the UTF-8
encoding) also ignore the issues relating to name equivalence,
and ultimately would create a serious security problem, given
that many applications and protocols rely on the DNS for performing
authentication and authorization checks. Many Unicode codepoint
sequences, which are visually identical, can be different at a
binary level, creating the opportunity for a malicious user to
fool someone into connecting to a different host than the one
they think they are connecting to. At a more basic level, without
some sort of canonicalization step during resolution, many users
will have a difficult time making IDNs work reliably. Within the
IETF, this requirement has been called the 'business card test'.
WALID's approach to
IDNs, currently in use as part of the VeriSign GRS multilingual
testbed, is to perform a canonicalization and transformation process
of the IDNs on the end user's system. IDNs are normalized to address
the equivalence problem described above, and encoded using a transformation
algorithm from Unicode into the subset of ASCII permitted in DNS
host labels. IDNs that are presented to the DNS for resolution
thus use the same range of characters as standard domain names.
The significant advantage to this approach is that no changes
are made to the deployed base of infrastructure systems, and the
operational stability of the network is not compromised. Our experience
in working with ccTLD registries has shown that infrastructure-based
approaches, by contrast, are quite unworkable, both because of
the inertia associated with the DNS resolution infrastructure
and the large numbers of web proxy servers that are on the network.
To deploy the ACE-based
approach completely, applications which process DNS hostnames
will need to be upgraded to handle IDNs. In the short-term, a
mechanism must be widely deployed to enable immediate resolution
of IDNs in the applications that end-users use most often, such
as web browsers and e-mail applications. WALID is addressing these
needs by making freely available for download its WALID WorldConnect
technology, to enable immediate resolution of IDNs, and its WALID
WorldApp to enable application developers to incorporate
standard IDN transformation capabilities into their applications.
Methods one and two
above involve an application sending binary (i.e., non-ASCII)
data through an infrastructure not designed to handle it, which
certainly has the potential to cause problems. Application protocols,
such as SMTP, call for domain names to be encoded in ASCII. Not
all DNS resolvers and name servers are "8-bit clean"
(i.e., able to handle binary data without issues). The deployed
base is huge, with endless combinations of components, and it
is impossible to test every scenario for its ability to handle
binary data. We do not know of any completed studies, although
MINC is planning such testing. (Please see http://www.minc.org/WG/testing/interop/.)
For this reason, the
IETF Internationalized Domain Name (IDN) Working Group has focused
on the Internationalized Domain Names in Applications (IDNA) solution,
which involves transforming internationalized domain names (as
described in method three above) at the application level, so
that they can be sent in application protocols and through the
Internet's DNS infrastructure in a known safe format.
Brute Force Approach
- The advantage of this approach is that multilingual names
could be deployed immediately at the server end to parse the multilingual
name information and be reachable by a good percentage of users
over the Internet. However there are with it also a considerable
number of disadvantages that could cause inconsistent responses.
These include character encoding conflicts as well as proxy and
application blockages. Character encoding conflict is one that
is particular prominent. The same encoding value could represent
entirely different characters if a different encoding scheme is
used. Conversely as well, the same characters might be represented
with different encoding values under different schemes. Both of
these issues lead to problems for the DNS where names must be
unique and that the user expects to be transported to the same
domain regardless of their input mechanism.
Approach - The common advantage of using a protocol approach
is that the efficiency of the DNS is not compromised at all and
that there will be no ambiguity as to the exact characters a domain
name query is referring to. Also, with the introduction of an
extension, versioning and future extensions could also be built
in. In essence, a protocol extension approach is generally considered
a better long-term solution for multilingual domain names. The
common disadvantage of the protocol approach however is that it
requires changes and upgrades from both the server-end as well
as the client/application-end. This may result in the slower adoption
of the system.
Approach - The most prominent advantage for using an ASCII
conversion scheme is that no changes is necessary in the server
end because they will continue to expect and react to request
that are formatted within the specifications of the original DNS
standards. Conversely, the major disadvantage is that users that
wish to use multilingual domain names must consciously upgrade
their software to be able to reach the multilingual domains. The
average user however is not likely to be technically sophisticated
and would expect multilingual domain names to function the same
as English only ones. Also, it introduces an additional procedure
in domain resolution and takes away the feature of the DNS to
keep the transportation format consistent with the presentation
format of domain names.
- It makes most economical sense for implementers to tackle
the issue with an all inclusive hybrid approach because the efforts
in development of the solution will not become totally emaciated.
On one hand, the inclusion of a brute force approach ensures that
once multilingual domains are deployed, a good number of users
could immediately be able to access and utilize these names. On
the other hand, more alert or early adopters would likely embrace
the ACE technology and already have converters installed, therefore,
to take care of these requests, the database should include an
ACE formatted record. Finally, the system should be made aware
of eventual protocol approach where the incoming packet would
effectively announce the exact encoding scheme and format of the
multilingual name. By developing a three-fold strategy, the implementer
may be able to assure that it will be prepared for any situation
that might transpire out of the dynamic standardization process
As the Internet matures,
it should no longer be a purely technology push mechanism for
implementing new features, but should also consider the customer
pull factor. In the hybrid deployment model, first the brute force
approach is used so that registries could begin allowing registrants
to obtain functional multilingual domains and use them immediately
without any client-end reconfiguration. Only the registry name
servers and the registrant's hosting server needs to be upgraded.
As the need for accessing multilingual domains increase, users
will be more aware and knowledgeable of using the ACE approach,
which will make provides a good consolidation towards a common
protocol and makes administration much easier. Eventually, this
would encourage middleware and other applications to upgrade to
the protocol approach to make the entire process much more efficient
and truly multilingual aware.
The advantage of 8-bit
character transmission is that these approaches seem to be the
most simple and elegant solutions. These approaches allow fairly
direct representations of IDNs and may allow DNS data to be human-readable
for those with terminals capable of recognizing and displaying
the relevant character encodings. Unfortunately, although the
DNS protocol itself allows for the transmission of 8-bit domain
name data, many of the application protocols that rely on the
DNS were not designed to handle such domain name data, and these
protocols would likely need to be individually re-designed in
order to provide IDN capability.
generally provide a high degree of compatibility, because they
continue to use RFC 1034 compliant labels to represent all DNS
data. Some ACE-based approaches have been designed which move
all IDN work to the application or local resolver, and as a result
require no modification of the name servers which are currently
running. Such an approach allows individual users to essentially
"opt-in" to the use of IDNs by installing updated software
on their computers without impacting other users or affecting
the stability of the network at large.
The more radical approaches
mentioned above offer the potential for significant elegance and
potentially large amounts of innovation, but the time to implement
such solutions is likely to be unacceptably long.
It can be realized with current protocols.
Weakness: It reduces the string size of each label. It requires
character set / encoding conversion.
Test bed case has the
following strengths and weaknesses.
(1) It does not need
to download client software.
(2) It does not need
to proceed ACE conversion on client side.
(3) The zone file
is readable for administrator. It is easy to maintain zone file.
(1) Some Chinese
characters contain '\' '@' codes that makes Internet application
(like Bind, apache, firewall...etc) do not support clean 8 bits
The label length is
limited to 63 bytes.
ACE allows up to 17
to 19 characters under the best conditions. UTF8 allows 21 or
Labels are "visible"
outside of the resolver-nameserver context, e.g., in email headers
ACE is not ASCII preserving,
ACE transformations of names containing Latin characters not in
ASCII, or non-Latin characters, bear no resemblance to the original
name, even if only one non-ASCII character is present, e.g., umlaut.
This feature is absent
with UTF8, due to its ASCII preserving property.
Labels are "processed"
outside of the resolver-nameserver context, e.g., in email headers.
ACE encapsulates extensions
to ASCII in ASCII, resulting in no requirement for change to infrastructure
such as mail transport agents which route mail based upon domain
names in mail headers.
UTF8 requires changes
to infrastructure such as mail transport agents which route mail
based upon domain names in mail headers. This change is frequently
described as "8 bit clean processing".
a. Advantages: Break
through the limitation of current DNS protocol and size of domain
Disadvantages: Alters the DNS and relevant applications
b. Advantages: No need
to alter the current DNS, a technology with good interoperability.
Disadvantages: Alters relevant application of DNS, awkward expressive
performance in displaying and relevant applications, has more
limitation on the length of domain names.
c. Advantages: Leave
some of the difficult problems on the representation layer
Disadvantages: Need to modify applications related to DNS, need
to add additional directory inquiring transactions.
d. Disadvantages: the
application scale is not broad.
- DNS clients ACE
unaware would look up entries exactly
- DNS clients ACE
aware, would display IDN characters, but convert and submit
transparently in ACE format
- DNS servers would
be ACE unaware, and simply store ACE entries "as is",
therefore requiring no change
has been active on the Internet for 3 years, beginning with a
Pan-Asian test bed strategy; we stand behind our product which
we can say with confidence, that it runs across the current Internet
without causing it to break.
i-DNS.net is the Technology Enabler and Resolution Partner-of-Choice
to VGRS in the Multilingual Testbed. Our technology is in use
by major Registrars like Register.com, Melbourne IT (INWW), interQ;
as well as ccTLD Registrars handling .tv, .cc, .com.au, .la and
We have established
affiliations with industry organizations (IETF, APNG, APTLD, APIA,
PBEC, W3C) and in-country players worldwide. Most importantly,
i-DNS.net is the pioneer of the Internationalized Domain Name
System (iDNS), the First Registry for Fully Multilingual Domain
Names, the industry leader in terms of market penetration and
has the longest operating track record since 1999.
As part of our corporate
objectives, we aim to provide a technology that remains compliant
with the workings of the IETF; right now this means Client side
and ACE based, and I-DNS.net technology is fully compliant with
WALID have patented
an IDN client solution and we understand this has raised some
concerns to the IETF IDN working Group.
Native Names (and a
few others) provide Server Side solutions only, are would (in
our opinion) not be compliant with IETF.
Status Report: Server-based
- no client intervention required but long assimilation by servers
worldwide. Client-based - NAMEPREPed on application, as per IETF
but require installation.
Are there more problems relating to particular scripts? Why?
IETF IDN Working Group and the Unicode Consortium have been investigating
the complexities associated with introducing non-Latin scripts into
the context of DNS hostnames, and attempting to ensure that end-user
expectations are met. We fully support the work of these two expert
organizations in this area.
in these languages and scripts are in the best position to answer
general, scripts with more local encoding schemes are more problematic
initially for quick deployment of multilingual domain names. Other
language issues are local script dependent. For example, there is
the traditional and simplified Chinese issue. Part of the debate
is whether a folding or mapping should occur automatically and built
into the IDN protocol. This coupled with conflicting local character
encoding schemes also makes the deployment of Chinese, Japanese
and Korean scripts more difficult. Neteka's perspective on the Chinese
character folding issue is that it should be a policy matter and
controlled during registration and be dependent on the registry
policy. ICANN should however provide guidelines as to what the issues
are and suggest a number of alternatives to solve the problem. Other
languages also have their own language issues such as Arabic, where
spaces within phrases changes the meaning and the form of a character,
Hebrew where characters could be omitted, etc.
the more different scripts are from traditional Latin scripts, the
more likely problems are to occur. Languages such as Chinese and
others that use the Han ideographs can be problematic due to the
sheer size of the character repertoire. Some languages have a large
number of encodings to represent essentially the same character
set, which can make it problematic to identify and transform raw
data into a common, universally understood format.
requires Unicode as its base character set, but many PCs use local
character set such as JIS. It causes normalization problems due
to character set conversion, that is 1 to N mapping.
second byte of Big5 encoding characters include ASCII encoding range,
it may make DNS response error data (DNS software is case sensitive
in ascii character)
European scripts, normally
comprehensible with diacritical simplification become "ASCII
gibberish" under ACE transformation.
Scripts which use European
characters and non-European characters have very poor ACE length
properties, as ACE length optimization relies upon code page utilization.
a. Sequence of Chinese
Domain names: Chinese language is totally different from Latin
in word and sentence structure.
b. The Simplified -
Original Chinese character mapping: Chinese characters have two
writing forms to which corresponding each other. For Chinese people,
this kind of correspondence is as same important as the Case folding
to ASCII domain name users.
c. The correspondence
between "." & "?". The presence of such
problem is due to the characteristics of Chinese input method,
and the problem should be resolved to fulfill users' needs.
(e.g. traditional and simplified Chinese, Arabic space phrasing,
Hebrew omitted characters), could be resolved by manual entry
of the multuple versions into the DNS at registration time
The IETF working group
places special emphasis on conversion from Localè Unicode
à ACE. It is script (not language) based, and the NAMEPREP
process deals to the various nuances of each script. Standing
back, we see the following issues
Some scripts have no
INPUT METHOD ENGINE (IME) which means that it is impossible to
enter the language into the computer (e.g. some Indian scripts)
Along a similar theme, some script lack a font renderer, so they
cannot easily be displayed on the screen
Some languages have
yet to have standardized scripts
Some languages have
multiple scripts; in which case NAMEPREP needs to support consistent
canonicalization and be case-folding (often an area of political
debate amongst different linguists)
Status Report: As above
To the extent there are weaknesses in the technologies, what groups
are working to develop solutions?
The group most active
in addressing the need for technical standards to support IDNs
is the Internet Engineering Task Force (IETF). The IETF IDN
Working Group has made considerable progress in the past year
in defining an overall set of technical and operational requirements
for internationalized domain names, has vetted a broad set of
technical proposals, and has chosen an approach consistent with
those requirements. Many IETF participants are also active in
the Unicode Consortium, the W3C, and other standards bodies,
and the IETF IDN Working Group and IDN community as a whole
benefits from their experience, coordination, and support.
Internet Names Consortium (MINC) has also been active in developing
policy in the IDNs area, as well as providing a forum for performing
interoperability testing. While this has been somewhat less
successful, MINC provides a good forum for representing the
interests and needs of its broad constituency and can support
the efforts of the IETF and other technical standards bodies.
As MINC moves forward with its mandate, we expect to see MINC
play an important role in supporting the deploying of internationalized
domain names and in promoting cooperation, compliance, and interoperability
between the systems that are deployed today.
share the opinion of others in the IETF IDN Working Group that
the issue that should be tackled is internationalizing domain
names, not internationalizing the DNS protocol. Thus the issue
is broader than some "quick fixes" or partial solutions
advocated by some technology providers, such as simply upgrading
DNS clients and servers. Any complete IDN solution must involve
end-user applications, such as web browsers, as well. The IDN
Working Group is developing standards for IDN and is, we believe,
the primary focal point for a complete solution.
Neteka's DNSII (www.dnsii.org)
and OpenIDN (www.openidn.org)
initiatives encourage and allow more people to be involved in
this important transition on one of the core technologies of
the Internet. DNSII is a forum for discussing different multilingual
approaches and currently archives Neteka's proposals. OpenIDN
is an open source multilingual DNS, allowing interested parties
to tryout using multilingual names as well as the source code
to enhance the features on their own.
IETF is mainly concerned
with the protocol and tries to determine which approach to use
and what the eventual format should look like.
MINC is a quasi-iDNS
initiative started by iDNS advocates. The discussion includes
both protocol issues as well as language or policy issues. In
Neteka's perspective, both these functions are already carried
out by IETF and ICANN and the responsibility should really go
back to these two bodies for a more comprehensive points of
views of the problems therefore providing better results.
IETF continues to work on a variety of issues surrounding the
IDN problem space.
IDN Taskforce, JP-CN-KR-TW NIC's Joint Engineering Team and IETF
Chinese technology task force, CDNC, JET, IETF.. etc.
The Unicode Technical
Committee and ISO JTC1/SC2/WG2 (character sets) and JTC1/SC2/WG22
(internationalization) are working on deficiencies in the ISO
10646 character repetoire, normalization, and canonicalization.
The IETF IDN Working
Group is working on the name preparation (nameprep) step. It
is also working on selection of the better ACE algorithm, and
on the infrastructural work UTF8 requires.
The W3C Internationalization
Activity is working on internationalization of the URI namespace.
CDNC, JET, MINC, I-DNS, VeriSign, etc.
There are 2 key groups
focusing on solution, each working at different levels. I-DNS.net
is an active contributor to both groups.
IETF (working on
on operational policies)
Status Report: Also,
UC, CDNC. Some suggest local/regional control over problems
specific to particular language/region
What are the different solutions under consideration? Which are the
most promising? How much longer will it take to develop a solution that
The current proposal
before the IETF IDN working group is "Internationalizing
Domain Names in Applications (IDNA)." From a technical
standpoint, we understand that the WG has established rough
consensus around the core concepts of normalization and transformation
taking place within the application. Assuming that certain non-technical
issues are resolved, the IETF could have a standard ready by
the end of 2001.
The consensus in
the IETF IDN working group is not complete, however, and some
have suggested that the working group is failing to consider
questions relating to language and language use, and the expectations
of end-users of the DNS. While these questions are certainly
important, we are not convinced that they concern issues that
can or should be solved by the DNS. Many participants feel that
these questions are outside of the scope of the charter of the
IETF IDN working group, which is focused on enabling use of
non-Latin scripts in the DNS, and thus should be addressed separately.
work of the IDN Working Group is public; more information is available
at http://www.i-d-n.net/. The most promising proposal is called
IDNA (Internationalized Domain Names in Applications), which calls
for applications to convert IDNs to an ASCII-only "safe"
format using an ACE (ASCII Compatible Encoding). More details
are available in the IDNA Internet-Draft at http://www.i-d-n.net/draft/draft-ietf-idn-idna-01.txt.
IETF - while a good
number of proposals have been presented to the IETF, until recently,
discussions surround the IDNA (IDN Applications) approach. This
however collides with a patent issued to Walid. Recent discussions
have included ways to work around the patent as well as hybrid
Neteka - Neteka is
a proponent of a hybrid approach to ensure that the migration
is transparent to the end user and smooth for the operators.
We believe this is the most promising approach in that it already
works for the majority of the people on the Internet immediately.
It also provides a clear path towards the longer-term approach
where the entire Internet will become fully multilingual aware.
Neteka's system is also compatible with email addressing systems
and Neteka already have the technology also to introduce multilingual
iDNS - the iDNS Proxy
solution assumes that multilingual domain names are redirected
to the iDNS servers for resolution. This creates a bottleneck
for the system and introduces unnecessary complications.
WorldNames - as far
as Neteka's understanding, WorldNames' NUBIND, currently implemented
at the dotNU registry, is essentially a redirector technology
and multilingual names registered using this system could not
be utilized for email addresses.
indicated above, there are a number of solutions currently under
consideration. Currently, the [IDNA] solution proposed within
the IETF's IDN working group seems extremely promising; recently,
however, intellectual property concerns have slowed the development
of that particular approach. More generally, ACE-based solutions
seem to generally have the greatest traction and operational experience
to date, and the advantages that they yield in backwards compatibility
is probably a strong argument in their favor. A final solution
to this problem space still seems to be at least six months away.
other than the above is thought of.
Common Name Resolution Protocol + DNS solution:http://www.ietf.org/internet-drafts/draft-ietf-cnrp-09.txt
(2) Depending on
when IETF finalize the RFC, after that, it would take 1 or 2
See question 1. The
most promising approach is binary labels with nameprep. This
has been implemented and deployed in the .CN ccTLD. To fully,
globally provide IDNs seamless, requires changes to host operating
systems and to core internet infrastructure. The general rule
for major feature changes to operating system products is one
or more years after announcement to new feature general availability,
and one or more years after announcement to old feature withdrawal.
period is shortened by incorrectly posing the question in the
form of a surf-the-web model. The internet is more than the
a. Multi 8-bit encoding
(long term) need to renew all the DNS and relevant applications
b. ACE (short term)
need to renew all the applications related to DNS
c. Directory Services
(long term) need to renew all the applications related to DNS
ACE type encoding
We see the IETF IDN
group as THE focal point for development of standards, and so
far they have made considerable progress. They have made the
strategic choice of Client vs. Server side (Easier to implement,
and in line with IETF philosophy that the Internet is a dumb
network carrying bits - intelligence is located on the edge.
They have also worked
out how to convert local language into ASCII form for carrying
over the net. Localè Unicode à ACE, and the I-DNS
technology solution is compliant with this. We believe the approach
set out by IETF works NOW - our technology uses it and resolves
across the Internet.
We believe that there
will never be "just one solution" or "just one
standard". Rather we see that over time, the basic approach
adopted today by IETF will be refined, even improved, much as
the Internet has improved over time. Standardisation is not
a "first past the post" race, rather it is a process
of continual improvement, based upon as sound strategy - which
we believe is more or less in place through the IETF IDN.
Critical to any solution
provider will be that their solutions are "interoperable"
with other providers. Critical to any customer, is that they
are aligned with a technology provider that keeps their solution
in line with standards as they develop.
i-DNS.net is a Technology
Partner to VGRS and its iDNS, and our technology is compliant
with the current standards as defined by the IETF working group.
Status Report: IETF
(IDNA), WALID (patent), Neteka (working on hybrid client/server)
Currently there are no accepted standards for IDN. Is this because there
are competing technologies, or because the underlying problem is sufficiently
difficult that a "best" solution has not yet emerged?
believes that competition is a healthy and necessary part of the
development of any emerging industry, and a useful tool for providing
real-world experience concerning the viability of various approaches
to solving a given technical problem. The 'IDN Subject' is certainly
a complex one, and some have characterized it as one of the most
difficult challenges that the Internet technical standards community
has faced. The IETF and other standards bodies have made extremely
good progress in addressing it within a relatively short time.
IETF IDN Working Group is moving relatively quickly to produce
an IDN standard.
While competing technologies
imply that there is no defacto standard, it is because some
initial attempts are not satisfactory that competing technologies
arise. This is therefore a multifold issue: first of all there
is a differing opinion on what the "best" solution
should be. The underlying problem is sufficiently difficult
in that there has to be compromises and a decision could only
be made based on giving more consideration to some key issues
and focusing less on others. Unfortunately, it is very difficult
to build a consensus on which among the many issues should these
"key issues" be. There are really three main camps:
a. System administrators/operators
- this group generally has the view that the allocation of
pain should be on the user and that it is absolutely important
that the servers are not threatened by multilingual requests
even though they might not break down. They also view that
server-side migration would be lengthy.
b. End users -
there are two groups within this sector: the registrants and
the users of domain names. Both of these groups are eager
to have functional multilingual domain names without requiring
client reconfiguration. They expect multilingual names to
work exactly like English only names and will be confused
and frustrated if they are not. They are also technically
less sophisticated and may not understand why and what needs
to be done to get multilingual names working. Therefore they
also believe that the allocation of pain should be on the
server end where the technical expertises are.
- these are the design engineers and architects who believe
that a long-term solution should be made extensible and cater
not only the operators but also the end users. They have the
view that eventually both servers and clients should be upgraded
to enable a fully multilingual Internet. The servers should
be first as that is where the technical expertise are, while
the client end would slowly migrate as new applications are
introduced. Meanwhile existing applications should also be
able to access multilingual names.
real problem is that there is no ideal solution. All proposed
solutions to date have drawbacks, and it has been difficult to
develop consensus about which of these drawbacks is the most tolerable.
The underlying problem is indeed an extremely difficult one, and
even if a "best" solution has emerged, it will take
time and careful study in order to recognize and adopt it. Also,
because of the critical nature of the DNS to the Internet community,
it is important to develop and in-depth understanding of the pros
and cons of all possible solutions, and to move towards adoption
in a manner that does not jeopardize the stability of the Internet.
IDN WG has come to consensus as answered to Q1, so the WG is concluding
proposed technologies, and going to process the result on standard
track. The most anxious hurdle of the process is WALID's patent
problem is sufficiently difficult that a "best" solution
has not yet been accepted.
the problem itself is sufficiently difficult, the solution that
can be generally adopted by the international societies has not
See 5 above, we believe
it is inappropriate to see the IDN world as only having "one
standard". IETF have already defined a standards framework;
Client side, ACE based.
They are finessing
this, e.g. we have recently seen options for DUDE over RACE
debated, and we fully expect this polishing process will continue
long into the future. One expects "standards" to be
set, and then to improve, e.g. medical, emissions, noise etc,
Internet will not be any different in our opinion.
continual improvement does not mean that sufficiently robust
technologies cannot be provided over the Internet. Clearly they
have to conform to the standards of the day, and need to be
From a customer standpoint,
suppliers need to keep up to date and, most importantly, make
THEIR customers aware of what it is that is being purchased,
and any issues that may impact the use of that product/service.
Again, we see that the Internet should not be any different
to the normal market place.
IDN Standards have
sufficiently crystallized to the point where solutions can be
safely deployed over the Internet, to address the significant
market pressure tat has built up in non-English speaking countries.Status
Report: As above
Do the existing "testbeds" and pre-registrations help or hinder
the resolution of the technical issues relating to IDN? In what manner?
Would the testing impact the ongoing operation of the Internet?
are an important mechanism for gathering useful operational experience
in this area, and help to gauge demand and user expectations for
IDNs. Some of the testbed projects underway have been very careful
to not disturb the use of the existing DNS, while others have
not been as focused on the operational stability requirements
of the network.
A testbed that supports
the IDN standard development process, such as VeriSign's testbed,
is helpful. For example, the VeriSign GRS IDN testbed has offered
technical feedback to the IDN Working Group on the complexity
of the Row Based ASCII Compatible Encoding (RACE) algorithm
(one possible ACE). Partially based on this feedback, the IDN
Working Group has decided that RACE is not suitable for use
in the eventual IDN standard.
In addition, the
VeriSign testbed has been conducted in a progressive, phased
approach. This allows for the completion of predefined milestones
before moving to subsequent phases and thereby reduces the possibility
of creating DNS stability problems.
It is difficult to
imagine how a testbed could interfere with the operation of
the Internet. It is highly unlikely that even a testbed that
uses domain names in a binary format (unlike VeriSign's testbed)
would negatively impact the Internet's DNS infrastructure (including
the root and gTLD name servers). Because so many applications
already send DNS queries in one binary format or another, the
root and gTLD name servers are already deluged with such queries
as part of the normal DNS resolution process, all with no impact
aside from the additional volume.
Depending on how
the domain resolution strategy is eventually deployed, pre-registrations
should not hurt the introduction of multilingual names. Other
so called functional "testbeds" may hinder the progress,
especially the establishment of alternative namespace beyond
that recognized by ICANN. This is a very serious issue as these
"testbeds" would redirect all multilingual requests
to their own alternative namespace meaning that even if later
on the existing namespace introduces multilingual names, the
requests under the "testbed" system will be routed
to the alternative namespace causing confusion. Pre-registration
however is safer as it essentially means that the multilingual
name is only stored in a database and not being used. Any technical
solution could be deployed later for domain resolution. It also
serves to be an indicator of user demand. Even when users know
that these names do not work, a lot of people are registering
for them in the hope that they will be able to use them soon.
Beyond the "testbeds"
and pre-registrations in fact Neteka views the faulty implementations
on the existing browsers and unnecessary blockages at proxies,
cache servers and firewalls as even larger hindrance to the
implementation of multilingual domain names. Please refer to
section A:16 for more information.
The operational experience
gained from legitimate testbeds can be extremely helpful in
moving solutions from theory into practice. Due to the large
number of commercial interest in play, however, some of these
testbeds might be seen as attempts to force the Internet community
to accept certain technologies despite their appropriateness
or quality. Generally, policy considerations have lagged behind
technology in the IDN space, and as a result there have sometimes
been inadequate assurances that testbeds serve the internet
community by providing valuable operational experience as opposed
to benefiting certain commercial interests at the expense of
should not affect the ongoing operation of the Internet. It
is important that end user's expectations of these testbeds
be managed carefully however-these users may be under the impression
that the testbeds may be an operational portion of the Internet,
and may view technical failures within the testbeds as operational
problems rather than a normal part of the testing process.
provides a lot of 'real samples' to evaluate proposed technologies
such as ACEs, that are useful to list up issues. The impact on
the operation is that DNS or Web server operators must learn how
to convert IDN to ACE. Testing provides good opportunity to learn
it. Testing also provides many information about IDN to end-users,
engineers, developers, and service providers.
promotion on a test bed product is not good. It is better to provide
service until the standard of IDN is ready. But if the local testbed
does not influence the Internet stability, it would be help for
of proprietary products hinder the resolution of technical issues
relating to IDN. The "sale" of over one million RACE
names removed an estimated USD$35,000,000 from the pool of capital
available for funding work on IDN enablement, and created unrealistic
expectations in speculators, trade mark owners, ICANN, MINC,
and even the IETF, that a solution was both "easy"
and "soon". This is "mindshare capture".
The testing question
in the context of "testbeds" and pre-registrations
is a non sequitur. There is no program for complex system testing
of IDNs in the DNS or in the internet infrastructure, there
is only string warehousing, and worse.
The only technically
defensible testbed is the .CN operational support of UTF8 encoded
Traditional and Simplified Chinese characters.
benefits for testing the feasibilities of various technical resolutions.
It can also promote the development of IDN to some extent.
The use of the word
"test bed" has some very negative connotations when
used by some industry players. Clearly there is some political
debate here over the definition of a test bed -- is it a pure
controlled lab experiment run by scientists, or is it an applied
market focused approach run by marketers? In our opinion this
is a philosophical and perhaps political issue, and it may be
holding back deployment of ML solutions, without adding any
value to the ML debate.
The Testbed process
is helping prove that the technology can work; it flushes out
areas where additional development is required for other ancillary
services (e.g. ML hosting); pre-registrations also give a gauge
to the level of market interest.
Market demand will
not wait for a technically perfect solution (we expect to see
a sufficiently robust and workable solution deployed, not one
"final perfect" solution). The test bed, with its
explicit disclaimers on non-performance (very important to advise
Customers on what they are buying and any related issues), offers
a good sandbox environment in which to test the technology's
ability to administer and handle the diverse requirements. The
feedback and analysis of testing here is invaluable to technical
Status Report: As
Are natural languages so complex, rich and varied that a true IDN system
that responds completely to user expectations is beyond current technological
capability? Can the problem be solved incrementally in a manner that
does not interfere with the operation of the entire domain name system?
IDN problem in our view is not one of natural language, but rather
one of adding support for a wider range of scripts to be used as
identifiers in the DNS. As such, issues involving natural language
and the often context-sensitive expectations of users are outside
of the scope of the IDN-related efforts currently underway. Some
within the community have proposed creating directory service layers
above the DNS to meet the expanding needs, and we strongly encourage
and support any work in this direction. Natural language issues
are language- and locale-specific, and any proposals to address
them should be developed based on participation by native members
of the locale as well as general linguistics expertise.
The IDN Working Group
is already developing a technical solution to support a true IDN
As noted above, we
support the introduction of IDNs in a phased manner that does
not risk interference with the operation of the DNS.
This depends on the
perspective of what constitutes a "domain name". Some
technical persons maintain that a domain name is nothing more
than a string of characters for the identification of a resource
over the Internet. Neteka however believes that domain names have
evolved from its origins to represent an identity of a person
or a corporation on the Internet, whether it is being used as
part of an email address or simply a web address. Natural language
rules can definitely be introduced to the DNS, Neteka's technology
have shown that the use of phrases, punctuations and even spaces
are possible. Therefore a fully natural language domain name is
However, it is important
to also understand that the domain name system is useful because
of unique names and this rule should not be violated or confusion
would occur. The same phrase must result in the same resource
regardless of which locale or platform it is accessed from. This
means that certain user education is required to understand that
Mikeshoes.tld may not be the same Mikeshoes in your local mall.
domain name system was never intended to serve as a directory service
with the capability to consistently find the appropriate result
to a natural language query. Although the original design of the
DNS includes certain characteristics which are designed to reduce
language-related errors (for example, case folding, or even the
original limitation of domain names to use only ASCII characters),
it still is not capable of distinguishing between variants of words
(e.g. "color" versus "colour") or appreciating
the other subtle nuances of language. Regardless of the IDN solution
that eventually emerges, it will be important to educate users regarding
the use of the Internet. A good IDN solution will not solve natural
language problems, but will allow many more users to take advantage
of the Internet using their native language and their native character
believe that IDN doesn't introduce 'languages' to DNS, but introduces
non-alphanumerical scripts or characters.
natural languages will not be a domain name, user may use natural
languages on search engine to find out some data. But proper normalization
of DNS is required even it is very difficult.
DNS uses identifiers, not languages or words, though lots of words
in lots of languages are used (in ASCII transliteration) in the
DNS as identifiers. DNS labels are not general purpose objects for
unconstrained writings, and the question reflects the (unfortunately
widespread) misunderstanding concerning the fundamental properties
of "identifier and lookup", wishfully substituting "word
and search" in their place.
can find a progressive way to solve the problem.
DNS is "identifier" only. Natural language queries could
use web (http/url) or LDAP search to return identifier, for subsequent
lookup in DNS system
Natural languages are
complex. However, a workable and acceptable solution (i.e. one
where non-English speakers can use their own language for domain
names) does not need to perfectly cater for every singly nuance
in a language.
Compromises can be
found (such as the substitution of space by the hyphen in ASCII
domain names). In our opinion, most languages could be implemented
in a way that provides very high utility over ENGLISH by dealing
with 95% of the language issues - the remaining 5% would be out
of bound and not catered for.
Once more, a clear
expression to the consumer is required so that they understand
what they purchase.
it has developed a solution, which adequately satisfies market
expectations of a true IDN system. And yes, it can progress incrementally
without disrupting existing DNS operations.
Status Report: Dichotomy
between identifier and identity
How do different technologies affect the size limitation of domain names?
What, if any, are the possible solutions?
Domain name segments
are limited to 63 octets per segment, and an overall domain name
length of 255 octets. In the context of the ACE-based proposals,
Unicode codepoints can expand to multiple octets, thus reducing
the number of actual non-Latin characters that can be used in
a domain name. Even in non-ACE proposals (particularly those that
rely on UTF-8) this same issue exists.
There are a number
of proposals under consideration by the IETF IDN working group
to address this issue through efficient encoding of Unicode sequences.
The challenge in this area is to find an encoding algorithm that
is both very efficient yet relatively simple to describe and implement.
The current draft before the IETF IDN Working Group ACE design
team comes very close to meeting these requirements.
names are limited to 255 octets in length and individual labels
(i.e., between periods) are limited to 63 octets. This is a fundamental
limitation of the DNS protocol and cannot be changed without altering
the DNS protocol. Different representations of different character
sets require more or fewer octets depending on their design. For
example, UTF-8 is a variable length encoding of the Unicode character
set. In a given number of octets, some scripts require more space
than others. The IDN Working Group has been sensitive to this issue
during the design of the various ACE algorithms that are candidates
for inclusion in the final IDN standard. A requirement of the final
ACE algorithm is a roughly equal treatment of all scripts in Unicode.
Brute Force Approach
- utilizes existing packet format therefore will only allow
a maximum of 63 bytes. Depending on the byte length of the character
encoding scheme used, the number of characters possible could
range from 63 to 15.
Approach - new size limit could be introduced so length can
become a non-issue.
Approach - utilizes existing packet format. Depending on compression
scheme, domain length per label ranges between 15 - 20 characters.
they transform eight bit characters into what is approximately a
five bit (37 possible values) storage format, ACE-based solutions
generally reduce the number of native characters that may be present
within a single DNS label. Most of the existing ACE proposals contain
compression mechanisms in order to increase the size of the native
domain name as much as possible.
answered in Q2, ACE reduces the size of each label. Therefore ACE
must involve effective compression algorithm. JPNIC is evaluating
many ACEs and contributing to IDN WG ACE team.
is more length limitation on ACE encoding. Native encoding (local
encoding like big5) has less length limitation on domain names.
the response to Question 2.
Both multi 8-bit encoding
and ACE can affect the size limitation of domain names.
One of the resolutions
is to expand DNS protocol.
labels longer than 63 chars could be resolved through using more
levels, one per word or syllable for instance. 255 chars overall
should be adequate
The ASCII DNS already
imposes size limitations, probably as its original designers ever
imagined that it would be used today to identify name of companies
etc. The IETF strategy Localè Unicode à ACE often
requires more that one ASCII byte to represent each character
so this means a ML DNS does not allow such long strings.
Compounding this, some
countries do not use acronyms to the extent that we do in the
English speaking world (e.g. Arabic), preferring to register the
full company name.
IETF's proposals do
deal with the size issue, and it is important to note that DUDE
has some advantage over RACE in this regard. However, it is also
important to note that regardless of technology restrictions,
customers need to understand what is and is not achievable in
a ML DNS. The fact that a ML DNS implementation does not cater
for the full 64 character DNS string will not invalidate its utility.
Perfection is a laudable design goal, but in reality most countries
just want to use their language now, and 80% is better than 0%.
Status Report: Existing
Latin-based DNS limitations
Do IDNs pose special problems for the technical operation of WHOIS databases?
If so, what problems? What are the possible solutions?
to the WHOIS public registration databases tends to be provided
in two ways: via web-based interfaces, and through the TCP port
43 whois/nicname service. One of the challenges for operating
a WHOIS database will be in ensuring that queries arrive in a
form that can be accurately matched against the database contents.
WALID considers that a positive solution would be to use the IDNA
approach and upgrade the deployed 'WHOIS' clients. These upgraded
applications would need to normalize and transcode IDNs into their
ACE equivalents, and then use the transformed name as the query
to the WHOIS server. This is a strength of the IDNA approach,
in that it addresses not only the question of IDNs in the DNS,
but also in all of the applications, such as WHOIS, which use
domain names as application protocol elements.
No, although WHOIS
services must be internationalized if the domain names they
hold are internationalized. One possibility is internationalizing
the WHOIS protocol itself, along with clients and servers. Another
is adopting the IDNA approach: IDNs would be stored in an ACE
format and WHOIS clients would be required to convert internationalized
user input into ACE format before querying a WHOIS server.
VeriSign GRS is presently
developing an IDN Whois service. In the interim, an IDN conversion
tool is provided.
domain names should not present special problems not encountered
by the domain name server. Depending on the approach used, WHOIS
databases may need to be upgraded however for it to handle multilingual
requests. For example, if a protocol extension approach is used,
the WHOIS side should determine whether the mode bit is required
or should it force all request into a standardized format.
IDN problems should not significantly affect the operation of
the WHOIS database. It may be necessary to display WHOIS data
in non-Latin scripts, but this problem can largely be viewed independently
of the IDN effort.
problems of WHOIS are expressions in query and display. Short
term solution is to update IDN-aware whois client. Long term solution
is to improve WHOIS protocol.
WHOIS database can not accept clean 8 bit data or query. The problem
could be solved if IETF finalize the standard for WHOIS databases
support IDN a soon as possible.
problem, either raw ACE encoding, or UTF-8 for descriptive references
WHOIS is a very old
product, probably as much used as it is maligned. Its roots
go back to the days of ASCCI, and a world without browsers and
Registries based on database engines. The Internet community
is separately debating what it wants its WHOIS to be.
However, WHOIS has
become synonymous with data base searching for a Domain Name,
and so the Ml solution has to address this.
With ML, there is
no longer one standard encoding, so the WHOIS product needs
to understand, detect and handle multilingual queries. From
a customer standpoint, it is important to stress that this does
not only mean catering for an ML domain name, but local language
CONTACT DETAILS too.
The i-DNS.net's implementation
of the WHOIS database is already IDN-aware.
Status Report: Whois
must be internationalized on server side or IDNA'ed client-side.
Long term server side solution preferred.
Are any IDNs related technologies covered by patents or other intellectual
property rights? If so, will this have an affect on the implementation
We understand that
there are a number of granted patents and patent applications
that cover various areas relating to internationalized domain
names, including U.S. Patent No. 6,182,148, which was issued
to WALID, Inc. on January 30, 2001, a related PCT application
by WALID, and at least one pending patent application by i-DNS.Net.
We consider that intellectual property rights need not impede
implementation of IDNs, and may even encourage a more rapid
adoption of a single and optimal technical standard.
patent and PCT application, we have supplied the following IPR
Statement to the IETF on November 3rd, 2000. We understand that
this statement is in accordance with many such statements that
have been filed with the IETF by numerous companies in the past:
Pursuant to the requirements
of RFC 2026, Section 10 ("INTELLECTUAL PROPERTY RIGHTS"),
WALID, Inc. ("WALID") gives notification to the IETF
Secretariat that one or more patent applications relating to
a METHOD AND SYSTEM FOR INTERNATIONALIZING DOMAIN NAMES have
been filed. Should the implementation and practice of any part
of an IETF standard relating to the above subject matter require
the use of technology disclosed in any granted WALID patent,
WALID is prepared to make available, upon written request, a
non-exclusive license under reasonable and non-discriminatory
terms and conditions, based on the principle of reciprocity,
consistent with established practice.
For any questions
regarding WALID intellectual property and license, please contact:
J. Douglas Hawkins
State Technology Park
2245 S. State St.
Ann Arbor, MI 48104
companies have patents surrounding the IDN space. WALID, Inc.
has notified the IETF of a patent that may cover the work of the
IDN Working Group. The working group is currently taking this
patent into account as it decides whether or not to proceed with
the IDNA solution.
that there are at least the follwing three patented approaches:
Neteka - Parts of
Neteka's multilingual technologies are patent pending and are
submitted as Internet drafts to the IETF and archived both at
the IETF site as well as at http://www.DNSII.org. Neteka's technology
however is available as open source and is freely available
at http://www.OpenIDN.org. This ensures that even if Neteka's
technology is used, the Internet community is guaranteed to
have a freely available source of the technology for their utilization.
Walid - In essence,
Walid's technology is a client-side or pre-DNS-server ASCII
conversion approach. Neteka's understanding is that the patent
surrounds a technology that intercepts multilingual requests
sent from the client and performs a conversion of the multilingual
characters into an alphanumeric form acceptable by the existing
DNS and reformulating the request to carry this alphanumeric
string before sending to existing DNS servers for domain resolution.
Servers therefore do not need to be upgraded as requests remain
in ASCII format.
iDNS - As far as
Neteka's knowledge, iDNS utilizes a proxy solution that performs
similar interception of multilingual domain names as prescribed
by Walid. However, the conversion and detection is done in a
proxy server beside the domain name system. All requests must
first go through this proxy before going thorough a DNS resolution
companies claim intellectual property rights over various portions
of the IDN solution space. These claims could affect the implementation
of IDN if groups such as the IETF make decisions regarding whether
or not to use a technology based on its IPR encumbrances, or if
the holder of intellectual property rights regarding a particular
solution seeks to prevent others from using the technology.
doesn't have any patent to IDN related technologies.
covered by Walid's patent is a obvious example. It will has an
affect on the implementation of IDNs, but TWNIC do not use ACE
solution at current stage.
Walid and Ydsig both have patent claims which may apply to some
ACE in particular, and possibly to any ACE. The IETF has an IPR
policy, and if the rights to an encumbered technology cannot be
reconciled with that policy, it is IETF practice to discard the
technology is covered by intellectual property rights.
Yes -We understand
that Walid has a patent and this is under discussion inside
Status Report: WALID
patent granted, Neteka (patents pending). Proponents suggest
tech under Open Source License.
Are you participating (or have you participated in) the IETF standards
process for IDN?
has been an active participant in the IETF IDN Working Group,
and has submitted Internet-Drafts supporting the Working Group's
efforts. However, in conformity with RFC 2026 Section 10, WALID
has not proposed any of its proprietary technology to the IETF
for inclusion in a standard, and WALID participants in the IETF
were vigilant to avoid making any contribution related to our
patent application to the IDN Working Group before filing our
IPR Statement on November 3, 2000.
GRS is an active participant in the IETF standards process, including
the IDN working group.
Neteka is actively participating at the IETF IDN work group and
have submitted three Internet drafts as proposed solutions for
multilingual domain names. These are also archived at the DNSII
are participants within the IETF standards process for IDN.
we are. We are participating in IETF IDN WG from the very beginning
we attend IETF IDN WG meeting several times and there is a IETF
IDN WG status update on JET meeting every time.
Yes. James Seng (our
CTO) is the CO-Chair of the IDN working group. Our commitment
to this group is a reflection of the I-DNS.net commitment to
the standards process.
Status Report: Many
respondents are involved in the IETF IDN WG.
Once IETF adopts an IDN standard, how quickly will it be incorporated
into applications such as browsers? Are any problems with this incorporation
anticipated? What can the IETF and ICANN do to facilitate the incorporation
Should an ACE-based
approach to IDNs be chosen by the IETF and accepted by the Internet
community, we would expect that major application suites could
be upgraded within a few months of the adoption of the standard.
In order to ensure rapid adoption, ICANN could move swiftly to
endorse and support the standard with a policy focused on encouraging
consensus and interoperability in this area. In the short-term,
end-users are going to demand enabling software to resolve IDNs
immediately. ICANN can reduce the potential for fragmentation
during the period before the final standard is issued by encouraging
the distribution and adoption of these enabling technologies.
If a non-ACE-based
solution were to be chosen, we would expect to see a much slower
deployment and adoption cycle. Many experts within the IETF believe
that an infrastructure-based solution could take as long as eight
to ten years to fully deploy, and we would expect to see a significant
amount of fragmentation and non-interoperability in the area of
IDNs as a result.
application developers can answer the first two parts of this question.
The IETF can facilitate the process by developing an IDN standard
in a timely manner. ICANN can facilitate the process by supporting
the IETF's efforts and the eventual standard.
The speed of adoption
will be dependent on the solution chosen and intellectual property
rights (IPR) issues surrounding it. Existing browsers have already
implemented some measures for multilingual domain names albeit
often faulty and problematic, it is therefore likely that a patent
protected approach might not be embraced by the browser community.
believes that regardless of the standard adopted, there needs
to be a transition period and registries will have to embrace
a solution for them to be able to immediately deploy multilingual
domains that can be used by most of the people on the Internet.
This would very likely mean a hybrid solution more or less like
that described in section A:1.
speed at which IDN is adopted into applications may depend on the
particular IDN solution that is adopted by the IETF. Some approaches
are easier than others to implement at the application layer, and
as a result would likely see faster uptake by application developers.
of IDN-aware applications heavily depends on two things: IDN toolkit
and definition of IDN in application protocol. When toolkit is prepared,
applications such as telnet of ftp that treat hostname will be easily
developed. But applications such as browser or mailer that treat
domain name in application protocol won't. IETF or other organization
such as W3C should define how IDN is treated in application protocol.
ICANN should elaborate criteria whether each accredited registry
properly adopts IDN technology. Also ICANN should support fundamental
software budget such as BIND.
perhaps within one or two years.
(2)Once if IETF finalize IDN standard, as soon as possible, the
vender will adopt it.
question assumes that only some ACE variant will be adopted as the
IDN standard. See "mindshare capture", in the response
to Question 7, above.
depends on how do application suppliers support such IDN standard.There
are some problems that can not be anticipated.IETF and ICANN should
do their best to listen to various suggestions from the whole Internet
community, in order to reach the most favorable IDN resolution.
very slow incorporation, therefore need to support extended co-existence
period 4-10 years
We believe that adoption
will be relatively fast , notwithstanding that we do not envisage
"one standard". However, we perceive that as more applications
become "IDN aware", there is greater potential for the
"standards" to become less compliant. For instance,
both Microsoft and Netscape provide browser solutions, but there
are "nuances" that exist between them causing them to
perform slightly differently under similar circumstances.
applications may also lead to interoperability issues with the
currently deployed client-side plug-ins distributed by in-country
players. IETF and ICANN can ease its adoption via the promotion
of a Universal Client - a cousin to its server-side equivalent
known as iBIND
Status Report: Depends
on speed of adoption.
Will the IETF standard be interoperable with other IDN standards? What
can be done to eliminate interoperability problems (assuming not all
ccTLDs adopt the IETF standard)?
the diverse range of approaches currently deployed to support IDNs,
it is impossible for the IETF to issue a standard that provides
for complete interoperability with all existing deployments, nor
is such an expectation reasonable. Adoption of any technical standard
is of course voluntary, and we would expect user and market demands
to promote standardization and uniformity in this area. To ensure
interoperability during the transition period, WALID is adding support
to our WorldConnect system enabler to enable end-users to continue
to resolve IDNs that may have been registered using different standards.
With a client-based approach such as WALID's, it is possible to
support de-facto or national standards in addition to the final
standard the IETF recommends.
are no IDN standards at this time with which an eventual IETF standard
could interoperate. There are various IDN experiments, none of which
can be expected to interoperate with an IDN standard. We believe
compliance with an IETF IDN standard should be a requirement for
all ccTLD and gTLD operators now offering IDNs.
of the solution embraced by the IETF, Neteka's hybrid solution should
be able to make sure that interoperability would not be a concern.
It is already interoperable with some of the ccTLDs' solution as
well as the IDNA solution currently contemplated by the IETF. Should
a protocol extension approach be adopted, Neteka's solution is also
prepared for it and could consolidate different approaches. In short,
there is not too much interoperability concerns so long as alternative
namespaces and unnecessary name checks are not created to complicate
to the wide variety of IDN approaches, it is likely that the IETF
standard will not be interoperable with various other IDN approaches.
For this reason, it is extremely important that all interested parties
be active participants within the IETF process and that registries
and registrars do not make irrevocable technology decisions prior
to the adoption of a formal standard.
is no IDN standard yet. The IETF will standardize only one, so interoperability
to be concerned locates between IDN and current DN.
think all the ccTLDs will follow the IETF standard. It is better
dialogue with IDN users when IETF IDN WG defines the standard. Encouraging
IDN users participate and involve IETF IDN WG would help for push
forming IDN standards.
Both ACE and UTF8 rely
upon an underlying untransformed UCS. Names "in UCS"
and equivalently "nameprepped" and transported encoded
into ASCII or encoded into UTF8 may resolve to the same internet
character repetoires, equivalent name preparation, equivalence
classing (or other means, e.g, secondary A records, C names) and
"transport independent resolution" all will contribute
to the elimination of interoperability problems, assuming both
UTF8 and a single ACE are the de facto or de jure standards for
If multiple ACEs are
deployed, then the problem is equivalent to the known hard problem
of code set negociation, and practically intractable.
To make IETF standard
IETF should consult
extensively for suggestions before finalize its standard.
ACE encoded chars directly in "non-supporting" domains
this moment, we believe so, and the approach adopted to date in
setting and IDN framework via IETF is a sound approach to address
interoperability issues.However, we sense that the time taken by
IETF is far too long for the market, and the pressure is mounting
to deploy. In our opinion, this may lead to other technologies driving
ahead to set a de facto standard outside of the IETF.Status Report:
Unlikely that IETF will be interoperable with all IETF deployments.
Underlying principle is that compliance with IETF standards is a
must, and speed is essential if IETF wish to maintain any form of
"control" on the process going forwards.
Are there other end user needs concerning IDN that need to be addressed?
question that has not been discussed sufficiently concerning IDNs
is the use of IDNs in document contexts, such as URLs embedded in
HTML or XML documents. End users are going to expect to be able
to generate URLs containing domain names in native characters, so
the IDNA approach (in its current form) needs to address these issues
before it can be considered complete.
survey appears to address key user needs.
Neteka believes that
it is very important for multilingual domain names to be immediately
usable by most client systems on the Internet today without requiring
client side modifications or plug-ins. This is a very strong demand
from all of Neteka's clients and represents the major concern
for multilingual domain name registrants and users. The average
user is usually not technically sophisticated enough to understand
the complicities of multilingual domain names and will be frustrated
and confused if multilingual names do not work as expected and
the same as English names.
Beyond providing multilingual
characters, symbols and punctuations are also very important as
a component of language. The introduction of multilingual characters
open up the opportunity to introduce some symbols as well and
they should not be excluded.
hand side of E-mail addresses, path part of URL, electronic signature,
and so on. RFC2825 addresses it clearly. Domain name is a fundamental
component of communication on the Internet. The requirement of the
end-user is not only resolving IDN as hostname, but also indicating
certain entity on the Internet.
compatibility and general Internet application adoption.
Yes. Direct access
to DNS labels in applications is the pre-IDN norm. In the interposed-keyword,
interposed-proxy, interposed-root, and ACE transformed solution
frameworks, the end-user looses direct, unmediated access to DNS
labels. This is incompatible with open architectures.
The following list
of applications MUST use IDNs:
is a concern, and domain names need to handle linguistic issues
such as traditional/simplified Chinese, different uses of diacritics
etc. Some languages also share the same script, such as Chinese
and Japanese, Arabic and Urdu. IDN solutions need to incorporate
"linguistic policy" at the NAME PREP level. The rules
need to be clearly conveyed to the domain name holding publics
in each language.
However, the biggest
concern that we see arises from FUD. There appears to be a current
climate of "fear' for IDN's, partly promoted trough IETF,
partly through ICANN. We believe that IDN is doable, and would
welcome positive, clarifying statements from key leader groups
such as ICANN. We welcome informed debate and believe that ICANN
has a position to ensure that balance is put to such as debate.
Customers are frustrated
at the time it is taking to make IDN available, and becoming confused
as a result of conflicting technical aspects that are currently
Status Report: Use
of IDNs in documents, HTML/XML, punctuation. Other issues include
lack of end-user sophistication and inability/reluctance to handle
plug in installations
Are there any other technical issues we should know about?
survey appears to cover the major technical issues.
No matter how multilingual
names are deployed, a set of problematic glitches would arise
as the transition takes place and as users learn to understand
more about these issues. The main reason being that the average
user will not immediately understand why they might not be able
to access multilingual domain names using their existing system.
These could range from the client side software settings to the
ISP settings or even the authoritative end hosting handling. A
more technically comprehensive documentation on these known issues
could be found at http://www.OpenIDN.org.
Browser & DNS Client
Application Issues - some browsers simply block all entry of domain
names, others try to implement some form of transformation of
the name causing loss of character information, which is sometimes
irrecoverable. There are four main types of behaviors among the
browsers and client side applications when encountering a multilingual
- Send as is without
interfering - while it is positive that the request is being
sent without faulty alterations, because character encoding
information is not provided, it is very difficult to determine
precisely the intended domain name;
- Attempt to convert
to UTF-8 - most implementations to date are problematic due
to complex application (browser) and operating system kernel
intricacies. In some occasions, the double conversion occurs
(UTF-8 on UTF-8 bytes), others drop ending bytes, still others
perform unnecessary case folding causing character information
loss that may be irrecoverable;
- Attempt to convert
to some form of ASCII string - similar to the UTF-8 issues,
these implementations sometimes creates inconsistent results.
Notably is the different behavior of the application whether
it goes through a proxy or not;
- Refuses to send
DNS Resolver Issues
- in general, the DNS resolver resides at the ISP level. There
are three areas of trouble for multilingual domain names at this
level: 1) the ability to match multilingual requests with cached
records; 2) the ability to refer the request accurately to its
nearest match (TLD/root) server; and 3) the ability to cache the
results of multilingual requests. It is very important that these
"messengers" in the DNS do not choke on multilingual
requests. Because the original DNS protocol itself is 8-bit capable,
this middlemen level usually simply passes requests along the
DNS path, however proxy and caching issues could complicate matters
Authoritative DNS Databases
- authoritative DNS databases include root servers, top-level
domain (TLD) registry servers to individual domain hosts. While
they are critical to the functioning of the Internet, especially
for root servers and TLD servers, their tolerance to multilingual
requests are higher because they seldom perform caching and will
implement multilingual names only when they have prepared for
it. Multilingual requests to root servers will either be authoritatively
dropped because the particular TLD does not exist or will be referred
to existing ASCII TLDs.
Beyond the direct implications
of multilingual domain names on the registration system and the
domain resolution system, a handful of other peripheral issues
arise as multilingual names are being introduced to the Internet:
Proxy Servers &
Cache Servers - first and foremost, proxy servers and cache servers
will be affected because they depend on URLs and domain names
to function properly. They also contribute to the blocking of
multilingual names and thus present a huge barrier for multilingual
names to be transparently deployed. A multilingual aware, patched
version of Squid is currently available from Neteka.
Web Servers & Digital
Signatures - web servers are the next in line that requires some
work in order to be able to perform accurate virtual hosting functionalities
as this as well depend on domain names. Digital signatures and
certificates are also an area of concern as they also uses domain
names as a key identifier. As DNS security is being deployed,
this becomes even more important. A multilingual aware web server
based on Apache is also available from Neteka.
& Databases - besides the immediate critical transportation
nodes, other applications such as databases that hold domain names
and email addresses will have to be taken into considerations.
These include customer databases, mailing lists and other directory,
search as well as storage applications. Neteka's API solution
for a quick fix for these applications is the NeMate library which
utilizes an ASCII transformation engine to force multilingual
names into unique ASCII identifiers without loosing character
should define ACE prefix (ACE identifier) as soon as possible. JPNIC
proposed a determination process in draft-ietf-idn-aceid-01.txt.
1. Consider modify
BIND and Internet application to support clean 8 bits (native
encoding) and UTF-8 encoding environment, in order to accept IDN.
2. The technology of
converse between Traditional and Simplify Chinese encoding.
The majority of the
issues raised here are either protocol-design (or interpretation)
or market behavior and analysis ones. They are important issues.
But, they, especially the protocol ones, are not going to be settled
properly by counting heads or otherwise determining a majority
opinion from the community.
More generally, I believe
that this issue is, with the exception of one area that has, IMO,
been persistently dodged, out of ICANN's scope:
- Design, evaluation,
and approval of protocols falls into different space. Nothing
gives ICANN any authority or responsibility in this area until
the point at which parameter assignment is involved, and ICANN
has little discretion about most parameter assignment issues.
- The IETF process
in this area will take as long as it takes to get things right.
There is enough pressure on the area that I do not believe it
is likely to take one week longer than that. But pressure from
various interests, including ICANN, are unlikely to produce
quicker results of high quality and may impede the final schedule.
For example, I had intended to spend this morning wrapping up
the next draft of a set of documents that lay the foundation
for a "search environment" system clearly enough that
we might start thinking about working groups and area allocations.
Instead, I'm attempting to respond to your "survey"
- As most of you know,
there has been a gradual shift in the technical community --driven
by increased understanding of user needs, requirements, and
expectations-- away from the belief that a DNS-only solution
will be adequate. The revised opinion is that additional mechanisms,
which support "search"-type operations rather than
only the DNS's exact-match lookups, will surely be needed and
that "the IDN problem" will not be solved or protocol
work completed until they are. I make no prediction as to whether
IETF will agree on a partial/ temporary/ interim in-DNS approach
while those other scenarios work themselves out.
- Any common/standardized
approach, whether layered on the DNS or part of it, that moves
outside the traditional DNS, hostname, and Class=IN rules, is
going to raise important strategic issues for ICANN and the
community. There are no approaches of this type that I consider
plausible (e.g., not fragmenting of the Internet) that do not
have at least some aspects of a "unique root" situation
or other way to ensure uniqueness of names. But any of them
--whether a new class, a directory-like structure, or something
else-- imply, technically, the opportunity to go back and revisit
the governance and authority questions and to do so without
any significant claims of US Government ownership, authority,
or oversight responsibility. I would personallyprefer to see
ICANN take on the necessary roles, if only because I don't want
to revisit the battles and traumas of the last four or five
years. But I thnk your ability to gain acceptance in that role
will be significant enhanced if you demonstrate to the community
that you are able to resist efforts to drive you toward expansion
of your role beyond your natural charter. And IDN surveys and
evaluation at this point are expansionist.
- The one area where
I believe you clearly do have scope -by virtue of inheritance
of IANA's role under RFC 1591-- is to protect the Internet against
abuses of the DNS that create the risk of damage to existing,
conforming and deployed software, or of ambiguous or non-unique
naming. The risks in those areas of ill-defined testbeds, "just
send 8" strategies, encouragement of multilingual cybersquatting,
etc., are considerable and have been identified repeatedly to
ICANN. The solution is to start warning the relevant domains
of the impact, with the potential of starting a redelegation
process --clearly contemplated by 1591-- if they continue to
encourage these efforts. If, as I suspect is the case, ICANN
is effectively powerless to do this, then admit that and get
out of this area until the various issues sort themselves out
in the marketplace.
I wonder, why you
call this "Internationalized Domains". Is a domain
in an American Indian's script an "international domain",
or rather a "multilingual domain" (MLD)?
2. Verisign's "Testbed"
Versign started its
"testbed" with mixed appreciation of its usefulness.
ISOC discouraged it, but Verisign went ahead, and by indicting
that they would transfer testbed registrations later without
additional charge to the live gTLD zones, they put registrars
into a difficult situation: comply with ISOC's requests and
wait with MLD registrations, or accept MLD registration in order
not to loose customers.
Registrants on the
other side, as much as they might have wanted to honour ISOC's
request, had to register their rightful names in the testbed,
in order to be sure, not to loose out, once MLDs are accepted
in the live gTLDs, i.e. existing testbed registrations would
be transfered to the live zones.
For everybody it
has to look, like Verisign is dictating the conditions, not
3. Verisign and NSI
Verisign had published
a time table when they would accept registrations for which
script. UNICODE was after a while scheduled for 5th of April.
At that time, the Network Solution webpage for testbed registrations
was way outdated. I think it said UNICODE registrations would
be available by early March, i.e. the page was done early February
and had not been updated until 5th of April. The page was in
a language, which didn't suggest, that NSI was waiting for Verisign,
but they themselves would be ready with their setup until the
given times. Without further explanation, Verisign then delayed
UNICODE for the 19th of April. On that date the page on NSI
changed and they accepted registrations.
One cannot help but
wonder, whether Verisign delayed the process, because NSI was
not yet ready, and to start earlier had meant much lost revenue
for NSI (other registrars were ready already).
I am aware, that
this is a vague suspicion, but in case it would be true, who
could proof it?
one of the few registrars, which accepted "pre-registrations"
for UNICODE domains, even before the 5th of April, claiming,
that they would try to register them, as soon as possible. On
19th - and even until the 23rd of April, none of the Register.com's
pre-registered domains showed up in whois, and it was even possible
to register them with NSI (again). Some days later, Register.com
informed registrants, that their domains had been accepted and
charged for it. However, until today, those domains show up
in whois only as "registered by Register.com", but
don't show the registrant (whereas the NSI registrations do).
This leaves registrants neither a chance to check "first
come, first serve" principles, nor to fight cybersquatting
at an early stage.
5. Client Applications
As far as I know,
none of the current client applets (to do the foreign script
to *ACE conversion) supports UNICODE. Customers in "UNICODE
countries" therefore cannot participate in Verisign's "Phase
3.2" (current) and "Phase 3.3" (which should
start soon). A "testbed" where the testing cannot
be done is rather useless.
6. Blocking of MLDs
I didn't find any
policy stated, what would happen to domains, which are directly
registered in their *ACE form, before "official" registrations
(or the transfer of testbed domains into the live zones) will
7. Ease of use of Whois.
To check whois info
on MLDs (in the testbed) is right now a cumbersome multistep-procedure:
transform the MLD version via an online tool into its RACE version,
then copy and past this RACE version into a whois form on some
There need to be
tools to make this easier for non-techies.
8. Open Source
I strongly suggest
to adopt only technology where, and to "go live" when
required tools (like those applets below) are available under
an Open Source License, so that they can be easily adapted to
local languages and to different computing platforms.
9. MLDs and "alternative
During the introduction
of the MLDs, every Internet user who wishes to access those
MLD domains has to install a small applet to do the conversion
to a DNS compatible *ACE string. This will make it very easy
for companies like New.net to offer those applets with "double
functionality": new MLDs and at the same time a new root
(e.g. the New.net root). Looking at the latest published numbers
from New.net, it seems to me, that ICANN is on the best way
to loose the battle.
If it obviously cannot
win on its own (anymore), then it might make the most sense
to look for allies, and the group around the ORSC/Superroot
seems to me the best option. By peering the ICANN root with
their root, there would be immediately lots of new TLDs available
for everybody on the Net (whithout the need for plug-ins), and
New.net with their conflicting TLDs had to fight against lots
of TLD holders. The ORSC looks like very reasonable, has obviously
the most "historical legitimacy", and seems to be
willing to co-operate with ICANN.
- The existence of
standard encodings other than ASCII.
- The issues of encoding
discovery and negotiation.
- The difference between
glyph-centric and character-centric approaches to scripts.
- The prevalence of
7-bit processing in IETF standard protocols.
- The prevalence of
8-bit clean processing in POSIX and proprietary operating systems.
- The basic contours
of the i18n/l10n applications and system markets, by market
area and major vertical market segment.
- The 8-bit clean
"readiness" of bind9.
- The UTF-8 "readiness"
The problem seems to
me to be one of presentation, not infrastructure change. ACE encoding
would seem to be transparent on servers and clients, except visually.
Consequently, clients could use one of these options:
- rely on client
browser to display/convert ACE encodings
- live with raw ACE
encodings visible in client, and manually obtain them from web
sites using, Java applets, Java/ECMAScripts, or CGI scripts,
for instance, and use copy/paste
- point to DNS proxies
that would convert IDN characters to ACE?
Note, URL links would
be ACE encoded only, so non-ACE browsers would work correctly
Some languages do not
have standardized encoding and ISO10646 is unable to satisfy their
needs. This means these languages may not be able to be Multilingualised
until such time as they get ISO encoded.
Some also do not have
generic Input Method Engines that allow IDN to be used. This means
that the IME will need development prior to deployment in an ML
Note, these should
not be reasons to hold the rest of the world backThere is also
a significant demand for Internationalized Email applications
from our users.
Status Report: As above.
Mr. Katoh and the other members of the committee for a balanced
and informative report. I have a few hurried comments.
First, it would be useful if the task force would compile a list
of the major papers and reports that currently exist on IDNs, and
provide links to their URLs if they are on the Internet.
Second, as a Communications
PhD I would express my strong support for the viewpoint that domain
names are just unique, hierarchically organized identifiers for
computers or other resource on the Internet. Their primary function
is technical. Policy should NOT be based on the assumption that
domain names should be something more than that. It will prove
dangerous and counterproductive to attempt to make them approximate
"natural language" and all its inherent contextual ambiguity.
Such a path will simply lead to complex, expensive and ultimately
futile attempts to regulate and control the use of DNS labels
on a worldwide basis.
Thus, ICANN policies
that attempt to be sensitive to extremely local and culturally
specific variations in scripts must be avoided. Whatever problems
of this sort arise as a byproduct of technical change can be handled
through regional treaties and national systems of law.
Finally, ICANN needs
to clarify its relationship to the standardization process. There
is pressure from some quarters to leverage ICANN's contractual
relationship with registries to impose specific technical standards
regarding IDNs on them. This is wrong and should not be done.
I hope that from this process the Board will make an explicit
statement that it does not have and should not have any authority
to select from among competing technical standards and require
registries or registrars to employ one of them exclusively.
ICANN was never intended
to be a standard-setting or enforcing organization. It is purely
an assignment authority within the framework of Internet standards.
During the public forum discussion of IDNs in Stockholm, several members
of the public observed that adoption of an IDN standard was "the
easy part" of the IDN process. What are the "hard parts"
we have to look forward to? Why are they going to be so hard?
Several difficult challenges
lie before the DNS technical community, the IDN mess, which may
transition through an application-specific (ACE) mechanism before
binary (UTF8 or better) deployment in the DNS, changes to fundamental
constants such as label length (currently 63 bytes), addition
of DNSSEC and IPv6 addressing. Each requires a transition period
of non-trivial duration and of non-trivial complexity.
In each case it is
"easy" to specify the "right answer" (or even
a reasonably good "wrong answer"), and much much harder
to deploy the change in a global nameserving mesh of hosts, some
of which are running bind4, some bind8, some bind9, some other
implementations of the DNS protocol, across a space in which global
consistency is formally required, or which has reasonable convergence
1). The particularities
of local languages:
architecture of IDNs includes two important elements: the definition
of the client interface and DNS. The original intention for
IDNs is to permit all people in the world to locate the Internet
resources by using their local language. So the special requirements
of local languages and characters should be resolved in the
whole architecture. It is inevitable for vast users to have
different requirements, and it is technically feasible to implement
the whole architecture to meet the requirements. There may exist
many solutions to meet the requirements, so which solution and
where in the whole architeture to achieve the requirements is
an issue of technique selection.
2). Alternative: UTF8/ACE
The consequence of
the selection has a bearing on the future developmental trends
of IDN. In a long term, it also affects the applications, services
of the Internet.
Upon adoption of a
technical Standard, the harder problems include deployment and
migration issues like the following:
It may take an exceedingly
long time to update all DNS applications around the world. The
solution would still depend on client-side resolution plug ins
that likewise need to be updated to be interoperable with each
As the standards improve
over time, there is a major commitment to keep the IDN solutions
(plug ins, WHOIS etc) up to date.
The standards will
be just a base, and any "Policy' encoded may be enhanced
within each country, much like each ccTLD has slightly different
The "easy part"
is taking so long, and we fear that providers may break away and
implement "workable" de facto solutions unless IETF
and ICANN move ahead within a reasonable period of time
Others would include
linguistic sensitivity issues e.g. normalization/canonicalization,
lack of official font, language, encoding, input method, renderer?
IETF standard isn't finalized yet. But the IDN requirements from
Internet users are very urgent. By the way, even the standard is
finalized in time, the related applications deployment will still
take times more than several years. It need to modify many existing
client and server software, such as browser, email
This is the hard parts we think. The deployment of those kind of
large scale services in infrastructure level is extremely difficult.
concerning the layout, construction and functionality of this site
should be sent to firstname.lastname@example.org.
The Internet Corporation for Assigned Names and Numbers.