ICANN Stockholm Meeting Topic: Report of the Internationalized Domain Names Working GroupResponses to Survey A
Posted: 29 May 2001
AppendixResponses to Survey A: Technical Questions
1. What different technologies currently enable the use of non-Latin scripts as domain names?
WALID | WALID, Inc. (WALID) considers that the deployed or proposed approaches to Internationalized Domain Names (IDNs) have focused mainly on four approaches:
|
Verisign | Standards to enable the use of non-Latin scripts as domain names have not been finalized. Broadly speaking, three different methods are currently in use:
VeriSign Global Registry Services (VeriSign GRS) will conform its testbed to whatever standard emerges from the IETF process. |
Neteka | There are in general three approaches to multilingualize the domain name system:
Besides the three main approaches, it is also possible to have a hybrid approach that mixes and matches the different technologies. It is also Neteka's opinion that the best strategy forward to deploy multilingual domain names consists of a hybrid of all three approaches and contemplates a phased transition:
With all three approaches pre-installed into Neteka's NeDNS, it is immediately deployable as a server end solution for registries to offer multilingual names, and prepared for the migration towards a longer-term solution, whatever stream it might turn out to be. |
Register.com | Two general types of technologies attempt to enable the use of non-Latin scripts as domain names. The first of these approaches involves the transmission of native encodings as part of the DNS labels used within queries and/or responses. Native encodings are character encodings, such as ISO-foo or Shift-JIS used to represent non-Latin scripts. These encodings are generally 8-bit and always involve the use of characters outside those permitted within DNS labels by RFC 1034 [RFC1034]. The second type of approach is the conversion of IDNs into a domain name that conforms with RFC 1034. These approaches involve the use of ASCII-compatible encodings (ACEs) of non-Latin scripts. Generally, ACE-based proposals involve both the compression of non-ASCII data as well as a transformation into an RFC 1034 compliant string. For both of these approaches, various parties have proposed a variety of specific implementations. Internet Drafts currently describe several ACEs, as well as various approaches that describe the use of ACEs within various parts of the DNS. Different approaches using 8-bit character transmissions within the DNS have also been described, including on-the-wire transmission of native encodings as well as common formats such as UTF-8. Finally, some more radical approaches, such as the creation of a new DNS class or the use of a new directory layer to replace traditional DNS functionality, have been suggested. |
JPNIC | We follow IETF IDN WG discussion. Application solution complies the IDNA architecture, with NAMEPREP and ACE. |
TWNIC | (1) Interim case:
(2) Test bed case:
|
2. What are the strengths and weaknesses of the technologies referenced in Question 1? Please give concrete examples.
WALID | A number of the proposed approaches described above treat the problem of IDN as if it were a 'DNS protocol problem', instead of a 'domain name problem'. That is to say, if the DNS protocol or infrastructure can be changed to support non-Latin scripts, then the problem would be solved. The rough consensus of the technical community, however, is that this approach is fundamentally incorrect, and perhaps the approach stems from a view that the only applications running on the Internet that need to be considered are web browsers and web servers (and sometimes only particular web browsers or servers). WALID would suggest that any approach to IDNs must take into account the entire deployed base of protocols, applications, and implementations that run on the Internet today, many of which are crucial for ensuring the stability, security, and operation of the network. Many of these protocols and implementations will not support characters outside of the LDH (letters, digits, and hyphen) set, either in forward or reverse resolution contexts. These fundamental issues aside, approaches that focus on changes to the infrastructure, either by deploying a new protocol, new servers, or new types of server applications face the inertia associated with the deployed nameservice infrastructure. The DNS is everywhere, and attempting to make significant changes to the DNS as a whole would likely take at least a decade for complete deployment, risking the creation of islands of dis-connectivity in the process. Infrastructure-based approaches also suffer from the problem that updates are difficult to deploy. In this respect, one need only consider the large numbers of very old BIND distributions still in operation with serious known security vulnerabilities. The approaches above that would send Unicode data directly (typically in the UTF-8 encoding) also ignore the issues relating to name equivalence, and ultimately would create a serious security problem, given that many applications and protocols rely on the DNS for performing authentication and authorization checks. Many Unicode codepoint sequences, which are visually identical, can be different at a binary level, creating the opportunity for a malicious user to fool someone into connecting to a different host than the one they think they are connecting to. At a more basic level, without some sort of canonicalization step during resolution, many users will have a difficult time making IDNs work reliably. Within the IETF, this requirement has been called the 'business card test'. WALID's approach to IDNs, currently in use as part of the VeriSign GRS multilingual testbed, is to perform a canonicalization and transformation process of the IDNs on the end user's system. IDNs are normalized to address the equivalence problem described above, and encoded using a transformation algorithm from Unicode into the subset of ASCII permitted in DNS host labels. IDNs that are presented to the DNS for resolution thus use the same range of characters as standard domain names. The significant advantage to this approach is that no changes are made to the deployed base of infrastructure systems, and the operational stability of the network is not compromised. Our experience in working with ccTLD registries has shown that infrastructure-based approaches, by contrast, are quite unworkable, both because of the inertia associated with the DNS resolution infrastructure and the large numbers of web proxy servers that are on the network. To deploy the ACE-based approach completely, applications which process DNS hostnames will need to be upgraded to handle IDNs. In the short-term, a mechanism must be widely deployed to enable immediate resolution of IDNs in the applications that end-users use most often, such as web browsers and e-mail applications. WALID is addressing these needs by making freely available for download its WALID WorldConnect technology, to enable immediate resolution of IDNs, and its WALID WorldApp to enable application developers to incorporate standard IDN transformation capabilities into their applications. |
Verisign | Methods one and two above involve an application sending binary (i.e., non-ASCII) data through an infrastructure not designed to handle it, which certainly has the potential to cause problems. Application protocols, such as SMTP, call for domain names to be encoded in ASCII. Not all DNS resolvers and name servers are "8-bit clean" (i.e., able to handle binary data without issues). The deployed base is huge, with endless combinations of components, and it is impossible to test every scenario for its ability to handle binary data. We do not know of any completed studies, although MINC is planning such testing. (Please see http://www.minc.org/WG/testing/interop/.) For this reason, the IETF Internationalized Domain Name (IDN) Working Group has focused on the Internationalized Domain Names in Applications (IDNA) solution, which involves transforming internationalized domain names (as described in method three above) at the application level, so that they can be sent in application protocols and through the Internet's DNS infrastructure in a known safe format. |
Neteka | Brute Force Approach - The advantage of this approach is that multilingual names could be deployed immediately at the server end to parse the multilingual name information and be reachable by a good percentage of users over the Internet. However there are with it also a considerable number of disadvantages that could cause inconsistent responses. These include character encoding conflicts as well as proxy and application blockages. Character encoding conflict is one that is particular prominent. The same encoding value could represent entirely different characters if a different encoding scheme is used. Conversely as well, the same characters might be represented with different encoding values under different schemes. Both of these issues lead to problems for the DNS where names must be unique and that the user expects to be transported to the same domain regardless of their input mechanism. Protocol Extension Approach - The common advantage of using a protocol approach is that the efficiency of the DNS is not compromised at all and that there will be no ambiguity as to the exact characters a domain name query is referring to. Also, with the introduction of an extension, versioning and future extensions could also be built in. In essence, a protocol extension approach is generally considered a better long-term solution for multilingual domain names. The common disadvantage of the protocol approach however is that it requires changes and upgrades from both the server-end as well as the client/application-end. This may result in the slower adoption of the system. ASCII Conversion Approach - The most prominent advantage for using an ASCII conversion scheme is that no changes is necessary in the server end because they will continue to expect and react to request that are formatted within the specifications of the original DNS standards. Conversely, the major disadvantage is that users that wish to use multilingual domain names must consciously upgrade their software to be able to reach the multilingual domains. The average user however is not likely to be technically sophisticated and would expect multilingual domain names to function the same as English only ones. Also, it introduces an additional procedure in domain resolution and takes away the feature of the DNS to keep the transportation format consistent with the presentation format of domain names. Hybrid Solution - It makes most economical sense for implementers to tackle the issue with an all inclusive hybrid approach because the efforts in development of the solution will not become totally emaciated. On one hand, the inclusion of a brute force approach ensures that once multilingual domains are deployed, a good number of users could immediately be able to access and utilize these names. On the other hand, more alert or early adopters would likely embrace the ACE technology and already have converters installed, therefore, to take care of these requests, the database should include an ACE formatted record. Finally, the system should be made aware of eventual protocol approach where the incoming packet would effectively announce the exact encoding scheme and format of the multilingual name. By developing a three-fold strategy, the implementer may be able to assure that it will be prepared for any situation that might transpire out of the dynamic standardization process now underway. As the Internet matures, it should no longer be a purely technology push mechanism for implementing new features, but should also consider the customer pull factor. In the hybrid deployment model, first the brute force approach is used so that registries could begin allowing registrants to obtain functional multilingual domains and use them immediately without any client-end reconfiguration. Only the registry name servers and the registrant's hosting server needs to be upgraded. As the need for accessing multilingual domains increase, users will be more aware and knowledgeable of using the ACE approach, which will make provides a good consolidation towards a common protocol and makes administration much easier. Eventually, this would encourage middleware and other applications to upgrade to the protocol approach to make the entire process much more efficient and truly multilingual aware. |
Register.com | The advantage of 8-bit character transmission is that these approaches seem to be the most simple and elegant solutions. These approaches allow fairly direct representations of IDNs and may allow DNS data to be human-readable for those with terminals capable of recognizing and displaying the relevant character encodings. Unfortunately, although the DNS protocol itself allows for the transmission of 8-bit domain name data, many of the application protocols that rely on the DNS were not designed to handle such domain name data, and these protocols would likely need to be individually re-designed in order to provide IDN capability. ACE-based approaches generally provide a high degree of compatibility, because they continue to use RFC 1034 compliant labels to represent all DNS data. Some ACE-based approaches have been designed which move all IDN work to the application or local resolver, and as a result require no modification of the name servers which are currently running. Such an approach allows individual users to essentially "opt-in" to the use of IDNs by installing updated software on their computers without impacting other users or affecting the stability of the network at large. The more radical approaches mentioned above offer the potential for significant elegance and potentially large amounts of innovation, but the time to implement such solutions is likely to be unacceptably long. |
JPNIC | Strengths: It can be realized with current protocols. Weakness: It reduces the string size of each label. It requires character set / encoding conversion. |
TWNIC | Test bed case has the following strengths and weaknesses. Strengths:
Weaknesses:
|
3. Are there more problems relating to particular scripts? Why?
WALID | The IETF IDN Working Group and the Unicode Consortium have been investigating the complexities associated with introducing non-Latin scripts into the context of DNS hostnames, and attempting to ensure that end-user expectations are met. We fully support the work of these two expert organizations in this area. |
Verisign | Experts in these languages and scripts are in the best position to answer this question. |
Neteka | In general, scripts with more local encoding schemes are more problematic initially for quick deployment of multilingual domain names. Other language issues are local script dependent. For example, there is the traditional and simplified Chinese issue. Part of the debate is whether a folding or mapping should occur automatically and built into the IDN protocol. This coupled with conflicting local character encoding schemes also makes the deployment of Chinese, Japanese and Korean scripts more difficult. Neteka's perspective on the Chinese character folding issue is that it should be a policy matter and controlled during registration and be dependent on the registry policy. ICANN should however provide guidelines as to what the issues are and suggest a number of alternatives to solve the problem. Other languages also have their own language issues such as Arabic, where spaces within phrases changes the meaning and the form of a character, Hebrew where characters could be omitted, etc. |
Register.com | Essentially, the more different scripts are from traditional Latin scripts, the more likely problems are to occur. Languages such as Chinese and others that use the Han ideographs can be problematic due to the sheer size of the character repertoire. Some languages have a large number of encodings to represent essentially the same character set, which can make it problematic to identify and transform raw data into a common, universally understood format. |
JPNIC | ACE requires Unicode as its base character set, but many PCs use local character set such as JIS. It causes normalization problems due to character set conversion, that is 1 to N mapping. |
TWNIC | The second byte of Big5 encoding characters include ASCII encoding range, it may make DNS response error data (DNS software is case sensitive in ascii character) |
4. To the extent there are weaknesses in the technologies, what groups are working to develop solutions?
WALID | The group most active in addressing the need for technical standards to support IDNs is the Internet Engineering Task Force (IETF). The IETF IDN Working Group has made considerable progress in the past year in defining an overall set of technical and operational requirements for internationalized domain names, has vetted a broad set of technical proposals, and has chosen an approach consistent with those requirements. Many IETF participants are also active in the Unicode Consortium, the W3C, and other standards bodies, and the IETF IDN Working Group and IDN community as a whole benefits from their experience, coordination, and support. The Multilingual Internet Names Consortium (MINC) has also been active in developing policy in the IDNs area, as well as providing a forum for performing interoperability testing. While this has been somewhat less successful, MINC provides a good forum for representing the interests and needs of its broad constituency and can support the efforts of the IETF and other technical standards bodies. As MINC moves forward with its mandate, we expect to see MINC play an important role in supporting the deploying of internationalized domain names and in promoting cooperation, compliance, and interoperability between the systems that are deployed today. |
Verisign | We share the opinion of others in the IETF IDN Working Group that the issue that should be tackled is internationalizing domain names, not internationalizing the DNS protocol. Thus the issue is broader than some "quick fixes" or partial solutions advocated by some technology providers, such as simply upgrading DNS clients and servers. Any complete IDN solution must involve end-user applications, such as web browsers, as well. The IDN Working Group is developing standards for IDN and is, we believe, the primary focal point for a complete solution. |
Neteka | Neteka's DNSII (www.dnsii.org) and OpenIDN (www.openidn.org) initiatives encourage and allow more people to be involved in this important transition on one of the core technologies of the Internet. DNSII is a forum for discussing different multilingual approaches and currently archives Neteka's proposals. OpenIDN is an open source multilingual DNS, allowing interested parties to tryout using multilingual names as well as the source code to enhance the features on their own. IETF is mainly concerned with the protocol and tries to determine which approach to use and what the eventual format should look like. MINC is a quasi-iDNS initiative started by iDNS advocates. The discussion includes both protocol issues as well as language or policy issues. In Neteka's perspective, both these functions are already carried out by IETF and ICANN and the responsibility should really go back to these two bodies for a more comprehensive points of views of the problems therefore providing better results. |
Register.com | The IETF continues to work on a variety of issues surrounding the IDN problem space. |
JPNIC | JPNIC IDN Taskforce, JP-CN-KR-TW NIC's Joint Engineering Team and IETF IDN WG. |
TWNIC | TWNIC Chinese technology task force, CDNC, JET, IETF.. etc. |
5. What are the different solutions under consideration? Which are the most promising? How much longer will it take to develop a solution that works?
WALID | The current proposal before the IETF IDN working group is "Internationalizing Domain Names in Applications (IDNA)." From a technical standpoint, we understand that the WG has established rough consensus around the core concepts of normalization and transformation taking place within the application. Assuming that certain non-technical issues are resolved, the IETF could have a standard ready by the end of 2001. The consensus in the IETF IDN working group is not complete, however, and some have suggested that the working group is failing to consider questions relating to language and language use, and the expectations of end-users of the DNS. While these questions are certainly important, we are not convinced that they concern issues that can or should be solved by the DNS. Many participants feel that these questions are outside of the scope of the charter of the IETF IDN working group, which is focused on enabling use of non-Latin scripts in the DNS, and thus should be addressed separately. |
Verisign | The work of the IDN Working Group is public; more information is available at http://www.i-d-n.net/. The most promising proposal is called IDNA (Internationalized Domain Names in Applications), which calls for applications to convert IDNs to an ASCII-only "safe" format using an ACE (ASCII Compatible Encoding). More details are available in the IDNA Internet-Draft at http://www.i-d-n.net/draft/draft-ietf-idn-idna-01.txt. |
Neteka | IETF - while a good number of proposals have been presented to the IETF, until recently, discussions surround the IDNA (IDN Applications) approach. This however collides with a patent issued to Walid. Recent discussions have included ways to work around the patent as well as hybrid approaches. Neteka - Neteka is a proponent of a hybrid approach to ensure that the migration is transparent to the end user and smooth for the operators. We believe this is the most promising approach in that it already works for the majority of the people on the Internet immediately. It also provides a clear path towards the longer-term approach where the entire Internet will become fully multilingual aware. Neteka's system is also compatible with email addressing systems and Neteka already have the technology also to introduce multilingual email addresses. iDNS - the iDNS Proxy solution assumes that multilingual domain names are redirected to the iDNS servers for resolution. This creates a bottleneck for the system and introduces unnecessary complications. WorldNames - as far as Neteka's understanding, WorldNames' NUBIND, currently implemented at the dotNU registry, is essentially a redirector technology and multilingual names registered using this system could not be utilized for email addresses. |
Register.com | As indicated above, there are a number of solutions currently under consideration. Currently, the [IDNA] solution proposed within the IETF's IDN working group seems extremely promising; recently, however, intellectual property concerns have slowed the development of that particular approach. More generally, ACE-based solutions seem to generally have the greatest traction and operational experience to date, and the advantages that they yield in backwards compatibility is probably a strong argument in their favor. A final solution to this problem space still seems to be at least six months away. |
JPNIC | None other than the above is thought of. |
TWNIC | (1) UNAME:
(2) Depending on when IETF finalize the RFC, after that, it would take 1 or 2 years. |
6. Currently there are no accepted standards for IDN. Is this because there are competing technologies, or because the underlying problem is sufficiently difficult that a "best" solution has not yet emerged?
WALID | WALID believes that competition is a healthy and necessary part of the development of any emerging industry, and a useful tool for providing real-world experience concerning the viability of various approaches to solving a given technical problem. The 'IDN Subject' is certainly a complex one, and some have characterized it as one of the most difficult challenges that the Internet technical standards community has faced. The IETF and other standards bodies have made extremely good progress in addressing it within a relatively short time. |
Verisign | The IETF IDN Working Group is moving relatively quickly to produce an IDN standard. |
Neteka | While competing technologies imply that there is no defacto standard, it is because some initial attempts are not satisfactory that competing technologies arise. This is therefore a multifold issue: first of all there is a differing opinion on what the "best" solution should be. The underlying problem is sufficiently difficult in that there has to be compromises and a decision could only be made based on giving more consideration to some key issues and focusing less on others. Unfortunately, it is very difficult to build a consensus on which among the many issues should these "key issues" be. There are really three main camps:
|
Register.com | The real problem is that there is no ideal solution. All proposed solutions to date have drawbacks, and it has been difficult to develop consensus about which of these drawbacks is the most tolerable. The underlying problem is indeed an extremely difficult one, and even if a "best" solution has emerged, it will take time and careful study in order to recognize and adopt it. Also, because of the critical nature of the DNS to the Internet community, it is important to develop and in-depth understanding of the pros and cons of all possible solutions, and to move towards adoption in a manner that does not jeopardize the stability of the Internet. |
JPNIC | IETF IDN WG has come to consensus as answered to Q1, so the WG is concluding proposed technologies, and going to process the result on standard track. The most anxious hurdle of the process is WALID's patent issue. |
TWNIC | Both of them. |
7. Do the existing "testbeds" and pre-registrations help or hinder the resolution of the technical issues relating to IDN? In what manner? Would the testing impact the ongoing operation of the Internet?
WALID | Testbeds are an important mechanism for gathering useful operational experience in this area, and help to gauge demand and user expectations for IDNs. Some of the testbed projects underway have been very careful to not disturb the use of the existing DNS, while others have not been as focused on the operational stability requirements of the network. |
Verisign | A testbed that supports the IDN standard development process, such as VeriSign's testbed, is helpful. For example, the VeriSign GRS IDN testbed has offered technical feedback to the IDN Working Group on the complexity of the Row Based ASCII Compatible Encoding (RACE) algorithm (one possible ACE). Partially based on this feedback, the IDN Working Group has decided that RACE is not suitable for use in the eventual IDN standard. In addition, the VeriSign testbed has been conducted in a progressive, phased approach. This allows for the completion of predefined milestones before moving to subsequent phases and thereby reduces the possibility of creating DNS stability problems. It is difficult to imagine how a testbed could interfere with the operation of the Internet. It is highly unlikely that even a testbed that uses domain names in a binary format (unlike VeriSign's testbed) would negatively impact the Internet's DNS infrastructure (including the root and gTLD name servers). Because so many applications already send DNS queries in one binary format or another, the root and gTLD name servers are already deluged with such queries as part of the normal DNS resolution process, all with no impact aside from the additional volume. |
Neteka | Depending on how the domain resolution strategy is eventually deployed, pre-registrations should not hurt the introduction of multilingual names. Other so called functional "testbeds" may hinder the progress, especially the establishment of alternative namespace beyond that recognized by ICANN. This is a very serious issue as these "testbeds" would redirect all multilingual requests to their own alternative namespace meaning that even if later on the existing namespace introduces multilingual names, the requests under the "testbed" system will be routed to the alternative namespace causing confusion. Pre-registration however is safer as it essentially means that the multilingual name is only stored in a database and not being used. Any technical solution could be deployed later for domain resolution. It also serves to be an indicator of user demand. Even when users know that these names do not work, a lot of people are registering for them in the hope that they will be able to use them soon. Beyond the "testbeds" and pre-registrations in fact Neteka views the faulty implementations on the existing browsers and unnecessary blockages at proxies, cache servers and firewalls as even larger hindrance to the implementation of multilingual domain names. Please refer to section A:16 for more information. |
Register.com | The operational experience gained from legitimate testbeds can be extremely helpful in moving solutions from theory into practice. Due to the large number of commercial interest in play, however, some of these testbeds might be seen as attempts to force the Internet community to accept certain technologies despite their appropriateness or quality. Generally, policy considerations have lagged behind technology in the IDN space, and as a result there have sometimes been inadequate assurances that testbeds serve the internet community by providing valuable operational experience as opposed to benefiting certain commercial interests at the expense of technology. Generally, testbeds should not affect the ongoing operation of the Internet. It is important that end user's expectations of these testbeds be managed carefully however-these users may be under the impression that the testbeds may be an operational portion of the Internet, and may view technical failures within the testbeds as operational problems rather than a normal part of the testing process. |
JPNIC | They provides a lot of 'real samples' to evaluate proposed technologies such as ACEs, that are useful to list up issues. The impact on the operation is that DNS or Web server operators must learn how to convert IDN to ACE. Testing provides good opportunity to learn it. Testing also provides many information about IDN to end-users, engineers, developers, and service providers. |
TWNIC | Commercial promotion on a test bed product is not good. It is better to provide service until the standard of IDN is ready. But if the local testbed does not influence the Internet stability, it would be help for IDN development. |
8. Are natural languages so complex, rich and varied that a true IDN system that responds completely to user expectations is beyond current technological capability? Can the problem be solved incrementally in a manner that does not interfere with the operation of the entire domain name system?
WALID | The IDN problem in our view is not one of natural language, but rather one of adding support for a wider range of scripts to be used as identifiers in the DNS. As such, issues involving natural language and the often context-sensitive expectations of users are outside of the scope of the IDN-related efforts currently underway. Some within the community have proposed creating directory service layers above the DNS to meet the expanding needs, and we strongly encourage and support any work in this direction. Natural language issues are language- and locale-specific, and any proposals to address them should be developed based on participation by native members of the locale as well as general linguistics expertise. |
Verisign | The IDN Working Group is already developing a technical solution to support a true IDN system. As noted above, we support the introduction of IDNs in a phased manner that does not risk interference with the operation of the DNS. |
Neteka | This depends on the perspective of what constitutes a "domain name". Some technical persons maintain that a domain name is nothing more than a string of characters for the identification of a resource over the Internet. Neteka however believes that domain names have evolved from its origins to represent an identity of a person or a corporation on the Internet, whether it is being used as part of an email address or simply a web address. Natural language rules can definitely be introduced to the DNS, Neteka's technology have shown that the use of phrases, punctuations and even spaces are possible. Therefore a fully natural language domain name is possible. However, it is important to also understand that the domain name system is useful because of unique names and this rule should not be violated or confusion would occur. The same phrase must result in the same resource regardless of which locale or platform it is accessed from. This means that certain user education is required to understand that Mikeshoes.tld may not be the same Mikeshoes in your local mall. |
Register.com | The domain name system was never intended to serve as a directory service with the capability to consistently find the appropriate result to a natural language query. Although the original design of the DNS includes certain characteristics which are designed to reduce language-related errors (for example, case folding, or even the original limitation of domain names to use only ASCII characters), it still is not capable of distinguishing between variants of words (e.g. "color" versus "colour") or appreciating the other subtle nuances of language. Regardless of the IDN solution that eventually emerges, it will be important to educate users regarding the use of the Internet. A good IDN solution will not solve natural language problems, but will allow many more users to take advantage of the Internet using their native language and their native character sets. |
JPNIC | We believe that IDN doesn't introduce 'languages' to DNS, but introduces non-alphanumerical scripts or characters. |
TWNIC | Usually natural languages will not be a domain name, user may use natural languages on search engine to find out some data. But proper normalization of DNS is required even it is very difficult. |
9. How do different technologies affect the size limitation of domain names? What, if any, are the possible solutions?
WALID | Domain name segments are limited to 63 octets per segment, and an overall domain name length of 255 octets. In the context of the ACE-based proposals, Unicode codepoints can expand to multiple octets, thus reducing the number of actual non-Latin characters that can be used in a domain name. Even in non-ACE proposals (particularly those that rely on UTF-8) this same issue exists. There are a number of proposals under consideration by the IETF IDN working group to address this issue through efficient encoding of Unicode sequences. The challenge in this area is to find an encoding algorithm that is both very efficient yet relatively simple to describe and implement. The current draft before the IETF IDN Working Group ACE design team comes very close to meeting these requirements. |
Verisign | Domain names are limited to 255 octets in length and individual labels (i.e., between periods) are limited to 63 octets. This is a fundamental limitation of the DNS protocol and cannot be changed without altering the DNS protocol. Different representations of different character sets require more or fewer octets depending on their design. For example, UTF-8 is a variable length encoding of the Unicode character set. In a given number of octets, some scripts require more space than others. The IDN Working Group has been sensitive to this issue during the design of the various ACE algorithms that are candidates for inclusion in the final IDN standard. A requirement of the final ACE algorithm is a roughly equal treatment of all scripts in Unicode. |
Neteka | Brute Force Approach - utilizes existing packet format therefore will only allow a maximum of 63 bytes. Depending on the byte length of the character encoding scheme used, the number of characters possible could range from 63 to 15. Protocol Extension Approach - new size limit could be introduced so length can become a non-issue. ASCII Conversion Approach - utilizes existing packet format. Depending on compression scheme, domain length per label ranges between 15 - 20 characters. |
Register.com | Because they transform eight bit characters into what is approximately a five bit (37 possible values) storage format, ACE-based solutions generally reduce the number of native characters that may be present within a single DNS label. Most of the existing ACE proposals contain compression mechanisms in order to increase the size of the native domain name as much as possible. |
JPNIC | As answered in Q2, ACE reduces the size of each label. Therefore ACE must involve effective compression algorithm. JPNIC is evaluating many ACEs and contributing to IDN WG ACE team. |
TWNIC | There is more length limitation on ACE encoding. Native encoding (local encoding like big5) has less length limitation on domain names. |
10. Do IDNs pose special problems for the technical operation of WHOIS databases? If so, what problems? What are the possible solutions?
WALID | Access to the WHOIS public registration databases tends to be provided in two ways: via web-based interfaces, and through the TCP port 43 whois/nicname service. One of the challenges for operating a WHOIS database will be in ensuring that queries arrive in a form that can be accurately matched against the database contents. WALID considers that a positive solution would be to use the IDNA approach and upgrade the deployed 'WHOIS' clients. These upgraded applications would need to normalize and transcode IDNs into their ACE equivalents, and then use the transformed name as the query to the WHOIS server. This is a strength of the IDNA approach, in that it addresses not only the question of IDNs in the DNS, but also in all of the applications, such as WHOIS, which use domain names as application protocol elements. |
Verisign | No, although WHOIS services must be internationalized if the domain names they hold are internationalized. One possibility is internationalizing the WHOIS protocol itself, along with clients and servers. Another is adopting the IDNA approach: IDNs would be stored in an ACE format and WHOIS clients would be required to convert internationalized user input into ACE format before querying a WHOIS server. VeriSign GRS is presently developing an IDN Whois service. In the interim, an IDN conversion tool is provided. |
Neteka | Multilingual domain names should not present special problems not encountered by the domain name server. Depending on the approach used, WHOIS databases may need to be upgraded however for it to handle multilingual requests. For example, if a protocol extension approach is used, the WHOIS side should determine whether the mode bit is required or should it force all request into a standardized format. |
Register.com | Generally, IDN problems should not significantly affect the operation of the WHOIS database. It may be necessary to display WHOIS data in non-Latin scripts, but this problem can largely be viewed independently of the IDN effort. |
JPNIC | The problems of WHOIS are expressions in query and display. Short term solution is to update IDN-aware whois client. Long term solution is to improve WHOIS protocol. |
TWNIC | Some WHOIS database can not accept clean 8 bit data or query. The problem could be solved if IETF finalize the standard for WHOIS databases support IDN a soon as possible. |
11. Are any IDNs related technologies covered by patents or other intellectual property rights? If so, will this have an affect on the implementation of IDNs?
WALID | We understand that there are a number of granted patents and patent applications that cover various areas relating to internationalized domain names, including U.S. Patent No. 6,182,148, which was issued to WALID, Inc. on January 30, 2001, a related PCT application by WALID, and at least one pending patent application by i-DNS.Net. We consider that intellectual property rights need not impede implementation of IDNs, and may even encourage a more rapid adoption of a single and optimal technical standard. Regarding WALID's patent and PCT application, we have supplied the following IPR Statement to the IETF on November 3rd, 2000. We understand that this statement is in accordance with many such statements that have been filed with the IETF by numerous companies in the past: Pursuant to the requirements of RFC 2026, Section 10 ("INTELLECTUAL PROPERTY RIGHTS"), WALID, Inc. ("WALID") gives notification to the IETF Secretariat that one or more patent applications relating to a METHOD AND SYSTEM FOR INTERNATIONALIZING DOMAIN NAMES have been filed. Should the implementation and practice of any part of an IETF standard relating to the above subject matter require the use of technology disclosed in any granted WALID patent, WALID is prepared to make available, upon written request, a non-exclusive license under reasonable and non-discriminatory terms and conditions, based on the principle of reciprocity, consistent with established practice. For any questions regarding WALID intellectual property and license, please contact: J. Douglas Hawkins |
Verisign | Several companies have patents surrounding the IDN space. WALID, Inc. has notified the IETF of a patent that may cover the work of the IDN Working Group. The working group is currently taking this patent into account as it decides whether or not to proceed with the IDNA solution. |
Neteka | Neteka understands that there are at least the follwing three patented approaches: Neteka - Parts of Neteka's multilingual technologies are patent pending and are submitted as Internet drafts to the IETF and archived both at the IETF site as well as at http://www.DNSII.org. Neteka's technology however is available as open source and is freely available at http://www.OpenIDN.org. This ensures that even if Neteka's technology is used, the Internet community is guaranteed to have a freely available source of the technology for their utilization. Walid - In essence, Walid's technology is a client-side or pre-DNS-server ASCII conversion approach. Neteka's understanding is that the patent surrounds a technology that intercepts multilingual requests sent from the client and performs a conversion of the multilingual characters into an alphanumeric form acceptable by the existing DNS and reformulating the request to carry this alphanumeric string before sending to existing DNS servers for domain resolution. Servers therefore do not need to be upgraded as requests remain in ASCII format. iDNS - As far as Neteka's knowledge, iDNS utilizes a proxy solution that performs similar interception of multilingual domain names as prescribed by Walid. However, the conversion and detection is done in a proxy server beside the domain name system. All requests must first go through this proxy before going thorough a DNS resolution process. |
Register.com | Several companies claim intellectual property rights over various portions of the IDN solution space. These claims could affect the implementation of IDN if groups such as the IETF make decisions regarding whether or not to use a technology based on its IPR encumbrances, or if the holder of intellectual property rights regarding a particular solution seeks to prevent others from using the technology. |
JPNIC | JPNIC doesn't have any patent to IDN related technologies. |
TWNIC | ACE covered by Walid's patent is a obvious example. It will has an affect on the implementation of IDNs, but TWNIC do not use ACE solution at current stage. |
12. Are you participating (or have you participated in) the IETF standards process for IDN?
WALID | WALID has been an active participant in the IETF IDN Working Group, and has submitted Internet-Drafts supporting the Working Group's efforts. However, in conformity with RFC 2026 Section 10, WALID has not proposed any of its proprietary technology to the IETF for inclusion in a standard, and WALID participants in the IETF were vigilant to avoid making any contribution related to our patent application to the IDN Working Group before filing our IPR Statement on November 3, 2000. |
Verisign | Verisign GRS is an active participant in the IETF standards process, including the IDN working group. |
Neteka | Yes, Neteka is actively participating at the IETF IDN work group and have submitted three Internet drafts as proposed solutions for multilingual domain names. These are also archived at the DNSII site. |
Register.com | We are participants within the IETF standards process for IDN. |
JPNIC | Yes, we are. We are participating in IETF IDN WG from the very beginning of it. |
TWNIC | Yes, we attend IETF IDN WG meeting several times and there is a IETF IDN WG status update on JET meeting every time. |
13. Once IETF adopts an IDN standard, how quickly will it be incorporated into applications such as browsers? Are any problems with this incorporation anticipated? What can the IETF and ICANN do to facilitate the incorporation process?
WALID | Should an ACE-based approach to IDNs be chosen by the IETF and accepted by the Internet community, we would expect that major application suites could be upgraded within a few months of the adoption of the standard. In order to ensure rapid adoption, ICANN could move swiftly to endorse and support the standard with a policy focused on encouraging consensus and interoperability in this area. In the short-term, end-users are going to demand enabling software to resolve IDNs immediately. ICANN can reduce the potential for fragmentation during the period before the final standard is issued by encouraging the distribution and adoption of these enabling technologies. If a non-ACE-based solution were to be chosen, we would expect to see a much slower deployment and adoption cycle. Many experts within the IETF believe that an infrastructure-based solution could take as long as eight to ten years to fully deploy, and we would expect to see a significant amount of fragmentation and non-interoperability in the area of IDNs as a result. |
Verisign | Only application developers can answer the first two parts of this question. The IETF can facilitate the process by developing an IDN standard in a timely manner. ICANN can facilitate the process by supporting the IETF's efforts and the eventual standard. |
Neteka | The speed of adoption will be dependent on the solution chosen and intellectual property rights (IPR) issues surrounding it. Existing browsers have already implemented some measures for multilingual domain names albeit often faulty and problematic, it is therefore likely that a patent protected approach might not be embraced by the browser community. Furthermore, Neteka believes that regardless of the standard adopted, there needs to be a transition period and registries will have to embrace a solution for them to be able to immediately deploy multilingual domains that can be used by most of the people on the Internet. This would very likely mean a hybrid solution more or less like that described in section A:1. |
Register.com | The speed at which IDN is adopted into applications may depend on the particular IDN solution that is adopted by the IETF. Some approaches are easier than others to implement at the application layer, and as a result would likely see faster uptake by application developers. |
JPNIC | Deployment of IDN-aware applications heavily depends on two things: IDN toolkit and definition of IDN in application protocol. When toolkit is prepared, applications such as telnet of ftp that treat hostname will be easily developed. But applications such as browser or mailer that treat domain name in application protocol won't. IETF or other organization such as W3C should define how IDN is treated in application protocol. ICANN should elaborate criteria whether each accredited registry properly adopts IDN technology. Also ICANN should support fundamental software budget such as BIND. |
TWNIC | (1)It's perhaps within one or two years. (2)Once if IETF finalize IDN standard, as soon as possible, the vender will adopt it. |
14. Will the IETF standard be interoperable with other IDN standards? What can be done to eliminate interoperability problems (assuming not all ccTLDs adopt the IETF standard)?
WALID | Given the diverse range of approaches currently deployed to support IDNs, it is impossible for the IETF to issue a standard that provides for complete interoperability with all existing deployments, nor is such an expectation reasonable. Adoption of any technical standard is of course voluntary, and we would expect user and market demands to promote standardization and uniformity in this area. To ensure interoperability during the transition period, WALID is adding support to our WorldConnect system enabler to enable end-users to continue to resolve IDNs that may have been registered using different standards. With a client-based approach such as WALID's, it is possible to support de-facto or national standards in addition to the final standard the IETF recommends. |
Verisign | There are no IDN standards at this time with which an eventual IETF standard could interoperate. There are various IDN experiments, none of which can be expected to interoperate with an IDN standard. We believe compliance with an IETF IDN standard should be a requirement for all ccTLD and gTLD operators now offering IDNs. |
Neteka | Regardless of the solution embraced by the IETF, Neteka's hybrid solution should be able to make sure that interoperability would not be a concern. It is already interoperable with some of the ccTLDs' solution as well as the IDNA solution currently contemplated by the IETF. Should a protocol extension approach be adopted, Neteka's solution is also prepared for it and could consolidate different approaches. In short, there is not too much interoperability concerns so long as alternative namespaces and unnecessary name checks are not created to complicate this problem. |
Register.com | Due to the wide variety of IDN approaches, it is likely that the IETF standard will not be interoperable with various other IDN approaches. For this reason, it is extremely important that all interested parties be active participants within the IETF process and that registries and registrars do not make irrevocable technology decisions prior to the adoption of a formal standard. |
JPNIC | There is no IDN standard yet. The IETF will standardize only one, so interoperability to be concerned locates between IDN and current DN. |
TWNIC | I think all the ccTLDs will follow the IETF standard. It is better dialogue with IDN users when IETF IDN WG defines the standard. Encouraging IDN users participate and involve IETF IDN WG would help for push forming IDN standards. |
15. Are there other end user needs concerning IDN that need to be addressed?
WALID | One question that has not been discussed sufficiently concerning IDNs is the use of IDNs in document contexts, such as URLs embedded in HTML or XML documents. End users are going to expect to be able to generate URLs containing domain names in native characters, so the IDNA approach (in its current form) needs to address these issues before it can be considered complete. |
Verisign | This survey appears to address key user needs. |
Neteka | Neteka believes that it is very important for multilingual domain names to be immediately usable by most client systems on the Internet today without requiring client side modifications or plug-ins. This is a very strong demand from all of Neteka's clients and represents the major concern for multilingual domain name registrants and users. The average user is usually not technically sophisticated enough to understand the complicities of multilingual domain names and will be frustrated and confused if multilingual names do not work as expected and the same as English names. Beyond providing multilingual characters, symbols and punctuations are also very important as a component of language. The introduction of multilingual characters open up the opportunity to introduce some symbols as well and they should not be excluded. |
JPNIC | Left hand side of E-mail addresses, path part of URL, electronic signature, and so on. RFC2825 addresses it clearly. Domain name is a fundamental component of communication on the Internet. The requirement of the end-user is not only resolving IDN as hostname, but also indicating certain entity on the Internet. |
TWNIC | Backward compatibility and general Internet application adoption. |
16. Are there any other technical issues we should know about?
Verisign | This survey appears to cover the major technical issues. |
Neteka | No matter how multilingual names are deployed, a set of problematic glitches would arise as the transition takes place and as users learn to understand more about these issues. The main reason being that the average user will not immediately understand why they might not be able to access multilingual domain names using their existing system. These could range from the client side software settings to the ISP settings or even the authoritative end hosting handling. A more technically comprehensive documentation on these known issues could be found at http://www.OpenIDN.org. Browser & DNS Client Application Issues - some browsers simply block all entry of domain names, others try to implement some form of transformation of the name causing loss of character information, which is sometimes irrecoverable. There are four main types of behaviors among the browsers and client side applications when encountering a multilingual domain name:
DNS Resolver Issues - in general, the DNS resolver resides at the ISP level. There are three areas of trouble for multilingual domain names at this level: 1) the ability to match multilingual requests with cached records; 2) the ability to refer the request accurately to its nearest match (TLD/root) server; and 3) the ability to cache the results of multilingual requests. It is very important that these "messengers" in the DNS do not choke on multilingual requests. Because the original DNS protocol itself is 8-bit capable, this middlemen level usually simply passes requests along the DNS path, however proxy and caching issues could complicate matters (Section 4). Authoritative DNS Databases - authoritative DNS databases include root servers, top-level domain (TLD) registry servers to individual domain hosts. While they are critical to the functioning of the Internet, especially for root servers and TLD servers, their tolerance to multilingual requests are higher because they seldom perform caching and will implement multilingual names only when they have prepared for it. Multilingual requests to root servers will either be authoritatively dropped because the particular TLD does not exist or will be referred to existing ASCII TLDs. Beyond the direct implications of multilingual domain names on the registration system and the domain resolution system, a handful of other peripheral issues arise as multilingual names are being introduced to the Internet: Proxy Servers & Cache Servers - first and foremost, proxy servers and cache servers will be affected because they depend on URLs and domain names to function properly. They also contribute to the blocking of multilingual names and thus present a huge barrier for multilingual names to be transparently deployed. A multilingual aware, patched version of Squid is currently available from Neteka. Web Servers & Digital Signatures - web servers are the next in line that requires some work in order to be able to perform accurate virtual hosting functionalities as this as well depend on domain names. Digital signatures and certificates are also an area of concern as they also uses domain names as a key identifier. As DNS security is being deployed, this becomes even more important. A multilingual aware web server based on Apache is also available from Neteka. Other Applications & Databases - besides the immediate critical transportation nodes, other applications such as databases that hold domain names and email addresses will have to be taken into considerations. These include customer databases, mailing lists and other directory, search as well as storage applications. Neteka's API solution for a quick fix for these applications is the NeMate library which utilizes an ASCII transformation engine to force multilingual names into unique ASCII identifiers without loosing character information. |
JPNIC | IANA should define ACE prefix (ACE identifier) as soon as possible. JPNIC proposed a determination process in draft-ietf-idn-aceid-01.txt. |
TWNIC | 1. Consider modify BIND and Internet application to support clean 8 bits (native encoding) and UTF-8 encoding environment, in order to accept IDN. 2. The technology of converse between Traditional and Simplify Chinese encoding. |
Klensin | The majority of the issues raised here are either protocol-design (or interpretation) or market behavior and analysis ones. They are important issues. But, they, especially the protocol ones, are not going to be settled properly by counting heads or otherwise determining a majority opinion from the community. More generally, I believe that this issue is, with the exception of one area that has, IMO, been persistently dodged, out of ICANN's scope:
|
Probst | 1. Naming
2. Verisign's "Testbed"
3. Verisign and NSI
4. Register.com
5. Client Applications
6. Blocking of MLDs
7. Ease of use of Whois.
8. Open Source
9. MLDs and "alternative TLDs".
|