III. Technical Plan (Including Transition Plan)

C16. The third section of the .org Proposal is a description of your technical plan. This section must include a comprehensive, professional-quality technical plan that provides a full description of the proposed technical solution for transitioning and operating all aspects of the Registry Function. The topics listed below are representative of the type of subjects that will be covered in the technical plan section of the .org Proposal. [ICANN will extensively review and analyze this section of the .org Proposal. The content, clarity, and professionalism of this section will be important factors in ICANN's evaluation of applications. We strongly recommend that those who are planning to apply secure professional assistance from engineers and/or other technical consultants to aid in the formulation of the technical plan and the preparation of the technical plan section of the .org Proposal.]

C17. Technical plan for performing the Registry Function. This should present a comprehensive technical plan for performing the Registry Function. In addition to providing basic information concerning the proposed technical solution (with appropriate diagrams), this section offers the applicant an opportunity to demonstrate that it has carefully analyzed the technical requirements for performing the Registry Function. Factors that should be addressed in the technical plan include:

C17.1. General description of proposed facilities and systems. Address all locations of systems. Provide diagrams of all of the systems operating at each location. Address the specific types of systems being used, their capacity, and their interoperability, general availability, and level of security. Describe buildings, hardware, software systems, environmental equipment, Internet connectivity, etc.

Locations of Systems

Registry services for critical elements of the Internet infrastructure must be housed in world-class facilities. The UIA team will operate the .org TLD from VeriSign's three major Internet data centers in the continental United States, from which it currently provides numerous critical Internet services, including:

  • Management of the Internet root zone and operation of 2 of the 13 Internet root servers (a.root and j.root)
  • Management of the .com, .net and .org registries
  • PKI certificate authentication
  • Trusted payment services

Additionally, the UIA Team has partnerships with major collocation facilities throughout the United States, Europe and Asia that provide increased reliability and redundancy for critical Internet services (e.g., nameserver resolution). These partner facilities must conform to rigorous standards and are subjected to a detailed physical inspection prior to being selected. The locations of the data centers and the partner collocation centers are depicted in Figure C17.1-1.

Broad Run Internet Data Center Sterling, Virginia
Lakeside II Internet Data Center Dulles, Virginia
Mountain View Internet Data Center  Mountain View, California
AOL Data Center Ashburn, Virginia
Internap Collocation Center Atlanta, Georgia
Terremark Collocation Center (NAP of the Americas) Miami, Florida
Internap Collocation Center Seattle, Washington
AOL Data Center Sunnyvale, California
Internap Collocation Center Los Angeles, California
TeleHouse Collocation Center London, United Kingdom
NIC/SE Data Center Stockholm, Sweden
Global Switch Collocation Center Amsterdam, Netherlands
KDDI Data Center Tokyo, Japan
PCCW Data Center Hong Kong, China

Figure C17.1-1: Data Centers and Partner Collocation Centers

Global resolution sites are critical for a TLD that has a global focus. The reliability and stability of any global TLD dictates that DNS responses be given as quickly as possible and as close to the point where the query is issued.

Each data center has the following features:

  • 7x24 onsite Network Operations Center (NOC)
  • Redundant UPS
  • Redundant generators
  • Redundant Power Distribution Units (PDUs) with redundant circuits to each equipment rack
  • Redundant fire suppression (FM200 gas as primary with dry-pipe individually activated sprinkler heads as back-up)
  • N+1 cooling and humidification

Despite the features designed into a single facility, no single facility can be depended upon for 100% reliability. For that reason, two of the major data centers house sufficient equipment to independently operate the .org registry.

Additionally, the Lakeside II data center is designed as two data centers in one, with separate infrastructure and security. This design, coupled with the ability to load-balance and/or shift services between facilities, provides for the most robust facilities infrastructure imaginable for the .org registry.

In this architecture (depicted in Figure C17.1-2), registry services are load-balanced between data center A and data center B at the Lakeside II facility and also between the Lakeside II and Broad Run facilities.

Figure C17.1-2: Data Center Redundancy

This architecture offers maximum protection and reliability for registry provisioning services. The secondary data center is located several miles away from the primary data center rather than across the country. This separation has three distinct advantages:

  • Critical data can be synchronized in real time from the primary facility to the secondary facility. Such synchronization is not possible with facilities located across the country.
  • The most technically qualified personnel are more easily and more quickly able to perform recovery services if the recovery site is a reasonable distance away and does not require either a trip by plane or the hiring and training of personnel at the secondary site.
  • Production RRP traffic can be load-balanced between sites as a normal mode of operation. Load balancing provides extra capacity as well has a high degree of confidence (in addition to formal testing) that the secondary site would be able to quickly assume all functions if a major event disabled the entire primary site.

Internet DNS is, by its very nature, quite robust, but this is no excuse not to invest in and implement additional DNS functions designed to improve DNS reliability and security. Each nameserver resolution site around the globe must adhere to strict facility standards. Beyond this, however, operational processes and procedures have been developed so that DNS services can be quickly moved from one site to another. A DNS "swing" site will continue to be maintained at the Lakeside II data center where DNS traffic from any of the 13 resolution sites can be quickly redirected. The swing site is a major element of the business continuity plan and will be used to support site maintenance.

Types of Systems

The hardware systems that the UIA Team will use for the .org registry have been extensively tested and validated in a state-of-the-practice engineering lab. IBM enterprise servers (e.g., S80 and P680 models) running the AIX operating system perform as database servers. Oracle is the DBMS used as the database. Application and gateway servers are a mixture of IBM and Linux servers. Web and FTP servers are a combination of IBM, Linux and Sun servers. Cisco, Foundry and Alteon provide network and load-balancing equipment.

The server functions will be protected with hot stand-by servers, using IBM HACMP for automated failover monitoring and execution. The data itself will be housed on EMC Symmetrix equipment and will be synchronized in real time to multiple secondary EMC Symmetrix devices located in multiple data center facilities. This architecture will be capable of processing more than 300,000 transactions per minute and operates at reliability rates of better than 99.9%. More details of the demonstrated capacity and reliability of this architecture are discussed in Section C17.3.

It is contrary to security policy to publish the architecture or capacity of the global DNS constellation because of its criticality to the stable operation of the Internet and because it is often a target of Internet "hackers". However, that architecture does utilize redundant hardware systems with no single points of failure. It currently handles more than 90K queries per second on a regular basis and has successfully handled peaks of nearly 400K queries per second.

Facility Infrastructure

A facility must be judged by the robustness and reliability of the infrastructure that supports elements such as electrical power, cooling, humidity and fire suppression. Data center facilities proposed for use by the UIA Team are designed to provide a redundant electrical infrastructure all the way to the individual racks in the data center. As an example, Figure C17.1-3 shows the electrical infrastructure design of the Lakeside II facility.

Figure C17.1-3: Electrical Infrastructure

With this design, there are no single points of failure. Since most I/T equipment today is outfitted with dual electrical connections and dual power supplies, each component of the electrical infrastructure, from the street to the CPU, is fully redundant. Redundancy also supports scheduled maintenance. The electrical infrastructure is designed to continue to provide service in the event of a failure or planned shutdown of any component.

Each data center HVAC and humidity control system is designed to a minimum of N+2 redundancy. This means that two HVAC and/or humidifier units could fail (or be taken down for maintenance), and still provide proper cooling and humidity. Temperature is maintained at 70 degrees Fahrenheit with a variance of plus/minus 3 degrees. Humidity is maintained at 50% with a variance of plus/minus 5%.

Primary fire suppression is provided by FM200 gas with individually activated sprinkler heads as secondary. The sprinkler system is "dry pipe," which means that compressed air keeps water out of the overhead pipes in the data center to avoid the risk of water leaks damaging equipment. In the event of an FM200 discharge, all power to the data center would be turned off. However, the FM200 will not damage equipment, and a data center equipped with FM200 can be back up and running following a discharge as soon as the reason for the discharge is identified and fixed. No equipment cleanup is required.

Data Center Design

At the primary data center facility, an extra step has been taken by designing two separate data centers in one facility. Each data center has its own electrical infrastructure, HVAC, humidity control, and fire suppression. This design provides three distinct advantages. First, redundant system architectures that might not normally be distributed across geographically separate facilities can be distributed between two data centers that have completely separate supporting infrastructures. Figure C17.1-4 shows this architecture as it applies to the .org registry equipment.

Figure C17.1-4: Redundant Data Centers within a Single Facility

The second advantage is that although production is distributed between both data centers, development and test equipment is kept in only one. Therefore, one data center will be dedicated solely to production equipment and services. There will be less activity in this data center, making it less susceptible to "collateral damage" that can occur as a result of changes. Finally, different levels of physical security will be applied to each data center, ensuring that staff having access to development and test systems are not able to access the data center that is dedicated to production services.

Server Architecture

The UIA Team will use a three-tiered architecture for the .org registry as shown in Figure C17.1-5. Technologies applicable to each tier will be used to provide redundancy at each tier. For example, at the database tier, the EMC SRDF product will be used to replicate data in real time to multiple locations, both inside the data center (e.g., between multiple data centers in the same facility) and to the secondary data center. Additionally, hot stand-by servers with automated failover using IBM's HACMP function provide for redundancy of the database server. Load-balancing the transactions across multiple gateway servers and application servers will provide reliability and redundancy in the other tiers.

C17.1-5: Redundancy at Each Server Tier

Software Systems

Software systems include more than just the products used and their design. They also include the process by which the software was designed, developed, tested, and deployed. As shown in the previous section, the proposed .org registry is designed in a three-tiered architecture. This structure separates gateway functions (e.g., login, session management, service auditing), application functions (e.g., business rules) and database functions. In doing so, security will be improved, problems more easily diagnosed, and modifications more easily and reliably tested and deployed. Standard industry software products (e.g., Java, C, and C++) will be utilized as appropriate at each tier to facilitate performance and compatibility. WebLogic will be used for web application server development. A rigorous quality assurance and testing methodology will be utilized that includes a separate, fully functional, production "look alike" environment where new software can be tested prior to deployment. Additionally, a "staging" environment enables deployments to be practiced repeatedly to ensure that they can be executed seamlessly within maintenance windows. The staging environment also enables an accurate prediction of the length of a deployment and back-out plan, if necessary.

Level of Security

If natural disasters weren't enough of a risk, in this post 11 September age, no one doubts that physical facilities present the most obvious target for terrorist activities. The data center facilities of the UIA Team possess the most obvious characteristics of security, including:

  • Low profile (e.g., no external markings or signage)
  • Hardened against regional weather events (e.g., high winds or hurricanes)
  • Located outside of flood areas
  • Multi-level physical security, including 7x24 onsite security force, badge readers and biometric access control devices
  • 7x24 video surveillance

A dedicated I/T security team provides for logical (or I/T) security. This dedicated team will be responsible for the development and implementation of security standards, the management of security devices (e.g., firewalls), security monitoring, audits and tests (including third-party penetration tests) and working within government and industry I/T security forums. The characteristics of the proposed I/T security functions are outlined in greater detail in Section C17.9.

Internet Connectivity

Internet connectivity is a critical element for any facility supporting registry and global DNS functions. Sufficient bandwidth is the primary defense against Denial of Service (DOS) and Distributed Denial of Service (DDOS) attacks. Internet connectivity is provisioned through multiple providers and through multiple physical routes. The Lakeside II and Broad Run data centers have multiple DS-3 and OC-3 connections to the Internet provisioned through diverse providers. At data center facilities, redundant Internet connections enter the facility through diverse cable conduits, travel to the border routers via separate conduits within the facility and terminate at border routers positioned in separate cabinets located in different sections of the data center. Nameservers positioned with collocation partners have a minimum of diverse 100mb connections. "Super" sites have diverse 1gb connections. Although currently only one "super" site exists, three more are planned by the end of 2002. Currently, five OC3s connect the primary and secondary data centers. They will soon be replaced by dark fiber. As Figure C17.1-6 also shows, OC48 fiber has been run and is ready to be "lit" if needed.

Figure C17.1-6: VGRS Data Center Network Connectivity

The data center facilities proposed by UIA and the facilities of its collocation partners provide the secure underlying physical infrastructure required to support a growing critical Internet infrastructure at a time when external attacks (both physical and logical, malicious and non-malicious) are an ever-growing reality. All facilities are available for inspection by ICANN.

 

Back to Table of Contents