D15.2 Technical Plan for the proposed registry operation

 

This should present a comprehensive technical plan for the proposed registry operations. In addition to providing basic information concerning the operator's proposed technical solution (with appropriate diagrams), this section offers the registry operator an opportunity to demonstrate that it has carefully analyzed the technical requirements of registry operation. Factors that should be addressed in the technical plan include:

 

The technical plan for KDDISOL's new gTLD Registry is composed of two parts.  The duration of the first is expected to be approximately one year.  During this time our subcontractor, VeriSign Global Registry Services “NSI”, will be responsible for operating a "virtual" registry.  This will enable us time to secure facilities, equipment and, most-of-all, technical know-how which we believe to be vital to the successful implementation of the new gTLD.  Every effort will be made to expedite the transition from this initial operational period to "Phase 2" at which time we will assume full operating responsibility for the registry.  We believe that by utilizing the high-level of technology and experience of NSI we will comfortably introduce not only the new gTLD but also ourselves as the registry operator.

The overriding concern which directed us to opt for a phased approach and to use the services of NSI was that we considered the requirement to assure the continued stability of the internet to be the highest priority of the global community regarding the introduction of new gTLDs.  As is set forth in this document, the experience and capacity of KDD in basic research, technical innovation and international marketing and telecommunications provides a remarkably compatible opportunity for the selection of an Asian-based registry.  There is no doubt that we might have been able to successfully implement our new gTLD without the assistance of NSI.  We believe, however, that speculation about a matter of this significance was not in the interest of either ourselves or the global Internet community.  It was in this environment that our technical plan for the proposed registry was developed.

 

D15.2.1        General Description of Facilities and Systems

Address all locations of systems. Provide diagrams of all of the systems operating at each location. Address the specific types of systems being used, their capacity, and their interoperability, general availability, and level of security. Describe in detail buildings, hardware, software systems, environmental equipment, Internet connectivity, etc

 

Figure D15.2-1 - Domain Name Registration and Resolution Overview

 

A Shared Registration System (SRS) and Top-Level Domain (TLD) infrastructure are the two major components of the Registry. The Registry SRS enables the Registration Service, Directory Service (Whois), and Customer Service, while supporting the Domain Name Resolution Service by generating and distributing zone files. The TLD system provides the infrastructure and common platform for the Domain name Resolution Service.

The SRS is a protocol and associated hardware and software that permits multiple registrars to provide Internet domain-name registration services within the TLDs administered by the Registry. The SRS provides equivalent access to all Registrars to register domain names in the TLDs administered by the Registry. The System will generate the zone files for the new TLD and distribute them to a TLD constellation to enable domain-name resolution across the Internet. 

A Whois service will be provided through the SRS that will allow users to query the availability of a domain name. 

Registrars access the System through a Registry Registrar Protocol (RRP) to register domain names and perform domain name-related functions such as registering name servers, renewing registrations, and deletions, transfers and updates to domain names registered by that registrar. Registrars have a web-based interface to access the System to perform administrative functions, generate reports, perform global domain name updates, and perform other self-service maintenance functions not available through RRP. 

The Registry invoices the registrars for the domain names registered, renewed, and transferred. The Registry provides support to the registrars through Customer Support Representatives (CSRs). The CSRs have their own web-based interface to the Registry, through which they can query and perform updates per the registrar requests after authenticating the registrar. Registry CSRs are trained to provide first-level customer support, and are proficient in customer care skills.

Other external interfaces include Registry users who perform Whois queries to the System to determine the availability of a particular domain name or names. The Whois service is available via both a standard command-line interface and a web-based interface.

The TLD infrastructure includes geographically dispersed TLD name servers.  These name servers will be located within the Internet at the topological cores, which roughly correspond to major peering centers for the backbone network providers.  Locating these servers at or near the major peering centers ensures low-latency access from networks that carry the bulk of the Internet traffic. Initially, there will be three name servers located in Asia, and the United States. Overall performance of the Internet and the services that depend on name resolution is enhanced by this server placement strategy.

 

D15.2.1.1 System location

In phase 1, Services will be provided by KDDISOL by using its subcontractor  VeriSign Global Registry Service’s new state-of-the-art facility in Lakeside II, a 101,875-square-foot office building in Lakeside @ Loudon Tech Center in Sterling, Virginia. The space will include the computer facility, known as the data center, and most personnel involved with the Proposed Registry, including operations personnel, engineering, quality assurance staff, administrative support staff, and customer care support staff (See D15.2.1.7.1).

In phase 2, Services will be provided by KDDISOL’s owned servers in KDD Otemachi Building, a 380,000 square-feet office building in Otemachi, Tokyo. Also KDDISOL will distribute the gTLD servers worldwide, ensuring geographically and topologically diversity (See D15.2.1.7.2 and D15.2.1.7.3).

 

D15.2.1.2 System/network Diagrams

D15.2.1.2 .1 Registry Architecture

Figure D15.2-2 Sample Registry Architecture

 

D15.2.1.2.2 System configurations

The registry onsite and TLD system configurations will consist of multi-processor UNIX configurations with up to 16GBs of memory. Other equipment used to support the Registry includes large capacity border routers, high-performance firewalls, load balancers, and switches. The entire system and network are built so that there is no single point of failure, and includes mechanisms to automatically fail over when errors are detected. A second level of redundancy is provided by an offsite Disaster Recovery (DR) facility where the Registry processes can be migrated on short notice.

To accommodate future growth the configuration can be scaled to handle additional registrar connections and registrations. There is an n-to-n relationship of RRP Application Gateways to RRP Application Servers; depending on where the bottlenecks occur additional servers can simply be added. Because changing the database systems is more complex, it is designed to support the full complement of registrations expected over the next four to five years.

Equipment, processes and procedures have been designed for the seamless operation and support of the Registry and TLD systems. A Registry Command Center(RCC) will be established and equipped with the latest monitoring tools for monitoring all the components on a pro-active basis in order to identify and resolve issues before they become problems. There will be an isolated Operations, Test, and Evaluation (OT&E) environment for Registrars to test their interface to the SRS software. KDDISOL will also test any new versions of SRS software or hardware configuration upgrades before they are introduced into the production environment.

 

Table D15.2-1 Equipment List for Sample Registry Architecture

Description

Product

No.

Networking

Load balancer

2

Cisco Router

2

RRPAG

Unix Server

2

DNS

Unix Server(Sun)

2

Web

Unix Server(Sun)

2

Mail

Unix Server(Sun)

2

FTP

Unix Server(Sun)

2

Firewall

Redundant Firewall

2

Dynamo

Unix Server(Sun)

2

Whois

Unix Server(Sun)

2

RRPAS

2CPU Unix Server

2

2CPU Unix Server

2

RRPRS/SS

2CPU Unix Server

1

External Storage

1

ZCK

2CPU Unix Server

1

External Storage

2

DB1

2CPU Unix Server

1

External Storage

1

DB2

2CPU Unix Server

1

External Storage

1

Storage

EMC

1

Backup solution

EDM Symmetric

 1

 

D15.2.1.3 System Capacities

The systems will be initially configured with up to 16GB of memory and up over 100GB of storage. This is more than sufficient to support the introduction of a new TLD.  When needed, the systems are scalable both vertically through the addition of memory and disk space, and horizontally with additional systems. 

 

D15.2.1.4 System Interoperability

The Shared Registration System (SRS) is a protocol and associated hardware and software that permit multiple registrars to provide Internet domain-name registration services within the TLDs. It has been designed and is operated as a single, interoperable system, where each component is a critical element in the registry processing. An extensive evaluation and quality assurance process ensures compatibility and interoperability when new features, software, or hardware are added to the system.

 

D15.2.1.5 System Availability

The objective of the Registry design is to provide 100% planned system availability.  This is accomplished through complete system and configuration redundancy, and a process commitment to not execute any system or application changes until they are thoroughly tested in an isolated Operations, Test, and Evaluation (OT&E) environment. 

 

D15.2.1.6 Facility and Site Descriptions

D15.2.1.6.1 VeriSign Global Registry Production Data Center.

This data center is located in the Lakeside II building in the Lakeside Technical Center in Sterling, VA. The 10,600 data center is operated 24x7x365. Onsite staff from the Registry Command Center (RCC) operates and monitors the site and the equipment in the data center room. This data center is not located in any flood plains. Ceiling height is a minimum of 8.5 feet with ventilation being provided via under-floor airflow generated by eight air-cooled HVAC units of 25 tons each, providing for N+3 redundancy.  Temperature is maintained at 70 degrees Fahrenheit +/- 2 degrees. Static conditions are maintained within equipment manufacturers tolerances.

Power to this facility is routed through a Uninterruptible Power Supply (UPS) capable of sustaining the data center for at least 15 minutes. However, the UPS is needed only for the few seconds it takes for a 750KW generator to start automatically. A second 900KW generator is available as additional backup. Power is routed through eight power distribution units (PDUs) with each server being redundantly supplied via two separate PDUs. All racks and equipment are grounded.

 

D15.2.1.6.2 KDD Otemachi Building

This data center is located in KDD Otemachi Building in Otemachi, Tokyo. It is operated 24x7x365. Onsite staff from the Registry Command Center (RCC) operates and monitors the site and the equipment in the data center room. This data center is not located in any flood plains. Ceiling height is a minimum of 9 feet with ventilation being provided via under-floor airflow generated by air-cooling units on each floor. Temperature is maintained constant at all the time. Static conditions are maintained within equipment manufacturers tolerances.

Power to this facility is routed through a Uninterruptible Power Supply (UPS) capable of sustaining the data center for at least 15 minutes. However, UPS is needed only for the few seconds it takes for a 3000KW generator to start automatically. All racks and equipment are grounded.

 

D15.2.1.6.3 TLD Remote Sites

KDDISOL will distribute the gTLD name servers worldwide to best serve the Internet community. Each remote site is required to meet high standards for support of the TLD servers. The geographically and topologically diverse sites provide space in secure, high-availability collocation centers designed and built using industry best models. At these sites, TLD servers are housed in secure areas and supported by n+1 power and cooling capabilities. They are redundantly connected to the facility’s switching fabric with full-duplex 100Mbps connections and have diverse access to large capacity backbone circuits. Access to the TLD servers is controlled by Access Control Lists (ACLs) on border routers that exclude all traffic from the Internet other than UDP and TCP queries. There are 99.7+% uptime requirements for connectivity, power, and cooling to ensure uninterrupted availability.

 

D15.2.1.7 Internet connectivity

Refer to Network Capacities in Section D15.2.10.3

 

D15.2.2 Registry Registrar Model

Please describe in detail.

 

D15.2.2.1 Registry Registrar Model

The Registry accepts registrations and registration service requests from all accredited, licensed registrars, while protecting the integrity of registrations from unauthorized access and interference by third parties. Every new domain name application is checked to ensure that the domain name is not already registered. This function demands exceptional speed and accuracy to confirm registrations definitively and to arbitrate near-simultaneous requests for the same domain name.

Domain name registrations and name servers, including domain name, name servers, IP address, registrar name, transfer date, registration period, expiration date, status, registration creation date, created by, updated date, and updated by information is maintained by the Registry. The Regis­try is the authoritative source for its TLD zone file content (i.e., domain name, name server, and associated IP address). This is commonly referred to a “thin” registry model. The registrar of the particular domain name or name server maintains all other customer data. This protects customer privacy, gives greater flexibility to registrars, and allows them to determine their business model. KDDISOL will have a formal con­tractual relationship with each individual registrar accredited for registering domain names in their new TLD. 

The Registry database used to support inquiries to identify the registrar associated with a specific domain name is currently called “Whois.” Whois enables registrars and potential registrants to establish the availability for registration of selected domain names. Internet users also use it to identify the registrar controlling a domain name.

Registration of a domain name or name server in the Registry database does not automatically create entries in the Internet DNS. For this to occur, a zone file associating all registered domain names with their corresponding IP addresses is generated and exported to the DNS root servers for the TLD. KDDISOL will operate and maintain distributed root servers to which the zone file is exported and from which the domain name information is dis­seminated to the Internet community. The deployment and operation of the new TLD name servers is the responsibility of KDDISOL.

To enable close to 100% Registry availability, multiple data­base servers are used, with off-site backup to protect against catastrophic data loss. Redundancy is found at almost every level within the Registry to ensure high-availability of the systems and applications for the Registrars.

SRS is the Registry architecture and processes used to enable registrations by multiple registrars. It includes the Registry Registrar Protocol (RRP), which is used to support communications between the Registry and Registrars, and provides the security and authentication functions to protect the Registry database while supporting all necessary registrar operations. RRP is also used during the certification process for accredited registrars for operational testing and evaluation of registrar implementations of the RRP prior to commencement of actual registrar operations.  KDDISOL will be responsible for providing the RRP software interfaces, documentation, and training to accredited registrars for the new TLD. Hands-on technical support to new registrars should be available from KDDISOL to assist them in resolv­ing difficulties in successful inter­facing with the Registry.

 

D15.2.2.2 RRP Description

The Network Solutions, Inc. Registry under the auspices of the Shared Registration System program developed RRP. The protocol was initially deployed in April 1999 as part of a test bed implementation of the Shared Registration System with five registrars.  Additional registrars began using the protocol in July 1999. RRP has been published as Informational RFC2832, and that open source software is available for both clients and servers.

The Registry stores information about registered domain names and associated name servers. A domain name's data includes its name, name servers, registrar, registration expiration date, and status. A name server's data includes its server name, IP addresses, and registrar. RRP provides a mechanism to perform various functions to domain names, such as:

 

·        Update the name servers of a domain name.

 

Each RRP session is encrypted using the current Secure Socket Layer (SSL) v3.0 protocol. SSL provides privacy services that reduce the risk of inadvertent disclosure of registrar-sensitive information, such as the registrar's user identifier and password.

All registrant specific information is retained by Registrars.

 

D15.2.3 Database Capabilities

Database size, throughput, scalability, procedures for object creation, editing, and deletion, change notifications, registrar transfer procedures, grace period implementation, reporting capabilities, etc.

 

D15.2.3.1 Size

The Registry uses Oracle RDBMS to store all of the domain names for a TLD. Since the size of the Registry is determined by the number of domain names which are to be stored, the size will vary as new domains are added. Oracle is used by many organizations around the world to store large amounts of information – in many cases, significantly more than will be required for even the largest domain.

 

D15.2.3.2 Throughput

The throughput of the system is dependent upon several different factors of the hardware being used; number of processors, amount of memory, disk drive configuration. The current configuration can support well in excess of 600 million transactions a month. 

 

D15.2.3.3 Scalability

Oracle has sufficient ability to scale in a variety of different methods based upon the requirements being placed upon it. However, based on the anticipated size of the new gTLD domain, there will be no problem with Oracle database scaling.

 

D15.2.3.4 Object Management

The Registry implementation performs management of the Registry objects at both the database and business layer levels. In general, the business layer validates any request to the database and an Oracle stored procedure is used to perform the actual changes to the database.

 

D15.2.3.5 Domain Level Capabilities

D15.2.3.5.1 Change Notification

For each instance where a second level domain holder wants to change its Registrar for an existing domain name (i.e., a domain name that appears in a particular top-level domain zone file), the gaining Registrar shall obtain express authorization from an individual who has the apparent authority to legally bind the second level domain holder (as reflected in the database of the losing Registrar). In those instances when the Registrar of record is being changed simultaneously with a transfer of a domain name from one party to another, the gaining Registrar shall also obtain appropriate authorization for the transfer. This information shall be provided to the losing registrar if requested. The form of the authorization is left to the discretion of the gaining registrar.

The registration agreement between each Registrar and its second level domain holder shall include a provision explaining that a second level domain holder will be prohibited from changing its Registrar during the first 60 days after initial registration of the domain name with the Registrar.  

 
D15.2.3.5.2 Registrar Transfer Procedures

The transfer procedure is an RRP command executed by the gaining registrar.

 
D15.2.3.5.3 Grace Period

The SRS automatically will renew domain names as their current registration periods expire. Following an auto-renewal, a Registrar has a 45-day grace period to delete the domain name. Any names not deleted during the 45-day grace period will be included on the auto-renewal invoice.

 
D15.2.3.5.4 Reporting

The system will be able to produce a variety of reports to help monitor and analyze the type of operations performed on the system. These reports are summarized in the following table:

 

Table D15.2-2 Registrar Reports Summary

Generation Date

Type/Description

Audience

How Available

Daily

Describe registrar transactions pertaining to that particular registrar

Registrar-specific report to that registrar

Registrar tool or FTP site

Transfer

Describe domain transfers pertaining to that particular registrar

Registrar-specific report to that registrar

Registrar tool or FTP site

Common

Each row contains a full Registrar description.

ICANN, Third-Party Escrow Company

FTP Site

Weekly

Total domain name count, total name server count, total domains hosted by name server count

Registrar-specific report to that registrar

Registrar tool or FTP site

 

D15.2.3.6 Registrar Add/Delete/Modify Procedures

Adds, changes, and modifications to the domain name records are performed by the registrars through RRP. During the certification process the Registrars are instructed on how to process new registrations and make changes to existing records.  

Refer to Section D15.2.14.5 for a complete description of the Registrar Tool that the registrars use to interact with the backend registry

 

D15.2.4 Zone File Generation

Procedures for changes, editing by registrars, updates. Address frequency, security, process, interface, user authentication, logging, data back-up.

 

D15.2.4.1 Registrar Manipulation of Zone Data

Registrars can access their domain data via three methods (presented in order of automation):

 

1.      RRP protocol as specified in the Informational RFC 2832.

  1. Using a web browser and the Registrar Tool web interface, which in turn uses RRP to communicate with the Registry database.

3.      Contacting the Registry Customer Service Representative who uses the Customer Service Tool web based interface to access and manipulate domain and registrar data directly for unusual scenarios.

 

D15.2.4.2 Zone File Generation Process Overview

Custom applications have been developed to securely and accurately extract domain registration data from the registry database to construct the appropriate zone files. The overall process is as follows:

 

1.      A database “snapshot” is prepared

2.      Custom applications are launched to extract data from the database and format the data into zone files

3.      Validation checks are performed on the static zone files

4.      Zone files are loaded on production-like servers and dynamic checks are performed against the server

5.      Validated zone files are moved to the zone distribution process

 

D15.2.4.3 Validation

After the zone files are created, a number of checks are performed against the files to ensure they contain valid data in the proper format. Serial numbers, data values, and file size checks are performed on the resultant static zone files.

The zone files are then copied to a name server (to simulate the distribution process) and loaded to verify the named application loads properly. After the process is started, the name server-logging file is reviewed to verify that no error messages resulted. Once the name server is operational, the following the serial numbers are verified again and sample queries are run against the database.

 

D15.2.4.4 Frequency

Zone files are generated at a minimum twice daily at 12-hour intervals. The database is constantly being updated but the zone files are generated from a point-in-time version of the database to avoid corruption of previously extracted data.

 

D15.2.4.5 Security

The RRP Application Gateway (RRPAG) is a gateway to the RRP Application Server (RRPAS) from the outside world. The Application Server runs behind the firewall, whereas the Gateway runs on a machine that is visible to the outside world and listens on a well-known port. Registrars connect to RRPAG using SSLv3.

The primary purpose of the Gateway is to provide transport layer security using SSLv3.  The initial connection to the RRPAG is authenticated by RRPAG based on the X.509 certificate that it presents at the time of the connection. After a successful SSL handshake, the Gateway opens a dedicated connection with the Application Server for the connecting entity.

RRPAG is connecting to the outside world, so is vulnerable to be attacked. If RRPAGs are suffered by DoS (Denial of Service) or DDoS (Distributed DOS) attacks, any connection will not be possible through RRP. To prevent the system to be hacked, Intrusion Detection System (IDS) must be introduced to find invasion and to take actions to it. KDDISOL proposes to utilize a sophisticated IDS named EMERALD developed by SRI International. EMERALD should be applied to all servers outside the firewall such as DNS, FTP, WWW servers as well as to servers in the internal network. See more details in D15.2.9.1.

The database and zone generation and validation process is conducted on the registry internal network and systems protected by firewalls that restrict access to the network.  A File Replication Tool that allows files to be copied via encrypted channels between hosts controls file replication between the systems behind the firewalls.

Access to the systems is limited to a “need-to-know” basis. Physical data center access is limited to selected Registry engineering and operations staffs. System logon IDs and passwords are provided only to technical staff in Operations who are involved in the zone generation and distribution processes, and secure shell (SSH) is used for all logins.  User logins are monitored and logged for audit purposes and to recreate any sequence of events if a failure occurs.

 

D15.2.4.6 Interface

The zone generation process is done via custom interactive applications that are controlled by operations personnel. Some applications are automated but manual checks are performed at many points in the process to ensure proper construction of the zone files before they proceed to the distribution process.

 

D15.2.4.7 User Authentication

All production registry systems require the use of SSH with public/private keys and encryption for interactive login sessions.

 

D15.2.4.8 Logging

All transactions that impact the zone files are captured in activity and status log files using standard (e.g. Syslog) and custom-built logging utilities.

Processing logs will be created to capture processing statistics, such as number of records processed, passed, or failed, for each audit rule.  The format of the logs will comply with the monitoring tool requirements so that the monitoring tool can be used to monitor the processing. 

The Customer Service Representative (CSR) and Registrar tools use the registration system’s configuration-driven logging system. The developer and operator can specify how to log messages, given their origin, type, and severity. The log message provides valuable information to pinpoint when the event occurs and for what reason.

 

D15.2.4.9 Backup

The EMC Data Manager (EDM) Symmetrix Timefinder Replication tool is used by the Registry to perform backups of the systems and databases. Timefinder is a utility that allows one to make exact physical copies of Symmetrix disk volumes, on a second set of Symmetrix disks called Business Continuance Volumes (BCVs). The BCVs can then be mounted on a server, producing an exact physical copy of the original disks. Timefinder can be integrated with Oracle's online backup procedure to allow the replication of a database instance, as well as greatly enhance the speed and functionality, of database backup and recovery. The copied data is then backed up to tape, by an EDM network backup.

 

D15.2.5 Zone File Distribution and Publication

Locations of nameservers, procedures for and means of distributing zone files to them.

 

D15.2.5.1 Name Server Location

TLD name servers will be located in diverse geographic locations and on diverse Internet Service Provider (ISP) networks. The select TLD server sites will all be housed within leading Internet collocation centers located at or near major centers of peering among Internet backbone providers. Each of these sites will be chosen using a rigorous set of requirements covering network, security, power, fire suppression, and other key factors. In terms of network availability, the following requirements are met by all of the sites:

 

·        Diverse Internet connectivity – minimum of two diverse circuits,

·        Extensive public and private peering – number and quality of peering and transit relationships in force at each of the proposed facilities,(At Otemachi, Tokyo, 2.4Gbps peering connections to over 50 ISPs.)

·        Fully redundant routing and switching infrastructure – each facility network follows accepted best practices for high-availability including the use of multiple ingress/egress routers, dynamic routing protocols (BGP and OSPF), redundant layer 2 switching infrastructure, and HSRP (or VRRP) for default router redundancy, and

·        Facilities – secure facilities with n+1 power and cooling capabilities

·        On site support – each facility operator has a 24x7x365 NOC with on-site “hands and eyes” support.

 

D15.2.5.2 Distribution Procedures

Zone files are distributed by a completely separate infrastructure than the zone generation process so the two processes do not impact one another. Once the extraction process generates zone files, they are transferred to dedicated machines for preparation and distribution to TLD servers.

TLD zones will be distributed on a separate infrastructure from the .com, .net & .org infrastructure to avoid interruption of service. The Service Level is designed to be comparable.

Distribution of zone files is performed by the rsync application over an encrypted channel using SSH and an encrypted private VPN to all TLD servers. Distribution via this method uses compression and a Unix “diff” type file to decrease transfer time, and uses MD5 to verify the integrity of the file received after the transfer process. Multiple instances of the process will be started to update all TLD servers within a narrow time interval. Name servers are restarted at staggered intervals to avoid disrupting DNS service and to also ensure the proper operation of name servers with the new zone files.

The distribution procedure will be semi-automated and closely controlled and monitored by operations personnel.  NOC personnel monitor the distribution process from start to finish and can intercede at any time should a situation require the interruption of the process.

 

D15.2.5.3 Validation

Operations personnel use an MD5 checksum application on the final TLD zone file to verify its integrity with the reference zone file. One the zones are verified, the name server will be restarted. Operations personnel will monitor the name server error log files during application restart to verify the error free loading of new zone files.  Dynamic queries will then run against the name server to verify proper operation and accurate responses.

 

D15.2.6 Registrar Billing

Technical characteristics, system security, accessibility.

 

Finance reports are used for financial analysis of KDDISOL’s Internet domain name registration business and for billing purposes. These reports facilitate KDDISOL’s invoice preparation and distribution processes and aid registrars in invoice reconciliation.  Finance reports are available to Registry staff through the Registrar tool of the Shared Registration System (SRS) and the reporting server FTP site.

Detail and summary reports are produced on a monthly basis for billing. Only summary reports are generated for revenue analysis and made available internally to the Finance department. Detailed reports with domain names that meet specified criteria for registration renewals, transfers, and deletions are distributed to each registrar.

 

D15.2.6.1 Billing Tools

Registrar Tool. A Registrar will be able to check its available credit using the Registrar Tool on the Registry’s web site.

Low Balance Emails. Prior to beginning registrations, each registrar selects a “Low Balance Notification Percentage” value. The Low Balance Notification Percentage indicates at what point a registrar wishes to be notified of a low account balance.  When a registrar’s available credit is equal to or less than its Low Balance Notification Percentage times its total credit limit, the system sends automated email notifications to the Registrar’s routine email address. Emails are generated at 7:00 AM and 7:00 PM JST(Japan Standard Time).

 

D15.2.6.2 Technical Characteristics

The Registry provides billing reports to their Registrar customers that will allow them to review and reconcile their accounts. These reports are generated automatically and made available through a secure web site or from a secure FTP server. The Registry also uses these reports to prepare monthly invoices, which are currently manually prepared and submitted. No changes will be made to the SRS for billing at this time until volumes increase to a point where manual processes are inadequate.


Table D15.2-3 Billing Report Summary

Generation Date

Type/Description

Audience

How Available

Weekly

Summary revenue. Subtotals by registrar within each report

Finance

E-mail distribution

1st of month

Summary revenue

Finance

E-mail distribution

7th of month

Summary billing & revenue

Finance

E-mail distribution

7th of month

Detail reports for registrations, transfers, extensions, and refund/no refund deletions for each transaction type; registrar-specific

Registrars (registrar-specific info only)

Registrar tool and FTP site

17th of month

Auto-renewal

Finance

E-mail distribution

 

Examples of the reports to be generated for the registrars are as follows:

 

Table D15-2.4 Billing Report Examples

Monthly Billing Reports (Detailed and Summary as currently in SRS)

·        Monthly Registration Report

·        Monthly Transfer Report

·        Monthly Auto-renewal Report

·        Monthly Additional Years Added Report

·        New Registration Deletion Report (Refund and Non-Refund)

·        Auto-Renewal Deletion Report (Refund and Non-Refund)

·        Additional Years Deletion Report (Refund and Non-Refund)

·        Transfer Deletion Report (Refund and Non-Refund)

 

Revenue Reports (Monthly and weekly as currently in SRS)

·        Registration Report

·        Transfer Report

·        Auto-Renewal Report

·        Additional Years Added Report

·        Auto-renewal Report

 

D15.2.6.3 Accessibility and Security

There are two ways to access the registrar billing reports: through the Registrar Tool using a browser, and by logging on to a secure FTP site and downloading the reports.  IP filtering based on source address restricts access to the FTP server to accredited registrars, and all logon attempts are logged and periodically checked.

Denial of Service (DoS) attacks occur when one or more systems flood a network or individual services on that network with disruptive traffic. These attacks may come from many source addresses–a so-called distributed DoS (DDoS) attack–or from a single address. In either case, recovery options are limited and involve quenching the source of the attack either by filtering traffic at network routers or tracing the attack back to the origin and taking the originating server(s) off the network.

Therefore, it is very crucial for us to find DoS and DDoS as soon as possible. Automatic detection mechanism should be required. KDDISOL will establish IDS (Intrusion Detection System) called EMERALD which adopts excellent intrusion detection technique.

All logon access to the registrar billing information is limited to specific points of contact at the registrars, who are provided unique IDs and passwords. Any changes to registrar contacts must be authorized and authenticated through Customer Support.

 

D15.2.7 Data Escrow and Backup

Frequency and procedures for backup of data. Describe hardware and systems used, data format, identity of escrow agents, procedures for retrieval of data/rebuild of database, etc.

 

D15.2.7.1 Overview

The goal of the Escrow Process is to periodically encapsulate all Registrar-specific information into a single Escrow File and to make this file available to a third party for escrow storage.

Existing Daily and Weekly reports as well as a new Registrars Report will be used to construct the Escrow File because these reports, when taken together, describe completely the entire set of Registrars.

The Escrow Process employs a method of encapsulation whereby the Daily, Weekly, and Registrar reports are concatenated, compressed, signed, and digested into a single file. The format of this encapsulation enables the single file to be verified for Completeness, Correctness, and Integrity by a third party.

 

D15.2.7.2 Escrow Process

Steps of the escrow process require that a format file be created for each report file. A “tar” utility is used to concatenate the files into a single data file, which is then compressed. For authentication, a digital signature is applied to the data file. A “checksum” algorithm is then used to check the data value and create a message digest for the digitally signed file. The message file is then concatenated to the data file to create a single file suitable for escrow.

 

D15.2.7.3 Data Verification

The verification process uses layers of meta-data encapsulated in the escrow file to construct a verification report, which indicates whether an escrow file meets the above authentication requirements.

 

D15.2.7.4 Data Format

Standard UNIX utilities are used to concatenate and compress the files into a single file for more efficient storage and recovery.

 

D15.2.7.5 Restoration Process from Escrow Data

If file recovery from the escrow data is required, the tapes are retrieved from the offsite storage facility and the escrow steps reversed to uncompress and recover the files. 

 

D15.2.7.6 Backup Procedures

The domain name database is backed up fully on a daily basis.

 

D15.2.7.7 Backup Hardware and Software

KDDISOL will use EMC and Storage Tek hardware and Veritas software for backing up the files for escrow.

 

D15.2.7.8 Escrow Agent Identity

KDDISOL will employ the reliable agent in phase 2.

 

D15.2.7.9 Recovery Procedures

If escrow data are needed, KDDISOL’s offsite storage is contacted and the appropriate tape or tapes are couriered back to the KDDISOL.

 

D15.2.8 Publicity accessible lookup/Whois Service

Address software and hardware, connection speed, search capabilities, coordination with other Whois systems, etc.

 

D15.2.8.1 Hardware and software

Hardware

The Whois daemon will run on multiple servers that are scalable with more memory, CPUs and disk space as needed. These servers are actively/dynamically load balanced to provide optimum response time and reliability. Each server accepts connections from a variety of clients, and accesses a local copy of the Whois data files. This architecture is scalable as query traffic increases by adding additional servers and/or increasing the capacity of the existing servers.

 

Software

The Whois service is implemented via two major software components:

1.      Data extraction and format applications

2.      Whois server daemon

 

The Whois data extraction applications generate the Whois data files and indexes from a static read-only portion of the Registry database. These applications will run on servers located on the internal network of the registry and cannot be accessed by the Internet population.

The formatted Whois data files are then transported to the Whois server machines. All Whois servers have the same data and will be actively load balanced. These Whois servers handle Internet users queries directly after passing through site load balancing equipment.

The Whois daemon runs on each of several servers, accepts connections from a variety of clients, and accesses a local copy of the generated files. The daemon is configured using configuration file that may be edited, then re-read on the fly. This configuration file controls much of the dynamic behavior of the daemon, including disclaimer and other query response output, maximum load, and speculation control. The daemon may be configured to have different properties for each of several ports, thus allowing users of different classes to obtain different qualities of service.

The Whois daemon gives the administrator control as is reasonable over the number, type, and behavior of incoming sockets. This control does not affect the rest of the daemon architecture—e.g., logging, error-handling, searching, state management, etc.

In the daemon, two fundamental objects must be configured: sockets and behaviors.  Customizing these objects enable the Registry to tune the operation of the server to provide almost any level of service required.

 

D15.2.8.2 Network Connectivity

Whois servers will be located in a segmented LAN configuration to segregate them from other internal Registry functions for performance and security reasons. The Whois service is supported by the same Internet connectivity that supports the Registrar-to-Registry interaction. Multiple connections to multiple ISPs provide the capacity and redundancy required for high availability Whois services. See Section D15.2.10.3 for more network connectivity details.

 

D15.2.8.3 Search Capabilities

The Whois implementation will use the standard Whois server application used by the Internet population. This application can be used to look up records in the registry database (via the Whois data files) to provide information about domains, nameservers, and registrars. Searches for text strings embedded in domain information fields will be searchable as is limited by current standard Whois server implementations.

In the future, KDDISOL will require registrars to include new information such as more detailed company profile in records of registrar database when making contract with registrars. Also, KDDISOL will develop a new Whois client software which have a upward compatibility to existing software. And then registrants enjoy having more beneficial information from the registrar database. Stability of the Internet must be maintained because no modification is required in the registry database.

 

D15.2.8.4 Coordination with Other Whois systems

An implementation of Referral Whois (Rwhois) can be implemented in a controlled, test bed fashion if interaction of other Registrars/Registries Whois services is required.  This service is not currently supported at the Registry.

 

D15.2.9 System Security

Technical and physical capabilities and procedures to prevent system hacks, break-ins, data tampering, and other disruptions to operations. Physical security.

 

D15.2.9.1 Registry System/Network Security

The Registry will be connected to the Internet via two border routers and multiple DS3 connections for diversity. Border routers will use Access Control Lists (ACLs) to control access from the Internet. RRP Application Gateway, Whois, and web servers will reside behind the border routers but outside the firewalls, and have access to them controlled by destination IP address and port number. Access to the application Gateway is also filtered by source address block, ensuring that no one other than the accredited registrars will gain access. One of the TLD servers will also reside on this network and be accessible from the Internet to answer queries.

The Application Gateway servers will be configured with internal and external interfaces, each assigned to a different subnet. External interfaces will receive queries and registration requests from the Internet, whereas the internal interface will be used for communicating to the application and database servers. Acting as a proxy, the Application Gateway will accept and pass query requests and registration information through the firewall to the application server, thereby eliminating direct registrar access to the backend servers. This approach provides superior security from hackers or other Internet based threats.

Firewalls will be used to secure the internal network and the application and database servers. The firewall will be configured with rules to allow only data traffic between the Application Gateway on the external network and the application and the database servers on the internal network. Additional rules will allow the Registry’s internal management systems to access the servers for monitoring purposes and to refresh files as necessary.

Changes to the ACLs and firewall rules are tightly managed by Operations, who use structured change management techniques to oversee changes when registrars are added or deleted, or other changes are made. The Registry utilizes security scanning software to constantly monitor its network for security leaks, and has contracted with an outside firm to run “friendly” scans against the network at least twice a year.  Results of the scan are promptly reported to Registry Operations.

Even with these standard securities, a determined attacker will bypass the defenses.  Thus, the perimeter security is decreasing, as the registry businesses open up the networks to registrars, internet users.  This need to secure core assets makes an Intrusion Detection System (IDS) introduced to monitor and respond to misuse.  In Figure D15.2-3, as the Registry IDS, we chose EMERALD, developed by SRI International is a comprehensive highly-scalable open system, and was reported by MIT Lincoln Laboratory on 13 Dec. 1999 as having the highest overall performance in their intrusion detection evaluation program.

 

Figure D15.2-3 Registry Business Requires Distributed Monitoring

 

EMERALD, consisting monitors of host computers and network traffic, rule-based and statistical analysis engines, has a three tier system architecture which separates data collection from analysis and reporting.  This highly customizable platform allows rapid addition of new monitors, analysis engines, correlators, and reports for new intrusion threats during the course of Registry business.  At rule-based intrusion detection, a stream of events are mapped against an abstract representation (i.e. “rules”) of target activity, and an expert system characterizes known attacks and vulnerabilities.  Analysis engine sends alarm when matches are identified.  Since the Internet technology is rapidly evolving, many new attacks have been created thus far.  IDSs other than EMERALD are weak for these new or unknown attacks, and cannot be used for the Registry service.  The EMERALD’s advantage is that it has a solution called Statistical Anomaly Detection for these unknown attacks, which, as shown in Figure D15.2-4,  builds profile of ‘normal’ activity (individual user profiling , client-server session analysis, network traffic profiling), compares short and long term activity patterns, and raises alarm when use departs from established patterns.

 

Figure D15.2-4 Statistical Anomaly Detection Required for New Hackers

 

Security Breach Recovery

A security breach occurs when one or more systems are accessed (and potentially modified) by unauthorized personnel. Often such breaches occur via a network connection. Recovery from security breaches is straightforward, but is often consuming, and potentially disruptive to the services hosted on the affected systems. Certain security breaches may disable a service, for example Registration, for the duration of the recovery and cleanup activities. Following is a summary of the steps involved in recovering from a security breach:

1.      Using IDS (Intrusion Detection System) and its logging data, identify affected systems and remove them from the network to prevent further damage

  1. Identify mode of access (how the attacker gained access)–for example, account ID and password compromised or service exploited.
  2. Notify appropriate law-enforcement authorities of the event
  3. Correct weaknesses exploited on all systems including those not breached
  4. Collect and preserve evidence and other information for turnover to law enforcement
  5. Cleanse affected systems by reformatting disks and re-installing operating system, software, and data from the most recent back-up prior to breach
  6. Reconnect systems to network and restore services

 

D15.2.9.2 Physical Security

Physical security for the Registry is of paramount importance based on the value of the services provided to the Internet community. In this regard, the following precautions will be enabled:

 

Base Building

 

Physical Security

 

D15.2.9.3 Others

The Registry must be operated according to well-documented principles for information and physical secu­rity, implemented by ade­quately trained personnel. It will be a paramount target of sophisticated hackers world­wide, motivated by curiosity, malice, or greed. It therefore must incorporate the most robust information assurance technology to protect the data­base and other servers from corruption, preclude theft of private information by unauthorized third parties, and resist external denial-of-service attacks.

The physical Registry system must be secured against intrusion and protected against normal vicis­situdes of operation that might compromise operational secu­rity.   Shared Registration System (SRS) is the Registry architecture and processes used to enable registrations by multiple registrars.  It includes the Registry Registrar Protocol (RRP), which is used to support communications between the Registry and Registrars, and provides the security and authentication functions to protect the Registry database while supporting all necessary registrar operations.  RRP is also used during the certification process for accredited registrars for operational testing and evaluation of registrar implementations of the RRP prior to commencement of actual registrar operations.  KDDISOL will be responsible for providing the RRP software interfaces, documentation, and training to accredited registrars for the new TLD.  Hands-on technical support to new registrars should be available from KDDISOL to assist them in resolv­ing difficulties in successful inter­facing with the Registry.

Personnel responsible for software implementation and hardware operation must be screened carefully to eliminate potential internal security risks. KDDISOL Registry provides a web-based maintenance tool, Registry CSR Tool, for domain updates and administrative functions.  This site is password protected to maintain security for individual Registrar information.  Through this site, the Registrar will have access to daily and weekly reports, billing information, and the ability to update administrative and domain information.  To assist in providing quality support, KDDISOL CS will have access to view and update individual Registrar administrative and billing information through this web-based tool.  KDDISOL CS will also have the ability to update domain information for the Registrar and view Registrar reports.

In addition, to ensure the sanc­tity of remote distribution of the Registry products, The Registry must have 100 per­cent control of the remote dis­tribution services, that is, the TLD servers.

 

D15.2.10 Capacities

Technical capability for handling a larger-than-projected demand for registration or load. Effects on load on servers, databases, back-up systems, support systems, escrow systems, maintenance, personnel.

 

D15.2.10.1 Average System Capacities

KDDISOL carefully selects the system vendors, IBM and Sun, based on their reliability, serviceability, performance, and scalability. Their respective average system capacities are dependent on their individual configurations, which will change as requirements and demands change. An architectural goal of KDDISOL is that these systems operate under 50% utilization, so that they can handle 100%+ peak loads, as well as supporting fail over scenarios where one server may have to assume the workload of two. These systems are constantly monitored, and proactively upgraded when average system utilization exceeds a pre-determined threshold. Memory and disk space utilization are also monitored as part of this process and upgraded as needed.

 

D15.2.10.2 Peak System Capacities

Peak system capacities are dependent on equipment configurations. KDDISOL is designing the new TLD registry infrastructure to accommodate numbers and growth rates similar to .com. Effective June 2000 the subcontractor was processing over 20 million transactions a day and had over 19 million domain names.  Individual system capacities, containing escrow and backup system, are scalable as needs required, but in addition, the registry systems are designed to be expanded by adding additional systems and load balancing between the systems. By expanding horizontally with additional systems as well as vertically with additional processors, memory and disk space, there is huge growth potential. The Oracle database supports significantly more records than required even for the largest domain.

 

D15.2.10.3 Network Capacities

Phase 1

In Phase 1, KDDISOL will subcontract with the VeriSign Global Registry. The subcontractor designed and constructed its registry network to deliver exceptional availability, performance, scalability, security, and maintainability. In terms of bandwidth and connectivity the registry supports four DS3 connections to the Internet from four different major ISPs. The border routers pass up to 1 million packets per second to and from the Internet. KDDISOL and the subcontractor monitor the circuits constantly for utilization and upgrades the circuits when they reach 50% average utilization.

 

Phase 2

KDDISOL designs and constructs its network to deliver exceptional availability, performance, scalability, security, and maintainability by connecting to KDDI’s public Internet service at sufficient bandwidth. In terms of bandwidth and connectivity at the main location (Otemachi, Tokyo) of registry,  KDD supports about 2.4Gbps IX and direct peering to over 50 ISPs in Japan and have approximately 1.7Gbps connections to US and 330Mbps to Asia. Each border routers pass at least 1 million packets per second to and from the Internet. KDDISOL and KDD monitor the circuits constantly for utilization and upgrades the circuits when they reach 50% average utilization.

 

Future upgrades to the registry production network will include increasing the size of the circuits to the Internet and replacing fast Ethernet links with gigabit Ethernet links. 

 

D15.2.10.4 System Scalability

As indicated in earlier sections, the key to a successful registry implementation is be able to scale as the demands on the systems increase. The subcontractor has architected scalability into the registry design to ensure sufficient capacity to manage large amounts of growth. Individual systems can be upgraded or additional systems added to increase the capacity of the registry. 

The TLD configurations are also designed to scale in the same manner as the size of the zones and the number of queries increase.

 

D15.2.10.5 Personnel

KDDISOL operates on a 24x7x365 basis with a full complement of support staff for supporting the registry, back office, and TLD infrastructures. In critical situations, all the technical staff can be contacted via pagers or cell phones. Sufficient personnel are available to monitor and maintain current systems, troubleshoot, and develop additional features to the registry infrastructure. Each server’s location is a major technology center of each area with access to a deep pool of engineering and operations talent.

 

D15.2.11 System Reliability

Define, analyze, and quantify quality of service.

 

D15.2.11.1 System Reliability, Availability, Serviceability

The Registry system is designed to be highly reliable with State-of-the-practice architectural elements and operational procedures applied throughout. Using elements such as component redundancy, load balancing, high-availability (HA) configurations, hot spares, aggressive vendor maintenance contracts, and optionally, multi-site operations, the Registry will be able to ensure the uninterrupted availability of Registry services. The Registry will be designed to meet the following goals:

·        Provide uninterrupted service redundancy to mitigate the risk of most system failures

 

In addition to the core Registry infrastructure, the TLD name servers are to be distributed in multiple locations throughout the world. Although each TLD site depends on the facility where it resides, the TLD system, as a whole, will not depend on the Registry site except for updated zone files. Even with a loss of the Registry, the global TLD servers will continue to provide basic Domain Name Resolution Service within current zones.

 

D15.2.11.2 Database Integrity

The Registry will use the Business Continuity Volume (BCV) software feature of the EMC Symmetric Array to periodically perform backups, Ad-Hoc and regularly scheduled reporting, and corruption detection. Backups and restores are performed using the EMC EDM backup product providing complete images of the Oracle database are posted to tape on a daily basis. Both ad-hoc and regularly scheduled reports are constructed from a physically separate reporting server connected to the Symmetrix array using BCV technology for the daily Oracle database image. Exhaustive Oracle block level corruption detection and application-level data scrubbing are performed on the BCV image so operations personnel can detect corruption, determine actionable root cause of failure, and implement solution alternatives early in the process. Both the primary and secondary sites have equal and compatible backup and restore technology.

 

D15.2.11.3 System Support

The Registry will provide a variety of tools to support the system. For problems that occur within the normal operation of the system (e.g., Customer Service requests), a web-based tool is available that allows for a variety of domain operations to be performed. For troubleshooting of system problems, a Registry Diagnostic Tool will be used which interrogates each of the system components to verify their proper functioning. This includes:

 

D15.2.11.4 Processes and Procedures

KDDISOL will document and use standard operating procedures (SOPs) in running the registry. Each step in the process of registering domain names, generating zone files, distributing zone files, and maintaining the backend infrastructure will be tested in an isolated Quality Assurance(QA) environment before being released. The QA environment will be designed to closely emulate the operational environment, and QA Engineers stress test hardware, software, and processes and procedures to ensure they will integrate cleanly and not be the cause of an interruption of service. The results of the tests are thoroughly documented and test results are reported back to Engineering and Operations. This process is a closed loop process; any problems encountered during testing are fed back through the process, corrected, and retested.

For the most part, registry processes will be automated. Where operations intervention is required, there will be strict guidelines and checklists to ensure that all steps process correctly. The RCC monitors all the processes on a 24x7x365 basis. When a problem occurs, the RCC staff follows pre-defined procedures to identify and resolve the problem.  If the problem cannot be quickly resolved, there will be an aggressive escalation path to quickly involve the appropriate technical management and staff.

Registrars will be required to be accredited by ICANN. Once accredited, they must pass certification by the KDDISOL to begin registering domain names. This process is an essential ingredient ensuring that registrars will not face complications when beginning to register domain names in production mode. To assist when needed, there will be CSR’s available on a 24x7x365 basis to answer questions and provide transactional assistance when required.

 

D15.2.11.5 Change management

We will use change management systems and processes in both Engineering and Operations departments to keep the KDDISOL’s Registry Systems in operation. This includes periodic planned outages to perform maintenance on the registry systems. As indicated above, integrating changes into the registry requires passing a rigorous testing and evaluation stage before being allowed.

KDDISOL will employ technical project managers to plan and track execution of changes made to the Registry. They will conduct a risk analysis of any proposed change, and ensure that all affected parties are involved in any change.

 

D15.2.11.6 Service Level Agreement (SLA) Summary

The Registry will provide a world-class level of service to its customers. A Service Level Agreement will be used to provide metrics and remedies to measure performance of the Registry and to provide accredited and licensed Registrars with credits for certain substandard performance by the Registry coupled with a yet to be defined Registrar License and Agreement.

Shared Registration System ("SRS") Availability shall mean when the SRS is operational. By definition, this does not include Planned Outages or Extended Planned Outages. Planned outage shall mean the periodic pre-announced occurrences when the SRS will be taken out of service for maintenance or care. The Registry will achieve 99.4% or better availability for the SRS system.

Unplanned outages are generally defined as the amount of time recorded between a trouble ticket first being opened by the Registry in response to a Registrar’s claim of SRS unavailability for that Registrar through the time when the Registrar and Registry agree the SRS Unavailability has been resolved with a final fix or a temporary work around, and the trouble ticket has been closed. Unplanned outages are also defined as any time that exceeds the planned outage time or the planned outage time interval.

SRS Unavailability shall mean when, as a result of a failure of systems within the Registry’s control, the Registrar is unable to either:

a) Establish a session with the SRS gateway which shall be defined as:

b) Execute a 3 second average round trip for 95% of the RRP check domain commands and/or less than 5 second average round trip for 95% of the RRP add domain commands, from the SRS Gateway, through the SRS system, back to the SRS Gateway as measured during each monthly Timeframe.

The Whois service will be updated once a day and availability will be equal or better than that defined for the SRS system.

TLD servers will be updated a minimum of once a day and the collection of servers as a whole will provide 100% query service availability to the Internet population. The TLDs geographic and network diversity ensures that multiple servers will be operating at any given time.

If any service levels are not met during a defined interval (e.g. Month), a credit based on the volume of add domain transactions will be given to the affected registrar(s). The maximum credit provided will be limited to 5% or 10% depending on the metric that was exceeded or not met.

A specific SLA agreement will be negotiated after contract award.

 

D15.2.12 System Outage Prevention

Procedures for problem detection, redundancy of all systems, back up power supply, facility security, technical security, availability of back up software, operating system, and hardware, system monitoring, technical maintenance staff, server locations.

 

Although high-availability features will be designed into all the registry systems and services, efforts will be concentrated on make core services “bullet-proof”. These core services include those that are required for the smooth operation of the Internet and are immediately evident to the Internet community in the event of a failure. These core services include:  

Other services that are important to the operation of the Registry, but whose failure or degradation would not affect operation of the Internet include:

 

D15.2.12.1 Primary and Secondary Systems

The Registry intends to use employ IBM and Sun UNIX systems in high-availability configurations to ensure no single point of failure. In addition, we expect to use offsite tape storage and an offsite disaster recovery facility that will be constantly updated with current information. Such a site would be utilized during full outage and some partial outage scenarios. See Section D15.2.1 for more system information, and Section D15.2.13 for more fail over information.

Note:  Not all registry services include secondary facility support.

 

D15.2.12.2 TLD Systems and Constellation

The TLD configurations are designed so there are no single points of failure. This is accomplished through the use of redundant components, both at the system and component level. For example, multiple switches and load balancing devices will back one another up in the event one fails, and the devices will be configured with dual power supplies when available. Configurations are designed so that when a failure is detected, the service will fail over to the backup systems. High-availability operational procedures as established in RFC 2870, “Root Name Server Operational Requirements”, will be used as guidelines for building and maintaining the name servers.

There will initially be three geographically distributed TLD name servers to support the new TLD in phase 2. Then it will be placed maximum six servers in next five years and total number will become nine. These name servers will be strategically placed at topological cores of the Internet; those areas that serve the greatest number of hosts and users. As well as topological, there will be geographic diversity to ensure that manmade or natural disasters in a single region will not affect the ability to answer queries by the remaining servers. It is anticipated that the name servers will be placed in the following locations:

1.      Tokyo, Japan

  1. Los Angeles, CA
  2. Washington, DC
  3. London, UK
  4. Hong Kong, China
  5. Paris, France
  6. New York, NY
  7. Frankfurt, Germany
  8. Singapore

TLD query rates will be constantly monitored, and the TLD name servers re-deployed as necessary to best serve the needs of the Internet users of the new TLD.

The DNS software is also designed to handle a failure of one or more name servers, so a failure of one or more servers in the constellation will not materially affect TLD resolution services.

 

D15.2.12.3 Network Architecture

The network infrastructure is designed with redundant devices, multiple physical routes and physical diversity. The objective is to isolate single-point failures with no interruption of services or degradation in performance. In most cases, isolation of failures is automatic and occurs within a few seconds of the event. It would take a minimum of two simultaneous network-component failures to disable the network infrastructure. Certain component failures (such as firewall failure) may require manual intervention to complete the fail-over.

Internet connectivity is enabled through KDD’s own backbone as well as through peering and transit relationships with multiple ISPs. A failure of the KDD backbone or another ISP’s network will not disable access by registrars.

 

D15.2.12.4 System Monitoring

The KDDISOL will utilize a range of standard and custom enterprise systems management tools to monitor and manage the registry production systems and the globally dispersed TLD constellation. These tools are used both by the Network Operations Center and the Registry Operations staff for system and network monitoring. A brief description of each tool and its use is outlined below.

WebNM is an SNMP-based monitoring is tool used to monitor system attributes such as:

Tool features will include monitoring real-time system availability for servers and network devices, an interactive web interface, and graphical displays of historical performance data. Thresholds can be set from which alarms are generated and forwarded to the RCC.

Concorde SystemEdge is an agent based monitoring tool that uses SNMP to monitor system specific attributes, including:

 

This tool features will also include an integrated alert manager, an interactive web interface, system self-monitoring, and logfile monitoring. Thresholds can be set from which alarms are generated and forwarded to the RCC.

A DNS Remote Real Time Monitor will be used to monitor the real-time traffic flow of root and TLD DNS servers. It monitors the following attributes:

§         Response time of last DNS query

§         Real-world query to server and compare to expected result

EMERALD is an Intrusion Monitor to show the real-time results (locations of invasion, ways to be attacked, internal network traffic trends, etc.) of IDS operations over the target Registry system.

TeamQuest is a performance analysis, diagnostic, management and modeling product suite.  It incorporates highly detailed operating system statistics, process accounting, custom data, and RDBMS performance data, including:

§         Identification of server problems

§         Drill-down investigation of events, alarms, and unusual system behavior

§         Root cause analysis of system performance issues

§         Trend analysis

§         Correlation of cause and effect

§         Compliance with service level objectives

§         Understanding the impact of substantial changes or new applications

§         Modeling  (Analytical Queuing Analysis or Discrete Event Simulation)

 

D15.2.12.5 Trouble Reporting

When problems are either reported to or observed by the NOC, the NOC staff will open a trouble ticket and perform preliminary analysis to determine the severity, diagnose the root cause and correct the problem if possible. Problems are assigned one of the following categories:

·      Severity 1 – service outage; severe or potentially severe impact

·      Severity 2 – service degradation; impact is not severe

·      Severity 3 – component outage; redundant components or workarounds prevent any service impact.

 

If the remote Registry NOC cannot resolve the problem, it will immediately escalate through the KDDISOL NOC to either the on-call System Administrator (SA) or on-call DNS engineer in KDDISOL Technical Operations (depending on the nature of the problem). In the unlikely event that the problem cannot be resolved at this level, the problem is escalated to KDDISOL Engineering. A workaround may be provided until the issue is resolved. The KDDISOL NOC will maintain update the remote Registry via phone or email on a periodic basis until the problem is resolved. 

Monitoring of the remote Registry will also be conducted from the KDDISOL NOC. Any detected problems at the NOC will be communicated to the new Registry NOC for resolution. If the problem cannot be resolved locally, the problem will be escalated through the NOC as described above.

 

D15.2.12.6 Physical Security

Phase 1: VeriSign Global Registry Production Data Center

The VeriSign Global Registry production data center is protected by onsite security staff 24x7x365 and the use of card readers. Only employees are permitted unescorted access to the building. Additionally, the data center room is further restricted (via card readers) to only those employees who perform hardware installations or maintenance. Between the hours of 7pm and 7am all card access is disabled, and anyone requiring access to the data center must obtain a special entry badge from the Network Operations Center.

 

Phase2: KDD Otemachi Data Center

KDD Otemachi data center is protected by onsite security staff 24x7x365 and the use of card readers. Only employees are permitted unescorted access to the building. Additionally, the data center room is further restricted (via card readers) to only those employees who perform hardware installations or maintenance.

 

Remote Sites

All remote sites provide 24x7x365 onsite security that meets or exceeds the security at KDDISOL. KDDISOL equipment is contained in locked cabinets and, in some cases, locked cages. Most sites also provide separate data center rooms with limited access to each room.

 

D15.2.12.7 High-Availability

Please refer to Section D15.2.11

 

D15.2.12.8 Facilities

VeriSign Global Registry is located in a new state-of-the-art facility in Dulles, Virginia.  The 10,600 square foot data center will house primary Registry systems and personnel engaged in Phase1 activities.  Please refer to Section D15.2.1.7 for more primary site details. 

 

The secondary data center is located at a facility in suburban Maryland that provides secondary site support services.  There are multiple high-speed direct connections to this site from the VeriSign Global Registry Production Data Center to facilitate backup and fail-over scenarios.  The facility is supported by n+1 power and cooling, and is staffed 24x7x365. 

KDDISOL’s Registry will be located in KDD Otemachi Building in Otemachi, Tokyo. The 66,000 square foot data center will house primary Registry systems and personnel. Please refer to Section D15.2.1.7 for more primary site details.

 

D15.2.12.9 Natural and Man-Made Disaster Impact and Fire Suppression

VeriSign Global Registry Production Data Center.

This data center, located in northern Virginia is not in an earthquake zone, and therefore does not need protection against earthquakes. It does provide protection from flooding, but only limited protection from other natural disasters. Fire suppression is provided by an FM200 system that is smoke activated. As a backup, a heat-activated water sprinkler system will engage sprinkler heads individually.

VeriSign Global Registry Secondary Data Center

Same as above except that protection from all natural disasters is provided in a structurally reinforced facility.

 

KDD Otemachi Data Center

This data center has a protection against earthquakes and heat-activated non-water based system using halon gas. It does provide protection from flooding, but only limited protection from other natural disasters.

 

Remote Sites.

Some remote sites provide for earthquake “hardening” depending on specific location.  All the sites are in data collocation centers that are designed to withstand natural disasters endemic to the respective area. The sites all have fire suppression systems similar to that employed in KDD data center, with a non-water based system.

 

D15.2.12.9.1 Power Backup/HVAC and Redundancy

Redundant UPS units protect the data center. Additional redundant power features include:

 

 

Heating, ventilating and cooling (HVAC) units are air cooled, and so no cooling water pipes are located within each of the aforementioned data centers. Additionally, the current HVAC units provide sufficient redundancy that up to some of them could fail and the remaining units would maintain the data center within designed tolerances.

 

D15.2.12.10 Network Diversity

WAN network connectivity has been designed with physical and logical diversity as a design goal. 1st tier Internet Service Providers have been selected to guarantee network and routing diversity in case one or two carriers experiences problems. Physical diversity is realized by working with the local access provider(s) to ensure diverse physical routing of circuits was used where possible. At Tokyo, Otemachi, main registry location of Phase 2, KDDISOL uses KDD’s public Internet which have sufficient diversity. KDD supports about 2.4Gbps IX and direct peering to over 50 ISPs in Japan and have approximately 1.7Gbps connections to US and 330Mbps to Asia.

Local Area Network diversity is enabled through diverse pathing and employing routing and switching configurations that automatically detect failures and re-route packets transparently. The network is designed to exclude any single point of failure.

 

D15.2.12.11 Technical Maintenance Staff

Technical maintenance staffs of KDDISOL check logs from all critical servers and routers several times a day via automatic error log monitoring system and proactively examine symptoms of crucial failures, intrusion to systems, and so forth. Staffs always follow the activities of security advisory councils such as CERT (Computer Emergency Response Team), and JPCERT. If a problem to threaten our system is announced by these bodies, KDDISOL amends it as soon as possible.

 

D15.2.13 System Recovery Procedures

Procedures for restoring the system to operation in the event of a system outage, both expected and unexpected. Identify redundant/diverse systems for providing service in the event of an outage and describe the process for recovery from various types of failures, the training of technical staff who will perform these tasks, the availability and backup of software and operating systems needed to restore the system to operation, the availability of the hardware needed to restore and run the system, backup electrical power systems, the projected time for restoring the system, the procedures for testing the process of restoring the system to operation in the event of an outage, the documentation kept on system outages and on potential system problems that could result in outages.

 

As described in System Reliability Section of this document, the Registry will employ infrastructure and operational processes to mitigate the possibility of a crippling failure.  However, there also are a variety of methods available to handle various system problems that might occur.

Business continuity and reliability are not after market products. They are designed into services and systems from the outset. The Registry application of business continuity design elements, coupled with rigorous test and validation procedures, ensure that the critical services provided by the Registry, and the systems that support them, are sufficiently robust to mitigate the risk of potential business interruptions.

To support the scope of this section, Registry Services are separated into Critical Services and Non-critical Support Functions. The Registry Critical services are those required for the smooth operation of the Internet. They include:

• Domain Name Resolution Service

• Registration Service

• Whois Service

• Customer Service

 

Critical Services are defined as those services that directly support registrars and  DNS resolution services available to all Internet users at large. Non-critical Support Functions are other processes for which the external impact of an outage would be minor or nonexistent.

 

D15.2.13.1 Failure Scenarios

D15.2.13.1.1 DNS Service Failures

Two types of failures can impact providing DNS services to the Internet at large:

 

1.      Zone file generation failure

  1. TLD server failure

 

(1) Zone File Generation Failure

Full fail-over

A full fail-over means all processes are manually shifted in a controlled manner to operate on the secondary site. During a full fail-over, any zone-generation processes running at the primary site may be terminated (as necessary) to allow for the secondary site to take over these functions. Any zone files currently under construction are treated as unreliable and are discarded. If fail-over to the secondary site occurs while the zone-generation process is not running, no steps are necessary for the fail-over to occur.

 

Partial fail-over

A partial fail-over means all processes are shifted to operate on the secondary site in an uncontrolled manner. During a partial fail-over, terminating zone-generation processes running at the primary site may or may not be necessary.

If the zone-generation process is not running at the time at which fail-over to the secondary site occurs, no steps are necessary to fail-over zone generation.  If, however, the fail-over occurred during zone file distribution, then the administrator will execute procedures to initiate the file distribution process to the sites affected.

 

Zone Data Corruption

Through the use of the Business Rules Engine in the Registry systems, data is validated before it is placed in the Registry database. If the data in the database has been corrupted, then the administrator will perform database cleansing procedures. In addition, an attempt would be made to determine if the corrupted data has been propagated to the TLD servers. If it has, the administrator will follow procedures for reverting the TLD servers to a previous copy of the affected zone file(s).

Zone files are distributed within and outside the Registry system and their contents are validated at each step. If the validation ever disagrees with the master copy, then the replication is considered to have failed and the flawed copies are destroyed. If a host intrusion on the zone file tagging area or any of the root and TLD servers is detected, then the one(s) on the affected host(s) should be compared with the master copies on the zone generation machine inside the Registry firewall. Standard Operating Procedures regarding the rollback of corrupt zone files on a root or TLD server should be followed to repair the damage.

 

(2) TLD Server Failure

Hardware Failure

Various components at the TLD locations are configured in a high availability configuration. Should a redundant component fail, the “backup” component is designed take over automatically. If a specific hardware component is not redundant, NOC personnel will work with onsite personnel to isolate the problem. Once the failed component is found, NOC personnel will initiate procedures to replace the defective component. Due to the “load balancing” nature of the DNS protocol, any single TLD failure is dynamically accommodated by standard DNS processes and a different TLD server would be utilized.

 

Name Server Application Software Failure

The Registry NOC will constantly monitor the health of the TLD constellation to maintain performance and availability goals. Once an anomaly is detected by NOC management systems, troubleshooting procedures will be initiated by NOC personnel to isolate the problem. Name server log files on the TLD server and archived log files will be reviewed to determine the nature of the problem.

Once the problem is corrected, log files are reviewed and queries are performed against the server to verify proper operation. Depending on the size of the zone files being used, the name server application will resume operation within 2 to 20 minutes after a restart of the application has been initiated.

 

Corrupt Zone Data

Extensive procedures have been developed to ensure that zone data files located on the TLD servers are error-free. Some situation may occur where one or more zone files resident on the TLD server get corrupted accidentally or intentionally. Once a determination is made that a current zone file is corrupt, NOC processes will be executed to restart the name server application using local copies of previously used zone files. The local versions are created automatically and stored to an archive area on the local hard disk each time a new version of the zone files are loaded.

 

D15.2.13.1.2 Registration Service Failure

The registration services is primarily supported by the Shared Registration System (SRS), which consists of a protocol and the associated hardware and software that permits multiple registrars to provide Internet domain-name registration services within the TLD administered by the Registry.

A number of entities interface with the SRS, primarily registrars and Registry Customer Service Representatives (CSRs). Registrars access SRS through the Registry-Registrar Protocol (RRP) to register domain names and perform domain-name related functions such as the registration of name servers, renewal of registrations, deletions, transfers, and updates to domain names registered by that registrar. Registrars also have a web-based interface to access SRS to perform administrative functions, generate reports, perform global domain-name updates, and perform other self-service maintenance functions not available via RRP. The Registry provides support to the registrars for the SRS through the CSRs. The CSRs have a separate web-based interface to the Registry, through which, after authenticating the registrar, they can query and perform updates per the registrar requests.

 

The SRS consists of the following components:

The majority of disasters result in some sort of physical damage to the SRS hardware, facilities or communication channels; however, some of these disasters are less obvious in nature. For example, a denial of service of attack could adversely affect the performance of gateway servers, rendering them useless for the duration of the attack. A hacker could compromise security and subsequently jeopardize the integrity of the SRS data. A software virus could infect one of the production servers and adversely affect performance, or result in data corruption. 

There are different levels of severity associated with each of the potential disaster scenarios. For example, a small flood may destroy only a small section of a data center, bringing down one set of components in the system. On the other hand, a severe flood could damage or destroy the entire building, resulting in a complete loss of the primary data center. The disaster recovery process that would be followed for the former case may differ from the process followed for the latter. After reviewing the potential failure scenarios carefully, there were four categories of failure:

1.  Full Fail over

2.  Partial Fail over

3.  Non-Fail over

4.  Business Reconstruction

 

Full fail-over

There are many types of failures that would result in a full, fail-over from the primary site to the secondary site. For example, if the primary site were unavailable to the registrars because of a fiber cut, then a full fail-over would be necessary. If the primary site data center was destroyed or rendered unserviceable as a result of a severe natural disaster (e.g. flood, tornado, earthquake, etc.), then a full fail-over would obviously be warranted.

 

Since the other secondary site components should all be in stand-by mode, they would not need to be reconfigured. All of the secondary site processes should be started. The registrars should be notified of this fail-over and instructed to use the secondary address(es) only to access the SRS.

 

Partial fail-over

Certain types of failures can occur which would be considered a disaster, but would not require a full fail-over to the secondary site. An example of this type of disaster would be some sort of primary site Oracle HA cluster failure. The servers themselves could be physically destroyed, or the power supply to the cluster could be interrupted indefinitely. Whatever the reason for the failure, a partial fail-over to the secondary would be required. A partial fail-over is when one or more components fail over to the secondary site, but a portion of the primary site remains operational.

 

Certain types of failures or disasters will not require a fail-over to the secondary site at all. If the hardware and physical network are still available, then it’s probable that the failure is due to user behavior, a security breach, or a software issue of some sort. These types of failures would most likely affect both the primary and secondary site and should be directly rectified, if possible. For example, if performance of the system were degraded as a result of a denial of service attack, both the primary and secondary sites would be affected by the attack. In this situation a full or partial fail-over to the secondary site would not make any sense.

 

D15.2.13.1.3 Whois Service Failures

Directory service consists of two major components: Whois servers and the Whois data extraction process.

The Whois daemon runs on each of several servers, accepts connections from a variety of clients, and accesses a local copy of the directory service database to answer these queries. The Whois data extraction process generates the directory service database from the Registry database.

Directory service is able to run at both the primary and secondary sites. Whois queries are load balanced to the directory-service servers across both sites. Also, the directory service process is run in test mode at the secondary site to verify functionality and accuracy in case site fail over is required.

 

Full fail-over

In full fail-over, the directory service is manually switched over from primary site to secondary site. Since Whois daemons on both the sites provide directory service, if all the daemons at one site fail, the daemons at the other site continue to provide the service. There is no fail-over required.

If the Whois file generation system becomes unavailable, the Whois file generation service is failed over to the secondary site and the Whois daemon servers are shut down on the primary site. The Whois file generation process on the secondary site is configured to run in production mode. It generates the Whois database, validates it and replicates it on Whois daemon servers on the secondary site only.

If the database becomes unavailable on the primary site, the Whois file generation process is disabled on the primary site. The Whois daemon servers are shut down on the primary site. Whois file generation process is enabled on the secondary site to run in production mode. It generates the Whois database, validates it and replicates it on Whois daemon servers on the secondary site only.

 

Uncontrolled fail-over

In an uncontrolled fail-over there is no opportunity to gracefully shut down the service on the primary site. In this scenario, the Whois daemon and Whois file generation both go out of service due to unforeseen circumstances.  Disaster results in service being unavailable. In such a situation the ser-vice is manually enabled on the secondary site. The Whois file generation process is enabled on the secondary site to run in production mode. It generates the Whois database, validates it and replicates it on Whois daemon server on the secondary site only.

 

Non-fail-over

Denial of service (DoS)

Directory service can also become unavailable because of a DoS attack. The Whois daemon has built-in defenses against DoS attacks. It is configured to block IP Addresses that send more than a pre-configured number of queries per second. This is not a complete defense against denial of service attack because Whois daemon resources are used in determining the IP Address of the client sending queries. This results in degradation of the quality of directory service. Failing over to the secondary site is not a solution because directory service load is distributed across both the sites and hence both the sites are under this attack. Denial of service attacks are best solved at border router level. The offending IP Address is blocked at the border router itself. This saves the directory service resources from identifying the offending IP Address and blocking them. As KDDISOL also utilizes IDS (Intrusion Detection System) called EMERALD which can find new and unknown attacks by Statistical Anomaly Detection method, the system will become much tough.

If a hacker compromises the Whois daemon servers and the service is consequently unavailable, a full fail-over to the secondary site is initiated.

 

Data Corruption

If the Whois database at one of the Whois daemon servers is corrupted on the primary site or the secondary site, then that server is shut down, uncorrupted data copied over from the one of the other Whois daemon servers and the shut down server is brought up. If all the Whois daemon servers at one site have a corrupted database, all of the Whois daemons are shutdown; uncorrupted data is copied over from the other site and the shutdown servers are brought up. If all the Whois daemon servers at both the sites have corrupted database, all the Whois daemons on both the sites are shutdown. The Whois database is reverted to the previous days known good database and the Whois daemons are restarted. Whois dumper is started to regenerate the database on the primary site. Once Whois database generation is complete it is replicated on Whois daemon servers on both the sides. All the Whois daemon servers on both the sites are restarted to refresh their data.

 

D15.2.13.1.4 Customer Service System Failure

Customer Services provides the 24-hour technical support via telephone and e-mail. One-on-one support includes both general information and problem resolution. CSRs have their own Web-based tool (CSR Tool) for querying and modifying the database. This tool gives the CSRs the ability to query registration information at the request of the contacting registrar. CSRs with appropriate access levels can modify the registration information to correct errors made by the registrars. If a problem occurs that is beyond the scope of the CSRs to rectify, a well-defined escalation process is followed to alert appropriate Operations and Engineering personnel.

 

Impacts of Failures

The following scenarios address the system-level disaster recovery processes (tools and E-mail). There are two failure points in the systems: CSR web-server fail-over and underlying database fail-over. Along with the system fail-over decisions, the decision must be made whether to relocate the CSRs to the backup location, entailing rerouting of telephone communications.

 

Full fail-over

In full fail-over, it will be necessary to complete any write transactions (database modifications) in progress in the CSR tool. After the write trans-actions are complete, the next action depends on the area where the failure was detected:

·       If the failure occurs in the CSR web servers, the underlying network routing mechanisms will automatically route further actions to the operational web servers. No further actions are necessary besides disabling the currently active web server.

 

·       If the failure occurs in the underlying database, the web servers will have to be pointed to the secondary database. This action requires changing a configuration parameter (IP address) on both of the web servers and restarting the web server application.

 

·       If the decision is made to relocate the CSRs, the CSRs will physically move to the secondary site and begin their operations at that site. No system changes are necessary.

 

Partial fail-over

For an uncontrolled fail-over, the process is the same as for a full fail-over, except that there is the possibility that transactions in progress have not completed successfully. Once the underlying systems have successfully failed over, the CSRs will have to query the database to determine if their last action was completed successfully (using the CSR Tool). At this point, it may be necessary for the CSRs to contact the customer to ensure that the data is correct.

 

Non-fail-over

Denial of service (DoS)

Since almost all the CSR Tool operates on an internal network, it is not so susceptible to many typical service interruptions (loss of communications lines, DoS attacks, etc.). Furthermore, KDDISOL will use Intrusion Detection System (IDS) called EMERALD on WWW server directly connected to the Internet outside of firewall.

For the identified areas of vulnerability, the actions are:

·          CSR Tool–follow the process for uncontrolled fail-over.

·          Database–follow the appropriate process for database fail-over.

 

Data Corruption

The CSRs are a resource that can determine data corruption (e.g., customer notices a failure in a registered domain or name server). However the CSR tools have no inherent capability of detecting or correcting data corruption. In the event of large-scale data corruption, the procedure to be followed would be the procedure for recovering the database.

 

D15.2.13.2 Data Restoration

D15.2.13.2.1 Data Recovery

To protect and recover data associated with critical services, the Registry will employ the EMC Synchronous Remote Data Facility (SRDF) product in conjunction with the Oracle Database Management System (DBMS). SRDF provides for significant operational flexibility in the following areas:

 

D15.2.13.2.2 TLD data restoration

Each TLD location maintains a tape backup of its system configuration in case of a hardware failure. If multiple name servers are present at the location, once the downed system has been repaired/replaced, it is rebuilt from system tapes. The zone data is either copied to the TLD server from the NOC or is transferred locally in the case of a multiple name server location.

TLD servers also keep backup copies of previous valid zone files in case the current zone file becomes corrupt or the application has problems using the current zone file. Restoration of name server operation will occur with the backup copies of the zone data until a valid current

 

D15.2.13.3 Network Recovery

The network infrastructure (both WAN and LAN) is designed to isolate single-point failures with no interruption of services or degradation in performance. In most cases, isolation of failures is automatic and occurs within a few seconds of the event. Certain component failures (such as firewall failure) may require manual intervention to complete the fail-over.

The Network Operations Center (NOC) will be proactively monitoring all equipment and WAN circuit activity at local Registry data centers as well as remote TLD sites to prevent outages. Once an outage occurs, the NOC will act immediately to isolate the problem and initiate actions to repair the problem.

 

D15.2.13.3.1 Denial of Service Recovery

Denial of Service (DoS) attacks occur when one or more systems flood a network or individual services on that network with disruptive traffic. These attacks may come from many source addresses–a so-called distributed DoS (DDoS) attack–or from a single address. In either case, recovery options are limited and involve quenching the source of the attack either by filtering traffic at network routers or tracing the attack back to the origin and taking the originating server(s) off the network.

Therefore, it is very crucial for us to find DoS and DDoS as soon as possible. Automatic detection mechanism should be required. KDDISOL will establish IDS (Intrusion Detection System) called EMERALD which adopts excellent intrusion detection technique.

 

D15.2.13.3.2 Security Breach Recovery

A security breach occurs when one or more systems are accessed (and potentially modified) by unauthorized personnel. Often such breaches occur via a network connection. Recovery from security breaches is straightforward, but is often consuming, and potentially disruptive to the services hosted on the affected systems. Certain security breaches may disable a service, for example Registration, for the duration of the recovery and cleanup activities:

1.      Identify affected systems and remove them from the network to pre-vent further damage

  1. Identify mode of access (how the attacker gained access)–for example, account ID and password compromised or service exploited

3.      Notify appropriate law-enforcement authorities of the event

4.      Correct weaknesses exploited on all systems including those not breached

5.      Collect and preserve evidence and other information for turnover to law enforcement

  1. Cleanse affected systems by reformatting disks and re-installing operating system, software, and data from the most recent back-up prior to breach
  2. Reconnect systems to network and restore services

 

D15.2.13.4 Redundancy/diversity

Please refer to Section D15.2.11 for information on system redundancy.

 

D15.2.13.5 Training of Technical Staff

KDDISOL will train its staffs to recover systems. Staffs are chosen from experts who have most profound knowledge of IP and UNIX technology and much experience of designing a large IP network and server system. KDDISOL staffs should be equal to or more excellent than those qualified as Cisco Certified Internetwork Expert (CCIE), Oracle Certified Professional (OCP), and so forth.

 

D15.2.13.6 Facilities

Please refer to Sections D15.2.1.7 and D15.2.12.8 for information on facilities.

 

D15.2.13.7 Process and Procedures

KDDISOL maintains a four-tiered data storage architecture for production data that includes the following:

 

1.      Primary on-line data and Critical Data Archive (CDA)

  1. Periodic disk copies for quickly restoring production data and read-only and batch archives
  2. On-site tape backup and archive
  3. Off-site tape backup and archive

 

The primary on-line data is dynamic data that is created and maintained on a real-time basis as the Registry performs normal business operations. The dynamic data may change from as often as hundreds times a second to periodic ad-hoc changes.  Full-copy disk mirroring protects most primary tier-1 online data. Critical Data Archive (CDA) is also a process for storing tier-1 data, but represents data that has been moved off of the production OLTP database for capacity reasons. Tier-2 data is less critical because it is copied periodically from the production systems. 

 

Periodic, or tier-2, disk copies for several purposes. First, they serve as the backup for tier-1 data. Secondly, they provide the ability to execute read-only instructions and batch activities without impacting performance on the main production OLTP database. 

 

Offsite, or tier-3, on-site backups and archives are stored in automated tape libraries.  These tapes contain not only backups of data, but system configurations as well.  Retention periods vary based on the nature and criticality of the data.

 

Offsite, or tier-4, tape backups and archives are copies of a subset of the on-site backups.  There is nothing off-site that does not also exist on-site. Critical backups (for disaster recovery) and long retention archives are stored offsite.

 

D15.2.13.8 Documentation

KDDISOL thoroughly documents the following items:

 

·        Backup and Archive Policies

·        Technical Operations Plan

 

D15.2.14 Technical and Other Support

Support for registrars and for Internet users and registrants. Describe technical help systems, personnel accessibility, web-based, telephone and other support, support services to be offered, time availability of support, and language-availability of support.

 

D15.2.14.1 Customer Service

Customer Services provides 24-hour technical support via telephone and e-mail.  One-on-one support includes both general information and problem resolution. CSRs have their own Web-based tool (CSR Tool) for querying and modifying the database. This tool gives the CSRs the ability to query registration information at the request of the contacting registrar. CSRs with appropriate access levels can modify the registration information to correct errors made by the registrars. If a problem occurs that is beyond the scope of the CSRs to rectify, a well-defined escalation process is followed to alert appropriate Operations and Engineering personnel.

KDDISOL intends to contract with a translation service to provide real-time translation for over 155 languages. When a call from a non-Japanese or English speaking contact is received by Customer Service, the language translation service will be conferenced in and the problem or issue addressed immediately.

 

D15.2.14.2 Registry Command Center

The NOC provides 24x7x365 global systems monitoring and support. Automated systems monitoring tools and technology (See System Outage Prevention) continually assess the health and well being of servers, networks, and applications. This often enables the Command Center to detect and address anomalies before they result in service outages. Strong problem management and escalation procedures ensure that issues are identified, escalated and quickly resolved.

 

D15.2.14.3 Registry Technical Operations

The KDDISOL Technical Operations staff provides 24x7x365 onsite or on-call support of all production systems operated by the KDDISOL. This includes the following operational systems management disciplines:

·        Performance & Capacity Planning

·        Data Center Planning & Management

·        Deployment Planning & Execution

·        Data & Systems Backup, Restore & Archive

·        Business Continuity & Disaster Recovery

·        Problem & Change Management

·        Asset & Configuration Management

·        Metrics Collection & Reporting

The Technical Operations staff is continuously on-site or on-call to address urgent problems and/or service degradation. Routine inquiries and requests (such as reports, metrics, etc.) are handled during standard business hours.

 

D15.2.14.4 Remote TLD Site Technical Support

At each of the TLD sites, there are contractual arrangements in place for technical support at each remote site. This support includes 24x7x365 “smart hands” support from staff employed at the site as well as quick response by vendor field engineers.

 

D15.2.14.5 Tools

The Registry provides web-based tools that are used by both the registrars and Registry Customer Support Representatives.  Registrars can used the Registrar Tool to access domain name and name server status and availability information, update registrar information, and generate Registrar Daily Transaction and Weekly Snapshot Reports.  The CSR Tool provides the ability to add, delete, or modify domain name and name server information.

 

Registrar Tool

The Registrar Tool site provides the Registrar with access to registrar specific information about transactions with the Registry. It is accessed through the Registry web site and uses SSL as supported by version 4.0 and above of Netscape, Microsoft Internet Explorer and AOL browsers for securing the connection. The registrars can perform the following tasks with the tool:

 

CSR Tool

The CSR version of the tool provides all the above functionality, but has additional capabilities to allow the CSRs to access the database and make changes directly to the domain name and name server records. This real-time capability provides superior service by enabling the CSRs to address and resolve issues immediately. Following are the functions that can be performed by CSR’s with the CSR Tool:

·        Query, add, update, delete, transfer, renew, and purge a domain on behalf of the registrar

·        Query, add, update, and delete a name server on behalf of the registrar

·        Delete domain Credit

·        Query, add, and update a Registry user

·        Update a registrar’s credit

·        Produce various reports

·        Administer a registrar’s account. This includes querying, adding, an updating registrar information, as well as querying, adding, updating registrar contact information.

 

The CSR tool will not allow CSRs to register new domain names on behalf of a registrar. Registrars must enter this information themselves.

To further empower the registrars, the Registrar Tool will be enhanced in the near future to provide all the functionality in the CSR Tool, except for the ability to add domain names.

 

Figure D15.2-5 Customer Support Process Diagram

 

D15.2.14.6 Personnel Accessibility

The Registry will have multiple layers of personnel dedicated to ensuring the uninterrupted operation of the SRS, TLD, and other systems, and to provide registrar support around-the-clock. There are pre-established escalation procedures that ensure that the appropriate person can be contacted at all times to quickly and effectively deal with any issues that may arise. Phone and email support are all used at various points in the escalation process. 

 

Table D15-2.4 Personnel Accessibility

Resource

Time of Availability

Contact

Customer Service Representatives

24x7x365

Phone, email

Technical Operations

24x7x365

Phone, email

Engineering

8x5 plus 24x7x365 emergency call

Phone, email

Management

8x5 plus 24x7x365 emergency call

Phone, email

 

D15.2.14.7 Operations Testing and Evaluation Support (OT&E)

The OT&E environment will provide a protected environment in which to validate the operability of prospective registrars. It will replicate the production software environment separate from all production data and operations and allows for debugging of interoperability issues. It also will be an ongoing test area for evaluating future system upgrades.

The OT&E process will ensure that a registrar’s system is compatible with the Registry.  To participate in the process, the following steps will occur:

1.      Registrar requests OT&E activation

  1. Registrar tests their registration system in the OT&E environment
  2. Registrar requests formal evaluation time during which they must demonstrate fully operational, well-behaved registration system
  3. Registry evaluates results of the formal evaluation and either confirms successful completion or returns failure results; if failed, registrar fixes problems and returns to step 2

5.      Registrar passes OT&E and is activated in the production environment.

The OT&E environment will have an RRP gateway outside a firewall. All other activities will be directed through the Registry Application and Database servers with other equipment added as needed. Initial capability will be hosted on multi-processor UNIX servers. 

 

D15.2.14.8 Non-Technical Registrar Support

D15.2.14.8.1 Account Management

Account Management will be responsible for maintaining and nurturing the relationship between the Registry and the Registrars (our clients). This team will be dedicated to constantly interfacing with the registrars and providing feedback to the Registry regarding the level and quality of service. As often as possible, the Account Managers will meet face-to-face with the registrars to discuss the relationship and explore ways to improve it. 

 

D15.2.14.8.2 Customer Affairs Office

The Customer Affairs staff will be responsible for the contractual relationship with the registrars, and for support during the ramp-up process. They will be also responsible for interpretation and compliance with ICANN guidelines, and communicate this information both internally and to the registrars. 

 

D15.3 Subcontractors

If you intend to subcontract any the following:

 

·        all of the registry operation function;

·        any portion of the registry function accounting for 10% or more of overall costs of the registry function; or

·        any portion of any of the following parts of the registry function accounting for 25% or more of overall costs of the part: database operation, zone file generation, zone file distribution and publication, billing and collection, data escrow and backup, and Whois service please

 

(a) identify the subcontractor; (b) state the scope and terms of the subcontract; and (c) attach a comprehensive technical proposal from the subcontractor that describes its technical plans and capabilities in a manner similar to that of the Technical Capabilities and Plan section of the Registry Operator's Proposal. In addition, subcontractor proposals should include full information on the subcontractor's technical, financial, and management capabilities and resources.

 

KDDISOL will elect to subcontractor most of the registry functions to Network Solutions, Inc.(NSI) during the first of two phases currently planned for the implementation of its new gTLD.

During the initial period of KDDISOL’s TLD registry administration, many of the basic responsibilities will be handled by our subcontractor, Network Solutions, Inc. (NSI) of Herndon, Virginia. They are the acknowledged world leader in registry services with sufficient financial and technical resources (see attached 10K) to accommodate our requirements. The duration of this phase is anticipated to be approximately one year. NSI’s responsibilities will include designing system and software sufficient for KDDISOL to operate and manage a world-class registry. The basic operational relationship envisioned between KDDISOL and NSI is one that will be designed to diminish as a function of time. By the end of Phase 1, KDDISOL will be fully capable of operating its registry and by the end of Phase 2, the necessity for NSI’s direct operational participation will have been eliminated.

We understand that ICANN has had a long and close relationship with NSI. In this sense, supporting documentation of NSI’s Registry capabilities seems relatively unnecessary. However, should you require additional information about NSI or about NSI’s relationship with KDDISOL, aside from what is included herein, please advise.

Please see attached Onsite Registry Service Proposal.


 

    


 

 

 _______________________________

 Signature

 

 Tohru Asami____________________

 Name (please print)

 

 _______________________________

 Title

 

 _______________________________

 Name of Registry Operator

 

 _______________________________

 Date