C17.13. System reliability. Define, analyze, and quantify quality of service.

Registry system reliability and quality of service can be aggregated into three basic categories:

  • Reliability and quality of the database and surrounding components that comprise the provisioning function
  • Reliability and quality of the nameservers and surrounding components that comprise the resolution function
  • Quality and integrity of the zone files

For the .org registry database, quality of service is measured in terms of the following three elements:

  • Availability of the database (system up-time)
  • Response time for database transactions
  • Equivalent access

Figure 17.13.1: .Org Registry Database Availability

The UIA Team has a history of reliable registry database operations. As Figure C17.13-1 shows, the current .org database has successfully operated within contractual SLAs through all of 2001 and into 2002. In 2001, it operated at a combined 99.63% availability (including both planned and unplanned outages). For unplanned outages, it operated at 99.99% availability. Thus far in 2002, it has operated at a combined availability of 99.5%, and for unplanned outages 99.995%.

As Figure C17.13-2 shows, response times for the current .org registry database are well within established SLAs and are continually improving.

Figure 17.13-2: Historic .Org Registry Response Times

Equivalent access for the .org registry will be managed via a QoS device that acts as a front-end to the SRS. This QoS device will manage two critical aspects of registrar access to the .org database. The first is the number of SSL connections. The QoS device will ensure that each registrar is permitted the same number of SSL connections to the .org database. But this by itself is not sufficient. In addition, the QoS device will ensure that each registrar has an equivalent amount of network bandwidth. This way, no registrar can utilize their SSL connections to "hog" more than their fair share of transaction bandwidth. This subject is discussed further in Section C17.3.

For the nameserver constellation, QoS is measured in terms of the following two elements:

  • Availability of each site (system up-time)
  • Capacity (queries per second), including queries answered and not answered

Although DNS is extremely tolerant of the loss of a single site, this is really only true if there are sufficient remaining sites with sufficient extra capacity to assume the load. Section C17.10 has more information about the capacity of the nameserver constellation currently serving (and proposed to continue serving) the .org TLD. As Figure C.17.13-3 shows, the proposed DNS constellation has a history of running at 99.84% availability in 2001 and 99.99% availability thus far in 2002.

Figure C17.13-3: .Org Nameserver Constellation Availability

Finally, the primary quality of service metric (which encompasses both the database and the nameservers) is the integrity of the zone files. In other words, how many errors are in the zone files and how long do those errors exist at the nameservers, thereby negatively impacting resolution on the Internet? There are many opportunities for failure in the process of zone file generation and distribution. In the current process of generating and distributing a fresh zone file every 12 hours, and considering 2.5 million domains, there are more than 1.8 billion opportunities for failure in a given year. As Figure C17.13-4 shows, in 2001, the .org zone file achieved a reliability rate of 99.99999995%. This represents one domain error for one 12-hour period in 2001. Thus far in 2002, the .org zone file has achieved an integrity and quality rating of 100%.

Figure C17.13-4: .org Zone File Quality

 

Back to Table of Contents