ICANN Meetings in São Paulo, Brazil
Captioning SSAC Open Meeting
6 December 2006
Note: The following is the output of the real-time captioning taken during the SSAC Open Meeting held on 6 December 2006 in São Paulo, Brazil. Although the captioning output is largely accurate, in some cases it is incomplete or inaccurate due to inaudible passages or transcription errors. It is posted as an aid to understanding the proceedings at the session, but should not be treated as an authoritative record.
>>STEVE CROCKER: Thank you for your patience and welcome to SSAC public meeting. I'm Steve Crocker, chair of the Security and Stability Advisory Committee. I'll introduce the other people up here in just a second.
The general format and purpose of this meeting is to share the results of what the Security and Stability Advisory Committee has been working on, what it plans to work on, and to solicit interactive feedback.
We're being Webcast and, I don't know, but I hope that it's possible for people to interact with us somehow, although I confess that I haven't looked at the mechanism.
One word in advance. We have a team of scribes, Chuck and Jennifer, in back, and we're requested to speak as clearly as possible and to identify ourselves when we speak, so that they can get this down properly.
The scribes have, over the long period of time that I've been dealing with ICANN, done an absolutely extraordinary job, and it's a great service.
To my immediate left, Lyman Chapin, who is chairing a committee whose name I have yet to figure out the full name of, but it's RSTEP, and you're going to explain all that.
>>LYMAN CHAPIN: It will be on a slide.
>>STEVE CROCKER: It will be on a slide. Good.
Next to him is David Piscitello, the ICANN fellow. Suzanne Woolf, who is a member of the -- a member of SSAC and is also an integral part of the Root-Server System Advisory Committee and is parallel with me as a liaison from that committee to the board, so she and I both serve on the ICANN board. And Russ Mundy, member of SSAC and this year's offering -- I'm sorry, committee member on the ICANN NomCom, which is an onerous and expensive task, in the sense of the amount of time that has to be committed, and I'm very grateful that he's agreed to take this on.
Are there any other members of SSAC immediately here? David Conrad is trying to hide in the background over there, general manager of IANA and a permanent guest, I think, of the -- of our -- and sometime beneficiary of our deliberations.
I'm going to take just a second and talk about what we're going to do today and what we are, and then quickly turn the floor over to -- for presentations, and then come back at the end with question-and-answer, a little bit of insight into what we're thinking about doing in the future, and I think that will complete it here.
We have a nice chunk of time. We'll try to move at a good clip, and stay within the time. In the unlikely event that we finish early, I'll consider that to be a bonus for all of us, and I'm sure we'll all find things to do.
Dave Piscitello has a presentation on investigations into some privacy aspects related to the WHOIS database. Suzanne Woolf will present the work that has been a joint effort with RSSAC and SSAC on what might seem a very arcane and specific issue about IPv6 addresses in the root zone for the root servers as opposed to IPv6 addresses in the root zone for TLD name servers. And some of the details about why that's interesting, why that's hard, and so forth, will emerge.
Lyman, as I said, will talk about RSTEP activities, and then I'll come back.
I'm going to permute the order that's shown here and ask Lyman to lead off and then continue with Dave and then with Suzanne.
A word about SSAC. We're a volunteer committee of experts. At least we think we're expert. All of the people on the committee are heavily involved in one aspect or another of the domain name or addressing business. Some are from registries, some are security experts, some are operators of one sort or another, or from the address community or, in principle, the registrar committee, although we lost the person who was there.
And we try to have a relatively well-balanced mix of people in all of the desirable dimensions: Skill set, some attention to geography, which is desirable but it's not our first driver. The other critical thing about our structure is that we're an advisory committee and we take that quite seriously. It is quite empowering. It allows us to say whatever we want and leave the responsibility for choosing whether to follow what we say to others. That gives us flexibility and a degree of comfort and safety that we can focus on the issues, sort of no matter how controversial they are, at least in principle, and not have to also weigh the equities of whether everybody's been heard from. I mean, there's another layer or two that is a check-and-balance on our activities, and that is quite good from where we're positioned.
Another aspect is that we're structured as a committee chartered by the ICANN board and one might think that, therefore, the interactions are between the board and us, that they task us with something and that we respond to them.
There is certainly that kind of interaction that takes place, but our position is considerably broader. We are frequently -- take on -- in fact, I would say in the majority of times, choose the topics of importance ourselves. And we speak not only to the board but to other parts of ICANN, to the various constituencies, supporting organizations and so forth. And equally, publicly to the broad Internet community.
We do have the capability of holding our deliberations in confidence when that's appropriate, but we try, as much as possible, to be quite open about what we're doing and make our advice and our thoughts visible.
We have a number of publications that are on the Web site for people who have been following us. Probably the biggest and most visible event was the -- in 2003 and 2004 on the Site Finder activity. Domain name hijacking, another high-visibility topic. Denial of service attacks we featured in -- at the Wellington meeting in -- I guess it was the spring, right? And so forth. And a whole variety of topics. And I'll come back to this in a bit but let me now turn attention -- turn the floor over to Lyman. And in addition to turning the floor over, I have to turn this dongle.
>>LYMAN CHAPIN: More than the floor.
>>STEVE CROCKER: Yeah.
>>LYMAN CHAPIN: Fantastic. Could we look at it in blue, though? It's good a thing we haven't relied on color for anything in here.
Well, thank you, Steve. As Steve has said, I'm going to give you an update on the -- the new RSTEP, which is the Registry Services Technical Evaluation Panel, which is a mouthful.
I will apologize in advance. I've put the -- the URLs for a number of Web sites up here. They all tend to be fairly long and I'd be glad to just make a copy of this presentation available to anybody to save them having to sit there with a pencil trying to write any of this down.
My name is Lyman Chapin. I have an address to which comments and any other useful or non-useful input can be sent: Rstepchair@rsteppanel.org.
A little bit about RSTEP. It's been around long enough now that there's fairly good information available on the ICANN Web site, so I won't go into great detail about this.
In a very brief summary, the mission or role of this panel is, as written up there: To evaluate specific proposals for new gTLD registry services. And evaluate them with respect to potential impact on security and stability. And there are a few important words in that brief statement.
One is "specific." The panel only deals with proposals that arrive at ICANN from registry operators and for which ICANN thinks there's a reason to believe -- a good reason to believe -- that there might be an impact on security and stability. So the panel does not review all registry service proposals. Only those that the ICANN staff feel that might have an impact on security and stability.
The other thing is "security" and "stability." There actually are specific definitions of those two terms that are contained within the enabling documentation that creates RSTEP. So it's not -- not nearly the same broad mandate that the SSAC, for example, has, to look over the horizon and to all -- you know, to right and to left and so forth to try to be out in front of new and potential threats to security and stability.
It's very narrowly focused on specific questions having to do with specific proposals, and within the specific context of "security" and "stability" as they're very carefully defined in the GNSO policy that created the RSTEP.
It was established by ICANN specifically to implement this GNSO consensus policy. GNSO finished their work back on the 30th of June, and ICANN actually launched this RSTEP process just this past July and August.
There are four steps to the process and the first step: A registry operator that wants to introduce a new service uses a tool that ICANN has prepared, an on-line tool, that essentially is a Web-based form that enables them to submit a proposal.
In the second step, the ICANN staff posts that proposal for public comment. There's a public comment period. And they also make a determination, what's called a predetermination, of whether or not a RSTEP review of the proposal is going to be necessary. Not all proposals, obviously, involve the kinds of questions that would trigger a technical review by RSTEP.
If the staff decides that the proposal does, in fact, require a review, then a review team selected from the panel -- the panel consists of about 24 or 25 people, and the review teams are 5-person teams. The review team that evaluates the proposal then reports to the board and its report is posted for public comment. And then the board, taking into account the public comments, the RSTEP report, the ICANN staff report and summary, and any other information that may have been made available to it, either from the community or from the registry operator, makes a decision -- a final decision on whether or not to approve the change to the registry agreement that would be necessary to allow the registry to implement the new service proposal.
So it's important to recognize that the RSTEP is, by no means, the last step in the process. The ultimate decision on whether or not to approve or not approve a proposal from a registry has to be made by the board, and they can, of course, take into account a great many issues other than the technical issues with which the RSTEP is concerned.
Since we started in mid-August of this year, we've had two registry service proposals that have been referred to the RSTEP and have been evaluated by RSTEP review teams. The first is a proposal from Tralliance for a wildcard that they've -- a wildcard service that they've called search dot travel and that was -- the review team for that started its work on the 18th of September, and the second is a proposal from GNR to make use of what were initially reserved two-character strings at the second level of dot name, and that was started on the 20th of October.
And I've noted at the bottom something that I've already said, which is that there are other proposals that have been made by registry operators during the same period that were not referred to the RSTEP, and ICANN maintains a page at which they keep track of all submitted applications, not just those that are undergoing technical review, but those that were -- that ICANN was able to dispose of on its own.
To take the first one, Tralliance, which is the organization that operates the dot travel TLD had proposed to add a wildcard to the dot travel zone. The review team consists of the five people that you see up here on the slide. The team's report was posted for public comment on the 9th of November, and a board resolution rejecting the Tralliance proposal was made and adopted on the 22nd of November. So that process, the process for this review, has been complete all the way to the end of those four steps and I'll go into a little bit of detail on this, and, you know, people should feel free to ask questions, either after I'm done or at Steve's discretion, wait until the end.
What Tralliance proposed to do was really the service they proposed had two parts. The first was to add a wildcard resource record -- and those are defined in RFC-1034 -- to the dot travel zone, and the effect of that would be to ensure that any DNS query for a name string that was not registered -- in other words, for which there were no resource records already present in the zone -- would be -- would receive, instead of the otherwise -- what would otherwise be the "does not exist" or "nxdomain" response, it would receive a synthesized response containing the IP address of a particular server, the search.travel server that Tralliance would set up and maintain. So the objective, obviously, is to catch any queries that might represent mistyped domain names, for example, in a browser window, and instead of simply returning "not found," return a response which is a valid IP address. So the response looks just like a valid response to any DNS query. But it contains the IP !
address of this search.travel server, not the IP address of the host that was the subject of the query.
And the second part of the service would be to then deploy what Tralliance called a landing page at that search.travel site which would do two things.
First of all, in typical fashion, it would advertise the availability of the string for registration as a domain name within the dot travel zone, and it would also provide a search box that would be preloaded with that -- with that string, and which, if used -- in other words, if you used the box -- would then return a series of search results with dot travel results being ranked higher than results from other TLDs.
So it's a fairly classic example of the sort of thing that you've run into browsing the Web, where if you type in something that isn't recognized, you frequently got not -- a "does not exist" response, but you get a page back that gives you the option to search for various terms and so forth. And there are lots of different ways to implement that and the wildcard is one of them.
The review team's assessment was that the service proposed by Tralliance, the wildcard service, would, in fact, have a meaningfully adverse effect on security and stability, and the report is fairly lengthy and goes into considerable detail on this. I'll just give some -- some -- some summary overview of what the review team found.
First of all, they found both security and stability risks and identified those and went into some great -- some considerable detail to explain them.
It was -- it also noted -- the review team also noted that the wildcard would do two things. First of all, it would disable a technique that quite a few applications use to detect the -- detect erroneous or misleading input, and obviously if the response to a query is always something that looks like a valid response -- in other words, it's always an IP address and not an indication that the name, in fact, is not registered -- then you would make it extremely difficult, if not impossible, to detect the fact that the name was not registered and to then possibly recognize that the user had typed in something wrong or there had been some other input error.
And the other important point, in addition to the specific security and stability risks was the fact that the effects of implementing a wildcard are felt by all current and future applications and protocols that rely on the DNS.
There's no way, given the current state of DNS and Internet technology and practice, to send a query to a DNS server and say, "I'm only -- I am a Web browser and I'm only making this query on behalf of someone who is browsing the Web."
You can ask -- obviously you can ask for specific resource record types, but you can't restrict a query to a DNS server to simple HTTP Web traffic and that, of course, is the context in which Tralliance was trying to offer this as a service, to help people who were browsing the Web and to give them something a little more meaningful than just a "does not exist" response when they typed in a name that hadn't been registered. The specific impacts on security and stability that are identified in the report are listed up here. I won't read through all of these because for the most part, they are simply shorthand for pages of explanatory material that are in the report, but it's sufficient for the purposes here to say that in each case, the review team was careful to observe the specific characterization of what constitutes security and what constitutes stability that's in the -- that was in the GNSO consensus policy document, and also to observe very carefully the fact that t!
hese concerns, if they are, in fact, used as the basis for the conclusion in this case that there was an adverse impact, must be concerns that affect systems, applications, protocols, and other kinds of systems, that are operating in conformance to and compliance with applicable Internet standards.
So these are things that are not issues just for misconfigured systems or systems that are running software that doesn't comply with Internet standards. They're issues with systems that are properly configured and operating correctly with respect to the applicable standards. Which is an important point relative to the background of how we got to this point of doing this kind of formal review, because the criteria that are used to evaluate these sorts of things have to be well-established enough and deterministic enough that registry operators can have some expectation that there will be some regularity to the process and predictability to the process.
The second review, which has just been completed, is a proposal -- is the proposal from Global Name Registry to make use of two-character strings at the second level of the dot name TLD., Again, I've listed the five members of this review team. They posted their report just a matter of hours ago, earlier today, this morning, and the extremely lengthy URL for the report is given here.
So in this case, since the report has only just been posted for public comment, the board's review and decision on this matter is still pending.
What GNR proposed to do was to allow the two-character strings at the second level, which had been reserved in all -- in all of the TLDs since the agreement reached in 2001 by ICANN -- to enable the registration of personal names that involve two-character family names. And I just picked two names out of the preregistration database for this ICANN meeting, so if Guanghao Li or Tony Ng happen to be in the room, I apologize for appropriating your name, but I couldn't think of a better source off the top of my head. And the idea here is not to delegate the two-character SLDs directly, as in allow someone to apply to register a two-character SLD, but to allow third-level registrations under those SLDs.
So the canonical form would be given name dot -- in this example -- li dot name. And the issue that arose with this and the reason that it was referred for a RSTEP review is the possibility of confusing those two-character -- what are intended, in this context, to be two-character surnames or family names, with two-character country codes that are used for ccTLD names. And in the examples I've given, dot li is the country code for Liechtenstein and dot ng is the country code for Nigeria.
So the question was whether or not there was a technical question, a technical issue with respect to the possibility of confusion.
The reason I belabor that a little bit is, of course, there may very well be policy reasons why you would be concerned about the visual confusion of a two-character name at the second level of dot name and the use of the same two-character string or visually the same two-character string as a ccTLD.
The issue that was referred to the RSTEP was not nearly as broad as that, it was simply are there technical problems with respect to security and stability that would arise if you allowed dot name to include two-character strings at the second level.
And the conclusion of the review team after an extremely exhaustive exploration of every possible way in which this might create a problem was that, in fact, it did not have any meaningful adverse effects on security and stability. And the three principle reasons for that conclusion are listed here. First of all, as I am sure all of you know, there are two character names at the second level of almost every TLD already. No new two-character strings have been registered since the agreement in 2001 to put a moratorium on that.
But the operation of TLDs that already have two character domain names registered at the second level suggest there are very few, if any, operational issues that have been reported.
We also -- the review team also did a thorough analysis of a data set that was provided by the registry operator for a ccTLD of queries that -- well, a ccTLD, sorry, that does allow two-character SLDs. An analysis of those queries show a proportion of erroneous queries that end in .TLD.TLD, for example, co.co is very small. This is important because one of the failure modes that was identified in bringing this to the RSTEP in the first place is one that is discussed at fairly great length in RFC 1535, and that's the possibility that DNS clients using resolvers that perform a particular algorithm for adding or appending strings to the domain name that's typed in by a user or user interface might result in the resolver querying first for a name that might be, say, name.co.co before it every queried for name.co if the user had typed in, for example, name.co.
This is obviously something, you can probably tell from the RFC number, RFC 1535, is something that dates back to about 1993, and is not a problem that can occur with modern versions of DNS systems.
So in particular, the current version of -- the land version of Bind that exhibited this behavior was 4.9.2 and all subsequent versions of Bind have not exhibited this behavior. The review team conscientiously examined this, analyzed the data and determined that, in fact, out in the field, in practice of the real world, the number of these that occur is actually very small.
The review team also conducted an independent honeypot experiment where they registered a number of domain names and set them up to encourage the erroneous behavior they were looking for and they floated it out there for a while to see how many hits would they get. This is obviously not a conclusive experiment. To design a conclusive experiment would obviously involve a lot more at that points.
But the data gathered at least from this experiment suggests the number of misdirected queries represents -- I should have put something even more forceful than microscopic in there because it is a really, really microscopic fraction of overall traffic, which suggests that the problem is not very widespread, although it certainly does exist.
And at the moment, we have no further registry service proposals that are on our plate, so the panel when it is not actively investigating a proposal is essentially inactive.
We do not have the kinds of ongoing activities that the SSAC does. We are simply, essentially, a set of potential review teams in waiting and so if, for instance, between now and the next ICANN meeting in Lisbon, there should be no review team activity. In other words, nothing referred to us from ICANN. Presumably there will be no need to give another report because nothing will have happened. We have no activities in between reviews. Thank you, Steve.
>>STEVE CROCKER: Thank you very much, Lyman. I have a question or two but I will hold it until the end, unless there is something urgent, I want to try to move forward here.
Dave, you're up next.
One of the qualities about SSAC is that the range of activities, although we're -- the scope is security and stability, actually covers a considerable breadth from user level issues to the internals stability of -- engineer stability of some of the architectures.
This is more along the first part.
I'm in the picture. Excuse me.
>>LYMAN CHAPIN: It is neat to have your head in shadow.
>>STEVE CROCKER: Yeah, yeah, yeah. We have three Macs and getting three different results.
>>DAVE PISCITELLO: Three Macs and three different shapes.
>>LYMAN CHAPIN: While we are dealing with this, don't you love the fact that the Security and Stability Advisory Committee is all Macs except for the root servers who apparently feel comfortable relying on Windows.
>>STEVE CROCKER: That's probably a Linux machine, I would bet, or Free BSD or something.
>>SUZANNE WOOLF: It has multiple personas and all of them can serve DNS queries.
>>STEVE CROCKER: While we are vamping here for a minute, tomorrow I will be chairing a DNSsec session from 11:00 to 1:30, I think -- no, later even -- embedded within a day-long ccTLD, DDNS workshop which will be -- are we in Sao Paulo 1 or 3 here? I think it is in Sao Paulo 1. Yeah, so I think it is in this room actually tomorrow.
So logically DNSsec is a core area of concern but it is a big enough topic that it runs on its own track so there will be a major amount of activity tomorrow, including starting off with hijack demonstration and continuing with the details of the state of deployment so that's the ad for that.
Do you want to switch machines?
>>DAVE PISCITELLO: No. I will go back to what I was doing.
>>SUZANNE WOOLF: That's what we love about our fellow.
>>DAVE PISCITELLO: Sorry. For those of you who will get seasick, I apologize. Let's try to labor on, under the refresh rate mismatch that we are stuck with.
As somewhat of an introduction to what I am talking about here, in the hacking world, there are actually several phases of an anatomy of an attack. Generally speaking when you are going to attack what you need to do is acquire a target. And the first stage of acquiring a target is called the information-gathering phase.
It is sometimes called surveillance if it is the law enforcement agencies that are actually doing this as opposed to a hacker. The next phase is often called probing or doorknob rattling. Next phase is typically choosing the tools from the tool kit that match whatever doorknobs you happen to have discovered, had some rattling opportunities. And then you -- that typically in the electronic world corresponds to choosing a set of vulnerabilities that are applicable to the target. And then once you have chosen a set of vulnerabilities, you attempt to exploit them.
What we are talking about here is the very, very first stage of some sort of attack that one might attempt to execute by using domain name registration records. In particular, what my objectives were in this analysis were to the extent to which I could extract personal contact information from the domain name registration records that I had acquired.
So immediately this begs the question of what is personal contact information, and if you Wikipedia or you Google or you search for the term "personal contact information," you will find dozens of legal regulatory and other definitions for "personal contact information."
For the purpose of my study, what I wanted to find were sufficient attributes to feel confident that the registrant is an individual or an individual operating a business from a residence. In other words, from his home or her home.
And that it is possible using the information collected to speak with or visit the information at his or her residence, making personal contact. I want to assure you all that do you remembering the course of this entire study I never knocked on a door. I never rattled a doorknob. I never called any of the people that I dug in the WHOIS database. And the information that I have and the names of the people that I culled this is safely on a machine at home and I will dispose of it at the end of this study.
The methodology I used is a methodology as I said is used in gathering techniques used by computer network attackers. I mentioned already it is the methods and the resources that are used by law enforcement agencies.
Actually, in order to confirm this and in order to stimulate a little bit more of an idea of how I might go about this, I attended a conference in the July time frame of this year where -- it is called Techno Security 2006. And the majority of people who attend this conference are law enforcement agents and people who are involved in network forensics.
So when you are in the early stage of this methodology, you need to acquire a set of potential targets. My acquisition of targets was to filter 5,000 registration records from over 2 million that I acquired from bulk data and the filtering argument or the search argument through the database was Philadelphia, Pennsylvania. And I will explain to you why Philadelphia, Pennsylvania in a moment.
Once I had my initial targets, I used publicly accessible resources to collect bits and threads of data from not only the registrant administrative contact information but also examining at the time the technical contact information to see if it provided any additional information. And I will not only provide a summary but dig a little bit deeper into some of the methodology in a moment.
So this was a hands-on, eyeballing effort. This was not a build a pearl script, run a C program, do whatever you felt was necessary to go through the data in an automated manner and extract information based on some logic that you wrote into the program.
I felt it was necessary to actually do this because when I looked at the kinds of resources that I wanted to use to build that confidence that I mentioned at the outset, I really didn't think it was going to be possible to actually do some of this without some of the eyeballing. And I will explain that in a moment.
So what were the resources that I used? Well, obviously I began with domain name registration records acquired in bulk using WHOIS protocol. I actually purchased CDs from a company that is now or since ceased operation that offered me 20 million registration records on six CDs and zip files, in ASCII tab-delineated text.
I also used a research database that is used by real estate agents in the Philadelphia area called trulia.com. I used the Internet telephony phone directory, whitepages.com. Several search engines including Yahoo and Google. I used aerial photographs. I will explain what that was for. E-maps and the companies and industries directory from Hoovers, personal familiarity with the geographic region. I chose Philadelphia, Pennsylvania because I lived there for 30 years. I was very familiar with most of the neighborhoods that I ended up coming in contact with and those that I wasn't familiar with, I used Google Earth. I said, Gee, I remember the general area and I have a vague idea of who lives there and I have a fairly good idea of how I might be able to match some of these addresses and deduce that this was a residence as opposed to a business.
Finally, I visited Web sites that were hosted at the registry domain. Some of these Web sites were not hosted, but we will go into how I built up my confidence in a moment.
I classified results initially into the five classes that appeared obvious as I went through the first 10,000 records and then the subsequent 5,000 records I used. As I said before, the personal contact were -- records were those that an individual or home-operated business.
There was also an obviously category of business contacts and this would be like Dupont.com. It would be any of the other major operators of banks and financial Constitutions in the Philadelphia area, some national and global companies that had registered addresses, using a Philadelphia street address.
The domain name business was just a category that I lumped into any secondary market or tasting or monetization that I could identify.
There was also the domain name proxy agent. By "proxy" I mean someone who actually substituted the address of the hosting company as the registrant's address. So this is very similar to the domain names by proxy services that we are seeing today.
The last category was obviously inconclusive data. This is where I looked at the record and there was just insufficient accurate information in the record for me to deduce anything.
The methodology here, because I was doing this record-by-record, field-by-field, search on each piece of information, it was very much like trying to match up points on a fingerprint and come up with a sufficient number of points in the finger print to feel very confident that I had a match of one fingerprint from a database to one I had taken off a potential criminal.
So if I go through trying to classify a record as a personal contact, first clues were it is a personal name. Another clue was the registrant's phone number is a residential listing that I managed to identify through a reverse phone number search. Another clue was the registrant's neighbors are neighbors -- are also individuals. And then you can actually do this using the whitepages.com. You can say who lives close to this number.
The registrant address contains an apartment number. In the United States, in at least some of the formal ways to identify your postal address, if you put apartment 2F, fairly high confidence this is an apartment building. I also then confirmed by looking to see if the apartment building was registered as an apartment building or a multi-tenant dwelling.
Real estate listings near the registrant's address are residential. I will make a caveat here that right at the outset -- so I perhaps avoid some questions later. This methodology works in the United States and in particular in a part of United States where there is a very clear delineation and most addresses of a residential neighborhood like a number of single family dwellings in a cul-de-sac that are very easy Google Earth recognizes as homes, and it is also very interesting that in the United States there are many, many neighborhoods that are entirely residential.
In those neighborhoods that are not entirely residential and have apartments above a pizza parlor, apartments next to an Asian restaurant or an apartment next to a Thai restaurant, it becomes a little bit muddier. When I saw that muddiness, I didn't include that field. And I said this is not a reliable piece of data. I will go on and see if I can find some other matches.
Registrant phone number as a cell phone. When I did a reverse phone number search, there are opportunities in some databases to determine it is registered to an individual. Registrant looks like a residence. This was from an aerial photograph. As I mentioned before, if it is a single family dwelling in a neighborhood that I recognized or that I had visited or had friends in and they are like here it is, this is your classic American two-story colonial, this is probably not a business.
We have very careful zoning laws in many parts of the country, especially in Philadelphia, where you can't run a business legally out of a home.
The registrant's neighborhood is known to be residential. Again, that was familiarity with region. And then the registrant's Web site reveals personal information. It is quite often the case that someone who is willing to put their personal information in the registrant record is hosting a personal site and provides even more personal information once you get there.
So how do I classify a record as containing a domain business. It was a registrant that was a public corporation operating under a fictitious name in the EDGAR database, identified in Hoovers. The registrant phone number comes up as a business listing in the Yellow Pages or a phone number search. The registrant's neighbors are also businesses, shopping mall kind of scenario, the address contains a suite number. Often -- especially since I was familiar with the area, I know that 1600 Chestnut Street in Philadelphia is a multi-story building and no one lives there. It is entirely populated with professional offices.
Also from a Google Earth search, I could deduce from certain professional offices that are sort of campuses in the suburban area that this was a place where people don't live. It is a place where people go to do business. So I won't go through the rest of these because they sort of fall in parallel. What I would like to point out is that in most cases, if I couldn't get six or seven matches out of the ten or 11 that I had, I didn't include that as a conclusive deduction that this was a personal or a business address.
So I started out with around 5,000 filtered records and then because I couldn't come to conclusion or there was insufficient information, I ended up with about 4,400.
Of the 4,400 in the sample, about 505 were from dot net. Dot com had approximately 3,334 names. Org was around 520. Other domain names, biz, info, were about 85.
The first set of findings what I looked at the type of contact based on the registrant contact fields, I ended up concluding that around 377 records, or 9%, were clear personal contacts. Contacts that I had very high confidence I could go up and knock on a door, I could make a phone number and contact a person.
About 2500 or 56% were business contacts. 269 or 6% were domain name businesses. These are people who are clearly doing name speculation. In fact, I went to the Web site of three of the 19 or so that fell into this category and they literally advertise themselves as speculators.
562 were records that represented a proxy or some effort by the individual to allow someone else to host and hold the name as the registrant. That was approximately 13%.
Home-operated businesses. Now home-operated businesses again is where the personal contact name was a business, like The Embroidery Girl. But The Embroidery Girl operates out of apartment 3F on 1400 Locust Street in Philadelphia. That's another fictitious name. I am not actually revealing where The Embroidery Girl lives. But she has a very nice Web site, by the way.
So that was about 3%. And then the 600, as I pointed out before, or 14% were inconclusive. I wanted to drill this and look at it a little bit deeper. And I said, well, let me remove the inconclusive and proxy domains and let me compare again because I am looking to see whether it is business or personal.
I combined the personal contacts in the home-operated businesses and that ends up being approximately 13.4%. There was rounding in the first slide and there is no rounding here.
The business contacts now end up at a vis sampling of 65% and the domain name businesses end up to be 21.6%.
So then I said, all right, suppose I don't have enough information in the administrative contact -- I'm sorry, in the registrant contact information, maybe I get more in the administrative contact or maybe I find more individuals because there are people like me who register domains on behalf of other individuals because I happen to host a Web site and I'm an individual and I'm in the administrative contact field. And the person who is in the registrant field is the person for whom I am hosting the site.
Does that make sense to everyone? So there might be more. And, in fact, there were. Of the 377 records that contained personal contacts, 347 contained the same contact information in the admin and the registrant fields. 13 contained information of identified different individual. Eight contain information that contained a business and nine had inconclusive data. If you go and look carefully at the registration records, you find that there is a terrible inconsistency across a lot of the contact fields. Incompleteness is rather remarkable and it is something I will talk about in a moment.
Of those 138 records that contain home business contacts, 125 contain the same contact in the admin fields as in the registrant fields. Three contain information that identified a ditch individual. And four contained information that identifies a business contact and then again some inconclusive data.
You map out some of the -- some of this in graphing, these are pretty much the results you will see.
So what can I conclude from this? Oh, by the way, I did not use technical contact information because the overwhelming percentage of the technical information is much more accurate -- well, the good news is that it is much more accurate than the admin contact fields and the registrant fields. So that means that the people who are actually running services and doing registrations and registering names recognize the importance of technical contact which I think is sometimes overlooked. And I also found that by the time I got through the 3,000th record, I had maybe six that were personal contacts. Most people let the technical contact be whoever it is who is actually hosting the machine.
So, okay, why did I do this? One of the reasons I did it was because I had been to a sufficient number of WHOIS task force and WHOIS meetings to realize that people kept claiming that there wasn't sufficient data to make any distinction or to concretely decide whether or not there was a reasonable amount or an unreasonable amount of personal information. Of course, that's very subjective to begin with. I am not going in that direction. I am not going to draw some conclusions, but I am going to offer at least this one set of findings that fills that void. I think I missed a slide here.
Okay. So if you want to draw some conclusions, all I want to do is do a summary of the numbers. The personal contact information can be extracted from approximately one in seven domain name registration records. How much that is -- that varies from methodology from methodology or from the sampling is a good question. I did not do this for an European city.
If you use an European city, my suspicion in talking to European residents is that you would have to find some other methodologies. But this is sort of an interesting experiment for someone else in a different geographic location to, perhaps, attempt. What criteria would you use if you were trying to hack information and find people beginning with the WHOIS database?
The other thing is that approximately one in seven registration records contain insufficient information to conclusively distinguish whether or not contacts are businesses or individuals.
I think that's an interesting point as well and you can mull that over and send me some e-mail later and we can talk about it.
One of the things that I observed as I was trolling through these things -- trust me at times this is a deadly, deadly boring thing to do is that the causes for and remedies to reduce the number of incomplete records truly merits attention. Of the 456 of the 5,000 originally sampled records were entirely unusable. And of the remaining 4464, 600 were missing information used to classify a contact. I started with 5,000 and only ended up with only 3,700 which isn't all that great.
There should be one more slide and I think I missed it someplace. Here we go. To give you an idea of the incomplete record problem, of the 4,444 records that I was able to use in this study, 24% were missing the registrant phone number. 87% were missing the registrant fax number. 10% are missing the admin contact name. 11% were missing the admin contact e-mail. 12% were missing admin contact address, and 60% are missing the admin contact fax.
So if you are a registrar or you are trying to problem solve and contact people using a fax, you are in deep sneakers right now. It just is not happening. And that's it. If anyone has any questions, I will be happy to entertain them or do you want to hold those?
>>STEVE CROCKER: Yes, I think I want to hold those. Well, you look like you are ready to ask? Go ahead. You are just holding the mike for somebody else. Sorry. She works here. [laughter]
You don't have any questions are yourself?
>> Lucy: No, no.
>>STEVE CROCKER: You look so ready.
>>SUZANNE WOOLF: We need someone who speaks Portuguese.
>>STEVE CROCKER: Suzanne Woolf will take us through the intricacies of IPv6 addresses for root servers in the root zone.
>>SUZANNE WOOLF: Now, if you'll just bear with me briefly. I will say that I've been wondering a little bit about this threat to Internet security and stability represented by the hospitality of our hosts and the local invention of the caipirinha, but I'll defer to the chairman and talk about IPv6.
>>STEVE CROCKER: The chairman had too many last night, too.
>>SUZANNE WOOLF: Hence, the threat represented.
And instead, I will talk about IPv6 and the root. Some of this gets fairly technical. I'll try to flag the sort of take-away points but some of the technical detail is included for reference. You know, this will be on the record, and will be posted as part of the proceedings of the meeting, so don't worry if not all of it is obvious, but we'll try to flag the most important points.
So what exactly are we talking about? IP Version 6 is the successor to Version 4, inspired primarily by the need for larger addresses, and expected to accommodate global addressing needs for the foreseeable future. But transition to V6 from V4 is a lengthy and incremental process, given the amount of work that has to be done to change existing systems and create -- and manage compatibility for future systems.
So this matters, however, because we have a technical need to extend Internet protocols, including DNS, to allow for advancing technology and new capabilities for users. We need V6 into the future. We will need other technologies. We will need DNS to -- and future protocols like it to be compatible as we evolve the Internet.
Given the importance of the Internet as infrastructure, however -- and it's only getting more important -- we have a practical need to make sure that such advances are made prudently, with appropriate focus on consequences and the security and stability of the underlying infrastructure. So partly we want to see IPv6 addresses in the root, and the expansion of the IPv6 Internet. We also need to make sure that we're evolving our processes also, so we're trying to create a case study for resolving similar questions in the future.
So the current situation. The DNS standards accommodate IPv6. DNS support for IPv6 is available today in top-level domains, second-level domains. Five of the root name servers are capable of serving IPv6 data over IPv6 transport. Now, the key point -- the thing we're trying to get past here -- is that IPv6 name resolution is not available at the root of the DNS. What that means is that you cannot ask a root name server for the IPv6 address of a root name server.
This is a limitation. It propagates a dependency on IPv4 simply to use the DNS.
However, because we're involving -- because we're evolving this critical infrastructure, there are some impediments, there are some concerns. The technical issues come down to, we're changing -- we are reflecting changes in our underlying protocol. DNS response packet sizes increase when IPv6 addresses are present because V6 addresses were built to be larger than IPv4 addresses. In addition, we need to use a different record type in the DNS to support IPv6 addresses. I'm not sure whose idea of a joke it was to call these Quad A records, because the traditional IPv4 addresses -- address record is referred to as a single A, but the -- the key technical question is: How will these changes affect the installed base? The nontechnical or process questions are basically that we're in kind of new territory. This is -- we're looking at significant changes for the -- for DNS just to -- and we just -- we have not made that many changes at that level to the DNS. There's a need !
to make sure that we're not proceeding too quickly.
And there's analytic complexity, with evolving the underlying infrastructure at this level. There's a lot of questions we can answer by protocol analysis, which is why we talked to a wide variety of DNS experts. There's also certain things we don't know until we test them, so there's a significant amount of effort involved in making sure we're looking at the right things with appropriate tools.
So we can get into the nitty-gritty of what's holding us back.
This will be elaborated on, you know, in some more detail here, but the DNS has a historic limit on the size of a response. It goes back to the original specification of the DNS and a much smaller network with fewer demands being placed on it, but including IPv6 data may cause the response to be larger than this historic limit.
DNS protocol has a way to handle the larger responses. It's -- it allows modern clients to accept more data in a single response. It's designed to be safe for older clients, but because the root is a special case and any change in the root has such a wide impact, we need to verify that the critical data for reaching the root still fits in the older packet format and size.
To elaborate a little bit further, every resolver, every DNS client needs to know the address of at least one root name server when it starts up. We're simplifying a little bit here, but this -- there needs to be somewhere to start in finding the DNS, and finding the root. This is a configuration issue for operating systems and system administrators, not typically for users. But it involves the verification of initial data, referred to as the priming exchange.
The configuration information is the hints file, and it's manually installed or it's preconfigured with the OS or it's bundled with DNS software installations. One of the things that needs a close look here is that this is not user visible infrastructure. This is stuff that users don't notice until it breaks.
So the thing that we need to be careful about, we need to know that the priming query that start -- that allows the DNS to start up for a client, and later queries but particularly the priming query because that's the special case, will be successful, even for clients that don't understand IPv6 addresses, don't understand larger DNS answers, don't care about any of that.
Clients in this situation may be applications, may be enterprise-wide resolvers like your ISP or your corporate IT department runs or middle boxes like firewalls.
The key point, again, backwards compatibility is not optional. A situation that would cause problems for any significant fraction of the installed base while evolving towards the future is not going to be acceptable.
I'm not going to spend a lot of time on this, but just so -- just so you've seen it, the hints file that came with your DNS resolver, the text file, IPv4 addresses, to accommodate IPv6 additional resource records have to be included. This is not the complicated part. There are a few issues here.
Again, the picture is available primarily for reference, but the key point, as initially designed, DNS performs this initialization step very efficiently. Almost -- if you can really say this about anything in the DNS -- gracefully. Things fit together just so. The query is sent out, a response arrives that fits exactly into the message size that the client is expecting, and it contains the critical information.
The situation, as we push the limits, if you will, with IPv6 or other reasons to include additional records -- but we'll get back to that -- the response will be larger. And we need to make sure that the -- that the characteristic of -- the characteristic efficiency is preserved.
The possibility that a priming response with IPv6 addresses in it would be useful to IPv6 users somehow, but at the expense of less modern users, is the -- the thing we're trying to make sure we have paid very close attention to.
I'm sorry, that was a very convoluted sentence. Let me go back and try that again.
At this point, the concern arises that the V6 compatible response will serve the clients that are using V6, will serve the future and the modern -- the more modern software base, but will take away or cause problems for the less modern client. That's the thing we have to make sure does not happen.
Another class of potential issues, it's kind of a variation on the theme and, again, it's backwards compatibility we're most concerned about. This includes intermediate system, often called middle boxes. The most typical ones are firewalls and DNS proxies. The user may not know that they're there. The user may not have any control over what they do.
But if they ignore or change a DNS response that they don't understand, because they've never seen an IPv6 compatible DNS message before, the user could still be left without usable data, in response to the priming query or another query.
Again, the analysis needs to be done to make sure that that's not going to happen. So the findings to date on the joint effort between SSAC and RSSAC, there's a few things we can finalize at this point, although there are some things still under discussion. Some finer points.
The constituency for IPv6 availability, all the way from the root of the DNS tree is significant and growing. This is important to an increasing number of people. And the limitation currently imposed that there's a dependency on IPv4 to use the public Internet DNS is something we need to address.
Adding V6 addresses at the root of the DNS affects the root hints file and the priming exchange, the initialization phase for DNS. But it remains important to move forward.
So we're looking into the details.
One thing we're comfortable with saying at this point, the existing procedures for publishing the root hints and doing that phase of the initialization don't need to be changed to add additional Quad A addresses.
Changes to the composition of the DNS root response are needed to include V6 addresses. There are protocol mechanisms designed to make that easy to do in a backwards compatible way. We think those work. We've looked closely at the protocol issues and we don't see a problem, and some of my colleagues are in the room and will undoubtedly argue with me if I'm going too far here.
But we see no -- we anticipate no harm in the expansion of the standard root response to accommodate the assignment of IPv6 addresses to the -- to all of the root name servers.
We also note that there are already Quad A records in the root for a number of TLDs, requests are routine, this is something IANA knows how to do, and this is part of the administration of the root zone.
In addition, we have done -- I believe Dave Piscitello is responsible principally for this -- thank you -- testing of well-known firewalls and other middle boxes that has shown no problems with either large DNS messages or Quad A responses in priming responses. This is a potential problem and therefore needs to be looked at -- needed to be looked at but it does not appear to be an actual problem in the field. And, again, there is an installed base that uses Quad A records regularly in other DNS responses and no widespread firewall/proxy issues with them. There is experience out there that this is not causing major problems.
So there's still some concern, though -- after that, it gets a little tricky. We've analyzed and tested quite a few things but we don't know everything. So engineering conservatism requires that we have a plan, a roadmap, that allows flexibility as we go forward and allows for the possibility that we'll find issues we haven't dealt with yet.
So the outline of the plan -- the outline of a plan, although we're finalizing details, would look at conducting tests or other means of identifying implementations, versions, configuration settings, software behavior that will permit systems to behave correctly when root hints and DNS priming responses contain Quad A addresses. That's probably going to wind up in the formal report. It summarizes to: We're going to look at ways of verifying that a particular piece of software configuration does the right thing.
Document configurations or workarounds that accommodate Quad A addresses but result in different default behavior or security policy enforcement from what is currently understood and deployed. Again, find out in detail exactly what systems, configurations, software might be doing the wrong thing and figure out how to make sure that we've documented how to change their behavior.
In addition, a deployment plan will probably have to include providing advance public notice of the intent to include Quad A records in the root hints file and DNS priming responses from the root name servers, and then go ahead and include the Quad A's for the roots that are currently V6 capable in the hints file, and in the live zone, and eventually have all of the root name servers able to serve IPv6 compatible data in the production zone.
But the phased deployment approach is going to be important, because we need to make sure that we identify issues. Even ones we have not thought of yet.
I've referred to the roadmap as kind of -- as tentative. The sequences of the committee is not -- the committees, the joint working group, I guess, is not complete yet on all points. And the final recommendation we present needs to be both a detailed suggestion for the way forward for IANA and a clear discussion of the issues and why -- how we feel we've resolved them for the Internet community, so that the process of deciding what to do and what the rationales have been will be public.
We do hope to have a sequences Recommendation and a roadmap before the Portugal meeting. And that is what we have, and I'll be happy to take questions when our chairman allows.
>>STEVE CROCKER: And I think that's what we should do now. I have a couple more things that I want to cover, but I think the right thing to do is to jump into questions on each of these quite substantive and meaningful reports.
I'm not going to connect up anything at the moment, so we'll just leave you connected -- no, you're disconnected.
So Suzanne presented on IPv6 addresses, Dave on WHOIS privacy, Lyman on the RSTEP process.
Let me just open the floor for questions. I have the impression that there's people who have been holding questions for a while and are eager to jump in here.
>>DAVE PISCITELLO: What gave you that impression?
>>STEVE CROCKER: Ah, come on.
>>SUZANNE WOOLF: Everybody had too many cape cape last night.
>>STEVE CROCKER: I've got some questions, if nobody else does, but --
>>STEPHANE BORTZMEYER: Yes. But could we ask questions also about the --
>>STEVE CROCKER: For -- would you, for the scribes, please identify yourself first?
>>STEPHANE BORTZMEYER: Yeah, right. Stephane Bortzmeyer from AFNIC's .fr registry.
Could we ask also question about the previous folks or only about IPv6 in root?
>>STEVE CROCKER: No. This is wide open. You can have at it. Whatever you like.
>>STEPHANE BORTZMEYER: Wide open. Okay. Because first I would like to ask a question about wildcard in dot travel proposal. Everything that you said in the report under wildcard in dot travel was perfectly correct, but it also applies to any wildcard proposal, not only Trilliance's specific proposal. So what about dot museum, which also has wildcards with exactly the same problems?
>>LYMAN CHAPIN: Yes. The -- the review was specifically about the Tralliance proposal, so although there is a lot of material in the report that would lead you to believe that it applies to any wildcard, the review team's conclusion is technically restricted just to the Tralliance proposal. And all that means is that if someone else proposed a wildcard, it wouldn't automatically be rejected by ICANN just because we had studied the Tralliance version. They would have to look at it and find out if there was anything different about it that would perhaps make it more acceptable under the guidelines that we used for the evaluation.
So I guess all I'm really trying to say is that this review team conclusion doesn't automatically apply across the board, because it's possible that there was something specific about the Tralliance wildcard that would be different from some other form of wildcard that someone else might propose. Okay? But that having been said, the expectation is that the technical work that is -- that is captured in the report could be used by anyone to evaluate another wildcard and basically come to a conclusion about whether or not it looks exactly like the Tralliance proposal, or enough like it that there's a likelihood that it's probably going to suffer the same fate. In other words, that it's going to have the same problems or that a technical review would find the same problems.
>>STEVE CROCKER: Let me -- let me add a bit to that, if I might. And thank you for that -- you stayed precisely within the prescription and the formal bounds.
Just to widen the scope in two respects. First of all, in 2003 VeriSign turned on wildcard and there was a big to-do about that, and eventually a year later, we issued a report, along with a lot of other people, and in broad terms, without looking at fine-grain details, the dot travel covers roughly the same ground and it got roughly the same answer.
As a matter of being precise -- and if lawyers were in the room, they'd want to make sure that we didn't make overly committing statements. As a matter of form, every case gets looked at. Every case gets a fair evaluation. But it's -- in common sense, it would be hard to find a distinction that would be meaningful.
And then the other thing that I want to be quick to say, because -- and particularly because you raised it and because the dot travel folks, the Tralliance people raised it, is: Okay. So what about dot museum?
The thing that you should expect is that this is going to go away in dot museum. That it's been there and the reasons why it was there in the first place have a kind of a strange little history. It was thought at the time that it was the right thing, and in retrospect, and particularly with experience, it hasn't worked out that way, and my understanding -- and I don't want to be in the position of speaking for them, but it's my understanding that they have no commitment to -- no strong desire to keep it, and that it's embedded in some contractual language in a very strange sort of backward way, and that that's in the process of being undone.
So it will stop serving as a -- as an example for others to use to make similar application and try to smooth out the landscape here. Yeah.
>>LYMAN CHAPIN: Yeah. I think it might be interesting to note that the reason that the dot museum folks plan to stop using the wildcard is that they discovered that they couldn't make it do what they wanted it to do. The idea in a domain like dot museum, there's a fixed well-known set of second-level domains, which are the names of -- you know, in other words, if you're a museum, that's a second-level domain, and the idea that they had in doing the wildcard was -- was actually not a bad idea.
They wanted to set up a system so that if someone typed in a second-level domain that wasn't registered, that what they would get is essentially a page showing all of the registered domains in dot museum, on the assumption that they had simply, you know, typed in "Institute of Contemporary Art" instead of "Contemporary Art Institute." You know, that they'd made some error. In the kind of domain like dot museum, it's possible to do that because there's a limited number of things that can actually show up in the second level.
What they discovered is that they couldn't figure out a way to actually implement the top-level wildcard in such a way that it would actually have that effect. In other words, that it would, in fact, catch those things and enable them to put up that kind of a page.
So they're withdrawing it not because they agree with -- with the RSTEP review team's logic, which talks about security and stability implications for the wildcard in dot travel. They're withdrawing it because it doesn't do what they want it to. In other words, it's not been effective, and, therefore, there's -- there's no point in their continuing to support it. Thanks very much. Any other -- any other questions?
I've got -- I've got a couple of questions here.
For you, Lyman, so now you've gotten two of these processes complete and on a very fast pace, and I haven't read the second report which just --
>>LYMAN CHAPIN: Just came out.
>>STEVE CROCKER: -- just came out, and I haven't had a chance to read it, but I read the first one and it was very well done and I assume the second one is comparable.
Do you have any general thoughts on the process itself and on sort of what this initial experience feels like in terms of utility and efficiency and effectiveness?
>>LYMAN CHAPIN: Yes. I think that both of the first two reviews have been more successful than I might have anticipated, in the sense that the -- the people that conducted the reviews did an absolutely fabulous job of sticking to the agenda. In other words, not getting caught up in all sorts of theorizing about the larger meaning of the question that they had been asked and so forth. You can imagine that the kinds of people that we have conducting these reviews might have all sorts of things to say about all sorts of related topics. That would not get you through a 45-day, very fast process to finish your review.
So I think that we've demonstrated that we're able to conduct these reviews in a very short period of time, and in a very focused way.
The -- one of the most interesting discoveries, having been through two of these, is how difficult it is to limit the scope to the question that's been asked. A lot of the people who are involved in these reviews are also involved in other forums where they have much greater flexibility and freedom to investigate questions and to bring in arguments from other parts of their experience that in a larger context would be relevant and in the narrower context of the kinds of reviews that we're conducting really isn't relevant, so although we have some interesting discussions on those topics, one of the most important things -- and it's one of our instructions to future chairs of review teams -- is to be very careful to keep the focus on the question, the specific question that's been asked, which has to do with a specific proposal.
Because 45 days is simply not long enough to have the kinds of wide-ranging discussion that a topic, for example, like wildcards might generate.
>>STEVE CROCKER: Let me ask Suzanne a question related to this work.
This is the first -- certainly the first official and probably the first substantive piece of work that was done as a full-scale joint effort with -- between the Root-Server System Advisory Committee and SSAC. So two questions. One is your feelings about the process and about any lessons learned out of this or any take-aways in that respect, and then the other, with respect to the specific issue that we're pursuing here, what's your sense of coming to conclusion -- coming to closure and finding a way forward and I guess I want to ask David Conrad the same question with respect to the effort as presented so far, since David is in the inescapable position of being the target -- "target" is the wrong word -- the addressee if you will, ultimately.
>>SUZANNE WOOLF: Stuckee.
>>STEVE CROCKER: The stuckee for having to implement things. So I'll give David a moment and let Suzanne jump in.
>>SUZANNE WOOLF: Sure. I will try to pull some thoughts together here.
First of all, on the process, as a first effort, I have been reasonably happy with the way certain strengths or certain pieces of the mission of the two committees fits together. RSSAC is somewhat on the hook to provide specific advice to IANA on things that impinge on the operation of the root -- of the DNS at the root because RSSAC is composed so significantly of the people who are responsible for those operations, and are kind of on the spot, together with IANA, for anything that breaks or for any bad advice we give.
At the same time, SSAC has developed a significant strength as far as not only being able to do the analytic work, but to be able to discuss and explain it, and be able to -- be able to lay out, even for the less technical but still affected audience what the issues are, how they're to be resolved, what's the way to go forward. And in particular, I think Dave Piscitello is a great strength. Having the SSAC fellow to mediate that process is a great strength to both groups.
As far as the specific issue in getting to closure, I've thought at least four times we were about there, so I don't speculate anymore. It turns out that there are so many fine points and distinctions and specific details that it's really quite difficult to get a picture of the world that, well, particularly a fairly contentious and opinionated bunch like ourselves are going to agree on. I think where we're going now, there's a -- there is a tension we have to respect between engineering conservatism, making sure we're not hurting anything, and being so afraid to touch the infrastructure that nothing changes. And that we can't evolve new capabilities.
>>STEVE CROCKER: Yeah.
>>SUZANNE WOOLF: So I think we're sort of feeling our way towards a -- a compromise position that will let us back off, if we're -- if it turns out we're pushing things too far. But it means we will not dither indefinitely.
>>STEVE CROCKER: Yeah, you alluded to the iterative nature of the thinking you were almost done and finding one more thing and having to probe into that. This effort -- this is really sort of the second push that we've tried to make in this, and we had an effort that didn't come to closure, sort of embarrassingly, a couple of years ago.
The original effort kicked off with a thought experiment that said the packet that contains the priming response, that contains the list of root server addresses, is pretty full and there's no way to put IPv6 addresses in there on top of the -- or in addition to the IPv4 addresses. How are we ever going to make that work?
And that's when -- and this was my sort of little knowledge is a dangerous thing. I said the way I know that these packets are limited to 512 bytes and then that reasoning continued from there.
So the first important thing is the packets are not in fact limited to 512 bytes. By agreement the requester asked for bigger ones, then the responder can send big ones. And then more subtle, much more subtle kinds of questions began to emerge as we looked closely into this. As Suzanne explained, the possibility that some pieces of software would be sensitized to Quad A records and break the possibility that middle boxes were too educated to know what -- and had overly strong views as to what the packets should look like and we are trying to run all that exactly down to details and make sure there is no nightmare scenario that we turn on Quad A records and a lot of things break even though they are not our fault get blamed for them.
>>SUZANNE WOOLF: You reminded me of something I meant to say earlier.
>>STEVE CROCKER: Good. So I just wanted to comment that this iterative nature is, I think, one of the -- it is aggravating in the process, but I think it is enormously important and healthy in the overall scheme of things. And we are trying to be quite thorough about all that.
>>SUZANNE WOOLF: Yeah. I referred earlier to this as kind of a case study and the reason for that is Quad A addresses for the root name servers is not the last change of this kind that we will have to make to the root zone. Assigned root zone with DNSsec has similar implications, and I certainly hope as we continue to evolve the technology between now and the day we can successfully replace the DNS altogether that there will be other situations where there is a need to make sure we are not doing harm to the installed base but we can still continue to evolve the infrastructure and make new services and capabilities available.
Sort of feeling our way through the rough spots on this, I think, is providing valuable experience and hopefully the next similar set of issues will at least have a roadmap to what considerations to spend our time on.
>>STEVE CROCKER: Now it is your time, David.
>>DAVE PISCITELLO: Give me all your questions.
>>STEVE CROCKER: Different David. David Conrad, general manager, IANA.
>>DAVID CONRAD: I think I have been very pleased with the information that's been provided by the -- to Advisory Committees to IANA with regards to the changes that are being proposed. I think -- my view is this -- as Advisory Committees are providing advice to ICANN in order to formulate a proper response on behalf of ICANN, it is input -- I do anticipate there to be additional input from other bodies in particular with regards to making changes for the Quad A records in the roots.
I'm hoping that essentially a public comment period will enable additional information with regards to potential impacts that, perhaps, the committees have not anticipated will be exposed. If not, then there may not be any reason for concern about changing the root DNS in a way that is -- that has not yet been experienced and we can simply move on.
I also think that this is -- this particular exercise has been extremely beneficial in helping to create a more -- I don't want to say formal -- but a more structured approach to dealing with questions that have global impact with regards to security and stability. So from that perspective it has been all positive.
>>STEVE CROCKER: Thank you. Question over here. And do introduce yourself. Thank you.
>>ROGUE GAGLIANO: Hi, okay. Rogue Gagliano from (saying name). I wonder as a process matter when all of this is done, are you going to wait until the 13 root servers have (inaudible) address in order to change the hits in the root zone or going to move on with the servers now and be configured at the time?
>>SUZANNE WOOLF: We are not -- assuming we will have to wait for all of them to be able to answer queries over IPv6 about IPv6 addresses, that's not necessary for it to be useful to have some of them. However, we also anticipate that all of them will have the capability within a very reasonable period of time.
I'm hoping the recommendation will be done and some of them will start being available in the root zone before, perhaps, everybody's fully deployed v6 capabilities. But both of those things do seem to be relatively near-term events.
>>STEVE CROCKER: There is, what, five that are capable now and what do we know about the other -- 13 minus five is eight. I am in trouble here. I really did have way too many last night.
[laughter]
She was there, too.
>> (inaudible).
>>SUZANNE WOOLF: This has been very quiet today. As I said, the threat of security and stability Internet represented by Brazilian hospitality. I am actually not sure -- the whole list, there are five servers that could provide v6 -- could provide service today. The rest, I believe, the last time we ran around the table on that we were anticipating that capability a year at the outside.
>>STEVE CROCKER: Yeah. It is a bit of a race. By the time we actually get all this put to bed and everybody is completely comfortable, there may be considerably more than five. But it is not a structural requirement to wait.
All right. Another question.
>>STEPHANE BORTZMEYER: I need a smaller technical clarification about what Suzanne said in her talk. When you said that the name server has to provide -- when you ask for the name server of the root, it has to provide all the names. But you also said twice that they had to provide in additional section all the I.P. addresses of the root name servers which struck me as odd because it is not true. Your name server is not forced to give other addresses in additional sections. I don't think so. It can select a subset of the I.P. address. I tried with Bind 9.4 that we use here, and it is back on the I.P. address of root name servers.
>>SUZANNE WOOLF: I'm sorry, its impact only?
>>STEPHANE BORTZMEYER: Only 12 of the I.P. addresses. "I" root was missing. I don't know why "I." But it seems to me it is (inaudible).
>>SUZANNE WOOLF: Was it that it sent all of the names and not all of the glue -- not all of the DNS records?
>>STEPHANE BORTZMEYER: Right.
>>SUZANNE WOOLF: There are some finer points. Frankly, I was trying to stay away from the technical details of how the server selects what's available and what to send in the additional section for this room.
The higher-order point is that we do want to be able to provide the names of all the servers and all of the glue for them to clients that want that complete information without creating a situation where the subset of data that is sent to a particular client is not the subset of the data that is useful to that client. For example, you want to make very sure that your server does not serve Quad A data as the only thing that will fit in the space for a response for a client that only wants records and IPv4.
>>STEVE CROCKER: And vice versa.
>>SUZANNE WOOLF: If you have done the analysis, it is as we are hinting here. It gets very involved. We want certainty about what the proper behavior is so we can easily tell whether a specific server or client is behaving properly and tell people how to fix it if it is not. There is infinite variation out there but not all of it is correct behavior.
>>STEVE CROCKER: Thank you. Another question?
>> Sorry.
>>STEVE CROCKER: Your name?
>>JAY DALEY: Jay Daley from Nominet. I have just created a program to test that. We are getting 13 every single time.
>>STEVE CROCKER: But it knows it is you.
[laughter]
>>SUZANNE WOOLF: It likes you better.
>> (inaudible).
>>SUZANNE WOOLF: Maybe there is a firewall, Stephane.
>>STEVE CROCKER: More questions?
>>STEPHANE BORTZMEYER: A different one this time. We discussed only about the problem of the size of the response when there is also some IPv6 address for the root name servers. But there was another issue that was experienced by everyone who tried to put Quad A we call it for the Web server or name server. It is very often the client machine has a different connectivity in IPv4 and IPv6. Most of the time the connectivity is worse in IPv6. It goes through a tunnel or tunnel door or thing like that.
So it can happen that a client can reach all root name servers in IPv4 but within IPv6 it cannot pin a true IPv6 address which would be as a very slow and (inaudible) or thing like that.
When we set -- when we moved -- when we add an IPv6 address to our Web server, two people complained that it worked before and it didn't work after because they went through a tunnel in IPv6 which blocked ICMP packet, too big or anything like that. This is a sort of problem that could happen with IPv6 address for the root name servers.
Do you have an idea on how to test if it is a real problem, how to test how many people it will be affected?
>>SUZANNE WOOLF: That's a really tricky question, especially within this scope, because that's not -- it sounded to me like you are talking about -- correct me if I misunderstood your question, is that when a client that has connectivity over v4 and v6 receives both v6 and v4 reachability data from the DNS, it will have to make a choice based on the information it receives about which transport to use. Try to use the v6 network capabilities or the v4 network capabilities.
That issue is not only outside what the root name servers or the authoritative name servers have the ability to determine, it is outside the DNS.
And I think the only thing the DNS can do is provide the most complete information, again, which gets back to making sure that we don't have to make a choice over which subset of the information to include. We can include all of it and let clients that are most familiar with local conditions kind of figure that out because that's a very complex set of decisions and all we can really do is provide the most complete information possible and let the knowledgeable client decide.
>> BILL MANNING: I would like to add to Suzanne's point for the most part, root server operator people are a careful bunch. And for those who are expecting or anticipating IPv6 transport capability for their servers, there is a reasonable amount of testing being done to make sure that as far as they can tell those prefixes are available. They are not filtered by intermediate ISPs, they chase these things down and try to make sure there is good reachability.
But Suzanne is correct, it is outside the scope of the DNS. It really has to do with whether or not Jordi is going to block my packets. And he does.
So if you want to reach my name server, I point you at Jordi. That's outside the scope. We really try hard to make sure we know where our reachability issues are and try to resolve them before we turn this on in production.
>>STEVE CROCKER: Other questions?
>>ROQUE GALIANO: Hi, this is Roque again. I am going to change topic. I was earlier today in a meeting they are talking about domain tasting and there were a lot of concern about security issues, not only stability because there are some numbers -- finite numbers, we are talking about like 30% of the domain registration we use for testing and they were talking about phishing, spamming, et cetera. Is this a concern for this group, or is there any study going on about that?
>>STEVE CROCKER: Thank you for that question. We have been following the domain tasting and the "delete storm" phenomena for quite a while. And I must say that when I watch the enormous amount of traffic generated by the registrars for both phenomena and the load that places on the registries, I thought that that was extraordinary and dangerous in the sense that it would overwhelm those facilities similar to a denial of service attack. I have to say that the registries that are involved sent back very mixed messages. On the one hand they made it clear they were under stress, there was a tremendous amount of traffic being put on them and in a way asked for a certain amount of sympathy.
But they didn't actually ask to turn it off. And so it raised the question of why not. Now, there is some money involved because even though it is stressing their facilities, it is generating some degree of revenue.
Some time ago -- and this cycled around more than once within the committee here and there came a point in which we took up very carefully the prospect of trying to initiate a study and the possibility of making a recommendation with respect to this matter and realized it did not make sense for us as a committee to go forward if the registries who were involved were not equally interested and we've posed the question to two of the registries, VeriSign and PIR who were both being affected by this.
And at the time, which was not now, but it was a while ago, they both came back and said, thank you very much, but we don't need your help right now.
Sort of like a lawyer left without a client so we didn't have anything left to do.
Subsequently, there has been some shift as we have been tracking it and as you saw, PIR sought and has gotten approval for a charge that will shift -- that will provide some revenue in exchange for the load that's being put on them or will damp out -- slow down the load that's being put on them one way or another and we will just have to see where all that goes and then that became part of this whole dialogue about whether it is right for them to do that and what the public issues are. There is a much, much broader set of issues about the stability within the name space, whether it uses up too many names and trademark infringement and a whole series of less crisp -- lessening nearing issues and more user impact kind of issues. I was in that session and listened very closely. There was a comment about aren't those security and stability issues just as many as the engineering issues? And my answer would be yes, in fact.
But they require somewhat different way of approaching and so a different set of people and a different set of issues. To the extent they perturb or muddy the ability of users to get their primary work done, I would say, yes, that's an important factor. But the shape of that is not as clear as the shape of the problem of having registrars put 100 to 1,000-fold level of traffic on top of systems that were designed for a particular level. So, yeah, kind of a long answer. But I have had my -- it has been a personal interest because I have been watching this go back and forth and have often ridiculed the process -- the engineering design that if I were seeking a course in how to design systems and I said people out in the real world need a solution to this weighted lottery kind of effect, for example, for the deleted names or please design me a system and if a student came in and said, I have got a great solution, we will have distributed systems all over the world pounding!
as hard as possible on this one poor system and the proportion number of systems that you use is the weight that you have and it is sort of like buying a ticket, and that will be a solution, I would ask that student to switch field because he would be dangerous. I would never want to see such a system implemented. I would give him a failing grade and it is embarrassing that that's the solution we have out in the field. That's my -- and in another setting, I will tell you how I really feel about it when I can be more honest.
It is quite stupid. So we are paying attention. Not a whole lot we can do at the moment. But thank you for the question.
So I see time is moving. Let me -- I was going to ask Dave Piscitello a question about any conclusions about the status of WHOIS with respect to the broader issues of stalking and identity theft and all the things that people worry about when they worry about the exposure of WHOIS data.
In the meantime, as you keep that short, I want to put up just one last piece of business related to SSAC.
And for that, I need the display. Where did it go to?
>>DAVE PISCITELLO: So Steve has asked me to really sort of explore something that I intentionally not sought to explore because I didn't want to bring my personal perspective in while I was sitting here representing ICANN and SSAC.
So I am wearing a personal hat. And I actually had spent a fair amount of time yesterday in a very interesting WHOIS privacy debate with unfortunately most of the people who were privacy advocates and very few people who were not. And I think that sort of points out part of the problem that some of the people there said was institutional. I am not going to agree or disagree with that.
I do think that any time there is one in seven opportunities to make contact with an individual you have a fairly significant opportunity for mischief or for abuse.
Exactly how -- you know, how many different criminal activities can be perpetrated, you know, from the information that's gleaned is uncertain to me.
But I'd also like to point out that, you know, if you are a privacy advocate in the United States, you've got to be concerned with a ton of other databases that are available, you know, for free.
As an example, you can go to thefederalelectioncommittee.gov and you can FTP the entire database of everyone who has made a contribution of more than $500. In many cases, they are facsimiled copies of the submitted record, including the, you know, faxed signature. Okay?
You can go to searchsystems.net, which is a public records directory of over, what, 36,000 databases, and it's called a deep Web source. Meaning that it's databases that are not accessible through the -- directly through the -- you know, through a Web search, but it's -- they're accessible once you subscribe to this service.
You can get incredible amounts of information from these directories for a fee, including property records, licenses of all sorts -- nursing listens, driver's licenses, and the like -- criminal records.
This is the kind of information that people go out and grab and put up and say, you know, "We have a stalker on our neighborhood," kind of relationship.
You can go to places like zoominfo.com, datatoknow.com, archived.org, guidestar.org, grantsmart.org, zabasearch.org. You can get Social Security numbers, death -- you know, death indices. In fact, you can still Google, you know, certain Social Security numbers and find that Social Security number is posted someplace on some Web.
So, you know, we -- we are just in a dreadful state in the United States, you know, with respect to what we've done, you know, to allow our privacy to erode.
And, you know, the WHOIS database is -- is one of but many holes that we have to close, you know, in this terrible game of, you know, Whack-A-Mole.
So, you know, that isn't short but that's sort of, you know, my emotional reaction to this is --
>>STEVE CROCKER: Yeah. I wondered what your long answer was going to be.
>>DAVE PISCITELLO: Well, the long answer is -- actually, the long answer is I can actually go through this with one individual that you pull out of a newspaper, and I can show -- we can lead back to the fact that he's a criminal. You know, so...
>>STEVE CROCKER: Right, right. Thank you very much. Let me move -- let me move on with just a couple slides of closing business here.
Let me tell you a little bit about what we're doing and what we're thinking about doing.
We're looking at trying to take a broad look at security across the set of issues that ICANN needs to be concerned with, what are those issues and what shape are they in, some level of assessment, and also examine both ourselves and the ICANN spectrum of activities to ask how well things are being addressed.
So not to focus attention on any particular thing, but just as an example, some time ago we wrote about domain name hijacking. I think as part of this assessment, we'll want to see did it have any impact and is anybody paying attention.
Not so much that it's our job to be in control or admonishing anybody, but it's useful to calibrate how well this whole system works, of which we're just an portion.
So that's one activity. Another is, we've been asked to take a look at the internationalized domain names from a security and stability point of view. This has been a kind of obvious question that we shied away from a couple times before because we have not had the right set of people and the reservoir of energy and manpower available. That's caused me to back up and say, "Okay. Let's go get the resources necessary to do that job right," and we're having that discussion. I don't know which way it will come out. I'm hopeful it will be positive.
Different question, which has gotten some attention this week, is a consumption of address space, of IPv4 address space. There's been various warnings raised in the past, but several this week, that there may be black markets forming, or gray markets or secondary markets and various other things.
It isn't clear whether or not, as we get close to using up the address space, whether the effect will be rather cataclysmic -- and I've chosen some attention-getting words here -- like a nuclear winter or ice age, or whether the effects will be somewhat more gradual but, nonetheless, pervasive. You know, global warming or even, you know, just sort of an annoyance level of smog or something where it just gets harder and harder to get addresses but nobody says that the world is coming to an end.
So those are some of the current and potential activities that we're talking about.
I'm not sure this is comprehensive. I forgot to go back and do a careful look. Did I leave something out that's on our plate at the moment?
>>DAVE PISCITELLO: Oh, probably lots of stuff. We have all sorts of little threads.
>>STEVE CROCKER: Yeah.
>>DAVE PISCITELLO: Yeah. Well, I think one of the interesting things, you know, as an observation, we are often -- you know, or more often a react -- or response committee as opposed to a proactive committee. People come to us with a variety of different issues relating to an individual hijacking instance of a domain name to something that sort of looks anomalous in the numbering plan or name space, and, you know, we look at it and if it's something that merits attention, sometimes we dismiss it before we have to write anything. It's just an e-mail response back, you know, from Steve.
You know, some of the other things that we are doing are obviously reviewing and contributing to the -- you know, to ICANN's strategic plan, and the operational plan, you know, so -- you know, so we get involved as any of the other constituencies and advisory committees in that regard.
And the board often asks us point questions on some issue that they feel, you know, is -- you know, ought to be directed to us. And again, sometimes we post our comment back to the board on our Web site. We've got a fair number of papers and reports and advisories that we've posted on the document -- in the documents section. In fact, if you have one of these brochures, you can see the list of them.
You know, and -- and we also are constantly trying to put up some resources that are offered or have been written by members of our committees on our -- you know, on our Web site. The Web site itself is something that's undergone a -- you know, a substantial change as well, so...
>>STEVE CROCKER: Yeah. Thank you for pointing that out. I was going to attempt to highlight that.
The back side of the brochure has a list of the documents.
Also, I did not put up a slide listing our members, all of whom are -- give very generously of their time, but they're listed in here, and poor Russ Mundy sitting down here with us has said not a word yet.
Do you want to -- is there anything that comes to mind that you want to chime in on? You've certainly paid your dues here.
[Laughter]
>>RUSS MUNDY: No, I don't have anything particular, other than we hope folks are able to come and participate in the activity tomorrow, where you'll see some work that's very closely related to the SSAC, and that's the DNS security aspects of the ccTLD workshop.
>>STEVE CROCKER: Exactly. And with that, let me bring things to a close here. Thank you all for coming, and if you have any suggestions, don't be bashful. We're easy to reach, and we'll look for any kind of feedback. Thank you.
[Applause]