ICANN Meetings in Marrakech, Morocco
Workshop on IDN, Part 2
27 June 2006
Note: The following is the output of the real-time captioning taken during the Workshop on IDN held on 27 June 2006 in Marrakech, Morocco. Although the captioning output is largely accurate, in some cases it is incomplete or inaccurate due to inaudible passages or transcription errors. It is posted as an aid to understanding the proceedings at the session, but should not be treated as an authoritative record.
>>VINT CERF: LADIES AND GENTLEMEN, I WOULD LIKE TO GET THIS SESSION STARTED,
SO THOSE WHO DIDN'T WANT DETAIL AS TO INTERNATIONALIZED DOMAIN NAMES SHOULD
PROBABLY LEAVE THE ROOM OR PLUG IN YOUR HEAD SETS.
THE REST OF YOU, IF YOU ARE INTERESTED IN PARTICIPATING, PERHAPS YOU WOULD
LIKE TO TAKE YOUR SEATS NOW.
GOOD AFTERNOON, I AM VINT CERF, CHAIRMAN OF THE BOARD OF ICANN. I AM
SOMETHING OF AN INAPPROPRIATE PARTY TO INTRODUCE THIS SESSION BECAUSE IT IS
INTENDED TO BE SUBSTANTIALLY TECHNICAL, AND I AM THE LAST GUY IN THE WORLD TO
CLAIM THAT I HAVE AS DEEP A KNOWLEDGE OF INTERNATIONALIZED DOMAIN NAME
TECHNOLOGY AND IMPLEMENTATION AS THOSE SPEAKING TO YOU TODAY, BUT I AM VERY
PLEASED TO OPEN THE SESSION.
I HAVE TO SAY, ONCE AGAIN, THAT INTERNATIONALIZED DOMAIN NAMES, ALTHOUGH THEY
SEEM ON THE SURFACE TO BE QUITE SIMPLE, WHY DON'T WE JUST WRITE IN A
DIFFERENT SCRIPT, WHAT'S THE PROBLEM, FROM THE USER'S POINT OF VIEW, THAT'S
PROBABLY EXACTLY RIGHT.
THIS OUGHT TO BE JUST AS SIMPLE AS THE EXISTING INTERNET IS TODAY WHEN IT
USES ROMAN CHARACTERS.
BUT I HAVE TO SAY THAT AS YOU UNCOVER THE DETAILS, YOU DISCOVER THAT THE
SIMPLE CHARACTER SETS THAT HAD BEEN PART OF THE INTERNET FOR MOST OF ITS
LIFETIME ACTUALLY MADE IMPLEMENTATIONS MUCH, MUCH EASIER THAN WE NOW FACE
WITH THE INTRODUCTION OF INTERNATIONALIZED DOMAIN NAMES, THE COMPLEX UNICODE
CHARACTER SET, THE CHALLENGES OF MATCHING IN THAT CHARACTER SET, THE PROBLEM
OF FONTS AND SCRIPTS BECOMING INCREASINGLY CONFUSING FOR USERS, THE
DISTINCTION BETWEEN LOCALIZATION AND GLOBALIZATION. THE ASSURANCE THAT
END-TO-END PARTIES WILL BE ABLE TO COMMUNICATE WITH EACH OTHER, THAT DOMAIN
NAMES OF EVERY KIND WILL RESOLVE IN THE SAME WAY NO MATTER WHERE YOU ARE ON
THE INTERNET.
ALL OF THOSE THINGS ARE DEEPLY CHALLENGED BY THE INTRODUCTION OF IDNS.
IT'S PROBABLY FAIR TO SAY THAT IF WE HAD STARTED WITH SOME OF THESE CONCEPTS
AT THE VERY BEGINNING, PERHAPS IT WOULD HAVE BEEN EASIER. BUT I WOULD ARGUE
THAT SOME OF THESE PROBLEMS ARE FUNDAMENTAL IN THEIR CHARACTER, AND IT
DOESN'T MATTER WHEN YOU START. IT HAS A GREAT DEAL TO DO WITH THE COMPLEXITY
OF INTRODUCING THE MORE COMPLEX CHARACTER SETS. IT'S ALSO, I THINK, FAIR TO
SAY THAT JUST AS IN THE CASE OF THE 3166-1 LIST THAT WE USE FOR OUR CCTLD
IDENTIFIERS, WE RE-PURPOSED THAT LIST TO DO SOMETHING THAT IT WASN'T
ORIGINALLY DESIGNED TO DO.
WE'VE CHOSEN TO RE-PURPOSE THE UNICODE CODING SYSTEM TO DO SOMETHING WHICH IT
WAS NOT ORIGINALLY DESIGNED TO DO. AND THE SIDE EFFECT OF THAT IS THAT
SOMETIMES IT DOESN'T WORK THE WAY THAT WE WISH IT COULD OR WISH IT WOULD IN
ORDER TO SATISFY SOME OF THE RESTRICTIONS THAT THE DOMAIN NAME SYSTEM PLACES
ON OUR USE OF THAT CODING SYSTEM.
AND SO IT'S NO SURPRISE THAT FROM TIME TO TIME, OUR REPURPOSING IS AT CROSS
PURPOSES WITH THE ORIGINAL USE AND CONTINUING USE OF THE UNICODE CODE TABLES.
WE HAVE WITH US TODAY A SET OF EXPERTS WHOM I WILL ALLOW TO INTRODUCE
THEMSELVES IN THE INTEREST OF TIME AS THEY COME UP TO SPEAK.
WE'RE GOING TO START IN THE FIRST SESSION WITH A DISCUSSION OF TECHNICAL
ACTIVITIES RELATED TO IDN, AND IN THE SECOND SESSION WE HAVE A MORE LOCAL
FOCUS -- THAT IS, ON AFRICAN AND MIDDLE EAST IDN ACTIVITIES, WHICH SEEMS
APPROPRIATE GIVEN WHERE WE ARE TODAY IN MARRAKECH.
SO WITH THAT RATHER BRIEF INTRODUCTION, LET ME TURN THIS OVER TO TINA DAM,
WHO IS ON THE ICANN STAFF AND RESPONSIBLE FOR ICANN'S IDN EFFORTS.
SO TINA, ARE YOU HIDING BEHIND ME HERE? I'LL TURN IT OVER TO YOU. THANK YOU.
>>TINA DAM: THANK YOU VERY MUCH, MR. CHAIRMAN. I'M VERY PLEASED TO BE ABLE
TO GIVE YOU A STATUS UPDATE ON THE IDN ACTIVITIES RELATED TO ICANN. I'M
ACTUALLY NOT ON SCREEN, I GUESS.
IS STEVE CONTE IN THE ROOM? STEVE, I NEED TO....
>>VINT CERF: THIS IS WHERE WE NEED THE CAVALRY MUSIC ARRIVING.
>>TINA DAM: I GUESS I COULD HAVE FINALIZED DONE THAT MYSELF.
THERE WE GO.
OKAY.
SO ON THE STATUS UPDATE, WE'LL FIRST TAKE A LOOK AT ICANN ACTIVITIES
SPECIFICALLY SINCE THE LAST ICANN MEETING IN NEW ZEALAND.
THAT IS FOCUSED -- THAT IS FOCUSED AROUND THE TECHNICAL AND OPERATIONAL TEST
PLAN AND NOT SO MUCH ON THE POLICY-RELATED AREA AS THAT WAS DISCUSSED ON THE
SUNDAY'S WORKSHOP FOR THIS MEETING IN MOROCCO.
THEN I'LL EXPLAIN A LITTLE BIT ABOUT THE IDN PROGRAM PLAN THAT LIES WITHIN
ICANN AND IS MANAGED BY ICANN STAFF IN DIFFERENT AREAS.
AND THEN WE'LL STEP DOWN AGAIN AND TAKE A CLOSER LOOK INTO A DRAFT REVISED
PLAN FOR TECHNICAL AND OPERATIONAL TESTS. AND THAT IS ACTUALLY A NEW
PROPOSAL. AND I'M GOING TO UNDERLINE IT AGAIN LATER IN THIS PRESENTATION,
BUT THIS IS A STAFF -- THIS IS A STAFF-BASED PROPOSAL THAT NEEDS TO GO
THROUGH SOME VERY SPECIFIC PROCESS BEFORE IT CAN BE FINALIZED ON PARTICULAR
DISCUSSIONS WITH THE PRESIDENT'S ADVISORY COMMITTEE ON IDNS AND SO FORTH.
BUT I THOUGHT IT WOULD BE APPROPRIATE TO AT LEAST PROVIDE AN UPDATE ON WHAT
WE HAVE SO FAR, EVEN THOUGH IT'S NOT FINAL.
SO A QUICK OR SHORT REVIEW OF THE ACTIVITIES SINCE THE LAST ICANN MEETING.
AND AGAIN, THIS IS ON THE TECHNICAL TEST AREA.
AS YOU MAY RECALL, IN MARCH WE RELEASED, VIA THE PRESIDENT'S ADVISORY
COMMITTEE FOR IDNS, A PROPOSED PLAN FOR HOW TO PERFORM TECHNICAL TESTS OF
INSERTING IDN TLD LABELS IN THE ROOT ZONE.
THE PROPOSED PLAN WAS DISCUSSED THROUGHOUT ACTUALLY BOTH APRIL AND MAY. WE
HAD SOME INITIAL RSSAC DISCUSSIONS TO GET SOME FEEDBACK ON THE PROPOSAL.
THERE WERE SEVERAL COMMUNITY AND CONSTITUENCY DISCUSSIONS DURING THE ICANN
MEETING IN WELLINGTON, ALTHOUGH WE DID NOT HAVE A SPECIFIC WORKSHOP ON IDNS
AT THAT MEETING. IT WAS MORE LOCATED IN, LIKE, SMALLER GROUPS WITHIN THE
DIFFERENT CONSTITUENCIES. THE THOUGHT WAS TO GET A MORE FOCUSED REVIEW OF
THE PROPOSED PLANNING AND ISSUES THAT RELATE SPECIFICALLY AND TARGETED AT
SPECIFIC CONSTITUENCIES INSTEAD OF A BIG FORUM AS WE HAVE HERE THIS TIME.
DURING MAY, ICANN STAFF DISCUSSED ALL OF THE INPUT THAT WAS RECEIVED.
AS YOU MIGHT RECALL, IF YOU FOLLOWED THE PROPOSED PLAN FOR TESTING, THE CORE
PART OF THE PROPOSAL WAS TO TEST DNAMES AND NS RECORDS. SOME OF THE FEEDBACK
THAT WE GOT HAD TO DO WITH ROOT OPERATORS NOT RUNNING DNAME SOFTWARE.
SEVERAL CONCERNS ABOUT THE MATURITY OF THE DNAME FUNCTIONALITY AND WHETHER
THAT SHOULD NOT BE ANALYZED MORE DEEPLY BEFORE WE INITIATE ANY TESTING OF IT.
AND MORE FEEDBACK.
SO ICANN STAFF GATHERED THE FEEDBACK AND STARTED LOOKING AT HOW CAN WE REVISE
THE PROPOSED PLAN AND CHANGE IT SO THAT IT WORKS FOR THE COMMUNITY AND THOSE
WHO NEED TO TAKE PART IN THE TEST.
WE ALSO SOUGHT CONSULTATION FROM SEVERAL EXPERTS.
AND IN JUNE, WE THEN ENDED UP WITH A REVISED PROPOSED PLAN TO PRESENT TO
DIFFERENT STAKEHOLDERS. AND I WILL GET BACK TO THAT A LITTLE BIT LATER IN
THE PRESENTATION.
WE ALSO HAVE A PROCESS FOR FINALIZING THAT TEST PLAN, AND I WILL GIVE YOU A
VIEW OF THAT AS WELL.
ON THE IDN PROGRAM PLAN, AND THAT INCLUDES SOME OF THE WORK THAT WE DID IN
REVISING THE TECHNICAL TEST, WHAT WE'VE DONE INTERNALLY IS TO DEVELOP THIS
PROGRAM THAT WE CALL THE IDN PROGRAM PLAN.
IT ENCOMPASSES ALL ACTIONS AND DELIVERABLES THAT ARE NECESSARY TO DEPLOY IDNS
AS WE SEE IT AT THIS STAGE.
CERTAINLY, SOME OF THE DEEPER ANALYSIS AND TESTING WILL MAKE US GO BACK AND
REVISE SOME OF THESE ITEMS IN THE PROGRAM PLAN EVERY ONCE IN A WHILE THROUGH
THIS PROCESS, AS IS DEEMED NECESSARY FROM THOSE RESULTS THAT WE GET FROM
ANALYSIS AND TESTS.
HOWEVER, AT THIS POINT IN TIME, THE PROGRAM PLAN IS COMPRISED OF A SET OF
PROJECTS. THE PROJECTS ARE SOMEWHAT PLANNED SEPARATELY BUT THERE IS
CORRELATION BETWEEN THE MILESTONES AND THE TASKS WITHIN THE DIFFERENT
PROJECTS THAT HAS TO BE MANAGED ACROSS THE PROJECTS.
SO THE LIST IS TECHNICAL AND OPERATIONAL TESTS. POLICY DEVELOPMENT, AS I
MENTIONED, THAT WAS DISCUSSED IN THE SUNDAY'S WORKSHOP.
IDN GUIDELINES, IANA PROCESSES, OUTREACH PLANNING, AND COMMUNICATION
PLANNING.
AND AS I STARTED OFF BY SAYING, THIS -- IN THIS STATUS REPORT FOR YOU, I'M
GOING TO TAKE A FOCUS ON THE TECHNICAL AND OPERATIONAL TEST PLAN.
JUST TO MAKE SURE THAT WE'RE ALL ON THE SAME PAGE AND VIEW THIS TEST, THE
GOAL WITH THE TECHNICAL AND OPERATIONAL TEST THE SAME WAY, IT IS TO
DEMONSTRATE THAT INSERTION OF IDN STRINGS INTO THE ROOT HAS NO APPRECIABLE
NEGATIVE IMPACT ON EXISTING RESOLUTION. SO WE WANT TO MAKE SURE THAT THE DNS
STAYS STABLE AND SECURE WHEN WE ENTER IDN STRINGS INTO THE ROOT.
NOW, IT'S DIFFICULT TO PROVE A NEGATIVE, BUT WE CAN TAKE STEPS THAT ALLOW US
TO SAY THAT WE ARE REASONABLY CERTAIN THAT THERE ARE NO ISSUES.
AND IN ORDER TO REACH THAT GOAL, WE NEED TO GO THROUGH A COMBINATION OF
ACTIVITIES, WHICH WE'LL SHOW YOU HERE IN THE NEXT SLIDE.
NOW, THIS IS A PROPOSED REVISION OF THE TECHNICAL AND OPERATIONAL TEST
PROPOSAL THAT WE ACCOMPLISHED IN MARCH. AND AS I MENTIONED EARLIER, THIS IS
GOING TO BE -- THE DETAILS OF THIS IS WITHIN A BRIEFING TO THE PRESIDENT'S
ADVISORY COMMITTEE FOR IDNS THAT WILL BE MEETING TOMORROW LATE AFTERNOON.
SO BEFORE WE HAVE ACTUALLY HAD THE DISCUSSION WITH THAT GROUP OF EXPERTS,
THIS IS AS MUCH AS I'M GOING TO GO THROUGH IN JUST A LITTLE BIT, IS WHAT I
HAVE FOR YOU TODAY.
NOW, THE PRESIDENT'S ADVISORY COMMITTEE IS GOING TO DISCUSS THE NEW PLAN THAT
STAFF HAVE BEEN WORKING ON TOGETHER WITH EXPERTS IN TRYING TO DEFINE IT MORE
CLOSELY OR TO SEE IF THERE'S SOMETHING THAT WE HAVE MISSED THAT NEEDS TO BE
DONE DIFFERENTLY.
THEN THERE IS GOING TO BE AN IETF MEETING IN MONTREAL FROM THE 9TH THROUGH
THE 14TH OF JULY. AT THAT MEETING, WE'RE IN THE PROCESS OF SETTING UP
SEVERAL MEETINGS WITH THE RSSAC AND THE ROOT OPERATORS AND ADDITIONAL -- AND
ALSO OTHER ADDITIONAL WORKING GROUPS WITHIN THE IETF TO DISCUSS THE DETAILS
OF THE PROPOSED PLAN.
THESE ARE ALL INDIVIDUALS WHO NEED TO BE INVOLVED IN THE PLAN, AND SO THIS IS
THE CONSULTATION PHASE THAT WE ARE GOING TO GO THROUGH OVER THE NEXT PERIOD
OF TIME BEFORE ANYTHING CAN BE FINALIZED.
SO THE WAY THE PROPOSED PLAN LOOKS NOW IS NS RECORDS BASED ON PUNYCODE
STRINGS WILL BE PERFORMED FIRST AS A TEST IN A LABORATORY.
THE DEFINITION OF THE ISSUES THAT ARE GOING TO BE TESTED AGAIN, AS I
MENTIONED, IS GOING TO BE SOMETHING WHERE WE'LL BE ASKING FOR INPUT FROM DNS
TECHNICAL AND OPERATIONAL EXPERTS AND THAT WILL GO THROUGH BOTH THE
PRESIDENT'S ADVISORY COMMITTEE AND THE IETF AS I OUTLINED.
THEN THERE WILL BE A SUGGESTION TO BE PERFORMED AN OPERATIONAL PROCESS TEST.
THE GOAL OF THIS TEST IS TO VERIFY THAT ALL OF THE PROCESSES FOR INSERTING
IDN TLDS IN THE ROOT ARE FUNCTIONING AND ARE IN PLACE. THAT INCLUDES ANY
ICANN PROCESSES, APPROVAL PROCESSES WITHIN THE ICANN BOARD, THE IANA
PROCESSES FOR INSERTING THE STRINGS, AND DOC APPROVAL. AND WE SIMPLY WANT TO
RUN THROUGH -- WE ARE SUGGESTING WE WANT TO RUN THROUGH AND TEST THAT OUT.
AND FINALLY, AS A LAST RESULT, TO DO THE DNS ROOT NAME SERVER TEST.
SO THAT'S ON THE NS RECORD SIDE.
ON THE DNAME SIDE, WE WILL GO THROUGH SOME DEEP ANALYSIS OF HOW TO FUNCTION
-- OR HOW THE FUNCTIONALITY OF DNAME RESOURCE RECORDS STAND CURRENTLY. AND
ALSO TO DEFINE ANY PRACTICAL IMPLICATION THAT MIGHT BE WITH THAT METHOD.
THAT ANALYSIS IS INTENDED TO SHOW WHETHER WE SHOULD PROCEED WITH SOME TESTING
OR WHETHER THAT'S PREMATURE TO DO THAT AT THIS POINT IN TIME.
SO THOSE TWO THINGS ARE GOING TO RUN IN PARALLEL TRACKS.
AND THIS IS MY LAST SLIDE. I MENTIONED QUITE A BIT ABOUT THE PROCESS BECAUSE
IT'S IMPORTANT FOR ME TO STRESS THAT THE DETAILS OF THE NEW PROPOSED PLAN
NEEDS TO GO THROUGH A LOT OF CONSULTATION.
TOMORROW, WITH THE PRESIDENT'S ADVISORY COMMITTEE ON IDNS, THERE NEEDS TO BE,
NATURALLY, A REVIEW BY ICANN'S EXECUTIVE MANAGEMENT AND APPROVAL. AND MOVING
FORWARD -- SO THERE WILL BE SOME DISCUSSION HERE TODAY, AND THEN MOVING
FORWARD TO THE IETF MEETING IN MONTREAL, AFTER WHICH WE'LL PUBLISH IT FOR
PUBLIC COMMENT, AND AT THE END FINALIZE THE TEST PLAN AND MAKE IT AVAILABLE
FOR EVERYBODY.
AND THAT CONCLUDES MY STATUS REPORT FOR TODAY.
>>VINT CERF: THANK YOU VERY MUCH, TINA. IT STRIKES ME THAT THIS IS ONE OF
THOSE PROJECTS WHERE WE BETTER GET IT RIGHT, BECAUSE ONCE WE LAUNCH IT, IT
WILL BE REAL HARD TO UNDO.
SO ONE OF THE REASONS FOR DOING A LOT OF ANALYSIS AND TESTING UP FRONT IS
PRECISELY TO AVOID THE POSSIBILITY THAT WE CAN'T BACK OUT.
MR. KLENSIN -- I BEG YOUR PARDON. DR. KLENSIN, WOULD YOU CARE TO OFFER
YOUR VIEW ON TECHNICAL ISSUES WITH THE IDNS FOR THE "NTH" TIME?
>>JOHN KLENSIN: I THINK MY FIRST COMMENT, VINT, THANK YOU FOR THE
INTRODUCTIONS THAT I HAD HOPED THAT THE TALK I AM GOING TO GIVE TODAY WE
WOULDN'T HAVE TO GIVE, EVER.
WHAT WE STARTED TO REALIZE ABOUT A YEAR AGO AFTER IDNS HAD BEEN DEPLOYED IN
SOME PLACES FOR ABOUT TWO YEARS, IS WE DID NOT EXACTLY GET IT RIGHT. AND
THERE WERE A NUMBER OF ISSUES WHICH WERE NOT ADDRESSED THE FIRST TIME AROUND.
THERE WERE A NUMBER OF ISSUES THAT WERE NOT ADDRESSED THE FIRST TIME AROUND
THAT WE NEED TO COME BACK AND LOOK AT AGAIN. WE NEED TO COME BACK AND LOOK
AT AGAIN BOTH FROM A TECHNICAL SIDE AND THE POLICY SIDE. WE NEED TO
UNDERSTAND THE IMPLICATIONS OF LEAVING THINGS AS THEY ARE, WHICH IS A
POSSIBILITY, AND WE NEED TO UNDERSTAND THE IMPLICATIONS OF CHANGING THINGS,
WHICH MAY NOT BE A WHOLE LOT OF FUN.
BUT I THINK PART OF THE IMPORTANT MESSAGE THAT I'D LIKE YOU TO TAKE AWAY FROM
WHAT I'M GOING TO SAY TODAY IS THAT WE'RE AT A STATE OF IDN DEPLOYMENT NOW
THAT WE PROBABLY HAVE ONE CHANCE TO GET THE REST OF THE LOOSE DETAILS RIGHT,
BUT WE DON'T HAVE TWO CHANCES. IF WE GET THIS SERIOUSLY WRONG, WE ARE AT A
SERIOUS RISK OF ENDING UP WITH A DNS THAT WE CAN'T USE ANYMORE, AND THAT
WOULD BE PRETTY HARD ON THE INTERNET.
THIS PRESENTATION IS VERY MUCH MORE ABOUT ISSUE IDENTIFICATION THAN IT IS
ABOUT SOLUTIONS. I'M GOING TO BE TALKING A LITTLE BIT ABOUT PATHS TOWARDS
SOLUTIONS BUT NOT SOLUTIONS.
THESE ARE THING THAT ARE ISSUES THAT HAVE COME UP. THEY HAVE BEEN IDENTIFIED
IN DISCUSSION AND THAT WE NEED TO LOOK AT.
THE IAB TRIED TO IDENTIFY WHO SHOULD LOOK AT THESE ISSUES.
SOME OF THESE DO NOT HAVE SOLUTIONS OTHER THAN EDUCATION AND AWARENESS OF
REGISTRIES, OF REGISTRARS, OF REGISTRANTS AND OF END USERS. AND SOME OF THEM
HAVE SOLUTIONS ONLY OF GETTING THE PROBLEM OUT OF THE DNS AND SOLVING IT
SOMEWHERE ELSE OR DECIDING WE DON'T CARE WHETHER OR NOT WE SOLVE THOSE
PROBLEMS.
POUR THOSE OF YOU WHO HAVE NOT BEEN EATING, BREATHING AND SLEEPING THESE
PROBLEMS, AN IDN IS AN INTERNATIONALIZED DOMAIN NAME. WE ARE SLOPPY ABOUT
THE TERMINOLOGY, BUT WHAT WE'RE USUALLY TALKING ABOUT WHEN WE TALK ABOUT AN
IDN IS ONE LABEL IN A DNS NAME WHICH CONSISTS OF MULTIPLE LABELS.
THERE IS AN ISO STANDARD CALLED 646. IT IS ROUGHLY EQUIVALENT TO WHAT WE
CALL ASCII IN THE U.S., AND WHAT THE ITU CALLS RECOMMENDATION T.50 AND USED
TO CALL INTERNATIONAL ALPHABET 5. IT CONSISTS OF UPPER AND LOWER CASE
UNDECORATED ROMAN-DERIVED ALPHABETIC CHARACTERS, SOME DIGITS AND SOME SPECIAL
CHARACTERS. THE ONLY SPECIAL CHARACTER WE HAVE HISTORICALLY PERMITTED IN DNS
IS THE HYPHEN, PLUS THE PERIOD WHICH YOU USUALLY SEE AS SEPARATING LABELS.
THERE IS ALSO SOMETHING CALLED THE UNIFORM RESOURCE LOCATOR. YOU HAVE
PROBABLY ALL SEEN THEM. THERE'S A MORE GENERAL FORM CALLED AN IRI WHICH CAN
ACCOMMODATE NON-ASCII CHARACTERS BUT THEY BOTH HAVE THE PROPERTY THAT THEY
HAVE GOT ASCII CHARACTERS IN STRUCTURE WHICH WON'T GO AWAY, AND NOTHING WE DO
WITH IDNS IS GOING TO CAUSE THOSE CHARACTERS TO GO AWAY.
THERE ARE WAYS OF GETTING THOSE CHARACTERS OUT OF THE SIGHT OF THE USERS BUT
THEY ARE NOT THE NS SOLUTIONS.
EXAMPLES I'M GOING TO USE TODAY ARE MOSTLY BASED ON ROMAN CHARACTERS FOR
CONVENIENCE, BUT ONE OF THE THINGS WE HAVE DISCOVERED AS WE HAVE LOOKED AT A
WIDE VARIETY OF OTHER SCRIPTS IS THAT EVERY SCRIPT HAS ITS OWN UNIQUE AND
INTERESTING PROBLEMS.
DISCUSSION ABOUT IDNS AT THIS POINT HAS BECOME A MIX OF TECHNICAL ISSUES AND
IMPLEMENTATION, OF USER INTERFACE AND INTERNET NAVIGATION ISSUES, OF GETTING
TO WHERE WE WOULD HAVE GOTTEN WERE WE STARTING TODAY.
AGAIN, THEY GET INVOLVED WITH ISSUES ABOUT COMPETITION AND PROFITABILITY, AND
THEY GET INVOLVED WITH ISSUES ABOUT SOCIAL, NATIONAL AND POLITICAL SYMBOLS OF
THINGS WHICH MAY NOT ACTUALLY HAVE ANYTHING TO DO WITH USERS ON THE INTERNET
ACTUALLY USING THE NETWORK.
THIS TALK FOCUSES ONLY ON THE FIRST THREE. I AM NOT GOING TO TALK ABOUT THE
LAST TWO AT ALL, AND I WANT TO STRESS THIS IS NOT A TECHNICAL PRESENTATION.
WHERE THINGS START TO GET SERIOUSLY TECHNICAL, I HAVE HIDDEN THEM, GLOSSED
OVER THEM OR ENGAGED IN HAND WAVING. THERE IS ONE PARTICULARLY IMPORTANT
EXAMPLE OF THAT WHICH WE WILL GET TO.
THIS PRESENTATION IS DRAWN FROM A RECENT INTERNET ARCHITECTURE BOARD REPORT
CALLED REVIEW AND RECOMMENDATIONS FOR INTERNATIONALIZED DOMAIN NAMES.
IT WAS APPROVED FOR PUBLICATION WITHIN THE LAST COUPLE OF WEEKS. THE WORKING
DRAFT, WHICH IS THE BASIS FOR THE PUBLISHED FORM, IS AT THIS URL, IF YOU CAN
READ IT ON THE SCREEN.
IT'S ALSO DRAWN FROM SOME RELATED INTERNATIONALIZATION AND INTERNET
NAVIGATION WORK AND FROM SOME PERSONAL IMPRESSIONS. AND IF YOU DIDN'T KNOW
WHICH IS WHICH, ASK ME WHEN WE GET TO QUESTIONS.
IDNS ARE THE SOLUTION TO A PROBLEM. IF THEY WEREN'T PERCEIVED AS A SOLUTION
TO SOME PROBLEM, WE PRESUME WE WOULDN'T BE TALKING ABOUT THEM HERE.
THE PROBLEM TO WHICH THEY ARE THE SOLUTION IS BETTER MNEMONIC VALUE FOR NAMES
FOR PEOPLE USING NONBASIC ROMAN-BASED SCRIPTS THAN THEY GET FROM USING THOSE
SCRIPTS.
IF YOUR NORMAL LANGUAGE IS CHINESE, YOU PROBABLY WANT YOUR DOMAIN NAMES TO BE
IN CHINESE SO THAT YOU CAN REMEMBER THEM MORE EASILY THAN REMEMBERING NAMES
DERIVED FROM ENGLISH OR FRENCH OR GERMAN OR SOME OTHER LANGUAGE WHICH USES
LATIN BASED CHARACTERS. THEY MAY BE A SOLUTION TO NATIONAL PRIDE IN
RECOGNITION. I DON'T KNOW HOW TO TALK ABOUT THAT IN THE CONTEXT OF A
TECHNICAL PRESENTATION SO I'M NOT GOING TO SAY ANYTHING MORE ABOUT IT.
THERE ARE A NUMBER OF OTHER PROBLEMS TO WHICH THERE ARE NO IDN SOLUTIONS, AND
NO IDN SOLUTION IS GOING TO SOLVE THOSE PROBLEMS. IDNS DO NOT MAKE CONTENT
AVAILABLE ON THE INTERNET. IDNS DO NOT PROVIDE CONNECTIVITY AND ACCESS TO
THE INTERNET FOR ANYBODY.
IDNS DO NOT TURN URLS USER FRIENDLY. URLS FROM A USER STANDPOINT IS ONE OF
THE WORST IDEAS WE EVER CAME UP WITH.
FORTUNATELY, FOR USER INTERFACE STANDPOINT, THEY'RE AN ACCIDENT AND NO ONE IS
TO BLAME.
AND IDNS DON'T HELP US UNDERSTAND EACH OTHER'S LANGUAGES, ALTHOUGH THAT WOULD
BE REALLY GREAT IF THEY DID.
ANYTHING WE DO WITH IDNS IS, TO SOME EXTENT, CONTROLLED BY CONSTRAINTS THE
DNS ITSELF IMPOSES.
THE DNS PERMITS ONLY EXACT MATCHING.
WE CANNOT ASK THE DNS A QUESTION AND GET BACK AN ANSWER OF "CLOSE ENOUGH," OR
AN ANSWER OF "DID YOU REALLY MEAN THIS OTHER THING?"
AND THAT MAKES THINGS HARD.
ANOTHER CONSTRAINT IS DNS WAS DESIGNED TO BE SORT OF CASE-INSENSITIVE.
IT'S SORT OF CASE-INSENSITIVE BECAUSE IN ISO 646, IN ASCII, THE DEFINITION OF
HOW YOU GET FROM UPPER CASE TO LOWER CASE OR LOWER CASE TO UPPER CASE IS VERY
EASY.
IT DOESN'T REQUIRE TABLES.
IT DOESN'T REQUIRE MATCHING, AND IT DOESN'T REQUIRE SUBJECTIVE DECISIONS.
TO BE ABSOLUTELY STRICT ABOUT IT, DNS IS CASE SENSITIVE IN WHAT IT STORES.
IF YOU PUT SOMETHING IN IN MIXED CASE, IT COMES BACK OUT IN MIXED CASE.
AND IT'S CASE-SENSITIVE IN ITS REPLIES TO QUERIES, BUT THE QUERIES ITSELF AND
THE MATCHING ARE CASE-INSENSITIVE.
WE CANNOT FIGURE OUT A WAY TO REPLICATE THAT MODEL WITH IDNS WITHOUT SOLVING
TWO PROBLEMS, ONE OF WHICH IS TO ACTUALLY PUT THE IDNS INTO THE DNS RATHER
CODING THEM.
AND HAD WE DONE THAT, WE WOULD BE A NUMBER OF YEARS FURTHER AWAY FROM HAVING
IDNS WORKING.
AND THE OTHER PROBLEM WE WOULD HAVE TO SOLVE IS TO FIGURE OUT IN ALL LANGUAGE
-- LANGUAGES WHICH HAVE CASE HOW TO MAKE THE CASE-MAPPING UNAMBIGUOUS AND
CLEAR.
AND WE DO NOT KNOW HOW TO DO THAT WITHOUT GETTING VERY SPECIFIC ABOUT
LANGUAGES RATHER THAN ABOUT CHARACTERS AND SCRIPTS.
LANGUAGES, IN SOME CASES, THE COUNTRIES AND AREAS THEY'RE USED.
ANOTHER DNS CONSTRAINT IS THE DNS WORKS IN TERMS OF CHARACTER STRINGS.
IT DOESN'T KNOW ANYTHING ABOUT LANGUAGES OR SCRIPTS IN ANY OF ITS STORAGE
ALGORITHMS, IN ANY OF ITS MATCHING ALGORITHMS.
THE OTHER SET OF CONSTRAINTS ARE MUCH HARDER TO UNDERSTAND FROM A
NON-TECHNICAL LEVEL, BUT NOT LESS IMPORTANT.
THERE IS SOME EXTREMELY SUBTLE ISSUES HAVING TO DO WITH THE WAY THE DNS
OPERATES AS A STRICT ADMINISTRATIVE HIERARCHY.
THERE ARE THINGS YOU CAN'T DO IN TERMS OF TRANSFERRING INFORMATION ACROSS THE
DNS FROM ONE TREE TO ANOTHER TREE, FROM ONE TOP-LEVEL DOMAIN TO A
SECOND-LEVEL DOMAIN OR ANOTHER ONE.
THE ALIASING MECHANISM IS VERY, VERY INFLEXIBLE COMPARED TO WHAT WE'RE USED
TO IN, FOR EXAMPLE, A BIBLIOGRAPHIC SYSTEM.
WE CAN'T HAVE THE DNS COME BACK AND SAY, "IN ADDITION TO WHAT I TOLD YOU, SEE
ALSO THIS OTHER STUFF," AS PART OF AN ALIASING MECHANISM.
AND THEN THERE ARE A NUMBER ISSUES WHICH ARE TECHNICALLY COMPLEX AND SUBTLE
SO MUCH SO THAT WE GET INTO DEBATES AS TO WHETHER THE NUMBER OF PEOPLE IN THE
WORLD WHO COMPLETELY UNDERSTAND THEM NEED MORE THAN TWO HANDS TO COUNT THEM.
WHEN YOU START TALKING ABOUT DNS -- TO DNS EXPERTS ABOUT SOME OF THESE ISSUES
IN ALIASING AND SOME OF THE ISSUES ASSOCIATED WITH IDNS, YOU COME UP WITH
SOMEBODY SAYS SOMETHING LIKE, "RR SET CONSISTENCY."
I THINK IT IS A SAFE BET THAT THERE ARE NOT MORE THAN FIVE PEOPLE IN THIS
ROOM WHO REALLY UNDERSTAND WHAT THAT MEANS AND COULD EXPLAIN IT PRECISELY TO
OTHER PEOPLE.
I AM NOT ONE OF THEM.
WHAT HAPPENS, WE START TALKING ABOUT IDNS, IS WE START TALKING ABOUT NAMES.
AND AS I SAID EARLIER, THE DNS IS REALLY NOT VERY GOOD AT NAMES.
BUT WHEN WE TALK ABOUT NAMES, WE TALK ABOUT NAMES IN THE CONTEXT OF LANGUAGES
AND DIALECTS AND SCRIPTS.
AND THEY ARE COMPLICATED BUSINESS, AND THEY HAVE BEEN A COMPLICATED BUSINESS
FOR CENTURIES.
THE RELATIONSHIPS, THE MATCHING RULES, WHAT FITS INTO ONE SCRIPT OR DOESN'T
FIT INTO ONE SCRIPT, HOW THINGS OVERLAP, WHEN SOMETHING STOPS BEING A
LANGUAGE AND STARTS BEING A DIALECT, WHEN TO USE THE OFFICIAL ORTHOGRAPHY AND
WHEN OTHER ORTHOGRAPHIES COUNT BETTER.
THAT'S A QUESTION ABOUT WHEN AS WELL AS WHERE.
AND THESE RELATIONSHIPS CAN BE DEBATED PASSIONATELY.
IN SOME COUNTRIES, THERE ARE OFFICIAL BODIES WHICH DETERMINE WHAT THE
LANGUAGE IS, AND EVERYBODY IGNORES THEM AND THEY GET UPSET.
AND IN OTHER COUNTRIES, THERE ARE NO OFFICIAL BODIES AND EVERYBODY IGNORES
THE NONEXISTENT RULES AND GETS UPSET.
IT'S IMPORTANT TO UNDERSTAND THAT THERE ARE OFTEN NO CLEAR ANSWERS TO THESE
QUESTIONS, AND EXPECTING IDNS TO RESOLVE THESE QUESTIONS IS NOT GOING TO GET
US ANYWHERE.
THESE THINGS INVOLVE SUBJECTIVE DECISIONS OFTEN, EVEN A QUESTION OF WHAT
CHARACTERS MATCH INVOLVES SUBJECTIVE DECISIONS.
WHAT WE KNOW ABOUT THESE KINDS OF DECISIONS IS THAT PEOPLE ARE MUCH BETTER AT
THEM THAN COMPUTERS.
THAT CONTEMPORARY RULE-BASED COMPUTER SYSTEMS THAT YOU CAN FEED A LOT OF
RULES AND A LOT OF CONTEXT ARE BETTER AT THEM THAN THE DNS.
AND THE DNS DOESN'T EVEN HAVE THE INFORMATION NECESSARY TO TRY MOST OF THE
APPROACHES ONE WOULD NEED IF ONE WANTED TO ADDRESS THESE QUESTIONS.
BUT WE REGULARLY GET INTO DISCUSSIONS ABOUT IDNS IN WHICH SOME OF THESE
QUESTIONS TURN OUT TO BE NECESSARY TO THE ANSWERS.
WE CAN'T FIX THAT.
AND IF LINGUISTIC CORRECTNESS IS THE QUESTION, THEN IDNS IN THE DNS ARE NOT
THE ANSWER.
SO WE'VE GOT A STANDARD CALLED IDNA.
IT TAKES IDNS, AND ENCODES THEM IN THE DNS.
IT DOES SO IN ESSENCE IN TWO STAGES.
THE FIRST OF THOSE STAGES IS CALLED NAMEPREP IN THE INTERNAL GEEK
TERMINOLOGY.
AND IT TAKES A UNICODE STRING AND TURNS IT INTO ANOTHER UNICODE STRING, WHICH
IS SIMPLIFIED ENOUGH THAT MAPPING IS POSSIBLE.
WITHOUT THAT STEP, MANY CHARACTERS LOOK LIKE MANY OTHER CHARACTERS TO UNICODE
BUT LIKE THE SAME CHARACTER TO PEOPLE.
IF THIS -- IF THINGS WHICH LOOK LIKE THE SAME CHARACTER TO PEOPLE TO A
REASONABLE APPROXIMATION DO NOT LOOK LIKE THE SAME CHARACTER TO THE DNS, WE
GET LOTS AND LOTS OF UNPLEASANT SURPRISES.
AND THE SECOND STEP, ONCE THAT'S DONE, IS THE UNICODE STRING IS TURNED INTO A
STRING WHICH CAN BE STORED IN THE DNS.
AND THAT STRING WHICH IS STORED IN DNS, WHICH THE GEEK COMMUNITY CALLED
PUNYCODE, LOOKS TO THE DNS LIKE A CLASSICAL ASCII NAME WHICH CONTAINS ONLY
LETTERS, DIGITS, AND HYPHENS.
ITS IMPORTANT PROPERTY IS, SINCE THESE THINGS ARE USED IN THE DNS, WE HOPE,
ONLY TO REPRESENT IDNS, IT IS POSSIBLE TO NOTICE THAT ONE OF THEM IS AN IDN
AND MAP IT BACK IN THE ORIGINAL CHARACTER FORM.
THE STANDARD IS A LITTLE BIT OVER THREE YEARS OLD.
THE STANDARD AND THE DEFINITION OF STANDARD AND THE EXPECTATIONS OF STANDARD
TURNED OUT TO BE A LITTLE NAIVE IN SEVERAL WAYS.
FROM THE PERSPECTIVE OF THE COMMUNITY WHICH HAS BEEN WORKING ON THIS
TECHNICALLY, THE FIRST-GENERATION ICANN POLICY STATEMENTS AND PLANS TURNED
OUT TO BE A WHOLE LOT MORE NAIVE THAN THE STANDARD ITSELF.
THIS IS A SITUATION WHICH WE NEED TO FIX ON BOTH SIDES OF THIS ARTIFICIAL
BARRIER WE'VE CREATED.
SO WE OFTEN GET ASKED WHAT THE REAL PROBLEMS ARE IN IMPLEMENTING IDNA. AND
IT'S A VERY SHORT LIST.
IF YOU IGNORE THE POLICY ISSUES AND MOST OF THE THINGS THAT I'VE BEEN TALKING
ABOUT FOR THE LAST FIVE MINUTES, THERE ARE NO PROBLEMS WITH IMPLEMENTING
IDNA, IT TURNS OUT TO BE VERY SIMPLE.
THE BIGGEST PROBLEM IN MOST CASES IS GETTING AROUND TO DOING IT.
RIGHT NOW, IN ALL OF THE MAJOR INTERNET WEB BROWSERS IN THE WORLD IN THE
PRODUCTION VERSIONS, BUT ONE, WE'VE GOT FULL IDNA IMPLEMENTATIONS.
THE ONE IS COMING ALONG.
IN OTHER APPLICATIONS, THERE'S VERY LITTLE SUPPORT SO FAR, BECAUSE OTHER WORK
COMES FIRST.
WE CAN SUPPORT IDNA FAIRLY EASILY IN A MAIL CLIENT.
BUT UNTIL WE'VE GOT INTERNATIONALIZED ADDRESSES, THERE'S VERY LITTLE POINT IN
DOING SO.
SO VERY FEW PEOPLE HAVE OPENED UP MAIL CLIENTS TO DO THAT WORK.
SAME ANSWER FOR A SERIES OF OTHER PROTOCOLS.
WHEN WE START TALKING ABOUT USING IDNA RATHER THAN IMPLEMENTING IDNA,
ESPECIALLY IF THE QUESTION OF USING IT MERELY IS ANSWERED IN TERMS OF
REGISTERING THINGS AND PUTTING THEM INTO ZONE FILES AND THEN TAKING THEM BACK
OUT AGAIN, WE RUN INTO A SERIES OF PROBLEMS MANY OF WHICH HAVE GOTTEN A LOT
OF DISCUSSION AND DEBATE OVER THE LAST YEAR TO 18 MONTHS.
WE'VE RUN INTO PROBLEMS WITH CHARACTER SPOOFING AND SIMILARITIES.
IN THE MOST GENERAL SENSE, THESE CANNOT BE FIXED TECHNICALLY, ALTHOUGH THERE
ARE WAYS OF HELPING WITH IT.
IT'S VERY HARD TO DESIGN POLICIES WHICH HELP FOR MANY CASES WITHOUT
PREVENTING THINGS WHICH USERS LEGITIMATELY WANT TO DO.
AND IT'S IMPOSSIBLE TO PREVENT ALL THE CASES.
IF WE WANT TO GO VERY FAR IN THIS DIRECTION, WE HAVE TO START ANSWERING THE
QUESTION AS TO WHETHER OR NOT SOMEBODY HAS THE RIGHT TO A PARTICULAR NAME IF
THEY LIKE IT AND USE IT IN EVERYDAY LIFE AND IT'S HARD ON THE DNS AND HARD ON
USERS.
IT'S A VERY, VERY DIFFICULT QUESTION TO ANSWER.
IT NEEDS TO BE ADDRESSED AT SOME POINT, EITHER GENERALLY OR ON A
REGISTRY-BY-REGISTRY BASIS.
WE HAVE PROBLEMS WITH TRANSCRIPTION FROM WRITTEN FORM.
IF YOU LOOK AT A STRING SOMEHOW ON A BILLBOARD OR ON THE SIDE OF A LORRY, HOW
TO GET THAT INTO DNS IN AN UNAMBIGUOUS WAY MAY REQUIRE MORE INFORMATION THAN
YOU HAVE.
OUR HUMAN EXPECTATIONS, WHICH IS WHAT I'VE JUST BEEN TALKING ABOUT, AND
EXPECTATIONS ABOUT THE DNS ARE DIFFERENT.
THE DNS IS MUCH, MUCH LESS FLEXIBLE.
WE NEED TO EDUCATE OURSELVES ABOUT WHAT'S POSSIBLE IN THIS IDN WORLD.
DOES THE CHARACTER O ARE A SLASH AND O WITH TWO DOTS ABOVE IT MATCH?
DO THOSE TWO CHARACTERS MATCH?
ANSWER IS, IN SOME LANGUAGES, IN SOME CONTEXTS IN SOME COUNTRIES, YES.
IN OTHER LANGUAGES, IN OTHER CONTEXTS, IN OTHER COUNTRIES, THEY DON'T LOOK
ALIKE, THEY DON'T ACT ALIKE, THEY DON'T SOUND ALIKE, THEY MAY NOT BOTH EXIST
IN THE LANGUAGE, AND, HENCE, NO.
BUT USERS IN THE COUNTRIES WHICH THINK THEY MATCH WILL EXPECT THEM TO MATCH.
AND WE'RE GOING TO NEED TO EDUCATE THEM THAT THAT DOESN'T WORK.
DOES THE WORD "MASS" IN GERMAN SPELLED WITH S SET MATCH THE WORD "MASS"
SPELLED WITH DOUBLE S?
SAME ANSWER.
THE USERS UNDER SOME CONTEXTS AND CIRCUMSTANCES WILL EXPECT IT TO MATCH.
IDNA CANNOT MAKE THEM MATCH.
DO THE CHARACTER O FOLLOWED BY THE CHARACTER E MATCH THE CHARACTER OE?
IN SOME LANGUAGES, YES, IN SOME LANGUAGES AND CONTEXTS, NO.
WE CANNOT FIX THIS IN THE DNS.
THE ONLY SOLUTION TO THIS PROBLEM IS GOING TO BE EDUCATION ABOUT WHAT NOT TO
EXPECT.
IT IS FAR EASIER TO MATCH THE U.S. SPELLING OF THE WORLD COLOR WITH THE
BRITISH SPELLING OF THE WORD COLOUR TO RESOLVE THE QUESTIONS THAT I HAVE JUST
BEEN ASKING, BUT THE DNS CAN'T DO THAT, EITHER.
ALTHOUGH IF THERE'S ENOUGH OF A BUSINESS OPPORTUNITY IN THAT MATCH, SOMEBODY
WILL BE OUT THERE SELLING EVERY POSSIBLE EUROPEAN SPELLING AS A NAME YOU HAVE
TO HAVE FOR THE U.S -- IF YOU'VE GOT THE U.S. SPELLING LIST, SOMEONE
STEALING YOUR NAME OR VIOLATE YOUR TRADEMARK OR CONFUSE YOUR CUSTOMERS.
AS SOON AS THE CHARACTERS GET MUCH MORE COMPLICATED THAN 646, CASE-MATCHING
BECOMES IMPRECISE AND REQUIRES TABLES.
WE'VE GOT SITUATIONS IN WHICH IF WE TAKE A CHARACTER AND MAP IT TO UPPER
CASE, IT MAPS TO ONE FORM IN ONE LANGUAGE AND ANOTHER FORM IN ANOTHER
LANGUAGE, OR TO DIFFERENT FORMS IN THE SAME LANGUAGES USED IN DIFFERENT
COUNTRIES.
DNS CANNOT FIGURE THIS STUFF OUT.
ONE OF THE ADVANTAGES WE HAD WHEN DEFINING ASCII IN 646 IS THE CODE SPACE IS
VERY SMALL.
WE PUT ALL THE CHARACTERS IN IT WE WERE EVER GOING TO GET IN IT AND THEN WE
STOPPED.
SO THERE WAS NEVER AN ISSUE, NEVER A POSSIBILITY OF ADDING THINGS TO THAT
TABLE.
SO WE DIDN'T HAVE TO WORRY ABOUT UPGRADING NEW VERSIONS, IN THEORY.
IN PRACTICE, WE DID, TOO, THERE.
BUT IT WASN'T MAJOR.
MATCHING NEW CHARACTERS AND OLD CHARACTERS AND NEW AND OLD TABLES WITH THE
WAY IDNA WORKS TODAY IS GOING TO BE VERSION SENSITIVE, AND VERSION SENSITIVE
IN BOTH IDNA AND UNICODE.
AND WHEN MATCHING IS IN THE EYE OF THE BEHOLDER, VERSION SENSITIVITY ITSELF
IS NOT THE ONLY PROBLEM WE NEED TO SOLVE.
I MENTIONED EARLIER, WE'VE GOT A PROBLEM WITH TRANSCRIBING URL.
THERE HAVE BEEN SOME NOTORIOUS EXAMPLES.
LET ME GIVE YOU ONE SOME OF YOU MAY NOT HAVE SEEN.
IF WE HAVE A URL WHICH LOOKS LIKE THE FIRST LINE ON THIS -- LIKE THE FIRST
LINE ON THIS SLIDE, IT LOOKS LIKE A PERFECTLY GOOD URL IN ASCII IN A PARAGUAY
TOP-LEVEL DOMAIN.
IN FACT, THE CHARACTERS WHICH I WROTE IN RED ARE IN CYRILLIC, AND THE THING
WHICH LOOKS TO THOSE OF YOU WHO ARE NOT USED TO CYRILLIC LIKE DOT PY IS A
TRANSCRIPTION OF THE LETTERS "RU" FROM THE ROMAN CHARACTER SET -- FROM THE
LATIN CHARACTER SET INTO CYRILLIC CHARACTER SET.
IT'S BEEN SUGGESTED BY SOME PEOPLE THAT THIS MAKES A PERFECTLY GOOD TOP-LEVEL
DOMAIN NAME FOR RUSSIA.
AND, INDEED, IN ONE OF THE ALTERNATE ROOTS, THIS DOMAIN NAME TOP-LEVEL DOMAIN
FOR RUSSIA ALREADY EXISTS.
PRESUMABLY, IF THERE'S ANYONE FROM PARAGUAY HERE, THEY WEREN'T CONSULTED
BEFORE THIS WAS DONE. AND THIS KIND OF ISSUE RAISES GLOBAL ISSUES IN TERMS
OF THE USABILITY AND INTEROPERABILITY OF DNS AND TRANSCRIPTIONS OF DOMAIN
NAMES.
THE TOP LEVEL HERE IS IN A SINGLE SCRIPT, SO THE RULE WHICH SAYS, AT A
REGISTRY LEVEL, WE ONLY USE ONE SCRIPT PER LABEL, DOESN'T WORK.
AND THAT'S, INCIDENTALLY, THE OTHER ISSUE WE TRY TO USE IDNS IN TLD, WHICH IS
ALL THE ISSUES TODAY REGISTRIES HAVE TO DEAL WITH IN TERMS OF WHAT THEY DO OR
DO NOT NEED TO PERMIT TO BE REGISTERED SUDDENLY BECOME ICANN PROBLEMS AND
NEED TO BE SETTLED GLOBALLY.
THE SECOND URL, ONCE YOU GET -- SECOND DOMAIN NAME ONCE YOU GET PAST THE WWW
IS ENTIRELY IN CYRILLIC.
AND THE QUESTION IS, DOES THE APPEARANCE OF THOSE CHARACTERS WARN THE USER
ENOUGH?
I WOULD SUGGEST THAT FOR SOMEBODY WHO IS NOT USED TO LOOKING AT BOTH -- AT
EITHER ROMAN CHARACTERS OR CYRILLIC CHARACTERS, IT DOES NOT.
ONE OF THE SOLUTIONS WHICH HAS BEEN DEVELOPED FOR SOME OF THESE PROBLEMS IS
SOMETHING CALLED A VARIANT MODEL.
THE THEORY IS THAT WITHIN A GIVEN DOMAIN, YOU COLLECT ALL THE LABELS THAT
CONTAIN SIMILAR CHARACTERS, AND THEN YOU ADOPT A RULE.
AND THE RULE SAYS IF YOU REGISTER ONE OF THEM AND BLOCK ALL THE OTHERS, OR
YOU DECIDE THAT ALL OF THEM BELONG TO THE SAME REGISTRANT AND THEY CAN EITHER
BE REGISTERED OR NOT, BUT NOT BY ANYBODY ELSE, OVERSIMPLIFICATION OF A
COMPLICATED SYSTEM, BUT IT'S AN INTERESTING AND IMPORTANT IDEA.
THE NOTION OF WHAT CONSTITUTES SIMILAR IN THIS SITUATION IS REGISTRY DEFINED.
IT MIGHT BE APPEARANCE OR MEANING OR SOUND, OR SOMETHING ELSE.
STATUS OF THAT SYSTEM, IT'S BEEN VERY WELL AND STRONGLY DEVELOPED FOR THE
LANGUAGES BASED ON THE CHINESE CHARACTER SET.
IT'S GOT SOME OBVIOUS APPLICATIONS FOR FANCY DECORATED ROMAN-BASED
CHARACTERS. AND IT'S GOT SOME APPLICATIONS ACROSS SCRIPTS FOR OTHER SCRIPTS.
THE IAB HAS RECOMMENDED THAT PEOPLE NEED TO TAKE A MORE CAREFUL LOOK AT THIS
VARIANT SYSTEM IN A VARIETY OF CONTEXTS.
NOT A RECOMMENDATION FOR ANYBODY TO DO ANYTHING, BUT THAT A LOOK AT IT MAY BE
APPROPRIATE.
THE VARIANT SYSTEM HAS NO IMPACT ON QUERIES.
IT ONLY AFFECTS STORAGE.
REGISTRATION PROCESSES.
THERE'S A PERCEPTION IN THE USER COMMUNITY THE POLICIES WE HAVE AND THE
POLICIES WE ARE LIKELY TO EVOLVE ARE NOT PROTECTIVE ENOUGH OF USERS.
THIS HAS LED TO A REACTION FROM SOFTWARE WRITERS.
SO FAR, THE WEB BROWSER FOLKS, BUT IT WILL GENERALIZE.
AND THEIR REACTIONS ARE TO PROTECT THEIR USERS BY ATTACHING WARNINGS TO NAMES
THEY PERCEIVE AS RISKY, OR TO TAKE NAMES THEY CONSIDER RISKY AND RENDER THEM
IN PUNYCODE FORM RATHER THAN THEIR NATIVE SCRIPTS, WHICH DEFEATS THE WHOLE
VALUE OF IDNS.
OR TO DO OTHER CREATIVE THINGS.
THE DEFINITION OF WHAT CONSTITUTES RISK WILL ALMOST CERTAINLY EVOLVE IN WAYS
WHICH DIFFER BY VENDOR.
AND WHEN THEY DIFFER BY VENDOR, WE END UP WITH ASTONISHED USERS, BECAUSE
SWITCHING FROM ONE SOFTWARE PACKAGE TO ANOTHER SOFTWARE PACKAGE MEANS DIFFER
BEHAVIOR.
WE ALSO END UP WITH ASTONISHED REGISTRANTS, BECAUSE THE NAME THEY THINK THEY
HAVE REGISTERED AS AN IDN SOMETIMES LOOK LIKE AN IDN, SOMETIMES LIKE
PUNYCODE.
PARTIALLY IT'S A PROBLEM TO FIX, PARTIALLY IT'S A PROBLEM OF EDUCATION.
IT'S GOING TO HAPPEN.
WE'VE HEARD A LOT OF CONVERSATION ABOUT SEPARATE MATCHING TREES, TWO SEPARATE
DNS TREES WITH SEPARATE DELEGATION RECORDS WHICH SOMEHOW MATCH BY VIRTUE OF
THE NAMES IN ONE OF THE TREES BEING TRANSLATIONS OF THE NAMES OFF THE OTHER
ONES, FOR EXAMPLE.
WELL, IT DOESN'T HAPPEN WITH REAL TREES.
WE GET GENETIC VARIATION.
AND IT DOESN'T WORK IN THE DNS.
IF WE POPULATE ONE TREE WITH TRANSLATION OF ANOTHER TREE, IT MAY ALMOST WORK,
BUT MAY ALSO TURN OUT TO BE SPELLED, AS FAR AS THE USER IS CONCERNED, AS
UNPREDICTABLE.
AND IT CERTAINLY MEANS SEPARATE ZONE FILES AT THE THIRD LEVEL.
EXTREMELY DIFFICULT TO KEEP SYNCHRONIZED AND ADMINISTRATE.
WHEN WE MAKE DIFFERENT IMPLEMENTATION CHOICES ABOUT WHAT THE SUPPORT OF THE
DIFFERENT BEHAVIORS AS THE USER SEES IT, IF THE BEHAVIOR IS INCONSISTENT, THE
REGISTRANT AND THE USER WILL BE UNABLE TO PREDICT WHAT'S GOING TO HAPPEN AND
THEY WILL NOT BE HAPPY.
INDEED, THEY WILL RUN AROUND LOOKING FOR SOMEONE TO BLAME.
THIS VIOLATES AN OLD PRINCIPLE WHICH WE OFTEN CALL IN USER INTERFACE DESIGN
THE LAW OF LEAST ASTONISHMENT.
IF THE USER LOOKS AT SOMETHING AND IS SERIOUSLY SURPRISED, WE'VE DONE
SOMETHING WRONG.
IDNS HAVE WHAT WE CALL A LOT OF POTENTIAL.
THE MAIN PROTECTION AGAINST THE PROBLEMS OF MATCHING DIFFERENT WAYS OF CODING
CHARACTERS IS BUILT INTO UNICODE AND INVOLVES WHAT THEY CALL NORMALIZATION TO
A SINGLE FORM.
THE NORMALIZATION RULES HAVE BEEN CAREFULLY DESIGNED TO BE AS STABLE AS
POSSIBLE.
BUT BECAUSE OF SOME PROPERTIES OF IDNS AND THE DNS, THEY ARE PROBABLY NOT
STABLE ENOUGH FOR IDN USE.
PART OF THE PROBLEMS IS THAT IDNS PERMIT UNNORMALIZED STRINGS AND, INDEED, WE
EXPECT TO SEE UNNORMALIZED STRINGS IN ANCHORS ON WEB PAGES AND THE LIKE, AND
THOSE STRINGS MAY PERSIST OVER LONG PERIODS OF TIME SHE WE END UP STORING IN
SOMETHING IN THE DNS IN REGISTRATION TIME IN ONE VERSION OF UNICODE WITH ONE
SET OF NORMALIZATION RULES AND THEN TAKING A DIFFERENT VERSION OF THAT STRING
LATER AND EVALUATING IT AGAINST A DIFFERENT VERSION OF UNICODE AND A
DIFFERENT SET OF NORMALIZATION RULES WHEN THE MATCHING -- WHEN THE QUERY AND
MATCHING IS DONE.
WE KNOW THAT THERE'S A PROBLEM THERE.
WE HAVE SOME EXAMPLES OF THE PROBLEM.
WE ARE STILL TRYING TO FIGURE OUT JOINTLY WITH THE UNICODE CONSORTIUM HOW
SERIOUS THE PROBLEM IS AND WHAT TO DO ABOUT IT.
IT HAS NOT BEEN AN EASY DISCUSSION.
MAKING NAMEPREP STABLE ACROSS UNICODE VERSIONS IS VERY IMPORTANT BECAUSE
NAMEPREP ISN'T STABLE.
STABLE IN THE SENSE OF BEING STRICTLY UPWARD COMPATIBLE.
WE HAVE PROBLEMS MIGRATING FROM ONE VERSION OF UNICODE TO THE NEXT.
SOME OF THE METHODS WE HAVE FIGURED OUT TO DO THAT REQUIRE VERSIONING IN THE
DNS SO WE CAN RECORD THE VERSION NUMBER UNDER WHICH AN IDN WAS CREATED.
WE KNOW HOW TO DO THAT.
BUT IT INVOLVES A CHANGE IN THE PREFIX.
IT ALSO INVOLVES MUCH MORE COMPLICATED PROCESSING.
IF WE CHANGE THE PREFIX, EVERY IDN WHICH IS STORED IN EVERY REGISTRY IN THE
WORLD TODAY BECOMES INVALID AND HAS TO BE CONVERTED.
IF WE CAN'T DO THAT MIGRATION, EITHER BECAUSE THE PREFIX CHANGE IS TOO
EXPENSIVE OR BECAUSE WE HAVE OTHER PROBLEMS, THEN NO SCRIPT WHICH HAS BEEN
CODED SINCE UNICODE 3.2 IS EVER GOING TO BE AN IDN.
MANY OF US CONSIDER THAT COMPLETELY UNACCEPTABLE.
BUT THIS IS A HARD PROBLEM FOR THAT REASON.
WE TRIED TO LOOK AT SOME THINGS THE IETF HAS TO DO.
THE IETF NEEDS TO LOOK AT IDNA.
WE NEED TO LOOK AT RESTRICTING NAMEPREP SO IT DOES LESS MAPPING.
SOME OF THE MAPPINGS WHICH ARE BEING DONE NOW ARE FRAGILE AND HAVE THE
PROPERTY THEY'RE VERY CONFUSING TO USERS.
IT APPEARS THAT WHAT WE SHOULD HAVE DONE RATHER THAN MAPPING MORE CHARACTERS
INTO -- INTO OTHER CHARACTERS WAS TO RESTRICT THEM.
SO PART OF THIS IS A CODE POINT REVIEW, PERMITTING FEWER CHARACTERS, PERIOD.
WE NEED TO LOOK AGAIN AND CONTINUE LOOKING AT UPGRADING TO MATCH NEW VERSIONS
OF UNICODE.
WE KNOW HOW TO GET FROM 3.2 TO 5.0, BUT UNLESS WE HAVE A PLAN WHICH SMOOTHLY
MIGRATES THE IDN SYSTEM INTO FUTURE VERSIONS OF UNICODE AS THEY'RE RELEASED,
WE WILL AT SOME POINT FIND OURSELVES IN A WORLD IN WHICH NEW LANGUAGES AND
NEW SCRIPTS AS THEY'RE CODED SIMPLY WILL NOT BE POSSIBLE TO REPRESENT AS
IDNS.
AS I SAID EARLIER, SOME OF US CONSIDER THAT COMPLETELY UNACCEPTABLE.
AND AS WE'RE DOING THESE THINGS, WE NEED TO KEEP REMEMBERING WHAT LIMITS THE
DNS IMPOSES ON US, BECAUSE WE CAN'T MAKE ARBITRARY DECISIONS WHICH KEEP THE
DNS FROM WORKING IN OTHER AREAS.
AMONG THOSE LIMITATIONS, TO REVIEW WHAT I SAID EARLIER IS THAT IDNS WILL NOT
SOLVE URL PROBLEMS, THEY WON'T SOLVE THE PROBLEMS WITH THE STRUCTURE OF THE
URLS.
THEY WON'T SOLVE THE PROBLEMS WITH THE ASCII KEY LOADS AND DELIMITERS AND
THEY WON'T SOLVE THE LONG TAIL COMPLEX SYNTAX.
A WARNING TO THOSE OF YOU WHO ARE BANKING ON IDNS, WE LOOK AT USERS IN MANY
PARTS OF THE WORLD, WE ARE DISCOVERING THAT DOMAIN NAMES ARE BEING USED LESS
AND LESS AND THAT SEARCH ENGINES AND PORTALS AND CROSS REFERENCES AND
DIRECTORIES ARE BEING USED MORE AND MORE.
WERE THAT TREND TO CONTINUE, WE COULD GUESS THAT IN SOME NUMBER OF YEARS,
USERS WOULDN'T CARE WHAT WAS IN THEIR DOMAIN NAMES.
IDNS IN THE DNS DO NOT ADDRESS THE NEAR MATCH ISSUE.
WE CANNOT COME BACK WITH A RESPONSE TO A QUERY AND SAY, "I DON'T HAVE THE
LABEL THAT YOU WANTED.
IS THIS ONE CLOSE ENOUGH?"
AND FOR THESE SIMILAR CHARACTER PROBLEMS, NEAR MATCH MAY BE IN SOME CASES THE
ONLY REAL ANSWER THAT MAKES USERS THINK THAT THE COMPUTER IS DOING WHAT THEY
NORMALLY EXPECT COMPUTERS TO DO, OR EXPECT TO BE DONE.
WE'VE GOT THIS RIGID ADMINISTRATIVE HIERARCHY PROBLEM AND IT LIMITS WHAT WE
CAN DO WITH SIMILAR TREES.
AND THE SOLUTION TO THE PROBLEMS ON THIS SCREEN ALL LIE IN GETTING AWAY FROM
AND ABOVE THE DNS, NOT IN TRYING TO TAMPER WITH THE DNS.
IT'S IMPORTANT TO MENTION AS A SIDE ISSUE THAT ANY ISSUE WHICH INVOLVES
CONFUSABLE CHARACTERS AND SURPRISING MATCHING IS A SECURITY PROBLEM.
IT PROBABLY HAS GREATER IMPACT ON SECURITY AND CERTIFICATES THAN THE DNS.
BUT WHEN DNS NAMES ARE USED TO ESTABLISH IDENTIFIER LOCALES IN THOSE SECURITY
CERTIFICATES, ANY PROBLEMS WE HAVE MULTIPLE.
WE NEED TO MAKE CHANGES HERE.
CHANGES HAVE CONSEQUENCES.
BIGGER CHANGES HAVE MORE CONSEQUENCES THAN SMALLER CHANGES, BUT FIX BIGGER
PROBLEMS.
IF WE HAVE TO MAKE BIG ENOUGH CHANGES, WE MAY INVALIDATE NAMES THAT ARE NOW
VALID.
PREFIX CHANGES ARE A BIG DEAL.
WE NEED TO ASK OURSELVES AS A COMMUNITY WHEN THE PRICE IS TOO HIGH.
WHEN A CHANGE WHICH WOULD IMPROVE THINGS ISN'T IMPORTANT ENOUGH TO PAY THE
PRICE OF THAT CHANGE.
AND IN DOING THAT, WE NEED TO TALK TO USERS WHO ARE HURT IF WE DO NOT MAKE
THE CHANGES, AS WELL AS THE USERS WHO ARE HURT IF WE DO.
THIS RAISES SOME IMPORTANT ISSUES FOR ICANN THAT CANNOT BE ANSWERED
TECHNICALLY OR ON THE IETF SIDE OF THE FENCE.
IT IS CLEAR TO US NOW THAT IDNS AND THESE MATCHING ISSUES ARE GOING TO RAISE
SOME ENTIRELY NEW KINDS OF DISPUTE RESOLUTION ISSUES.
I WOULD HOPE THAT ICANN HAS THOSE PROBLEMS UNDER CONTROL BEFORE IDNS ARE MUCH
MORE WIDELY DEPLOYED, ESPECIALLY IN THE ROOT.
WHEN REGISTRIES START MAKING DECISIONS OF WHAT TO REGISTER, THAT MAY INVOLVE
ASSUMING REGISTRY RESPONSIBILITY FOR THOSE DECISIONS.
WE'VE NEVER MADE THAT ASSUMPTION BEFORE.
TECHNICALLY, WE CAN HAVE A SITUATION IN WHICH EACH REGISTRY CAN HAVE
DIFFERENT POLICIES ABOUT WHAT NAMES THEY PERMIT WITHIN WHATEVER NAMING AND
IDENTIFICATION AND CHARACTER SCHEMES IDNA ITSELF PERMITS.
SOME RESTRICTIONS ABOUT THAT MIGHT MAKE THINGS EASIER FOR THE REGISTRIES,
MAKE THEM MUCH EASIER FOR REGISTRARS, WHO WOULD PREFER NOT TO HAVE TO ENFORCE
DIFFERENT POLICIES FOR DIFFERENT REGISTRIES AND LOTS EASIER FOR THE USERS.
BUT EVOLVE THOSE RESTRICTIONS MAY TURN OUT TO BE VERY HARD.
WE AT LEAST NEED TO DISCUSS WHETHER OR NOT WE WANT TO DISCUSS THEM.
THE IAB REPORT BRIEFLY DISCUSSES AND ADDRESSES THE IDN TLD ISSUE.
IT CLAIMS THAT THERE ARE REALLY THREE SEPARATE KINDS OF DECISIONS BEFORE YOU
GET TO THE POLITICS.
THE DECISIONS ABOUT NAMING AND DELEGATING, WHO GETS ONE OF THESE THINGS, HOW
THEY RELATE TO EXISTING DOMAINS, WHAT THE ASSUMPTIONS ARE ABOUT MATCHING AND
WHAT THE RESTRICTIONS ARE WHICH GO ALONG WITH THAT.
THOSE DECISIONS ARE NOT AS EASY AS MANY PEOPLE WHO ARE TALKING ABOUT
TOP-LEVEL DOMAINS SEEM TO BELIEVE.
AND IF SOME DECISIONS ARE MADE, OTHER DECISIONS MAY TURN OUT TO BE IMPOSSIBLE
IN PRACTICE, SO ONE NEEDS TO UNDERSTAND WHAT DECISION YOU ARE FORECLOSING BY
THE DECISIONS YOU MAKE.
THERE'S BEEN A GREAT DEAL OF DISCUSSION ABOUT MULTIPLE LABELS FOR THE SAME
TLD.
THOMAS WILL TALK A LITTLE BIT ABOUT THIS, I THINK.
BUT THERE ARE SOME ALIAS FACILITIES IN THE DNS WHICH MAY OR MAY NOT BE GOOD
ENOUGH WHEN WE START TALKING ABOUT ALIASING THINGS FOR IDN PURPOSES AT THE
TOP LEVEL.
IF THEY'RE NOT GOOD ENOUGH, IT MAY BE POSSIBLE TO DESIGN OTHER ALIASING
FACILITIES, BUT IT WON'T BE FAST.
AND ATTEMPTS AT DOING ALIASING BY SIMPLY REPLICATING OR TRANSLATING TREES, AS
I SAID EARLIER, TURNS OUT TO BE AN ADMINISTRATIVE AND OPERATIONAL NIGHTMARE.
IT WOULD NOT BE AN EXAGGERATION TO SAY IT WON'T WORK.
AND THEN THERE'S SOME QUESTIONS ABOUT THE CODING PRESENTATIONS OF THESE
NAMES.
THERE MAY NOT BE QUESTIONS THERE, BUT IF THERE ARE QUESTIONS THERE, THE IETF
AND ICANN ARE GOING TO HAVE TO WORK ON THEM TOGETHER.
NEXT STEPS.
WE NEED TO LOOK AT THE PERMITTED CHARACTER LIST.
WE NEED TO CONSIDER REMOVING NON-LANGUAGE CHARACTERS.
WE MAY NOT NEED TO DRAW PICTURES IN A DOMAIN NAME.
MAYBE THAT'S MORE TROUBLE THAN IT'S WORTH.
AND REMOVING WORD SEPARATOR CHARACTERS.
IT HAS ALWAYS BEEN POSSIBLE TO HAVE A DOMAIN NAME LABEL WITH A SPACE IN IT,
BUT THERE ARE REALLY GOOD REASONS WHY NOBODY'S DONE IT OR NOBODY DOES IT
REGULARLY.
WE NEED TO FIGURE OUT HOW TO UPDATE TO UNICODE 5.0, BUT PART OF THAT PROCESS
IS FIGURE OUT HOW TO UPDATE TO 5.1 AND 5.2 AND 6.0.
THE IAB HAS RECOMMENDED THAT WE BE EXTREMELY CAREFUL ABOUT UPGRADING TO 5.0
ON A ONE-TIME BASIS LEST WE GET STUCK THERE.
THE IAB HAS RECOMMENDED THAT WE GO BACK AND REEXAMINE THE NON-DNS AND
ABOVE-DNS AND EXTERNAL ALIASING APPROACHES TO MAKE CERTAIN THAT WE'RE SOLVING
THE RIGHT PROBLEM WITH THE RIGHT TOOLS RATHER THAN TRYING TO USE THE DNS FOR
THESE THINGS BECAUSE IT'S HANDY.
AND THERE ARE A WHOLE SERIES OF ISSUES WITH WHOIS OR WHATEVER DATABASE WE USE
TO RECORD NAMES THAT IDNS CHART -- CHARACTERIZE AND CHARGE THAT WE DO NOT
BELIEVE HAVE BEEN ADEQUATELY DEALT WITH YET.
THIS ISN'T EASY.
WE GOT IT A LITTLE BIT WRONG THE FIRST TIME AND MANAGED TO UPSET SOME USERS
AND SOME BROWSER OPERATORS. AND IF WE HAD BEEN LATER IN THE PROCESS, WE
WOULD HAVE UPSET A LOT MORE PEOPLE.
WE NEED TO TRY TO GET IT FIXED BEFORE DEPLOYMENT IS BROADER. AND FOR MANY OF
THESE ISSUES, "WE" IS GOING TO REQUIRE IETF AND ICANN TO WORK TOGETHER, AND
"WORKING TOGETHER" MEANS GETTING OUT OF THE HABIT OF TOSSING DOMAINS OVER
SOME -- DEMANDS OVER SOME HYPOTHETICAL WALL BASED UPON SOME ASSUMPTIONS ABOUT
HOW THINGS WORK ON THE OTHER SIDE.
AND LEST I AM ACCUSED OF BEING DISCRIMINATORY ABOUT THIS, THAT HAS OCCURRED
IN BOTH DIRECTIONS SO FAR, AND WE BOTH NEED TO STOP IT.
THANK YOU VERY MUCH.
[ APPLAUSE ]
>>VINT CERF: THANK YOU, JOHN. AND NOW FOR THE REALLY PESSIMISTIC NEWS.
I HAVE TO SAY THAT IT'S AWFULLY IMPORTANT THAT WE UNDERSTAND WHAT WE'RE ABOUT
HERE. AND IT'S IMPORTANT NOT ONLY FOR THE TECHNICAL PEOPLE BUT FOR PEOPLE
WHO ARE TRYING TO MAKE SENSIBLE POLICY TO HAVE A SUFFICIENT UNDERSTANDING OF
WHAT THE TECHNICAL PROBLEMS ARE TO KNOW WHAT IMPLICATIONS WILL FOLLOW FROM
CHOICES OF POLICY.
IT GOES THE OTHER WAY, TOO. A CHOICE OF TECHNICAL DIRECTION MAY, IN FACT,
FORECLOSE POLICIES THAT WE WISH WE COULD ADOPT.
SO THIS IS, TO USE THE PHYSICS TERM, THIS IS AN ENTANGLED PROBLEM.
THE POLICY AND THE TECHNOLOGY ARE VERY CLOSE TO EACH OTHER, AND THIS IS ONE
OF THE FEW TIMES, I THINK, IN THE INTERNET WORLD WHERE THAT'S BEEN SO
VISIBLE.
WELL, THOMAS NARTEN, WHO IS HIDING OR WAS HIDING DOWN ON THE FLOOR SHOULD BE
-- OH, HE'S BEHIND ME NOW, LURKING.
SO THOMAS IS OUR IETF LIAISON ON THE BOARD, AND WE'RE GOING TO FIND OUT WHAT
THE IETF HAS BEEN DOING IN THIS DOMAIN.
SO THOMAS, IT'S ALL YOURS.
>>THOMAS NARTEN: OKAY. THANKS, VINT. THIS WILL BE A FAIRLY SHORT
PRESENTATION. I AM GOING TO GO OVER A COUPLE THINGS GOING ON IN THE IETF
WITH RESPECT TO DNS IDN TOPICS THAT ARE OF LIKELY INTEREST TO THIS COMMUNITY
HERE.
FIRST I AM GOING TO FOLLOW-UP A LITTLE BIT ON THE DOCUMENT THAT JOHN SPENT A
FAIR AMOUNT OF TIME TALKING ABOUT AND SAY WHAT'S REALLY HAPPENING WITH THAT
DOCUMENT AND WHAT IT MEANS.
THE FIRST KEY THING IS THE DOCUMENT IS FINISHED FROM THE IETF PERSPECTIVE.
THE IAB HAS BEEN WORKING ON IT A LONG TIME AND THEY APPROVED IT LAST WEEK
FORMALLY. SO AT THIS POINT THEY ARE NOT REALLY SEEKING REVIEW COMMENTS AND
FURTHER COMMENTS. THEY ARE NOT PLANNING ON REVISING THE DOCUMENT. IT'S JUST
GOING TO GO TO THE RFC EDITOR FOR PUBLICATION.
BUT AS JOHN ALREADY SAID, THE DOCUMENT CONTAINS A LOT OF RECOMMENDATIONS THAT
ARE MORE LIKE HERE ARE SOME ISSUES THAT YOU NEED TO THINK MORE CAREFULLY
ABOUT AND DECIDE WHAT YOU ARE GOING TO DO NEXT.
SO IT DIDN'T MAKE RECOMMENDATIONS ON WHAT TO DO AS MUCH AS HERE ARE DECISION
POINTS.
AND AT THE MONTREAL IETF MEETING WHICH IS IN A COUPLE OF A WEEKS FROM NOW,
THIS WILL BE A TOPIC DURING ONE OF THE APPLICATIONS AREA OPEN MEETING. AND
THE WHOLE PURPOSE THERE WILL BE FROM AN IETF PERSPECTIVE, WHAT DO WE NEED TO
DO. AND IT'S PARTLY LIKE TALKING ABOUT SOME OF THESE ISSUES BUT IT'S ALSO
WHAT IS THE ACTUAL PROCESS AND WHAT IS IT THE IETF NEEDS TO DO IN ORDER TO
MAKE PROGRESS ON THE INDIVIDUAL TOPICS.
THE SECOND THING I WANT TO TALK ABOUT BRIEFLY IS DNAME. BECAUSE DNAME HAS
BEEN TALKED ABOUT HERE IN ICANN FOR A COUPLE OF CYCLES NOW AS A POSSIBLE
ALIASING MECHANISM, AND I WANT TO GIVE YOU SOME BACKGROUND ON IT. IT'S BEEN
AROUND QUITE A LONG TIME. THE RFC 2672 WHICH IS THE DNAME SPEC ITSELF WAS
PUBLISHED WAY BACK IN 1999. AND AT THE TIME, IT WAS THOUGHT IT WAS GOING TO
BE HELPFUL FOR SOME TRANSITION PURPOSES RELATED TO IPV6, BUT AS PEOPLE
STARTED LOOKING AT IT AND THINKING ABOUT IT MORE THEY COOLED TO THE IDEA IN
TERMS OF HOW WELL IT ACTUALLY SOLVED THE PROBLEM IT WAS PURPORTED TO SOLVE IN
THE CONTEXT OF A SPECIFIC TRANSITIONING PROBLEM.
SO THE UPSHOT OF IT IS THAT TODAY, THERE IS SOME DEPLOYMENT EXPERIENCE BUT
IT'S FAIRLY LIMITED. AND THE KEY THING TO OBSERVE IS THAT NOTHING DEPENDS ON
IT OPERATIONALLY. AND WHAT THAT MEANS IS THAT THERE'S NO APPLICATION OUT
THERE THAT RELIES ON DNAME ACTUALLY WORKING. SO IT'S NOT LIKE SOMETHING
WOULD BREAK IN AN APPLICATION RELATED TO DNAME DIDN'T WORK. WHEN IT COMES TO
NETWORKING PROTOCOLS, THE WAY YOU KNOW SOMETHING ACTUALLY WORKS IS IF YOU
KNOW PEOPLE ARE USING IT, AND IF IT DIDN'T WORK FOR SOME REASON, IT WOULD
BREAK, THE END USER WOULD COMPLAIN. AND THERE'S A CYCLE, BECAUSE WHEN THINGS
ARE BROKEN, PEOPLE NOTICE IT AND THEY ARE FORCED TO GO FIX IT BECAUSE THEY
CAN'T LEAVE IT BROKEN. IN THE CASE OF SOMETHING FOR WHICH WE DON'T HAVE A
LOT OF OPERATIONAL EXPERIENCE, IT DOESN'T MEAN THAT IT DOESN'T WORK. IT JUST
MEANS THAT WE DON'T NECESSARILY KNOW HOW WELL IT'S TESTED AND WHETHER ALL OF
THE STRANGE CONDITIONS THAT CAN OCCUR IN PRACTICE HAVE BEEN TESTED AND THAT
WE !
KNOW ABOUT.
AND SO AT THE SAME TIME THIS IS GOING ON, THERE WAS A LOT OF WORK GOING ON
REVISING THE DNSSEC PROTOCOLS, AND IF PEOPLE WERE WORKING ON THAT, A LOT OF
TIMES THERE WOULD BE QUESTIONS OF THE FORUM, WELL, WHAT HAPPENS WITH DNAME
AND HOW DOES DNAME PLAY INTO THIS PARTICULAR ASPECT OF DNSSEC? AND AS A
RESULT OF THAT, PEOPLE STARTED AGAIN THINKING ABOUT DNAME IN MORE DETAIL AND
A SET OF QUESTIONS WOULD COME UP ABOUT HOW IT WAS SUPPOSED TO BEHAVE, WHAT'S
SUPPOSED TO HAPPEN IN THIS SCENARIO. AND THE CONVENTIONAL WISDOM AT THIS
POINT IS DNAME NEEDS TO BE -- THERE ARE AREAS WHERE THE DETAILS NEED TO BE
SPECIFIED A LITTLE BIT BETTER AND SOME CLARIFICATIONS NEED TO BE MADE BECAUSE
PEOPLE COULD READ THE SPEC AND NOT NECESSARILY KNOW WHAT WAS SUPPOSED TO
HAPPEN, AND CONSEQUENTLY DIFFERENT IMPLEMENTERS COULD IMPLEMENT IT IN
DIFFERENT WAYS WITHOUT BEING IN CONFLICT WITH WHAT THE SPEC ACTUALLY SAID.
SO AS A RESULT, THE IETF IS ABOUT TO RE-OPEN THE DNAME DOCUMENT OR THE DNAME
SPECIFICATIO!
N.
AND SORT OF STEPPING UP, GOING TO A HIGHER LEVEL, A GENERAL IETF OBSERVATION
IS THAT IT'S PRETTY NORMAL FOR US TO REVISE OUR SPECIFICATIONS. IN SOME
SENSE THE WAY YOU GET A GOOD SPECIFICATION IS BY MAKING A FIRST CUT AT IT,
IMPLEMENTING IT, USING IT FOR A WHILE, YOU LEARN FROM IT, YOU ITERATE ON THE
SPEC AND YOU CLEAN UP THE THINGS THAT WEREN'T QUITE RIGHT OR WHERE THERE WAS
AMBIGUITY-RELATED INTEROPERABILITY. AND THEN AFTER YOU HAVE DONE THIS ONCE
OR TWICE OR THREE TIMES YOU GET TO THE POINT WHERE THE PROTOCOL AND THE
IMPLEMENTATIONS ARE ALL STABLE AND THINGS WORK PRETTY WELL.
SO IT DOESN'T MEAN FOR EXAMPLE THAT DNAME IS BROKEN AND DOESN'T WORK THAT THE
IETF IS RE-OPENING IT AND GOING TO DO SOME FURTHER WORK ON IT.
AND IN ANY CASE, ANY CHANGES THAT ARE MADE TO DNAME, ONE OF THE IETF MANTRAS
REALLY IS WE DON'T WANT TO BREAK ANYTHING THAT DOESN'T NEED TO BE BROKEN. SO
IN TERMS OF MAKING CHANGES TO THE DNAME THERE WILL ALWAYS BE CONCERN ABOUT
NOT HAVING A NEGATIVE IMPACT ON WHAT IS ALREADY DEPLOYED OR HAVING TO CHANGE
THE SPEC IN A NEGATIVE WAY THAT MIGHT CAUSE PROBLEMS FOR EXISTING
IMPLEMENTATIONS.
AND THE REALITY ALSO IS IT TYPICALLY WILL TAKE AT LEAST A YEAR, MAYBE TWO
YEARS BEFORE A REVISED DOCUMENT COMES OUT BECAUSE THESE THINGS TAKE TIME TO
DISCUSS, NEW ISSUES ARE BROUGHT UP, THEY GET DOCUMENTED AND YOU SORT OF
ITERATE ON THAT FOR A WHILE.
SO IN THE PARTICULAR CASE OF DNAME I SUSPECT THAT MANY OF THE ISSUES THAT ARE
GOING TO BE RAISED ARE FAIRLY STRAIGHTFORWARD. THERE WILL BE SOME ARGUMENTS
ABOUT WHAT SHOULD BE DONE AND WHAT THE RIGHT WORDING SHOULD BE TO CLARIFY IT,
BUT THERE AREN'T ANY KIND OF FUNDAMENTAL PROBLEM WITH DNAME THAT WOULD CAUSE
THEM TO CHANGE IT IN A WAY THAT IS INCOMPATIBLE WITH WHAT'S OUT THERE OR
CHANGES THE SEMANTICS IN KIND OF ANY MEANINGFUL WAY.
BUT IT'S ALSO THE CASE THAT IT MAY BE THERE ARE SOME CASES THAT COME UP THAT
REQUIRE CHANGES THAT ARE MORE SUBSTANTIVE, AND WE DON'T KNOW THAT AT THIS
POINT, AND WE WON'T KNOW THAT UNTIL WE ACTUALLY COME TO SUCH A QUESTION AND
THAT THE DNS EXPERTS SIT ON IT FOR A WHILE AND THINK ABOUT IT AND COME TO THE
CONCLUSION THAT, YEAH, THIS IS A REAL PROBLEM AND WE DON'T HAVE A GOOD WAY TO
GET AROUND IT OTHER THAN MAKING A SIGNIFICANT CHANGE.
I'M NOT SUGGESTING THAT'S GOING TO HAPPEN SO MUCH OTHER THAN THAT CAN ALWAYS
HAPPEN WHEN YOU ARE REVISING A SPECIFICATION.
AND THAT'S IT FOR ME. THAT'S ALL I WANTED TO SAY.
>>VINT CERF: THANK YOU VERY MUCH. I ACTUALLY HAVE A COUPLE OF QUESTIONS.
[ APPLAUSE ]
>>VINT CERF: IF I COULD -- FOR CLARIFICATION.
THERE ARE A NUMBER OF THINGS THAT WILL DEPEND ON THE PRESENCE OF IDNS IN THE
PROTOCOL SUITES THAT WE USE. I WAS GIVEN TO UNDERSTAND YESTERDAY THAT THE
WHOIS SYSTEM OR INTERFACE MIGHT NOT BE A FRIENDLY PLACE FOR UNICODE, EVEN IF
IT'S ENCODED IN UTF-8. THAT SOMETHING EITHER THOMAS OR JOHN COULD RESPOND
TO?
>>THOMAS NARTEN: THE QUICK ANSWER IS IF YOU LOOK AT WHAT THE SPEC SAYS, IT
SAYS ASCII, AND IT'S NOT INTERNATIONALIZED BEYOND THAT. BUT I THINK JOHN CAN
SPEAK A WHOLE LOT ABOUT THE DETAILS.
>>JOHN KLENSIN: YEAH, THE ORIGINAL SPEC, VINT, WAS TO FINE IN TERMS OF NVT
BECAUSE IT RAN OVER TELNET. FOR THOSE OF YOU WHO DON'T KNOW THAT
ABBREVIATION, IT'S A NETWORK VIRTUAL TERMINAL CONCEPT WHICH IS DEFINED
STRICTLY IN TERMS OF ASCII AND EITHER BANDS OR GIVES SPECIAL INTERPRETATION
TO ANYTHING WITH A HIGH-ORDER BIT SET.
THE LATEST UPDATE TO WHOIS DIDN'T SAY THAT IN SO MANY WORDS SO THERE ARE
PEOPLE WHO ARE INTERPRETING THIS VERY LIBERALLY, BUT WHAT WE KNOW ABOUT OLD
APPLICATIONS AND THAT KIND OF LIBERAL INTERPRETATION IS SOONER OR LATER IT
WILL GET SOMEBODY.
AND THE USUAL FORM OF "GETTING" WITH THINGS LIKE THAT AND THE NVT DEFINITION,
SOMEBODY SIMPLY THROWS AWAY ALL HIGH ORDER BITS.
>>VINT CERF: THE OTHER OBSERVATION I WOULD MAKE IS THAT WHEN WE GET TO
E-MAIL, OUR HABIT OF FORWARDING THINGS AROUND, OF ATTACHING THINGS AND THE
LIKE, COULD PRODUCE SOME PRETTY BIZARRE RESULTS IF WE HAVE EVEN UTF ENCODED
THINGS.
MAYBE YOU CAN HELP ME HERE, JOHN, BUT I'VE OFTEN OBSERVED WITH THE CURRENT
E-MAIL SYSTEMS THAT THE BODY OF THE MESSAGE IS FREQUENTLY ASSUMED TO BE
ASSOCIATED ONLY WITH ONE CHARACTER MAPPING. IN THE CASE OF VENI, FOR
EXAMPLE, IT'S CYRILLIC. IN THE CASE OF OTHERS, IT MIGHT BE THE GERMAN OR
LATIN VERSION THAT INCLUDES ALL THE APPROPRIATE SYMBOLS FOR GERMAN.
WHEN THOSE TWO ARE COMPOSED, SO TO SPEAK, I OFTEN GET MAPPINGS OF THE UMLAUT
CHARACTERS INTO CYRILLIC.
AS WE PASS URLS AROUND, AS WE PASS REFERENCES TO PEOPLE'S DOMAIN NAMES OR
PEOPLE'S E-MAIL ADDRESSES AROUND IN THE CURRENTLY EXISTING SYSTEM, ARE WE
LIKELY TO WIND UP WITH THAT SORT OF PROBLEM? IF WE DO, OF COURSE WHEN PEOPLE
TRY TO USE THINGS, THEY WON'T WORK RIGHT.
>>JOHN KLENSIN: I'M AFRAID THE PART OF WHAT YOU'RE DESCRIBING IS EVIDENCE OF
THE FACT THAT WHEN IT COMES TO TAKING AN EXISTING PIECE OF CODE AND UPGRADING
IT, WE PROGRAMMERS TEND TO BE RATHER LAZY AND SLOTHFUL BUNCH.
THE MIME STANDARD WAS DEFINED IN THE EARLY '90S TO HANDLE MAIL WITH MULTIPLE
CHARACTER SETS AND NON-ASCII CHARACTER SETS, AS WELL AS HANDLING MULTIMEDIA
THINGS. AND AT THAT TIME THE PREVALENT CHARACTER SETS IN USE WERE -- EXCUSE
ME -- THE VARIATIONS ON ISO 8859-1 WHICH IS SORT OF A PER-SCRIPT, PER-COUNTRY
ARRANGEMENT, AND NATIONAL VERSIONS OF THE OLDER ISO 636 AND SOME OTHER
THINGS.
SO MIME PROVIDES FOR LABELING THESE THINGS, BUT WHEN THAT LABELING PROCESS
WAS PUT IN THE OUTGOING SIDE INTO MAIL COMPOSERS, USER AGENTS WHICH COMPOSE
THE MAIL, THE DECISION WAS USUALLY MADE TO SAY, OKAY, THERE'S ONE DEFAULT
SCRIPT, BECAUSE WE SIMPLY DO NOT WANT TO ASK THE USER FOR EVERY MESSAGE, WHAT
SCRIPT IS BEING USED IN THAT MESSAGE.
AND THAT'S ONE OF THE SOURCES OF THIS PROBLEM. THOSE THINGS SHOULD HAVE,
ARGUABLY, BEEN COMPLETELY REWRITTEN.
THEY WEREN'T.
A SECOND SOURCE OF THAT PROBLEM IS WITHIN A GIVEN MIME BODY PART, YOU CAN
ONLY HAVE ONE SCRIPT. SO EVEN IF VENI SENDS YOU SOMETHING WHICH IS IN
CYRILLIC AND LABELED IN CYRILLIC, IF WHAT YOUR SYSTEM IS SET UP TO DO IS SEND
OUT WESTERN EUROPEAN 8859-1, LATIN 1, AND YOU TRY TO FORWARD VENI'S MESSAGE
IN THE BODY OF THAT MESSAGE, PROBABLY WHAT YOUR MAIL CLIENT SHOULD DO IS SAY
ABSOLUTELY NO WAY. YOU HAVE TO FORWARD THAT THING AS A SEPARATE BODY PART.
BUT IN FACT, EITHER DUE TO SLOPPINESS OR NOT UNDERSTANDING THESE OR SOME
OTHER REASON, IT DOESN'T.
SO IT TAKES ONLY ONE OF YOU TO COMMIT THIS SIN, BUT WHEN ONE GETS THROUGH,
IT'S GIBBERISH.
THE GOOD NEWS ABOUT THE WORK THAT'S GOING ON NOW IS THAT ADDRESSING OF MAIL
IS ENOUGH MORE SENSITIVE THAN THESE BODY PART THINGS WHICH WE HOPED PEOPLE
COULD STRAIGHTEN OUT, THAT TO USE THE INTERNATIONALIZED ADDRESSING MACHINERY
WHICH IS COMING ALONG, IT WILL BE NECESSARY FOR YOUR SMTP SERVER TO
SPECIFICALLY NEGOTIATE WITH THE SENDING CLIENT THAT IT'S GOING TO DO THE
RIGHT THING. AND IF IT DOESN'T, THEN THAT MESSAGE IS EITHER GOING TO GET
DOWNGRADED TO ASCII -- INTO TRADITIONAL FORMS OR BOUNCED.
THE OTHER PIECE OF GOOD NEWS IS THAT WE ARE REALLY IN MUCH BETTER STATE RIGHT
NOW WITH REGARD TO UNICODE AND UTF-8 CODING THAN WE WERE A DECADE AND A HALF
AGO.
AND IN PARTICULAR, WE DON'T HAVE TO MESS AROUND WITH 8859 DASH SOMETHING OR
OTHER, OR 646 NATIONAL VERSIONS OR 2022 SWITCHING IN ALL OF THESE THINGS.
WE'VE GOT, TO ALL INTENTS AND PURPOSES, ALL THE CHARACTERS WHICH THOSE THINGS
SUPPORT IN UNICODE.
SO THE RIGHT SOLUTION TODAY FOR ALL THESE LAZY MAIL CLIENT IMPLEMENTERS IS
NOT TO GET BETTER AT LABELING THINGS IN THE SPECIALIZED CHARACTER SETS, BUT
SIMPLY TO SUPPORT UNICODED UTF-8 PROPERLY AND THEN LABEL THAT.
SO MY UTF-8 COMPATIBLE MAIL CLIENT IS MUCH LESS SENSITIVE TO THESE KINDS OF
ISSUES THAN YOURS MAY BE, BECAUSE WHEN IT GETS CYRILLIC INPUT AND DECIDES TO
FORWARD IT IN LINE WITH SOMETHING WHICH IS BASICALLY ASCII TEXT, IT KNOWS
WHAT TO DO WITH THAT, WHICH IS CONVERT EVERYTHING TO UTF-8 RATHER THAN TO TRY
TO DECIDE HOW TO MAKE 8859-5 LOOK LIKE 8859-1.
SO THINGS WILL GET BETTER BUT IT'S GOING TO BE BUMPY FOR A WHILE.
>>VINT CERF: THE IMPLICATION OF ALL THAT IN MY VIEW IS NOT TO SAY WE BETTER
NOT DO ANYTHING UNTIL WE SOLVE ALL THOSE PROBLEMS. IT'S, RATHER, LOOKING FOR
A MORE SYSTEMATIC WAY OF UNCOVERING AND IDENTIFYING THE PROBLEMS SO WE CAN
BEGIN WORKING ON THEM AND WARNING PEOPLE THAT UNTIL THOSE SOLUTIONS ARE
FOUND, THERE WILL BE TIMES WHEN THE IDN APPEARANCE IN SOME OF THE EXISTING
SOFTWARE WILL CAUSE SOME PECULIAR RESULTS. AND SOMETIMES IT WON'T WORK THE
WAY YOU WANT IT TO AT ALL.
BUT HERE WE'RE BACK TO EDUCATION, AND MORE SYSTEMATIC APPROACHES TO GETTING
THE PROBLEMS IDENTIFIED SO WE CAN SOLVE THEM.
WELL, THANK YOU VERY MUCH, THOMAS.
>>JOHN KLENSIN: VINT, THERE'S ONE OTHER ASPECT OF THAT EDUCATION THAT'S
PROBABLY WORTH COMMENTING ON.
ALL OF THESE THINGS, THE INTERNATIONALIZED E-MAIL ADDRESSES, EVEN THE BODY
PARTS IN DIFFERENT LANGUAGES AND SCRIPTS, ARE GOING TO WORK REALLY WELL FOR
COMMUNICATING WITHIN A PARTICULAR LINGUISTIC CULTURAL WRITING SYSTEM
COMMUNITY AND LESS WELL ACROSS THE BOUNDARIES.
AND THE GOOD NEWS FOR THOSE OF US WHO ARE INTERESTED IN USING THESE THINGS TO
HELP MORE PEOPLE USE THE INTERNET EFFECTIVELY IS THAT WITHIN ANY ONE OF THOSE
COMMUNITIES, IT'S A SAFE BET THAT MOST OF THE COMMUNICATION MOST OF THE TIME
IS GOING TO BE INTRA-COMMUNITY RATHER THAN INTER-COMMUNITY.
OUR HARD PART IS MAKING CERTAIN WE DON'T BREAK INTER-COMMUNITY IN THE PROCESS
OF MAKING INTRA-COMMUNITY WORK WELL.
BUT FOR THE VAST NUMBER OF CASES, THE USERS WHO WILL LEAST SOPHISTICATED
ABOUT THIS WHO DON'T TRAVEL INTERNATIONALLY, WHO WORK IN ONLY ONE LANGUAGE
AND SCRIPT AREN'T GOING TO NOTICE THE PROBLEMS THAT THE REST OF US ARE GOING
TO GET REAL SENSITIVE TO REAL FAST.
>>VINT CERF: AND THAT'S BOTH A GOOD THING, A BLESSING, AND A PROBLEM.
THE BLESSING, OF COURSE, IS THAT FOR A LOT OF PEOPLE, THIS IS GOING TO MAKE A
POSITIVE AND GOOD DIFFERENCE IN THE USE OF THE NET.
THE DIFFICULTY IS YOU COULD COME AWAY THINKING THAT EVERYTHING IS GOING TO
WORK FINE BECAUSE IT HAS WORKED FINE IN THAT LIMITED CONTEXT, AND THEN WHEN
IT DOESN'T WORK IN THE MORE COMPLEX ONE, YOU WONDER WHY WE GOT IT WRONG, OR
AT LEAST WHY IT DIDN'T WORK.
THE ONLY OTHER THING I WANTED TO MENTION IS THAT DOMAIN NAMES AND URLS ARE
RECOGNIZED IN PLAIN TEXT TODAY BY VARIOUS PIECES OF SOFTWARE THAT HAVE BEEN
-- THEY HAVE BEEN WRITTEN TO DISCOVER THAT SOMETHING AS A DOMAIN NAME -- FOR
EXAMPLE, IF THEY SEE WWW DOT SOMETHING ELSE WITHOUT A SPACE, IT'S A SIGNAL
THAT IT MIGHT BE A DOMAIN NAME. BUT NOT ALL DOMAIN NAMES ARE SUCCESSFULLY
RECOGNIZED.
I WOULD GUESS AS WE INTRODUCE IDNS IT WILL GET HARDER FOR THE SOFTWARE TO
RECOGNIZE DOMAIN NAMES.
SO ONCE AGAIN, A SLOW PROCESS OF UNCOVERING PROBLEMS AND FINDING SOLUTIONS.
I SEE SABINE IS WAITING TO TELL US WHAT HER EXPERIENCES ARE, I HOPE, WITH THE
USE OF AN EXTENDED CHARACTER SET, EXTENDED SCRIPT IN DOT DE.
YES, SABINE, PLEASE GO AHEAD.
>>SABINE DOLDERER: ACTUALLY, YES, I DO WANT TO SHARE WITH YOU THE INFORMATION
WE ARE DOING AT DOT DE. I HAVE TO TELL YOU SOME OF THEM ARE NOT REALLY
COMPLICATED AND THEY ARE USED BY A LOT OF USERS.
WE ACTUALLY HAVE IMPLEMENTED WITHIN OUR WHOIS SERVICE, BECAUSE WE DON'T THINK
THE WHOIS IS PLAIN ASCII, THE WHOIS DEFINITION IS CURRENTLY WHAT CAME IN --
SOMETHING CAME IN AND SOMETHING CAME OUT. THERE IS NO REAL DEFINITION ABOUT
WHAT TYPE OF CHARACTERS IN WHOIS IMPLEMENTS OR DOESN'T IMPLEMENTS AS FAR AS
WE HAVE READ THE RFC, AND THEREFORE WE DID SOME IMPLEMENTATION. WE HAVE
CONSULTED WITH THE COMMUNITY, AND WE ACTUALLY DID, AND WE SEE IT ALSO IN OUR
NAME SERVERS. EVEN IN OUR NAME SERVERS, YOU GET A LOT OF STRANGE REQUESTS
FROM DIFFERENT TYPE OF CHARACTER SETS.
SO WE START WITH THE IMPLEMENTATION, AND USUALLY WHAT CAME IN, WE GAVE OUT.
IF SOMEBODY CAME IN WITH AN ASCII REQUEST, WE GAVE HIM AN ASCII -- WE GAVE
HIM AN ASCII ANSWER, AND IF SOMEBODY CAME IN WITH AN ENCODED REQUEST, WE
ACTUALLY CAME OUT WITH AN ENCODED REQUEST.
AND THERE IS ALSO POSSIBILITY WE HAVE IMPLEMENTED IN THE WHOIS SERVER THAT
YOU CAN ACTUALLY CHOOSE THE CHARACTER SET THAT YOU WANT TO HAVE OR WANT TO
HAVE SEEN EXPLICITLY.
SO YOU FIND ALL THE INFORMATION ON OUR WEB SERVER, AND YOU CAN TEST IT
(INAUDIBLE) SERVER.
SO THERE ARE ALREADY POSSIBILITIES TO INTEGRATE IDNS AND WHOIS. IT IS NOT
ONLY DONE BY DOT DE. IT IS DONE BY OTHER REGISTRIES SUCH AS AUSTRIA. I'M
SURE IT IS ALSO DONE BY OTHERS. SO IT'S NOT SOMETHING WHERE WE ARE THE ONLY
ONE, WE HAVE INVENTED IT OR WE ARE PROUD OF IT. BUT IT'S A PRAGMATIC
SOLUTION. I THINK IT'S ALSO DISCUSSED FROM PEOPLE WITHIN THE IETF. I KNOW
MARCUS (INAUDIBLE) IS A MEMBER OF (INAUDIBLE) STAFF AND IS ALSO AUTHOR OF THE
CRISP RFC, HAS DONE A SIGNIFICANT TASK TO INTRODUCE IDN TO CRISP ALSO TO HAVE
A MUCH MORE STABLE AND MUCH MORE DEFINED POSSIBILITY TO ASK FOR ALL THE DATA.
BUT ON THE OTHER HAND SIDE, HE DID ALSO AN INCREDIBLE JOB IN TRYING TO
INCORPORATE ALL THESE POSSIBILITIES WITHIN IDN -- WITH IDN IN THE WHOIS AS WE
ARE SEEING.
WE HAVE DONE IT NOT BECAUSE WE WANT TO BE VERY FANCY OR BECAUSE WE WANT TO BE
VERY DIFFERENT, BUT AS WE INTRODUCED THE IDN, THE SECOND LEVEL DOT DE, WE
ALSO INTRODUCE THE POSSIBILITY FOR OUR CUSTOMERS NOT ONLY TO BE REGISTERED
IDN DOMAIN BUT ALSO TO REGISTER THEIR IDNS WITH THEIR REAL NAMES AND REAL
ADDRESSES. WHICH MEANS THAT IF SOMEBODY IS LIVING IN A MUNCHEN, WHICH IS A
TOWN IN GERMANY UNFORTUNATELY SPELLED WITH AN UMLAUT, HE IS REALLY ABLE TO
PUT THIS MUNCHEN IN HIS ADDRESS RECORD. SIMILAR HIS NAME.
BECAUSE IN GERMANY IT'S REALLY SEEN AS WE ARE SUPPORTING MOSTLY THE GERMAN
COMMUNITY, IT'S SEEN AS A PROBLEM THAT PEOPLE WHO ARE REGISTERING A DOMAIN
NAME ALWAYS HAVE TO REWRITE THEIR NAME IN FANCY FORMS BECAUSE THE REGISTRY,
AND IT WAS ALWAYS THE REGISTRIES WERE UNABLE TO PROVIDE THEM WITH CHARACTERS
WHICH ACTUALLY ARE USED WIDELY IN GERMANY BY ALL MEDIAS.
SO WE HAVE DONE -- WHEN WE IMPLEMENT IDN -- NOT ONLY THE IDN FOR THE DOMAIN
BUT FOR ALL THE OTHER RECORDS.
OBVIOUSLY FOR ALL THE OTHER ONES WHO WANT TO TRACK THE RECORD, THEY GET THE
INFORMATION IN ASCII, BUT WITH A -- BUT SOME INFORMATION GOT LOST BECAUSE OF
THE DOTS CANNOT BE ASSIGNED THERE.
YES, I THINK. SO THERE IS A LOT OF INFORMATION OUT THERE. I REALLY WANT TO
URGE YOU NOT TO INVENT THE WHEEL AGAIN. I THINK THERE'S DONE A LOT OF THINGS
ON THE SECOND LEVEL, AND IT'S SHOWN ON THE SECOND LEVEL THAT IT WORKS. I
DON'T SEE ANY TECHNICAL PROBLEM WHY IT SHOULDN'T WORK ON THE ROOT LEVEL. I
DON'T SEE THAT IT ADDS MUCH MORE COMPLEXITY. SO I WOULD REALLY URGE YOU TO
GO AHEAD AND JUST TRY TO FIND, AND SPEAK TO THE PEOPLE WHO HAVE ALREADY
IMPLEMENTED.
>>VINT CERF: THANK YOU.
[ APPLAUSE ]
>>VINT CERF: WE DON'T HAVE TIME TO GO THROUGH THE CASE ANALYSIS WHICH I WOULD
NORMALLY DO BECAUSE IT'S INAPPROPRIATE IN A MEETING LIKE THIS, BUT THE NEXT
OBVIOUS QUESTION WOULD BE VARIOUS WHOIS TOOLS, INTERROGATING YOUR WHOIS
SERVER TO FIND OUT WHAT THEY LOOK LIKE. WHAT HAPPENS WHEN THEY MAKE QUERIES
TO YOUR DATABASE. DO THE TOOLS THAT OTHER PEOPLE ARE USING TO LOOK AT THEM
WORK OKAY?
THIS IS NOT AN ARGUMENT AT ALL AGAINST OUR POINT, WHICH IS TO USE WHAT IS
AVAILABLE. THE POINT IS TO FIND OUT WHAT'S AVAILABLE SO THAT WE KNOW WHAT WE
DON'T HAVE TO REINVENT.
WELL, THANK YOU VERY MUCH, THOMAS, WHEREVER YOU ARE.
IT'S MICHEL SUIGNARD OPPORTUNITY NOW TO TAKE US INTO INTERNET EXPLORER 7
LAND, AND THE TREATMENT OF IDNS THERE.
SO MICHEL, IT'S ALL YOURS.
>>MICHEL SUIGNARD: I'M WORKING THE TECHNICAL DETAIL HERE. OKAY. GOOD.
OKAY. IN FACT, I'M NOT GOING TO TALK ONLY ABOUT IE 7. I'M GOING TO TALK,
YOU KNOW, IMPLEMENTATION NOTES FROM PLATFORM VENDORS. BEFORE WE SEE SOME
POINTS ABOUT THE BROWSER. BUT I WANT TO GO A BIT BEYOND THAT. SO EXPLORE, I
WOULD SAY, OR, YOU KNOW, EXPERIENCE AND SHARE WITH YOU WHAT WE DID TO
IMPLEMENT IDN IN THE PLATFORM ITSELF.
IN FACT, I AM ALSO A MEMBER OF THE UNICODE CONSORTIUM. IN FACT, I AM ONE OF
THE TECHNICAL DIRECTORS THERE, SO I AM VERY INVOLVED, IN FACT, IN CHARACTER
ENCODING, ADDING CHARACTERS AND MAKING UNICODE UNSTABLE, IF YOU WANT, TO SOME
DEGREE. BUT THAT'S THE WAY OF LIFE OF CHARACTER ENCODING THAT YOU KEEP
DISCOVERING ADDITIONAL NEEDS FOR COMMUNITIES.
IN FACT, AS A GAME, TO SOME DEGREE, ON THE FIRST SLIDE I ADDED ALL THE CC --
I TOOK THE CC FROM FREMONT 66, THE ONE THAT EXIST, THAT REGISTER TODAY ON
IANA, AND I CREATED BASICALLY A COUNTRY NAME OR A TERRITORY NAME, BECAUSE
IT'S NOT JUST COUNTRIES YOU SEE IN CCS, AND BASICALLY WENT OUT THERE ON
SEARCH. I MAY HAVE MADE SOME MISTAKES, BUT I HAVE MOST OF THEM.
BASICALLY YOU HAVE THE NAMES. ALL OF THOSE NAMES CAN BE DISPLAYED AS THEY
ARE BY THE PLATFORM. THERE IS NO -- YOU KNOW, WHEN YOU INSTALL THAT ON THE
MACHINE TODAY, YOU GET ALL THOSE NAMES RENDERING ORIGINALLY.
SO WE ARE GETTING THERE. ON THIS, ABSOLUTELY ALL THE CCS ARE INR DATABASE.
THERE ARE ONLY TWO EXCEPTIONS BECAUSE I HAVE TO BE FAIR ON THIS, IS IN FACT
MYANMAR, BECAUSE I DIDN'T HAVE THAT FONT ON MY MACHINE. AND THERE'S FOR THE
MONGOLIAN, I HAD AN ISSUE WITH POWERPOINT 2003. BUT WE ARE GETTING THERE.
WHEN YOU CAN EXPLAIN MONGOLIAN CORRECTLY ON THE MACHINE OUT OF THE BOX,
THAT'S PRETTY IMPRESSIVE.
IF YOU HAVE TO INPUT CHARACTERS, THAT'S A COMPLETELY DIFFERENT MATTER BECAUSE
YOU HAVE TO INSTALL THE APPROPRIATE KEYBOARDS AND YOU HAVE TO BE
KNOWLEDGEABLE OF THE LANGUAGE.
SO WHAT IS THE STATUS AT MICROSOFT. AS YOU SEE, WE RECOGNIZE THAT IDNA
PROVIDE, I WOULD SAY, A SOLUTION FOR BOTH CONTENT -- CONTENT I THINK HAS BEEN
SOLVED, OR EVEN MORE SO.
NOW YOU CAN BASICALLY, YOU KNOW, CREATE CONTENT ON MOST OF THE WRITING
SYSTEMS OF THE PLANET WITH MODERN PLATFORMS.
WE'RE NOT THE ONLY ONE.
MOST OF THE EXISTING PLATFORMS EITHER NOW OR IN THE VERY FEW NEXT MONTHS, OR
YEARS AND MONTHS FOR SOME PLATFORMS, SHOULD BE THERE.
WE DO REALLY SUPPORT, AS FAR AS THE CONTENT, MOST OF THE WRITING SYSTEMS THAT
CAN BE USED ON THE PLANET.
THERE ARE OBVIOUSLY SOME MINOR KEYS THAT ARE STILL GETTING THERE.
BUT WE HAVE PRETTY MUCH -- WE HAVE ACHIEVED PRETTY MUCH OVER 95% OF THAT.
THE NEXT FRONTIER WAS -- IT WAS HOW DO YOU POINT TO THE RESOURCE.
THAT'S WHERE WE SEE WHERE IDN ON IRI COMING INTO PLAY ON LIKE I KNOW I SHOW
TWO EXAMPLES, EXAMPLE OF WHAT WILL RESOLVE TODAY, TWO NAMES THAT DO EXIST ON
-- DO RESOLVE IF YOU HAVE THE APPROPRIATE BROWSER SUPPORT OR PLATFORM
SUPPORT.
IN FACT, THE SECOND ONE IS AN INTERESTING ONE BECAUSE IT DOES INVOLVE AN
ARABIC ON A MIX OF RIGHT TO LEFT AND LEFT TO RIGHT LABELS.
WE DO PROVIDE LAYERS NOT IN THE BROWSER, BUT IN THE PLATFORM.
SO, YOU KNOW, WE EXPECT EVENTUALLY THAT OTHER BROWSERS WILL PROBABLY USE OUR
SERVICE FOR THE IDNA SERVICES.
WE DO PROVIDE THE (INAUDIBLE) SUPPORT, SO THE TWO ASCII -- TWO UNICODE
FUNCTION IMPLEMENTED BY LIBRARIES.
WE ALSO PROVIDE WHAT I CALL SCRIPT HELPERS, WHICH ARE BASICALLY FUNCTIONS TO
HELP YOU WITH DETERMINING THE MIXED SCRIPT CONCERN THAT YOU HAVE WHEN YOU
IMPLEMENT IDNS.
FOR EXAMPLE, DETECTING IF A GIVEN TEXT CONTAINS MULTIPLE SCRIPTS AND STUFF
LIKE THAT.
SO YOU CAN CREATE MITIGATION SOLUTION FOR MIXED SCRIPT USAGE.
IT'S USED, WE SEE, BY IE7 ON THE FORTHCOMING VERSION OF OUTLOOK.
OBVIOUSLY, IT'S NOT -- YOU COULD SAY IT'S NOT THERE YET, IT'S NOT A PRODUCT,
IT'S NOT IN PRODUCTION, BUT, OBVIOUSLY, THERE ARE ALREADY MILLIONS OF USERS
OUT THERE.
SO IDN TO SOME DEGREE, AT LEAST FROM OUR POINT OF VIEW, IS ALREADY BEING USED
QUITE A BIT.
WE DO USE USER LOCALE ON THE CUSTOMIZATION TO CUSTOMIZE ON THE USER
EXPERIENCE.
NOT EVERYBODY WANTS TO ACCESS EVERY LANGUAGE ON THE PLANET.
THEY -- THE MORE YOU EXPOSE YOURSELF TO MORE LANGUAGES, ALSO TO SOME DEGREE,
YOU EXPOSE YOURSELF MORE TO PHISHING ATTACKS.
SO THERE'S A TRADEOFF BETWEEN BEING COMPLETELY OPEN TO EVERY LANGUAGE AND
BEING BASICALLY ABUSED BY A PHISHING OR SPOOFING ATTACKS.
THE SCREEN IS PROBABLY A BIT SMALL, BUT I WILL DESCRIBE WHAT IS ON THAT.
IN FACT, THE -- YOU CAN -- THERE'S A LIST, IT'S PROBABLY BLURRY FROM WHERE
YOU ARE.
BUT YOU CAN BASICALLY DEFINE ON YOUR USER EXPERIENCE IN THE BROWSER WHICH
LANGUAGES YOU ARE FAMILIAR WITH, SO THAT WILL MAKE, BASICALLY, THE EXPERIENCE
WITH THOSE LANGUAGES, YOU KNOW, GOOD FOR YOU.
IF YOU ARE OUTSIDE OF THESE LANGUAGES, BASICALLY, THE LEVEL WILL NOT BE SHOWN
AS A NATIVE NAME; IT WILL BE SHOWN AS PUNYCODE.
SO IN THIS MACHINE, FOR EXAMPLE, IT WILL SAY ENGLISH, JAPANESE, FRENCH,
KOREAN, BULGARIAN, AND THAI, WHICH, OBVIOUSLY, WAS A PRETTY WIDE SPECTRUM OF
LANGUAGES.
BUT IT'S LIMITED TO THOSE.
IF YOU GET ANOTHER LANGUAGE OUT OF THOSE -- OTHER THAN THOSE ON THAT LIST,
YOU WILL BASICALLY NOT SEE THE NAME IN ITS NATIVE FORM.
THEN WE CAN ALSO DO A FEW THINGS.
LIKE, YOU CAN ADD AUTOMATICALLY THE SUFFIX. AND THAT'S A JP ON THE DOWN.
SO, BASICALLY, SOMEBODY COULD BE IN JAPAN, ENTERING THE WEB SITE, ENTERING
JAPANESE, WITHOUT GETTING OUT OF THE JAPANESE INPUT METHOD, AND THE BROWSER
WILL AUTOMATICALLY ADD .JP AT THE END.
SO ALL THE ASCII PARTS WILL BE ADDED AUTOMATICALLY TO THE URL.
THAT MAKES IT KIND OF A CONVENES FOR PEOPLE TO AVOID FORTH AND BACK BETWEEN
THEIR NATIVE LANGUAGE ON SOME ASCII BECAUSE THEY HAVE TO ADD SOME ASCII TO
THE WEB SITE.
IN THE SAME WAY, IN FACT, YOU COULD, OBVIOUSLY, ADD THE NTLD AT THE END.
THE NAME FOR CN IN CHINESE.
YOU COULD ALSO, OBVIOUSLY, CREATE A TLD SUFFIX AUTOMATICALLY IF YOU WANT TO.
SO THAT'S JUST, YOU KNOW, ONE JAPANESE SIDE THAT WAS IN FACT THE RESULT OF
THE WEB SITE, THE URL WE SAW BEFORE.
OBVIOUSLY SHOWS -- I DON'T HAVE MY (INAUDIBLE) HERE.
BUT ALL THE URL ON THE LINKS THAT ARE SHOWN ON THE STATUS BAR, EVERYTHING IS,
IN FACT, IN NATIVE ENCODING.
SO JUST, YOU KNOW, THE USERS SEE NORMAL, YOU KNOW, NATIVE CHARACTERS ON --
DOESN'T REALLY HAVE TO DEAL WITH ANY ASCII THERE.
SO PREVENTING HOMOGRAPH SPOOFING ATTACKS IS, TO SOME DEGREE, LIKE WE SAID, WE
PREVENT SCRIPT UNKNOWN TO USER.
YOU KNOW, YOU CAN ALWAYS, OBVIOUSLY, LIKE I SAID, IF YOU SEE ONE OF THEM, YOU
CAN -- YOU KNOW, YOU SEE SOMETHING IN PUNYCODE, IT WILL TELL THAT YOU THIS IS
A USER AND IF YOU WANT TO ADD ANOTHER LANGUAGE THAT YOU THINK YOU SHOULD SEE,
YOU CAN ADD THAT TO YOUR LIST OF LANGUAGES THAT YOU WANT TO SUPPORT.
WE DO SUPPORT MIXED SCRIPT TO SOME DEGREE.
SO, FOR EXAMPLE, UNLIKE -- THE GUIDELINES IN ICANN SAY YOU SHOULD NOT DO
MIXED SCRIPT.
OBVIOUSLY, THERE IS SOME EXCEPTION TO THAT.
LIKE, YOU SHOULD SEE IN THE JAPANESE EXAMPLE, YOU HAVE TO SUPPORT, OBVIOUSLY,
THE (INAUDIBLE) SO YOU HAVE TO SUPPORT THESE KIND OF THINGS.
IN FACT, WE ALSO -- WE'VE BEEN WORKING IN SOME DETAILS, LIKE ALLOWING THE
ASCII SET, WITH SOME EAST ASIAN, YOU KNOW, LANGUAGES, BECAUSE IT'S FAIRLY
COMMON IN JAPAN, FOR EXAMPLE, OR EVEN IN MANY OTHER ASIAN COUNTRIES TO MIX
LATIN WITH THEIR SCRIPT.
IT'S KIND OF DANGEROUS.
WE HAVE TO BE CAREFUL ABOUT WHAT WE ARE DOING THERE, BECAUSE, OBVIOUSLY, YOU
DON'T WANT TO DO THAT BETWEEN, LET'S SAY, LATIN AND CYRILLIC, FOR THE REASON
THAT JOHN WAS EXPLAINING BEFORE.
SO, FOR EXAMPLE, WE ALLOW THE FIRST EXAMPLE, WHICH IS AN EXAMPLE WITH
JAPANESE.
AND THE SECOND EXAMPLE IS A MIX OF, IN FACT, CYRILLIC CHARACTERS FOLLOWED BY
SHOP, GAME SHOP THING IN CYRILLIC.
SO IN THAT CASE, WE ONLY SHOW THE SECOND PUNYCODE VALUE FOR THIS.
SO THAT'S AN EXAMPLE OF THAT SITE, FOR EXAMPLE, THAT AS WE SHOW TODAY, WHICH
IS A RESOLVING SITE.
IN THAT CASE, THE CIRCLE, THE RED CIRCLE, IN FACT, IS SHOWN IN PUNYCODE,
BECAUSE IT'S BASICALLY -- WE CONSIDER -- I WOULD SAY THAT -- I MEAN, RISKY,
TO USE JOHN'S TERM.
SO IDN ISSUES, YOU KNOW, WE HAVE, TO SOME DEGREE -- THIS IS MORE A USER
PERSPECTIVE.
IT'S NOT -- I MEAN, THE CLIENT PERSPECTIVE.
IT'S NOT NECESSARILY, LIKE, YOU KNOW PROTOCOL OR A MORE GENERIC ISSUES THAT
JOHN WAS SAYING.
WE HAVE SOME INTERSECTION, OBVIOUSLY, BETWEEN HIS ISSUES AND OUR ISSUES.
YOU KNOW, TODAY, AS IDN DOES NOT SUPPORT ANY IMPROVEMENT BEYOND UNICODE 3.2.
SINCE 3.2, A LOT OF THINGS WERE DONE, IMPROVING SITUATION OR ENCODING FOR
ETHIOPIC, GREEK, KHMER, LATIN, THERE ARE OBVIOUSLY NEW SCRIPTS THAT HAVE BEEN
ADDED SINCE 3.2.
THIS IS NOT AN EXHAUSTIVE LIST.
THERE ARE MANY MORE THAN THAT.
I DO MENTION TWO OF THEM HERE, OBVIOUSLY, BEING IN AFRICA, N'KO AND TAI LE,
TIFINAGH, WHICH IS USED IN MOROCCO.
THOSE ARE COMPLETELY NONREPRESENTED IN IDN.
YOU CANNOT USE THEM IN IDN.
OBVIOUSLY, THAT'S AN ISSUE FOR COMMUNITIES.
EVEN IF YOU GO TO 5.0, UNICODE 5.0, HAS BEEN MENTIONED BEFORE, WE DO HAVE
SOME NOT -- PRETTY MAJOR REVISION IN FACT IN MYANMAR THAT IS TAKING PLACE
TODAY.
MYANMAR, OR BURMA, IS BASICALLY SOMETHING THAT YOU CANNOT REDO COMPLETELY --
A COMPLETE GOOD JOB AT REPRESENTING IDN.
YOU WILL NOT BE ABLE TO DO IT BEFORE 5.1 IS, IN FACT, DONE.
AND THEN THERE IS, YOU KNOW, SOME ISSUES THE WAY NAMEPREP WORKS THAT, IN
FACT, YOU CANNOT REPRESENT SOME WRITING SYSTEM CORRECTLY IN IDN, BECAUSE THEY
DO REMOVE THE 0 WIDTH JOINER AND 0 WIDTH NONJOINER.
I DON'T WANT TO GET TECHNICAL, BUT JUST ON THE SCREEN YOU CAN SEE DIFFERENCES
BY STRIPPING THOSE TWO CHARACTERS FROM THOSE NAMES, YOU GET, FOR EXAMPLE, IN
THE SINDHI CASE, THIS IS THE NAME OF SRI LANKA IN SINDHI.
IN FACT, THE NAME OF SRI LANKA AS A COUNTRY NAME CANNOT BE CORRECTLY
REPRESENTED IN IDN.
BECAUSE IN THE LEFT YOU HAVE THE NAME AS CORRECTLY REPRESENTED.
AND IN THE RIGHT, YOU HAVE HOW IT IS WHEN IT'S PROCESSED BY NAMEPREP.
MYANMAR IS KIND OF A SIMILAR ISSUE, ALTHOUGH TO SOME DEGREE, WE ALREADY KNOW
THAT WE CAN'T SUPPORT MYANMAR FULLY IN IDNS.
AND PERSIAN, IN FACT, IS USED, IN FACT, VERY OFTEN AS A ZERO WIDTH NONJOINER,
I GUESS, OR NONJOINER, TO MAKE SEMANTIC DIFFERENCES BETWEEN NAMES.
SO YOU -- YOU KNOW, SOME DECISION WAS TAKEN THAT, IN FACT, DO CREATE ISSUES
FOR SOME WRITING SYSTEMS.
I NEVER SEE THOSE ISSUES, BUT, YOU KNOW, NO SERIOUS MITIGATION ABOUT THE
SECURITY ISSUES IN IDN.
JOHN MENTIONED, YOU KNOW, THINGS ABOUT SYMBOLS, HAVING MANY SYMBOLS ON --
ALSO HAVING SCRIPT THAT DIDN'T REALLY BELONG IN IDN.
SO, YOU KNOW, AGAIN, I MEAN, I DIDN'T SEE JOHN'S PRESENTATION BEFORE, SO,
OBVIOUSLY, SOME OF THE STUFF IS, YOU KNOW, INTERSECT.
WE THINK THAT WE HAVE TO FIND A MECHANISM AT SOME POINT, MAYBE NOT RIGHT
AWAY, BUT, YOU KNOW, WE HAVE AT LEAST TO THINK ON THE WAY TO EXTEND THE
REPERTOIRE BEYOND WHAT WE HAVE TODAY.
WE HAVE TO DEEMPHASIZE THE ROLE OF THE NAMEPREP.
I THINK NAMEPREP IS REALLY AN ISSUE FOR US, BECAUSE PEOPLE DON'T REALIZE THAT
WHAT REALLY MATTERS IN IDN IS WHAT I CALL THE OUTPUT SET, IS BASICALLY THE
RESULT OF THE NAMEPREP.
BECAUSE THAT'S REALLY WHAT YOU ARE MATCHING.
THIS IS WHAT YOU ARE REALLY -- YOU WANT TO STORE ON YOUR SYSTEM.
THIS IS WHAT YOU WANT TO COMPARE WITH IN YOUR SYSTEM.
SO IT'S -- IT'S VERY IMPORTANT TO REALIZE THAT IDN AS THIS CONCEPT OF INPUT
SET AND OUTPUT SET WHICH GOES, YOU KNOW, THROUGH THE NAMEPREP TRANSFORM.
AND WE REALLY LIKE, YOU KNOW, WHEN WE WORK AN IMPLEMENTATION, WE REALLY SPEND
MUCH MORE TIME ON WHAT WE CALL THE OUTPUT SET THAN ANYTHING ELSE.
SO WE REALLY WANT THAT PEOPLE FOCUS ON THAT.
BECAUSE THAT'S HOW WE RESOLVE SOME OF THE ISSUES.
WE SEE THE PROBLEMATIC CHARACTERS, IN OUR OPINION SHOULD BE REMOVED
COMPLETELY.
ON THIS FIRST POINT, WE MAY BE -- RESTRICT SOMEHOW THE USAGE OF CHARACTERS,
YOU KNOW, A BIT BELOW WHAT IS ALLOWED BY IDN.
WE -- SYMBOLS, WE DON'T LIKE SYMBOLS, BECAUSE THEY CREATE A LOT OF CONFUSION,
CONFUSABILITY IN PROTOCOLS, BECAUSE THEY DO INTERSECT WITH -- VISUALLY WITH
THE RESERVE CHARACTERS THAT BE, YOU KNOW, JUST IN A SINGLE IDN LABEL, YOU
HAVE THE PUNCTUATION IS A DOT.
IN FACT, GUESS WHAT, YOU HAVE MANY THINGS THAT LOOK LIKE DOT IN THE -- IN
UNICODE.
SO IT'S REALLY PROBLEMATIC NOW.
YOU COULD HAVE CONFUSION WITH DOTS, OR EVEN THE SLASH OR THIS KIND OF THING.
SO BY REMOVING NOT IN MODERN USE SCRIPTS, YOU KNOW -- I THINK WE HAVE A
STRESS -- OLD ITALIC CHARACTERS IN 3.2, BUT WE DON'T NEED THAT. PROBABLY
MOST PEOPLE HERE DON'T EVEN KNOW WHAT THOSE CHARACTERS ARE.
IDN DOESN'T SAY ANYTHING ABOUT NOT USING COMBINING CHARACTERS AS THE FIRST
CHARACTER OF A LABEL.
I MEAN, THAT'S KIND OF SOMETHING THAT IS ALMOST SHOCKING.
I MEAN, BECAUSE YOU -- WHAT HAPPENS IF YOU HAVE A COMBINING CHARACTER?
BECAUSE IN UNICODE, YOU HAVE THIS POSSIBILITY TO COMPOSE CHARACTERS OR
LETTERS, I WOULD SAY, FOR MULTIPLE CHARACTERS ON THE -- YOU HAVE WHAT WE CALL
A BASE CHARACTER THAT IS COMBINING WITH THE FOLLOWING CHARACTERS.
SO NOW IF YOU PUT THE COMBINING CHARACTERS AS FIRST, IT'S NOT GOING TO
COMBINE WITH ANYTHING.
GOING TO START TO COMBINE EITHER WITH A DOT IN FRONT OF IT OR WITH THE HTTP
IN FRONT OF IT OR EVEN AT THE PREVIOUS LEVEL, IN THE WORST CASE.
SO, VISUALLY, IT'S PRETTY BAD.
ONE THING WE COULD DO AS WELL IS WE -- THIS OUTPUT SET, TO MAKE IT A BIT MORE
(INAUDIBLE) WE COULD MAKE IT AS WHAT WE CALL CHARACTER COLLECTIONS IN
ISO10646.
WE HAVE, IN FACT, BEEN USING THAT MECHANISM FOR OTHER SIMILAR PURPOSES, LIKE,
FOR EXAMPLE, JAPAN IS IN THE PROCESS OF STANDARDIZING THEIR COLLECTIONS AS
(INAUDIBLE) COLLECTION FROM JAPAN, BASICALLY BEING MADE PART OF THE ISO10646.
JUST FOR RECORD, ISO10646 IS, IF YOU WANT, THE ISO SIBLING OF UNICODE.
THAT'S THE WAY WE -- YOU KNOW, WE HAVE BEEN ABLE TO HAVE THIS GREAT SUCCESS
BY HAVING, I WOULD SAY, INDUSTRIAL CONSORTIUM SUCH AS UNICODE, YOU KNOW,
WORKING VERY CLOSELY WITH THE ISO BODY ON COMING UP WITH COMMON SOLUTIONS.
IN FACT, BOTH OF THE -- MOST OF THE PEOPLE INVOLVED IN UNICODE ARE ALSO
INVOLVED ON THE ISO SIDE, INCLUDING MYSELF, AND (SAYING NAME) IN THE
AUDIENCE.
AND WE SEE -- I WOULD LIKE TO SEE THE SAME WAY THAT WE HAVE IN IANA TODAY A
LIST OF CHARACTERS ALLOWED FOR, YOU KNOW, A GIVEN REGISTRY -- I MEAN, THEIR
GIVEN TLD, I WOULD LIKE TO SEE THE SAME THING MORE OR LESS, LIKE, AS A -- YOU
KNOW, A GUIDELINE FOR ALL THE GTLDS.
THAT WOULD BE EASY, BECAUSE WE COULD, IN FACT, CREATE, YOU KNOW, A CHARACTER
SET THAT WOULD BE BASICALLY A SUPERSET THAT ALL THE GTLD WOULD LIKE TO USE.
WE COULD DO THE SAME FOR CCS, BUT THE CCS ARE A BIT MORE FREE TO DO THAT KIND
OF THING.
SO WE SEE NO -- ON THAT LIST, YOU WILL HAVE WORLDWIDE NAME SPACE, YOU KNOW,
YOU HAVE ALL THE SCRIPTS THAT MAKE SENSE IN THE IDN CONCEPT.
SO IT'S ALSO MULTISCRIPT.
ON -- YOU KNOW, WITH THE SAME CONSTRAINT THAT WE TALKED BEFORE, BUT MIXING
SCRIPT ON EQUAL ACCESS.
ON THE BASIC WORK, IN FACT, THAT WE HAVE DONE BY THE UNICODE CONSORTIUM WAS
TO CREATE WHAT WE CONSIDER A SAFER LIST.
IT'S NOT SAFE.
YOU CANNOT MAKE SOMETHING 100% SAFE.
BUT I WOULD SAY IT WAS A SAFER LIST.
UNDERSTAND, IT'S PRETTY BIG.
WE ARE VERY FAR FROM ASCII HERE.
WE'RE TALKING ABOUT 37,000 CHARACTERS.
NEVER MIND THAT OVERSEAS IS OVER 10,000 JUST FOR HANGUL AND MORE FOR CJK.
PROBABLY MORE LIKE 25,000.
BUT SAY YOU HAVE 12,000 CHARACTERS.
THAT IS A LOT FOR MOST OF THE SCRIPTS THAT YOU HAVE OUT THERE.
SO ON IDN.IDN, I HAVE JUST A SINGLE SLIDE.
WE SEE FOR US, YOU KNOW, FROM OUR POINT OF VIEW, WE SEE IT'S VERY DIFFERENT
FROM ICANN POINT OF VIEW.
IT'S A TRIVIAL THING FOR US TO DO.
WE JUST HAVE TO MAKE SURE THAT WE ARE ABLE TO SUPPORT THE TLD NAME THAT GOES
MUCH BEYOND, YOU KNOW, THE THREE, FIVE-CHARACTER LENGTH, AND MAKE SURE THAT
WE CAN ACCOMMODATE UP TO 63 CHARACTERS.
JOINT DOMAINS ARE KIND OF SCARY FOR US.
BY JOINT, I MEAN DOMAINS THAT BASICALLY ARE ADDRESSING PHYSICALLY THE SAME
RESOURCE.
IT'S FOR DNAME OR WHATEVER, OBVIOUSLY, I HEAR DNAME IS GOING FOR REVISION.
THAT'S NO GOOD FOR US.
WE DON'T LIKE THINGS THAT CHANGE ON US AT THIS POINT.
SO ANYTHING WHERE YOU WOULD HAVE BASICALLY TWO DOMAINS THAT ARE NOT REALLY
THE SAME PHYSICAL DOMAINS BUT THAT ARE SUPPOSED TO BEHAVE THE SAME WAY, THOSE
CREATE CONCERN FOR US FROM A SECURITY POINT OF VIEW.
BECAUSE TODAY YOU CAN DO A LOT OF THINGS WITHIN A SINGLE DOMAIN.
WE HAVE -- YOU DO ALLOW BASICALLY DATA TO FLOW PRETTY FREELY BETWEEN ZONES
INSIDE THE DOMAIN, YOU KNOW, SUBDOMAIN, BASICALLY.
SO IF YOU START TO HAVE, YOU KNOW, SUBDOMAINS THAT ARE BASICALLY THE SAME,
BUT HAVING A DIFFERENT TLD NAME, THAT'S KIND OF A CONCERN.
ANOTHER THING THAT WE SEE, WE HAVE ALSO TO RECOGNIZE THAT MANY NAMES FOR MANY
TLDS ARE VERY OBVIOUS.
YOU KNOW, IF I HAD TO DO JP, I KNOW WHAT JP WILL LOOK LIKE.
I LOOK -- I KNOW HOW MOROCCO WOULD LOOK LIKE.
NEPAL WOULD LOOK LIKE.
THEN YOU HAVE TO DEFINE WHAT YOU DO FOR COUNTRIES LIKE INDIA.
SO I PUT THE 12 REPRESENTATIONS I COULD FIND FOR INDIA, YOU KNOW, USING
DIFFERENT WRITING SYSTEMS.
AND THEN, OBVIOUSLY, GTLD, THAT'S BEYOND ME.
I'M NOT -- THAT WAS DISCUSSED BEFORE.
I'M NOT GOING TO GO THERE, ON OWNERSHIP.
I WAS SHOWING THE MUSEUM IN CHINESE, WHO DOES THAT FOR -- SO IS THAT SOMEBODY
ELSE -- SOMEBODY WILL HAVE TO DECIDE, BUT NOT ME.
SO THE RESOURCES ARE HERE.
YOU KNOW, YOU CAN OBVIOUSLY DOWNLOAD IE7 IF YOU WANT TO DO THAT, IT'S READY
AND IT'S PRETTY STABLE, IN BETA FORM.
A FEW THINGS I WANT TO SHOW QUICKLY.
THAT'S A RESOURCE.
JUST A SECOND.
NO, THAT'S NOT WHAT I WANT TO DO.
SORRY.
OKAY.
SO I DID CREATE THIS DOCUMENT, IN FACT, FOR -- BECAUSE I'M PART OF THE
COMMITTEE ON IDN PAC.
AND I WANTED TO SHOW SOME WORK THAT I DID THAT WAS KIND OF THE REPRESENTATION
OF WHAT WE DID BEFORE.
SO YOU HAVE HERE, YOU KNOW, IT'S, IN FACT, PRETTY REASONABLY EASY TO DO THAT
FOR THE CCTLD, TO COME UP WITH NAMES FOR ALL OF THEM.
THE ONES THAT ARE NON-LATIN REPRESENTATION ON -- I GUESS I'M RUNNING LOW ON
BATTERY HERE, BUT SHOULD BE ENOUGH.
OKAY.
SO YOU CAN SEE, YOU CAN DO PRETTY MUCH ALL OF THEM, YOU KNOW.
SO THAT'S -- AND THIS IS YOU KNOW SHOWN ON A REGULAR EMERGENCY ROOM WITHOUT
ANY ADDITIONAL FONTS.
SO IT'S ALL THERE.
ON INDIA, I WAS SHOWING BEFORE, YOU CAN EVEN INCREASE, BOTH HEBREW AND
ARABIC.
OBVIOUSLY, I'M NOT THE ONE TO DECIDE.
BUT FROM A TECHNICAL VIEW, IT'S PRETTY EASY TO SOLVE.
WE HAVE CAMBODIA, SRI LANKA, AS I SAID.
SO I DID SHOW BY THE WAY, THE INPUT AND OUTPUT, WHAT WOULD BE THE TYPICAL
INPUT, WITH SOME UPPER CASING ON -- THE OTHER SIDE IS BASICALLY AFTER
NAMEPREP.
SO THAT CONCLUDES, YOU KNOW, MY PRESENTATION ON -- OBVIOUSLY, I WILL TAKE
QUESTIONS IF THERE ARE ANY.
THANK YOU.
[ APPLAUSE ]
>>VINT CERF: THANK YOU VERY MUCH, MICHEL.
AFTER LOOKING AT THAT TABLE, IT OCCURRED TO ME THAT I WOULD HOPE THERE WOULD
BE AN ALTERNATIVE TO HAVING COMPLETE COUNTRY NAMES SHOWING UP AS TLDS.
I NOTICE HOW MUCH PEOPLE LOVE TO HAVE SHORT E-MAIL ADDRESSES, AND THEY
PROBABLY WOULD LIKE TO HAVE SHORT IDN TOP-LEVEL DOMAINS, TOO.
THE FINAL PAPER IN THIS, OR PRESENTATION IN THIS SECTION IS ON INTERNATIONAL
E-MAIL, ORI-EMAIL.
AND IT'S DR. LIANG, WHO IS STANDING RIGHT BEHIND ME.
>>M.C. LIANG: CHAIRMAN AND EVERYONE, I AM MING-CHENG LIANG, REPRESENTING
TWNIC. AND BEFORE I BEGIN THIS TALK, I THINK I JUST LET YOU KNOW THIS IS A
STATUS REPORT OF OURI-EMAIL WORK.
AND LOTS OF THESE TECHNICAL PARTS OF THIS WORK WAS LED BY JOHN AND HAPPENED
IN OUR GROUP TO DO -- HELPING OUR GROUP TO DO THE JOB.
AND ALSO FOR THIS EAI STATUS, WE HAVE A JOINT EFFORT FROM JPI'S, KLNIC AND
CNNIC TO DO THE WORK.
WHAT I AM DOING IS TO PRESENT WHAT WE HAVE DONE UP TO NOW, AND WHAT WE HAVE
THE PROBLEM.
IN CASE THERE'S SOMETHING THAT I CANNOT ANSWER, MAYBE JOHN CAN HELP ME OUT.
THE OUTLINE HERE IS THAT I FIRST TALK ABOUT THE PROBLEM WE HAVE ENCOUNTERED,
AND THEN SOME OF THE HISTORICAL DEVELOPMENT OF HOW THIS I-EMAIL WORKS AND
SOME MILESTONES AND EAI ROAD MAP AND THE TWNIC, AND THEN, FINALLY, I WILL
PRESENT OUR TEST BED.
AND THIS IS STILL PRIMITIVE, TO SOME EXTENT, BUT I THINK WE HAVE TESTED NOW
SOMEBODY'S WORK AS WE PLAN TO.
WHY IS THIS NECESSARY IS THAT I THINK THE -- ONE OF THESE PROBLEMS
ENCOUNTERED BECAUSE OF THIS FORMAT PART -- LOCAL PART OF THE E-MAIL IS NOT
NECESSARY THAT DEFINED CLEARLY IN MANY WAYS.
SO I THINK THAT CAUSE SOME PROBLEM.
AND IN OUR CASE, IN OUR CASE, WE WOULD NEED TO BE ABLE TO SUPPORT SOMETHING
LIKE CHINESE CHARACTER IN THE TRADITIONAL -- I MEAN, IN THE LOCAL PARTS, AND
THEN MAYBE IT'S IDN DOMAIN NAME AND THEN THE -- UNDER OUR .TW.
AND MAYBE WE HAVE TO -- WE WOULD NEED TO SUPPORT SOMETHING LIKE ASCII
CHARACTER, AND THE DOMAIN NAME, IDN DOMAIN NAME, AND OUR .TW. AND IN CASE
THE GLOBAL ROOT IDN IS -- I MEAN, IS APPLIED, THEN WE NEED TO BE ABLE TO
SUPPORT SOMETHING AT THE ROOT LEVEL ALSO ON THIS IDN ABBREVIATION.
SO IN THIS CASE, WE HAVE -- AT LEAST WE WILL NEED TO SUPPORT, DUE TO OUR
CIRCUMSTANCES, WE HAVE, IN ADDITION TO THE TRADITIONAL CHINESE IN OUR CASE,
THAT WE ALSO HAVE SOME OTHER, LIKE, ABORIGINAL LANGUAGE AND OTHER TYPE OF
LANGUAGE.
AND THEY MAY BE USED IN A DIFFERENT TYPE OF IDN THAN WE NEED TO SUPPORT IT.
SO THIS IS THE FORMAT WE NEED TO BE ABLE TO SUPPORT.
MAYBE WE CAN SUPPORT A CERTAIN IDN AND THEN THE DOMAIN NAME AND THEN ASCII.
AND MAYBE THE IDN HYBRID WITH ASCII, AND ADD THIS IDN DOMAIN NAME AND THEN
ASCII.
AND THEN IT'S POSSIBLE THAT WE WILL ALSO NEED -- IN OUR CIRCUMSTANCES, WE
NEED TO BE ABLE TO SUPPORT IDN OF A DIFFERENT TYPE.
AND SO THIS IS SOMETHING THAT WE HAVE TO SUPPORT NOW.
NOW, IN CASE THE IDN ROOT HAS BEEN IN FACT, THEN OUR PROBLEM BECOMES EVEN
MORE COMPLICATED, AS WE WILL SHOW IN HERE.
SO WE MIGHT HAVE TO SUPPORT A DIFFERENT PART OF THE PROBLEM.
AND SO THE MOST IMPORTANT PART IS THAT DURING THE TRANSMISSION, IT'S POSSIBLE
THAT SOME OF THESE PARTS MIGHT BE MISINTERPRETED, BECAUSE THEY ARE SENDING IT
TO A DIFFERENT MACHINE FOR A SIMPLE PURPOSE.
MAYBE IT'S A SPAM FILTER, MAYBE IT'S A FIREWALL, AND MAYBE IT'S SOMETHING
ELSE.
AND SO WE CANNOT SEE -- IN ONE OF THESE EXAMPLES, WE CAN SEE A SCENARIO THAT
SOME MAIL MAYBE NEEDS TO IMPLEMENT RELAY FUNCTIONS USING A PERCENTAGE SIGN ON
LOCAL PARTS.
AND SO THIS KIND OF INFORMATION WHEN YOU SEND IT UP, IT IS MORE -- YOU NEED
TO BE ABLE TO IDENTIFY THAT PART IS ACTUALLY ANOTHER MACHINE.
AND THEN YOU CHANGE YOUR LOCAL PARTS TO RELAY THE MESSAGE TO ANOTHER PART.
AND IN CASE THIS INFORMATION WAS GARBLED OR SOMEHOW CHANGED DURING THE
PROCESS, THEN THAT MIGHT BE A PROBLEM.
AND THIS IS SOME CASE THAT YOU MIGHT NEED THAT RELAY ALSO.
AND I'LL GIVE YOU A BRIEF HISTORY OF THIS WORK.
BEFORE THE EAI WORKING GROUP, WE HAVE A CDNC MEETING TALKING ABOUT THIS IDN
E-MAIL PROBLEM, AND THEN WE HAVE A COLLABORATE IN THE CDNC AND JET MEETINGS.
AND FINALLY WE WORK ON THIS EAI WORK GROUP AND ON THIS PROBLEM.
AND THIS WAS LED BY JOHN AND SOME OF THE MAJOR PLAYERS ARE THE TWNIC, JPRS,
KRNIC, AND CNNIC. AND MOST IMPORTANT WORK IN THIS IS TO DEFINE THEI-EMAIL
STRUCTURE, ENCODE THE FRAMEWORK, SMTP EXTENSION, UTF8 HEADER, AND SMTP
LANGUAGE.
AND SO THIS IS THE GOAL UP TO NOW, WE HAVE COMPLETED THE DRAFT ON THE SUPPORT
OF THISI-EMAIL PARTS.
AND FOR US, THE MOST IMPORTANT PART THAT WE HAVE DONE IN THIS CASE IS ON THE
HEADER PART.
WE HAVE DONE THE HEADER FOR THIS WORK.
pp AND LET ME GIVE A BRIEF OVERVIEW OF THIS SOLUTION.
THIS SMTP CLIENT HANDSHAKE, AND IT WILL BE ABLE TO HANDSHAKE WITH AN SMTP
SERVER FOR CHECKING IF THE SMTP IS SUPPORTED BY THE -- AT THE EXTENSION OF
THE SMTP IS SUPPORTED BY THE SERVER.
IF IT IS, THEN THE INTERNATIONAL E-MAIL ADDRESS WILL BE SENT.
IF NOT, IT WILL BE DOWNGRADE ACCORDING TO ASCII E-MAIL ADDRESS AND RFC2821
AND 2822 COMPARABLE.
AND SO THE TWNIC HAS DRAFTED THESE HEADER DOCUMENTS, AND ALSO, MOST
IMPORTANTLY, WE ARE INVOLVED IN THIS TEST PLAN AND MODIFY THIS SENDMAIL
SOFTWARE TO IMPLEMENT EAI WG DRAFT.
AND THE DEVELOP PLUG-IN FOR MUA TO SEND, RECEIVE, AND DISPLAY I-EMAIL
ADDRESS.
IF YOU NEED MORE INFORMATION, YOU CAN CHECK ON THESE SITES.
I'LL JUST SHOW A BRIEF EAI TEST BED.
IN THIS CASE, WE NEED -- THE I-EMAIL USER WOULD NEED TO BE ABLE TO ENTER AND
DISPLAY THE CHARACTERS OF HIS LANGUAGE IN THE E-MAIL ADDRESS.
ANDI-EMAIL USER IS ABLE TO STORE THEI-EMAIL ADDRESS IN THE ADDRESS BOOK AND
USE "REPLY" WITHOUT DESTROYING THE ADDRESS.
IF THEI-EMAIL SOLUTION REQUIRES KEEPING EXTRA INFORMATION AROUND FOR AN
ADDRESS IN SOME CASES, THE USER IS CAPABLE OF MANIPULATING THE INFORMATION,
INCLUDING STORE THE INFORMATION IN HIS OR HER ADDRESS BOOK.
AND SO WE NEED TO MODIFY THE SENDMAIL SOFTWARE TO SUPPORTI-EMAIL
SPECIFICATION, TO HAVEI-EMAIL SMTP EXTENSION, UTF8 HEADER AND DOWNGRADE
CAPABILITIES.
AND THE SCENARIO LIKE TEST IT WILL BE TWO DIFFERENT CASES.
ONE IS THE TWOI-EMAIL USER, ONE SAME ONE FROM ONEI-EMAIL TO ANOTHER.
AND THEN MAYBE THREEI-EMAIL USER, BUT SEND A MESSAGE TO BOTH PEOPLE AND REPLY
TO ALL THE RECIPIENTS ON THE LIST.
AND SO THIS IS MAYBE THE SCENARIOS.
THEI-EMAIL USER SENDS TO ONE ASCII, AND SO THIS WILL BE HYBRID CASES.
AND THIS IS A CASE THAT WE SHOW THAT THE TEST BED DEMONSTRATIONS.
SUPPOSE THE FIRST ONE THAT IF THE SERVER STILL HAVE THE SMTP EXTENSION AND
THE HEADER CORRECTLY, THEN THE MAIL WAS SENT THROUGH IF IT'S THE CORRECT
INTERNATIONALIZED IDN AND ADDRESS.
AND IF NOT, THEN THE DOWNGRADE WILL BE DONE, AND THEN A MESSAGE WILL BE SENT
USING THIS PUNYCODE.
AND THEN THIS WILL BE SOMETHING THAT WE HAVE FOUND, AND WITH THE HYBRID, IT
WAS CORRECTLY ENCODED IN THIS CASE.
THANK YOU FOR YOUR ATTENTION.
[ APPLAUSE ]
>>VINT CERF: THANK YOU VERY MUCH. I HOPE WE WILL TEST A LOT MORE COMPLEX
SCENARIOS THAN THE ONES THAT YOU JUST SHOWED US, LIKE FORWARDING MAIL AROUND
AND ATTACHMENTS AND SO FORTH.
BECAUSE ALL OF THOSE WILL BE CHALLENGING, I'M SURE.
>>JOHN KLENSIN: VINT, I HAVE SOME REALLY GOOD NEWS FOR YOU AND REALLY GOOD
NEWS FOR EVERYONE IN THE AUDIENCE. THIS ONE IS NOT AN ICANN PROBLEM
[ LAUGHTER ]
>>VINT CERF: MICHEL.
>>MICHEL SUIGNARD: YEAH, MY QUESTION WAS ABOUT THE REPERTOIRE. YOU MENTIONED
YOU ARE USING UTF-8, SO I WAS CURIOUS ABOUT THE REPERTOIRE YOU WERE GOING TO
ALLOW FOR THE LOCAL PART.
JUST TO BE CLEAR, THE UTF-8 IS JUST AN ENCODING SCHEME. IT DOESN'T DETERMINE
-- BASICALLY YOU COULD REPRESENT UNICODE 3.2, I 5.0, WHATEVER, OR A SUBSET.
I AM JUST BASICALLY ASKING IS THIS 3.2, 4.0, 5.0, A SUBSET? WHATEVER.
BECAUSE WHY I AM ASKING THIS IS OBVIOUSLY FOR AGAIN SECURITY ISSUE ON
SPOOFING, ON ALL THOSE GOOD THINGS. MY HIDDEN AGENDA HERE IS TO GET THE
SUBSET AS SMALL AS POSSIBLE.
>>M.C. LIANG: RIGHT NOW I THINK WHAT WE HAVE IS FOR THE CHINESE CHARACTERS
-- NO? MAYBE JOHN CAN ANSWER.
>>JOHN KLENSIN:VINT, YOU SHOULD DECIDE HOW FAR YOU WANT THIS DISCUSSION TO
GO, BECAUSE IT ISN'T AN ICANN AGENDA ITEM AND I CAN TALK ABOUT IT FOR HOURS.
BUT THE TRADITIONAL MODEL FOR E-MAIL, AND HENCE ONCE OF THE REASONS WHY SOME
OF THIS COMPLEXITY IS NECESSARY, IS THAT NOTHING IS PERMITTED TO INTERPRET A
LOCAL POINT -- A LOCAL PART BESIDES THE RECEIVING SERVER.
AND E-MAIL TRANSPORTS HAVE TO BE COMPLETELY TRANSPARENT TO ALL EXPRESSIBLE
LOCAL PARTS UNTIL IT GETS THE RECEIVING SERVER.
SO WHILE THIS IS STILL VERY MUCH UNDER DISCUSSION, THE TREND IN THE IETF
DISCUSSIONS RIGHT NOW IS TO LEAVE THIS UNRESTRICTED AS FAR AS THE PROTOCOL IS
CONCERNED, PRESERVING THAT RECEIVING SERVER MODEL, BUT TO THEN TURN AROUND
AND GIVE ADVICE TO PEOPLE WHO CONFIGURE MAIL STORES ABOUT WHAT ADDRESSES THEY
PERMIT, JUST AS WE HAVE HAD ADVICE FOR YEARS BUT IT HAS BEEN ONLY ADVICE,
THAT CONFIGURING YOUR MAIL STORE TO BE CASE SENSITIVE IS A BAD IDEA, THAT
PUTTING SPACES IN E-MAIL ADDRESSES IS A BAD IDEA, AND A WHOLE LOT OF OTHER
THINGS ON THAT LIST.
SO MY SUSPICION, IF CURRENT TRENDS CONTINUE, IS THE WAY THIS WILL WORK OUT IS
THERE WILL BE NO REQUIREMENT IN THE PROTOCOL ABOUT WHAT CAN BE SENT, BUT THAT
THOSE WHO OPERATE MAIL STORES AND CHOOSE THE ADDRESSES THAT PEOPLE USING
THOSE MAIL STORES CAN USE, WOULD BE WELL ADVISED IN THE INTEREST OF
INTEROPERABILITY TO BE AS CONSERVATIVE AS THEY POSSIBLY CAN BE ABOUT THE
CHARACTERS THEY USE.
>>VINT CERF: JOHN, LET ME ASK IF THERE ARE QUESTIONS THAT PEOPLE WOULD LIKE
TO ASK OF THE THREE PRESENTERS. WE DON'T HAVE A LOT OF TIME FOR IT, BUT I
WANT TO AT LEAST FIND OUT IF WE HAVE ANY BURNING QUESTIONS.
I DON'T SEE ANYONE LEAPING SKYWARD, OH, I'M SORRY.
SEBASTIEN, DO YOU WANT TO TAKE THE MICROPHONE OVER THERE? I DIDN'T LOOK FAR
ENOUGH TO MY LEFT, MY APOLOGIES.
>>SEBASTIEN BACHOLLET: (SPEAKING FRENCH).
AS WE SPEAK ABOUT IDN, MAYBE IT WOULD BE INTERESTING TO BE ABLE TO SPEAK
DIFFERENT LANGUAGES.
MY QUESTION IS DON'T YOU THINK THAT ALL THAT IS A MESS? WE -- WE HAVE IN
FRENCH AN EXPRESSION. WE SAY WHY TO MAKE IT SIMPLE WHEN IT COULD BE DONE
VERY -- WITH A LOT OF DIFFICULTIES.
CAN'T WE TRY TO GO OUTSIDE OF THOSE BOX AND TRY TO SEE THERE ARE NO WAY -- NO
OTHER WAY TO GO? FOR EXAMPLE, WE HAVE TODAY'S -- I MAKE A SIMPLIFICATION, TWO
DOORS TO GO TO THE DOMAIN NAMES. WE HAVE THE CCTLD AND WE HAVE THE GTLDS.
WE CAN DISCUSS MAYBE THERE ARE MORE DOORS. THOSE DOORS COULD BE SPLIT IN
DIFFERENT.
CAN'T WE LEAVE THOSE DOORS AND OPEN A THIRD ONE WITH LINGUISTICAL TLDS, AND
JUST START WITH THAT KIND OF THINKING?
I AM NOT TECHNICAL AT ALL, AND I START REALLY -- OR AT THE END, I DON'T
UNDERSTAND ANYTHING OF ALL THE DISCUSSION AT THE TECHNICAL LEVEL, BUT MAYBE
IT COULD BE A GOOD IDEA TO BE THINKING OUT OF THE BOX FOR THAT SUBJECT, TOO.
THANK YOU.
>>VINT CERF: I SUSPECT THAT THERE ARE A LOT OF PEOPLE THAT WISH THIS COULD BE
SIMPLER.
THERE IS A VERY FAMOUS QUOTE FROM A MAN NAMED EINSTEIN WHO, WHEN ASKED ABOUT
THE COMPLEXITY OF HIS THEORY, SAID THAT THINGS SHOULD BE AS SIMPLE AS
POSSIBLE, BUT NO SIMPLER.
THERE WAS AN ATTEMPT, IN FACT THERE WERE SEVERAL ATTEMPTS, TO THINK OUTSIDE
OF THE BOX. IN FACT, I EVEN WROTE SOME NOTES HERE SAYING COULD WE FIND A WAY
TO DECOUPLE UNICODE FROM DNS IN ORDER TO HAVE -- THIS WOULD BE AN OVER-DNS
KIND OF CONSTRUCT.
THERE ARE LOTS OF POSSIBILITIES, BUT THEY HAVE THE UNFORTUNATE PROBLEM THAT
THEY DON'T INTERWORK VERY WELL WITH THE ALREADY EXISTING SYSTEMS. SO PART OF
THE PROBLEM WOULD BE TO ANALYZE THE OPTIONS AND DECIDE HOW MUCH BREAKAGE YOU
ARE WILLING TO ACCEPT IN THE COURSE OF TRYING TO SIMPLIFY.
I SEE WE HAVE ANOTHER PERSON. IS IT -- PLEASE GO AHEAD. I'M SORRY, IT'S
HOTTA, YES?
>>HIRO HOTTA: YES, THANK YOU. THANK YOU FOR REMEMBERING MY NAME. MY NAME IS
HIRO HOTTA, AND I HAVE A QUESTION TO MICHEL.
I'D LIKE TO KNOW THE MOST RECENT SITUATION ABOUT HOW UNICODE CONSORTIUM
THINKS AND HOW IE 7 IS IMPLEMENTED.
I UNDERSTAND WHY IE 7 DISPLAYS THE IDN IN XN DASH DASH FORM, WHEN THE STRING
CONSISTS OF CHARACTERS FROM TWO OR MORE SCRIPT. IT'S REGARDED AS UNUSUAL AND
DANGEROUS, I THINK.
IT'S FROM UTR-36; RIGHT?
HOWEVER, FOR EXAMPLE, IN JAPAN, WORDS CONSISTING OF MIXED USE OF ENGLISH
LETTERS AND JAPANESE LOCAL CHARACTERS, SUCH AS KANJI, ARE VERY USUAL, AND
EVEN ALLOWED TO BE OFFICIALLY REGISTERED AS COMPANY NAMES TO THE GOVERNMENT
OFFICE.
CONSIDERING THIS, .JP ALLOWS REGISTRATION OF IDNS THAT CONSIST OF MIXED USE
OF ENGLISH ALPHABETS AND (INAUDIBLE) AND/OR KANJI.
HOW ARE SUCH STRINGS DISPLAYED IN COMING IE 7 WHEN THEY ARE USED AS DOMAIN
NAMES?
ARE THEY STILL DEALT AS DANGEROUS AND DISPLAYED IN XN DASH DASH FORM? I THINK
MIXED USE OF ENGLISH ALPHABETS AND LOCAL CHARACTERS WHICH ARE NOT LATIN BASED
ARE NOT SO DANGEROUS.
THANK YOU.
>>MICHEL SUIGNARD: TO SOME DEGREE, WE ARE FOLLOWING THE ICANN GUIDELINES THAT
SAYS THAT YOU ARE NOT SUPPOSED TO MIX SCRIPTS IN LEVELS. SO YOU SEE WE HAD
TO LOOK A BIT MORE IN DETAIL, SO WHAT WAS -- TO SOLVE THE ISSUE THAT THIS
CREATED BY THE FACT THAT IT'S TRUE THAT IN MANY CULTURE, INCLUDING WE SEE
JAPAN, YOU DO FIND A MIX OF LATIN OR ROMAN CHARACTERS WITH THE LOCAL WRITING
SYSTEM.
I MEAN, YOU CAN SEE, WE SEE A LOT OF JAPANESE MAJOR NAMES, YOU KNOW, IN FACT
WITHIN LATIN TEXT, SANIO, GVC, THESE KIND OF THINGS.
SO THEY DO MIX THEM IN (INAUDIBLE) CASE.
SO I CAN TELL YOU WE DID LOOK AT THE ISSUE. WE HAVE BEEN CHANGING. THAT'S
WHY BETTER PRODUCTS ARE FOR YOU. YOU CAN TAKE, YOU KNOW, CONSTITUENCY ON END
USERS FEEDBACK INTO ACCOUNT.
WHAT I DID, I DID WORK ON THE TABLE MYSELF THAT WHERE I CREATED A LIST OF
SCRIPT THAT I FELT WERE SAFE WITH ASCII.
INSTEAD OF THE OPPOSITE, BY THE WAY. I DIDN'T SAY WHICH ONES WERE UNSAFE
WITH ASCII. I SAID WHICH ONES WERE SAFE. SO IF YOU ARE INSIDE THAT LIST,
YOU ARE OKAY.
THAT DOES INCLUDE, AS A -- I'M NATURALLY NOT THE FINAL DECISION, BUT THAT
INCLUDES CJK, SO IT INCLUDES THE JAPANESE WRITING SYSTEM. IT DOES INCLUDE
MOST OF THE EAST ASIAN WRITING SYSTEM. ANYTHING IN INDIA, ARABIC, ANYTHING
LIKE THAT. IT DOES NOT INCLUDE CYRILLIC, GREEK. EVEN SOME YOU NECESSARILY
THINK ARE FEASIBLE, BUT I THINK THEY ARE, LIKE ARMENIAN IS CONFUSABLE TO SOME
DEGREE. GEORGIAN IS CONFUSABLE.
BECAUSE DEPENDING ON THE FONT YOU USE, THAT'S THE SORT OF ISSUE WE HAVE. YOU
CANNOT COMPLETELY CONTROL THE FONTS THAT ARE USED TO CONFUSE THE USER.
WE DO HAVE GOOD CONTROL OF WHAT IS USED FOR THE UI, BUT WE ARE VERY DIFFICULT
TO CONTROL WHAT IS USED ON THE WEB PAGE, FOR EXAMPLE.
SO WE HAVE TO BE CAREFUL. THAT'S WHY IT'S NEVER 100% SAFE, BUT WE FELT
BASICALLY THERE WAS A BALANCE BETWEEN USABILITY, ESPECIALLY IN A CASE LIKE
JAPAN, SO WE HAD TO KIND OF OPEN THE GATE A LITTLE BIT.
SO YEAH, WE SHOULD ALLOW WHAT YOU SAID.
>>VINT CERF: YES, GO AHEAD, TINA.
>>TINA DAM: I JUST WANTED TO MAKE A FOLLOW-UP ON THAT, MICHEL, FOR YOU. THE
IDN GUIDELINES JUST TO BE CLEAR ARE FOR THE GTLD OPERATORS AND NOT THE CCS.
SO FOLLOWING THE GUIDELINES FOR IMPLEMENTATION MAY CAUSE SOME DIFFICULTIES
FOR SOME OF THE CC OPERATORS.
IN ADDITION TO THAT, WHEN IT COMES TO MIXING OF SCRIPTS AND LABELS, THE
GUIDELINES WERE CAREFULLY -- CAREFULLY WRITTEN UP TO MAKE SURE THAT CASES
SUCH AS WHAT HIRO HOTTA IS TALKING ABOUT ACTUALLY CAN BE ALLOWED BY FOLLOWING
THE IDN GUIDELINES AT THE SAME TIME.
>>VINT CERF: WELL, WE ARE GOING TO MOVE ON NOW TO THE SECOND PART OF THE
PROGRAM. ACCORDING TO MY SCHEDULE, MOUHAMET DIOP IS GOING TO MAKE AN
INTRODUCTION TO AFRICAN MULTILINGUAL ACTIVITIES.
I WAS LOOKING AROUND THE ROOM AND -- OH, THERE. HE IS HIDING IN THE BACK.
HELLO, MOUHAMET.
WHILE MOUHAMET IS MAKING HIS WAY UP HERE, JUST A REMINDER THAT THOSE OF YOU
WHO ARE GOING TO THE RECEPTION TONIGHT SHOULD BE PREPARED TO LEAVE AT 7:45 TO
TAKE THE BUS TO THE RECEPTION.
WE WILL TRY TO FINISH THIS SESSION UP BY 6:45. THAT WILL GIVE YOU AN HOUR OF
TIME TO PREPARE.
>>MOUHAMET DIOP: OKAY. THANK YOU VERY MUCH, VINT.
I HAVE BEEN ASKED TO TALK A LITTLE BIT ABOUT IDN AND THE DIFFERENT INITIATIVE
THAT WE HAVE IN AFRICA.
WE GOT A CHANCE TO HAVE A FIRST PRESENTATION THAT IS REALLY THE TECHNICAL ONE
THAT SHOW ALL THE DIFFICULTIES TO IMPLEMENT IT.
SO I'M GOING TO GO TO THE OTHER PARTS; THAT IS, TO JUST ASK MYSELF WHAT CAN
BE DONE, AND WHY WE HAVE TO MOVE ON ONE PART TO MAKE THESE THINGS HAPPEN.
IT'S REALLY IMPORTANT TO REMIND PEOPLE THAT WHAT ARE OUR CORE PRINCIPLE, THE
FOUNDING CORE PRINCIPLE OF ICANN, AND WHY THE WORLD IS NOT UNIQUE, AND WHY WE
HAVE A DIVERSE WORLD WITH DIFFERENT SENSIBILITY, DIFFERENT LANGUAGES,
DIFFERENT GROUP.
AND GIVE SOME IDEA ABOUT HOW THE WORLD IS STRUCTURED. I THINK IT'S VERY
IMPORTANT FOR PEOPLE TO GET THIS. AND AT THE END I WILL GIVE SOME EXAMPLE OF
THE IMPLEMENTATION AND WHAT HAPPENED IN THE AFRICAN CONTINENT. WE GOT A VERY
DIVERSE INITIATIVE AND IT'S VERY IMPORTANT THAT WE CAN SHARE IT.
IF WE REMIND THE FOUNDING PRINCIPLE FOR ICANN, WE FIND MANY WORD THAT CAN
HELP US JUST KEEP IN MIND THAT WE ARE REALLY CONCERNED ABOUT THE STABILITY OF
THAT INFRASTRUCTURE, THE DNS INFRASTRUCTURE. BUT WE REALLY CARE ABOUT
COMPETITION. AND THE THIRD BULLET IS ABOUT CHOICE, AND CHOICE IS, FOR ME, A
RELIGION, BECAUSE WE ARE NOT JUST ONE GROUP. WE ARE SEVERAL GROUP. WE ARE
DIFFERENT CONSTITUENCY, DIFFERENT STAKEHOLDERS. AND ONE OF THE CORE
PRINCIPLE IS TO ALLOW THEM TO THAT CHOICE.
WE HAVE A BOTTOM-UP COORDINATION AND REPRESENTATION, LIKE WHATEVER WE'RE
TRYING TO DO IS TO PUSH FOR AN INCLUSIVE APPROACH, NOT AN EXCLUSIVE APPROACH.
WE ARE 6.4 BILLION HABITANT WORLDWIDE, AND 1 BILLION INTERNET USERS. I JUST
RECALL WHAT VINT HAVE USUALLY TO SAY, THAT 1 BILLION PEOPLE, IT MEANS WE GET
LEFT SOMETHING LIKE FIVE POINT (INAUDIBLE) BILLION USERS. AND IF YOU LOOK AT
THE DIFFERENT COUNTRY AND WHAT THEY REPRESENT, AFRICA REPRESENT 14%. IT'S
NOT PEANUTS. IT'S SOMETHING. IT'S ABOUT 800 MILLION PEOPLE.
AND IF YOU HAVE A LOOK AT LANGUAGES WORLDWIDE, WELL, THE NUMBERS CAN MAKE
PEOPLE REALLY FEEL AFRAID BECAUSE WE ARE TALKING ABOUT 6,000 LANGUAGES
WORLDWIDE. BUT WE KNOW THAT 97% OF THESE LANGUAGES ARE (INAUDIBLE) TO
DISAPPEAR. IT MEANS IT IS REALLY GOING TO BE A DISASTER AND A LOSS FOR THE
WHOLE COMMUNITY, AND I SEE SOME PEOPLE WHO HAVE DEVOTED TO MAKE THESE
LANGUAGES PART OF THE UNICODE DATABASE, AND MICHAEL EVERSON IS WORKING,
ALWAYS TRYING TO MAKE SOME FORGOTTEN LANGUAGES IN AFRICA TO BE REGISTERED IN
THE UNICODE CONSORTIUM.
MY CONCERN IS NOT NECESSARILY THESE FORGOTTEN LANGUAGES. WE ARE TALKING
ABOUT EXISTING LANGUAGES THAT PEOPLE ARE USING, BUT DUE TO SOME LACK OF WE'RE
LEFT BEHIND, AND WE'RE TRYING TO CATCH UP, AND WE'RE NOT PART OF IT.
WE TRY TO MAKE IT.
JUST TO GIVE YOU SOME IDEAS ABOUT THE DISTRIBUTION OF THE LANGUAGES, IT'S
REALLY IMPORTANT TO KEEP THAT IN MIND. I JUST GOING TO GIVE SOME CHART TO
SEE HOW THE LANGUAGES ARE DISTRIBUTED ALL AROUND THE WORLD.
SO THE FRENCH LANGUAGES IN THE WORLD ARE THAT WAY.
ENGLISH SPEAKING COUNTRIES.
ARABIC.
CHINESE.
SPANISH.
RUSSIAN.
WHAT DOES IT MEAN?
WE'VE GOT A VERY DIVERSE ENVIRONMENT. WE'VE GOT MANY LANGUAGES. VERY FEW
HAVE BEEN CODIFIED IN THE UNICODE.
WHY I'M TALKING ABOUT THE UNICODE CONSORTIUM, BECAUSE I AM FROM AFRICA AND I
KNOW THAT WE'RE REALLY LEFT BEHIND, AS USUAL. BUT MANY OF THE AFRICAN
LANGUAGES HAVE NOT BEEN IN THE UNICODE DATABASE.
SO WE ARE REALLY TRYING TO WORK HARD IN ORDER TO MAKE A PROMOTION OF THE
MULTILINGUALISM AND AN AFRICAN PRESENCE ON THE INTERNET.
I MEAN, PART OF THIS OBJECTIVE IS RELATED TO THE IDN THING, BUT THE OTHER
PART OF THAT OBJECTIVE HAVE NOTHING TO DO WITH ICANN RESPONSIBILITY BECAUSE
IT'S ABOUT AFRICAN CONTENT ON THE NET, AND THIS IS NOT NECESSARILY THE PLACE
WHERE WE HAVE TO DISCUSS ABOUT THAT ISSUE.
BUT ONE OF THE GOOD NEWS THAT I WANT TO SHARE WITH YOU IS THE AFRICAN UNION
THAT HAVE THE RESPONSIBILITY TO SHARE, I MEAN, THE BEHAVIOR OF OUR CONTINENT
HAVE DECIDE THAT 2006 IS THE YEAR OF AFRICAN LANGUAGES. IT MEANS THAT THEY
DECIDE TO DO SOMETHING REGARDING THE AFRICAN LANGUAGE. AND THEY HAVE
NOMINATED ONE PERSON WHO IS REALLY CLOSE TO ICANN, WHO IS A FRIEND OF ICANN,
WHO COMES TO SEVERAL MEETINGS OF ICANN AND WHO SHARE WITH US HIS VIEW AND HIS
BELIEF AND HIS STRONG COMMITMENT TO MAKE THE AFRICAN LANGUAGES PRESENT ON THE
NET. SO HE HAS BEEN NOMINATED AS THE PERMANENT SECRETARY OF THAT INITIATIVE.
THIS IS GOOD NEWS.
SO WHAT HAPPENED IN THE ADAPT? SO WE HAVE DIFFERENT INITIATIVES. SOME OF
THEM ARE UNDER THE AFRICAN UNION INITIATIVE. SOME OF THEM ARE FROM PRIVATE
SECTORS. AND SOME OF OTHERS ARE JUST SOME INDIVIDUALS WHO TRY TO MAKE THINGS
HAPPEN, LIKE BISHARAT, DON OSBORNE, OR MICHAEL WHO IS FROM IRELAND, BUT
USUALLY MORE CLOSE AFRICAN PEOPLE THAN EUROPEAN PEOPLE. WE GOT ALSO THE MINC
WHO IS REALLY INVOLVED.
SO I JUST GIVE SOME EXAMPLE OF MEETINGS WHERE PEOPLE TRIED TO DEFINE
SOMETHING, LIKE A WAY TO MOVE FORWARD.
AND I WANT TO SHARE WITH YOU WHAT HAPPENED AT SOME OF THESE MEETINGS. FOR
EXAMPLE, TWO OF THEM, ON THE ADDIS ABABA MEETING, IT WAS IN JUNE 2006. IT
WAS DEFINED THAT THIS YEAR IS THE 2006 YEAR FOR AFRICAN LANGUAGES.
AND LAST YEAR IN SEPTEMBER 2005, IN DAKAR, WE HAVE MORE THAN 40 EXPERTS
COMING ALL AROUND THE WORLD, COMING FROM AFRICA. WE HAVE ICANN
REPRESENTATIVE. WE GOT MANY EXERT LINGUIST AND TECHNOLOGIST. WE GOT ALSO
MANY FRIEND WHO JUST HELP TO MAKE THAT EVENT HAPPEN IN DAKAR.
AND THE INITIATIVE WAS JUST TO SEE WHAT IS THE DIFFERENT STEP THAT WE HAVE TO
ACHIEVE TO MAKE THE COMPUTERIZATION OF AFRICAN LANGUAGES BECOME A REALITY IN
AFRICA.
AND PART OF THE REALITY IS TO START TESTING IN TWO CCTLDS IN AFRICA TO HELP
THEM GET THE KNOWLEDGE ON HOW WE CAN IMPLEMENT IDN. AND I AM REALLY GLAD TO
HEAR, DURING THAT MEETING, THAT SOME EXPERIMENTATION HAVE GONE REALLY WELL.
SOME OF THEM ARE STILL LEARNING HOW TO MAKE IT. AND I THINK THAT WHAT WE
HAVE REALLY IN FRONT OF US AS AN ICANN ENVIRONMENT, WE HAVE TWO PATHS TO DEAL
WITH THE REALITY.
ONE PATH IS IF ICANN DID NOT DO IT OR IF ICANN IS REALLY CARING A LOT ABOUT
WHAT'S HAPPENING IF WE DID IT, PEOPLE WILL DO IT BUT IT WILL BE A MESS FOR
THE WHOLE COMMUNITY.
THE OTHER PATH IS IF ICANN SAY, OKAY, SINCE ALREADY HAVE BEEN IMPLEMENTED NOT
THE RIGHT WAY, WE TRY TO LEAD BECAUSE IT IS OUR ENVIRONMENT, IT IS OUR
RESPONSIBILITY TO TAKE THE EXPANSION OF THE DOMAIN NAME SPACE, EVEN IF YOU
KNOW THAT WE ARE GOING TO A COMPLEX PROCESS BUT IT'S OUR RESPONSIBILITY AND
WE NEED AND WE SHOULD AND WE HAVE TO MOVE THESE PEOPLE ALL ALONG THE PATH.
IT WOULD BE A DIFFERENT WAY, BECAUSE I HAVE HEARD THAT, WELL, SOME PEOPLE ARE
REALLY HAPPY THAT WE STICK THE SYSTEM AS IT IS RIGHT NOW. LIKE KEEP THE DNS
AS IT IS.
IT CAN BE A WAY TO DO IT. BUT WE HAVE SEEN THAT SOME OTHER PEOPLE ARE ALSO
TAKING THEIR OWN RESPONSIBILITY TO SAY IF THE DNS CANNOT INTEGRATE MY NEEDS,
SO I WILL GO OUT ON MY OWN PATH. AND THAT'S NOT REALLY WHAT MAKE THE ICANN
COMMUNITY AS A WHOLE.
SO BISHARAT.NET IS AN EXAMPLE. THEY ARE DOING A LOFT THINGS IN AFRICA. IF
SOMEBODY WANTS TO GET SOME EXPERIENCE ABOUT WHAT THEY DID IN AFRICA IN
LOCALIZATION LANGUAGE.
YOU CAN ALSO MEET SOME OF THE EXPERIENCE REGARDING THE N'KO LANGUAGES THAT
DEAL MUCH MORE WITH COMMUNITIES THAN COUNTRIES BECAUSE AFRICAN LANGUAGE ARE
NOT BIND TO COUNTRIES. AFRICAN LANGUAGES ARE BOUND TO COMMUNITIES, AND
COMMUNITIES ARE ACROSS BOUNDARIES.
SO SOMETIME WE TAKE A LANGUAGE, IT IS FOR FOUR COUNTRIES. SOMETIMES WE TAKE
A LANGUAGE, IT IS INSIDE A COUNTRY.
SO -- AND MY MESSAGE LAST DAY, MAYBE I WAS MISUNDERSTOOD, BUT ICANN IS
DEALING MORE WITH COMMUNITIES THAN WITH COUNTRIES. THAT'S -- IT MIGHT BE
SEEN AS JUST A RESTRICTION OF OUR MANDATE, BUT WE ARE MORE FOCUSED ON
COMMUNITIES. OUR CONSTITUENCY, OUR COMMUNITIES OF USERS WHO HAVE THE SAME
INTERESTS, THAT IS THE WAY I DESCRIBE ICANN.
EVEN THE CC, I MEAN THE CCTLD ENVIRONMENT IS ENVIRONMENT OF COMMUNITY USERS.
IT'S NOT THE COUNTRY, BECAUSE THE COUNTRY IS REPRESENTED BY A GOVERNMENT AND
THE GOVERNMENT HAVE THE MANDATE TO DECIDE WHATEVER HAPPEN IN THE COUNTRY.
AND SO THAT'S REALLY A CONTINUATION OF OUR MANDATE, IS TO REALLY DEAL WITH
COMMUNITY NEEDS AND SEE WHAT CAN BE DONE IN ORDER TO MAKE THINGS HAPPEN.
AND ONE OF THE CASABLANCA STATEMENT IS A MEETING THAT HAPPENED HERE IN
MOROCCO LAST YEAR. IT WAS ORGANIZED BY DIFFERENT COUNTRIES TO DEAL WITH THE
SAME ISSUE. MULTILINGUALISM, WHAT CAN BE DONE, WHAT WE ARE EXPECTING FOR
PEOPLE TO DO, WHAT RESPONSIBILITY WE HAVE TO TAKE TO MAKE THINGS HAPPEN.
AND IN THAT LANGUAGE, I THINK THAT OUR RESPONSIBILITY IS DEALING WITH THE
IDN.
THE MULTILINGUALISM ASPECT IS NOT OUR RESPONSIBILITY. MAKE THE CONTENT, MAKE
THE AFRICAN CONTENT, MAKE PEOPLE REALLY -- I MEAN, GET ACCESS ON IT. THAT'S
NOT OUR RESPONSIBILITY.
BUT IF WE DID OUR JOB WELL, I MEAN WE ARE WELL SET. AND ASK OTHER PEOPLE TO
DO THEIR JOB.
MICHAEL EVERSON WHO IS HERE AND REALLY RELATED TO ALL THE INITIATIVE WE ARE
DOING IN ICANN RELATING TO THIS IDN ISSUE ALSO HAVE MANY MATERIALS AVAILABLE
ON HIS WEB SITE. SO PEOPLE WHO WANT TO SEE WHAT HAVE BEEN DONE FOR AFRICA,
HE CAN GO THERE AND SEE THE INITIATIVE ON THAT SITE.
THE ETHIOPIAN INITIATIVE ALSO IS REALLY WELL SAID. THEY ARE DOING A LOT IN
THAT WAY. WE HAVE MANY INITIATIVE, NOT NECESSARILY WELL COORDINATED. AND
PART OF WHAT WE ARE TRYING TO DO IS -- I KNOW WE DON'T HAVE ENOUGH TIME TO GO
THROUGH ALL THESE THINGS BUT WHAT I AM TRYING TO DO IS SHARE WITH YOU THE
TIMETABLE THAT WE HAVE DEFINED IN AFRICA. BASED ON THE PRIVATE SECTOR
INITIATIVE, WHERE I CAN GIVE THE NAME OF THE (INAUDIBLE) TRYING TO COORDINATE
THE IMPLEMENTATION OF THIS ACTION UNDER THE AFRICAN UMBRELLA.
IT'S PIERRE OUEDRAOGO THAT MANY OF YOU HAVE KNOWN THAT WAS REALLY INVOLVED IN
THE CCNSO. PIERRE DANDJINOU, HE IS ALSO INVOLVED IN THE DIFFERENT ACTIVITIES
OF ICANN AND HE IS ALSO PRESENT ON THE PIR BOARD ALSO. HE'S AN AT-LARGE
BOARD MEMBER, I THINK, TOO.
SO MYSELF, HE IS TRYING TO -- I'M GOING TO SKIP QUICKLY ON THE PROJECT.
THIS IS JUST TO GIVE YOU ROUGHLY AN IDEA ABOUT WHAT HAPPENED IN SENEGAL LAST
YEAR AND THE DIFFERENT PEOPLE WHO WERE PRESENT THERE. JUST TO GIVE YOU AN
IDEA ABOUT HOW MANY COUNTRIES WE'RE ABLE TO GET FOR THAT MEETING.
WE GET LINGUISTS THERE. WE HAVE PEOPLE FROM MADAGASCAR, FROM FRANCE, WE'VE
GOT PEOPLE FROM IRELAND, PEOPLE FROM THE UNITED STATES, WE GOT PEOPLE FROM
ICANN.
SO MAURITANIA. JUST TO LET YOU KNOW IT WAS AN INITIATIVE. THE GOAL WAS WE
DON'T HAVE AN EXACT IDEA ABOUT HOW MANY LANGUAGES WE GET IN AFRICA. THIS IS
THE FIRST STEP.
(INAUDIBLE) THE LANGUAGE, THEY SAY, WE'RE SORRY, THIS SENTENCE HAS NOT BEEN
DONE WELL. WE SAY WE ARE GOING TO DO IT AND WE'RE GOING TO HELP YOU GET IT.
THE SECOND THING IS IN THESE LANGUAGES, HOW MANY HAVE ALREADY BEEN IN THE
UNICODE DATABASE.
IF YOU KNOW HOW MANY, WE KNOW, I MEAN, WHAT WE NEED TO DO IN ORDER TO
COMPLETE THE COMPUTERIZATION OF AFRICAN LANGUAGE.
AND WHEN THIS IS DONE, WE ARE GOING TO WORK CLOSELY WITH THE COMPANY LIKE
MICROSOFT OR OTHERS JUST TO MAKE -- HELP THESE LANGUAGES BE PART OF THE
LANGUAGE PRESENCE ON THE TOOLS THAT WE ARE USING IN ORDER TO ACCESS THE
INTERNET.
AND THE SECOND PART OF THE PROJECT IS RELATED TO THE IDN. WE SAY THAT IT'S
REALLY A COMPLICATED ISSUE. WE NEED TO LEARN FIRST. WE HAVE TO GO THROUGH
THAT PATH.
CAN WE HAVE TWO CCTLD AVAILABLE TO BE JUST LIKE A PILOT FOR US, AND WE'RE
GOING TO IMPLEMENT AT THE SECOND LEVEL IDN.
AND WE HAVE SENEGAL WHO SAY THEY VOLUNTEER FOR THAT.
WE'RE EXPECTING TO GET ANOTHER COUNTRY WHO WILL ACCEPT TO BE THE SECOND IN
ORDER TO GET AN IMPLEMENTATION DONE.
SO HERE IS THE AFRICAN PROJECT.
WE HAVE TO GET THE FIRST FUNDING PART OF THE PROJECT THAT HAS HELPED US DO
THE -- THE STEERING COMMITTEE THAT WAS THE TECHNICAL SESSION IN DAKAR.
WE'RE GOING TO START VERY SOON THE IMPLEMENTATION PHASE OF THAT PROJECT.
BUT IT TOOK US ONE YEAR, I MEAN, FROM THE BEGINNING OF THAT INITIATIVE.
WE ARE NOT ASHAMED, BECAUSE IT TOOK ICANN MORE THAN FOUR YEARS TO DISCUSS
ABOUT IDNS, SO WE SAY THAT'S FINE.
FOR THAT TIME, WE ARE NOT OVER SCHEDULE.
AND IF I THINK ABOUT HOW LONG IT HAS TAKEN AFRICA TO SET UP AFRINIC THAT IS
REALLY NOW VERY SUCCESSFUL, I DID NOT -- I'M NOT DISCOURAGED TO THINK THINGS
HAPPEN IN THE NEXT FUTURE.
SO OUR PARTNERS WHO HELP US MAKE THESE THINGS HAPPEN ARE ISOC, AFILIAS, AND
ALL THE OTHERS WHO ARE PART OF THE WHOLE COMMUNITY.
AND WE'RE EXPECTING FOR OTHER PEOPLE TO JOIN ALSO THAT INITIATIVE.
BECAUSE ALL THE INDIVIDUALS WHO ARE INVOLVED IN THE PROJECT ARE JUST, WE CAN
SAY, SPONSORED, BECAUSE THEY ARE NOT PAID FOR WHAT THEY ARE DOING.
AND I THINK WE ARE LOOKING FORWARD FOR AFRICA TO CATCH UP ON THAT IDN
REVOLUTION.
THANK YOU VERY MUCH.
[ APPLAUSE ]
>>VINT CERF: I DO HAVE ONE QUESTION FOR CLARIFICATION, MOUHAMET.
CAN YOU SAY JUST A LITTLE MORE ABOUT THE SCOPE OF THE PROJECTS?
ARE YOU FOCUSED ONLY ON GETTING THINGS INTO THE DOMAIN NAME SYSTEM?
OR ARE YOU ALSO TRYING TO WORK WITH THE APPLICATIONS THAT ARE GOING TO USE
THE IDNS TO MAKE SURE THAT THEY ARE COMPATIBLE?
>>MOUHAMET DIOP: IT WAS -- I MEAN, THESE ARE THE TWO OBJECTIVES, IS TO HELP
ON THE COMPUTERIZATION OF THE LANGUAGES, BECAUSE WE HAVE SEEN THAT THE
LANGUAGES THAT HAVE ALREADY DONE THE JOB, IN MY COUNTRY, WE HAVE 16 NATIONAL
LANGUAGES THAT HAVE BEEN DECLARED.
BUT THESE LANGUAGES ARE NOT IN THE UNICODE DATABASE.
IT MEANS YOU HAVE TO WAY TO FIND THEM USING YOUR COMPUTER JUST STARTING,
TYPING A MESSAGE.
SO THIS IS A PART OF THE PROJECT.
THE SECOND PART OF THE PROJECT IS, PEOPLE ARE TALKING A LOT ABOUT AFRICAN
PRESENCE ON THE NET.
BUT PART OF THE CONSTRAINT IS ALSO IF YOU ALLOW THEM, AS IN AFGHANISTAN,
WHERE THEY DO SOME LOCALIZATION, AND MY -- THE IDEA OF PUSHING FOR THE
PROJECT COME FROM THE AFGHANISTAN PROJECT THAT HAS BEEN LED BY THE UNDP.
AND MICHAEL EVERSON WAS THE ONE WHO DID THE JOB WITH THEM.
THEY HAVE THREE DIFFERENT LANGUAGES THAT ARE REALLY CLOSE TO MY ENVIRONMENT
LANGUAGES, AND THEY HAVE SOME LOCALIZATION TO DO.
AND HE DID IT WELL.
AND THEN THEY SHOW IN THE PAPER THAT IN THE AFGHANISTAN CYBER CAFE, WOMEN CAN
COME AND GET INTO THE CYBER WITHOUT KNOWING ANY ENGLISH WORDS AND START
READING THE NEWSPAPERS, SENDING MESSAGES.
AND I SAY, WOW, IT IS JUST EXACTLY WHAT WE ARE EXPECTING TO HAPPEN FOR THE
AFRICAN ENVIRONMENT.
BECAUSE WE'RE PUSHING FOR CYBERS, WE'RE PUSHING FOR INTERNET CONNECTIVITY.
BUT IF WE DID NOT DEAL WITH THE AFRICAN CONTENT, WITH THE AFRICAN REQUIREMENT
TO GET THE LANGUAGE INSIDE THE INTERNET, NOTHING WOULD HAPPEN.
SO THAT'S -- THAT'S WHY WE FOCUS ON THE FIRST PART OF THE PROJECT ON THAT
WAY.
THE SECOND ONE IS, WE KNOW THAT IDN IS IMPORTANT.
WE KNOW THAT MANY OTHER EXPERIMENTATION HAVE BEEN DONE IN OTHER PARTS OF THE
WORLD, LIKE IN ASIA-PACIFIC, WE SEE THE CJK WORK, HOW THEY WORK, HOW DID THEY
MAKE THE STANDARD HAPPEN, HOW THEY PARTICIPATE TO THE IETF WORK.
SO WE SAY, WE HAVE ENGINEERS IN AFRICA, BUT THEY DON'T HAVE A SPACE TO
COORDINATE THEIR TECHNICAL EFFORT.
SO WE'RE GOING TO GIVE THEM THAT SPACE.
SO THE AFTLD THAT IS THE NEW ORGANIZATION FOR THE AFRICAN CCTLD, THEY HAVE A
PROJECT AND THEY WILL JOIN TO WORK TOGETHER IN ORDER TO HAVE SOMETHING DONE
THAT WILL BE REQUIRED FOR THE ENVIRONMENT WE ARE HAVING.
SO THESE ARE THE TWO PARTS OF THE PROJECT, AND WE'RE REALLY EXPECTING TO GET
MORE PEOPLE INVOLVED IN THE JOB. BUT AS I SAID, WE, AS INDIVIDUALS, ARE TOO
MONO TASKED AND MONO PROCESSOR MACHINES.
SO WHENEVER WE TRY TO HANDLE SEVERAL THINGS AT THE SAME TIME, IT WON'T BE ON
THE BENEFIT OF THE WHOLE COMMUNITY.
SO WE NEED REALLY TO BE A MULTITHREAD AND MULTITASKING ENVIRONMENT AND TO GET
OTHER PEOPLE TO JOIN IN ORDER TO MAKE THIS HAPPEN.
>>VINT CERF: THANK YOU, MOUHAMET.
I THINK THE INITIATIVE TO GET CONTENT ONTO THE NET IN THESE LANGUAGES IS
VERY, VERY VALUABLE, BECAUSE WITHOUT THAT, WHY WOULD YOU BOTHER HAVING ALL OF
THE LANGUAGES IN PLACE?
SO GOOD IDEA.
THANK YOU, MOUHAMET.
WE MOVE ON NOW TO THE NEXT PRESENTATION, WHICH SPEAKS NOW TO THE ARABIC
EXPERIENCE IN A PILOT DOMAIN NAME PROJECT.
THIS IS ANAS MOHAMMED ASSIRI.
AND HE'S FROM THE SAUDINIC.
>>ANAS MOHAMMED ASSIRI: MR. CHAIRMAN, LADIES AND GENTLEMEN, GOOD AFTERNOON,
ALL.
IT'S A PLEASURE TO STAND IN FRONT OF YOU TODAY AND TRYING TO GIVE YOU AN IDEA
ABOUT THE ARABIC DOMAIN NAMES PILOT PROJECT.
MY NAME IS ANAS MOHAMMED ASSIRI, AND I AM PRESENTING SAUDINIC.
FIRST OF ALL, I'LL GIVE YOU A QUICK REVIEW, THE MAJOR POINTS OF THIS
PRESENTATION.
I WILL BE TALKING ABOUT SAUDINIC AND INTRODUCTION ABOUT SAUDINIC AND
INTRODUCTION ABOUT THE PROJECT AND WHY DO WE NEED IT.
AND THEN I WILL MOVE ON TO THE PROJECT IN DETAIL, TRYING TO DESCRIBE THE
GOALS AND OBJECTIVES OF THAT PROJECT, ABOUT PARTICIPANTS AND DURATION, THE
PHASES AND DELIVERABLES, THE ACCOMPLISHMENTS AND WHAT NEEDS TO BE DONE.
THEN THE LESSONS LEARNED THROUGH THIS PROJECT, WHICH INCLUDES THE OBSTACLES
WE FACED AND THE COMMON PROBLEMS.
FINALLY, OUR RECOMMENDATIONS.
ABOUT SAUDINIC.
SAUDINIC IS A NONPROFIT ENTITY THAT IS OPERATED BY KING ABDULAZIZ CITY FOR
SCIENCE AND TECHNOLOGY IN SAUDI ARABIA.
SAUDINIC IS IN CHARGE OF ADMINISTERING THE DOMAIN NAME SPACE UNDER .SA.
SAUDINIC IS ALSO LEADING THE LOCAL COMMUNITY EFFORT TOWARD ADAPTING THE
ARABIC LANGUAGE IN THE INTERNET.
IT ALSO COORDINATES WITH REGIONAL AND INTERNATIONAL ENTITIES IN ORDER TO
REPRESENT THE LOCAL COMMUNITY'S NEEDS.
ABOUT THE PROJECT.
THE MAIN GOAL OF THE PROJECT IS TO IMPLEMENT A TEST BED FOR ARABIC DOMAIN
NAMES IN THE ARABIC WORLD.
THIS WILL ALLOW ALL ARAB COUNTRIES TO EARLY EXPERIENCE THE USE OF ARABIC
DOMAIN NAMES AND TO IDENTIFY THEIR NEEDS, AGREE ON STANDARDS, AND LOCATE
POSSIBLE PROBLEMS.
AND DEVELOP REQUIRED TOOLS AND POLICIES.
I AM EMPHASIZING HERE THAT THE PROJECT IS NOT COMMERCIAL AND IT HAD BEEN
INITIATED BY A NONPROFIT ORGANIZATIONS.
WHY DO WE NEED ADN?
CURRENT ASCII-BASED DOMAIN NAMES ARE INCAPABLE OF REPRESENTING ARABIC
CHARACTERS.
IT IS DIFFICULT TO REACH ARABIC SITES USING ENGLISH DOMAIN NAMES.
THERE'S THE PROBLEMS OF PRONUNCIATION AND SPELLING PROBLEMS.
A FULLY ARABIC DOMAIN NAME WILL ENCOURAGE ARAB USERS TO WIDELY USE THE
INTERNET.
SOME STATISTICS ABOUT ARABIC USE OF INTERNET OR ARAB INTERNET USERS.
THE POPULATION OF THE ARAB WORLD IS ABOUT 5% OF THE WORLD POPULATION.
ARAB INTERNET USERS REPRESENT ABOUT 1% OF THE WORLD USERS.
LESS THAN 10% OF THEM CAN SPEAK ENGLISH.
SOME OBSTACLES FACING INTERNET USE IN THE ARAB WORLD.
THE MAJOR OBSTACLE IS THE LANGUAGE BARRIER, ESPECIALLY IN THE CONTENT, TOOLS
AND APPLICATIONS, AND IN DOMAIN NAMES.
ALONG WITH THAT, SOME OBSTACLES ARE LOW LEVEL OF TELECOMMUNICATION
INFRASTRUCTURE, THE LACK OF ADEQUATE REGULATIONS, THE HIGH COST OF INTERNET
CONNECTIVITY IN THE ARAB WORLD, AND COMPUTER ILLITERACY.
SOME FACTS ABOUT ARABIC LANGUAGE.
ARABIC LANGUAGE IS SPECIAL AND DIFFERENT THAN OTHER LANGUAGES.
IT CONSISTS OF 28 CHARACTERS.
THE WRITING DIRECTION IS FROM RIGHT TO LEFT.
DIACRITICS ARE USED FOR PRONUNCIATION, WHICH LEAD TO DIFFERENT MEANINGS AS
WELL.
THERE IS TWO SETS OF NUMERALS USED IN ARABIC LANGUAGE, THE ARABIC AND THE
HINDI.
ABBREVIATION IS NOT COMMON IN ARABIC LANGUAGE.
SOME PROPOSED SOLUTIONS FOR ARABIC LANGUAGES IS USING MIXED SCRIPTS IN THE
DOMAIN NAME.
THIS DOES NOT SOLVE THE PROBLEM, SINCE THE USER WILL ARE TO WRITE THE DOMAIN
NAME IN TWO DIFFERENT DIRECTIONS.
THE GOALS AND OBJECTIVES OF THE PROJECT IS TO ESTABLISH AND IMPLEMENT ARABIC
DOMAIN NAMES, TO INCREASE THE INTERNET USE IN THE ARAB WORLD BY MAKING THE
INTERNET EASIER TO USE FOR NATIVE ARABIC SPEAKERS, TO GAIN EXPERIENCE AND
KNOWLEDGE OF USING ARABIC DOMAIN NAMES AND SHARE IT WITH THE INTERNET
COMMUNITY, TEST THE IMPLEMENTATIONS OF ARABIC DOMAIN NAMES BASED ON THE
GUIDELINES DRAFTED BY ARABIC TEAM FOR DOMAIN NAMES, DEVELOP NECESSARY TOOLS
REQUIRED FOR ARABIC DOMAIN NAMES AND DNS, AND PRESENT THE RESULTS OF THIS
PROJECT TO THE INTERNET WORLD.
PARTICIPANTS.
ALL MEMBERS OF ARAB LEAGUE ARE INVITED TO PARTICIPATE IN THIS PROJECT.
DURATION.
THIS PROJECT IS OPEN AND WILL CONTINUE AS A TEST BED UNTIL THE RECOGNITION OF
ARABIC TLDS BY CONCERNED INTERNATIONAL BODIES.
THE MAJOR DELIVERABLES OF THIS PROJECT IS TO ESTABLISH AND ACTIVATE THE
STEERING AND TECHNICAL COMMITTEES; PREPARE AND MAINTAIN A WEB SITE FOR THE
PROJECT; PREPARE THE ARABIC DNS ROOT SERVERS; PREPARE THE ARABIC CCTLD
SERVERS FOR THE PARTICIPATING COUNTRIES, AND CONNECT THEM WITH THE ARABIC
ROOT SERVERS; REGISTER AND TEST ARABIC DOMAIN NAMES; TEST AND DEVELOP TOOLS
SUPPORTING THE USE OF ARABIC DOMAIN NAMES AND DNS.
ALSO TEST AND DEVELOP END USER APPLICATIONS OR BROWSERS, E-MAIL CLIENTS, TO
ENSURE SUPPORT FOR ARABIC DOMAIN NAMES.
ALSO DRAFTING TECHNICAL GUIDELINES AND DEFINE POLICIES AND REGULATIONS FOR
REGISTERING ARABIC DOMAIN NAMES.
AND ALSO PARTICIPATING IN LOCAL AND REGIONAL ACTIVITIES RELATED TO ARABIC
DOMAIN NAMES.
PHASES OF THE PROJECT.
PHASE 1 WAS THE LINGUISTIC ISSUES.
TO DEFINE THE ACCEPTED ARABIC CHARACTER SET TO BE USED FOR WRITING ARABIC
DOMAIN NAMES.
THIS HAD BEEN DONE MAINLY THROUGH LOCAL COMMUNITY EFFORT, THROUGH LINGUISTIC
COMMUNITIES, ARAB LINGUISTS, PUBLISHED PAPERS, AND THROUGH CONDUCTING WEB
SURVEYS.
THE MAJOR DEVELOPERS OF THIS PHASE DEFINED THE SET OF ACCEPTED ARABIC
CHARACTER SET FOR ARABIC DOMAIN NAMES.
THE SECOND PHASE WAS CHOOSING THE TECHNICAL SOLUTION.
A TECHNICAL SOLUTION THAT SATISFIES THE PROJECT OBJECTIVES MUST BE FOUND AND
TESTED.
THE IDN.IDN WAS CHOSEN AS THE TECHNICAL SOLUTION.
THE THIRD PHASE, ARABIC TLDS.
TO DEFINE THE TOP-LEVEL ARABIC DOMAIN NAMES. AND THIS HAS BEEN DONE, AN
INTERNET DRAFT THAT DEFINES THE ARABIC CCTLDS FOR ALL THE ARAB COUNTRIES WAS
RELEASED.
PHASE NUMBER 4, THE IMPLEMENTATION OF THE PROJECT ON COUNTRY LEVEL.
THE PROJECT WAS IMPLEMENTED INDIVIDUALLY BY SOME OF THE ARAB COUNTRIES.
THOSE COUNTRIES IMPLEMENTED THE PROJECT LOCALLY HAD ESTABLISHED CCTLD ROOT
SERVERS IN THEIR COUNTRIES.
PHASE NUMBER 5 IS THE IMPLEMENTATION ON THE GCC LEVEL.
DURING THE GCC CCTLD GROUP MEETING ON 7 OF MARCH 2004, A TECHNICAL PROPOSAL
FOR IMPLEMENTING ARABIC DOMAIN NAMES IN THE GCC COUNTRIES WAS PRESENTED AND
ACCEPTED.
A TECHNICAL TASK FORCE WAS FORMED AND ASSIGNED THE TASK TO IMPLEMENT THE
PROPOSAL.
THE MAJOR DELIVERABLES OF THIS PHASE, SETTING UP ARABIC GCC ROOT SERVERS,
RESOLVING ARABIC GCC DOMAIN NAMES, TESTING OTHER DNS SOFTWARE AND BROWSERS.
WRITING TECHNICAL DOCUMENTS ABOUT THE GAINED EXPERIENCE; STUDYING THE CURRENT
AVAILABLE POLICIES FOR DOMAIN REGISTRATION AND WRITING POLICIES AND
REGULATIONS FOR REGISTERING ARABIC DOMAIN NAMES; ALSO BUILDING A WEB SITE FOR
THE PROJECT AND PUBLISHING SOME TOOLS AND USEFUL DOCUMENTS THROUGH THAT WEB
SITE; ENCOURAGING THE ARAB COUNTRIES TO PARTICIPATE IN THE PROJECT; AND
REGISTERING SOME TEST DOMAIN NAMES.
PHASE NUMBER 6, THE IMPLEMENTATION ON THE LEVEL OF ARAB COUNTRIES.
ON MAY 2005, THE SECOND MEETING OF THE WORKING GROUP ON ARABIC DOMAIN NAMES,
WHICH HAS BEEN CONDUCTED IN CAIRO, RECOMMENDED THE EXTENSION OF THE GCC PILOT
PROJECT FOR THE ARABIC DOMAIN NAMES TO INCLUDE ALL MEMBERS OF THE ARAB
LEAGUE.
THE PROJECT HAD BEEN RENAMED TO BECOME ARABIC DOMAIN NAMES PILOT PROJECT.
AND IT WILL BE UNDER THE SUPERVISION OF THE ARAB LEAGUE.
AND THIS IS THE PHASE WE ARE IN NOW.
AND WE ARE HOPING THAT IT WILL CONTINUE TO -- WE HOPE ALL THE ARAB COUNTRIES
WILL PARTICIPATE IN THIS PHASE, AND EVEN SOME OTHER COUNTRIES OTHER THAN THE
ARAB LEAGUE.
ACCOMPLISHMENTS.
THE COUNTRIES PARTICIPATED IN THE PROJECT SO FAR, ORDERED BY THE DATE OF
PARTICIPATION, UNITED ARAB EMIRATES, SAUDI ARABIA, QATAR, OMAN, PALESTINE,
EGYPT, TUNISIA, AND SYRIA.
SOME OTHER ACCOMPLISHMENTS OF THE PROJECT, THE STEERING COMMITTEE PRODUCED A
NUMBER OF POLICY DOCUMENTS.
GUIDELINES FOR AN ARAB DOMAIN NAME SYSTEM, TERMS AND CONDITIONS,
PARTICIPATION POLICY FOR ARABIC CCTLD MANAGERS, GUIDELINES FOR FORMING ARABIC
DOMAIN NAMES.
A TECHNICAL COMMITTEE PRODUCED A NUMBER OF TECHNICAL DOCUMENTS.
GENERAL TECHNICAL INTRODUCTION, HOW TO SET UP ARABIC ROOT SERVERS, AND HOW TO
SET UP ARABIC CCTLD SERVERS, HOW TO RESOLVE ARABIC DOMAIN NAMES, AND
REQUIREMENTS FOR RESOLVING ARABIC DOMAIN NAMES FOR END USERS.
ALSO, THE TECHNICAL COMMITTEE HAD PRODUCED A NUMBER OF TOOLS, AN IDN/ADN
CONVERTING TOOL THAT CONVERTS DOMAIN NAMES FROM IDN TO ASCII AND VICE VERSA.
THE DNS CHECKER FOR DOMAIN NAMES THAT CHECKS IF AN IDN DOMAIN NAME IS HOSTED
ON ANY SERVER.
HOST CHECKER FOR ARABIC DOMAIN NAMES THAT RESOLVES IDN DOMAINS TO
CORRESPONDING I.P. ADDRESSES AND VICE VERSA.
ZONE FILE EDITOR THAT CREATES AND MANAGES ARABIC ZONE FILES EASILY USING THE
ZONE EDITOR.
ARABIC DOMAIN NAME REGISTRATION SYSTEM THAT MANAGES ARABIC DOMAIN NAME
REGISTRATION SYSTEM.
WHAT IS REMAINING?
EXPANDING THE PROJECT TO INCLUDE ALL ARAB COUNTRIES.
FIND SOLUTIONS FOR SOME OF THE TECHNICAL PROBLEMS ENCOUNTERED.
COORDINATING OUR EFFORTS WITH OTHER SIMILAR PROJECTS IN THE AREA.
AND WITH SIMILAR PROJECTS WORLDWIDE.
PRESENT OUR NEEDS TO INTERNATIONAL ORGANIZATIONS.
THE REGISTRATION PROCESS.
SAUDINIC HAS OPENED REGISTRATION FOR ARABIC DOMAIN NAMES UNDER DOT
(INAUDIBLE) FOR TEST PURPOSES ON THE 26TH OF SEPTEMBER 2005.
REGISTRATION IS FOLLOWING THE REGULATIONS OF THE ARABIC DOMAIN NAMES PILOT
PROJECT.
SAUDINIC HAS PUBLISHED A WEB SITE THAT PROVIDES FOR USERS THE ABILITY TO
REGISTER ARABIC DOMAIN NAMES.
IT ALSO PROVIDES SOME OTHER SERVICES, LIKE ARABIC DOMAIN NAMES WHOIS, WHICH
WE ARE SEEING NOW ON THIS SLIDE.
USERS CAN USE THIS FACILITY TO INQUIRE AN ARABIC DOMAIN NAME AND RECEIVE
INFORMATION.
THIS IS AN ONLINE FORM USED TO REGISTER ARABIC NAMES, ANOTHER FORM THAT IS
USED TO MODIFY INFORMATION OF ARABIC DOMAIN NAMES.
LESSONS LEARNED.
OBSTACLES.
NOT HAVING IDN ROOT SERVERS BY ITSELF IS A BIG OBSTACLE.
REGISTRANTS ARE NOT ENCOURAGED TO PROMOTE THE REGISTRATION OF ADN SINCE THE
CONCERNED INTERNATIONAL BODIES HAD NOT ADAPTED IDN.IDN.
NOT ALL BROWSER VENDORS HAVE IMPLEMENTED IDN.
COORDINATION WITH DNS RESOLVERS OPERATORS, FOR EXAMPLE, LOCAL ISPS TO SUPPORT
OUR PROJECT IS ONE OF THE MAJOR OBSTACLES.
SOME COMMON PROBLEMS WE ENCOUNTERED THROUGH THE PROCESS OF REGISTERING ARABIC
DOMAIN NAMES.
USERS SUBMIT MANY REQUESTS FOR ARABIC DOMAIN NAMES WITH ARABIC SPELLING
MISTAKES, WITH ENGLISH PRONUNCIATION IN ARABIC LETTERS, WITH POPULAR AND
GENERAL NAMES FOR FUTURE SELLING PURPOSES.
SOME OTHER PROBLEMS: IT IS DIFFICULT TO COORDINATE AND CONVINCE EVERY ISP TO
JOIN THE PROJECT SO USERS CAN REACH ADN.
MANY USERS ARE USING SOFTWARE LIKE IE VERSION 6 OR OLDER THAT IS NOT -- THAT
WAS NOT SUPPORTING IDN.
FINALLY, OUR RECOMMENDATIONS.
WE STRONGLY BELIEVE THAT USERS IN LOCAL COMMUNITIES HAVE THE RIGHT TO BE
SERVED IN THEIR OWN LANGUAGES.
INTERNATIONAL BODIES SHOULD START PLANNING FROM THE ACTUAL NEEDS OF THE
USERS.
THANK YOU FOR LISTENING.
[ APPLAUSE ]
>>VINT CERF: I WONDER IF I COULD GET ONE BIT OF CLARIFICATION.
ARE THE -- ARE THESE DOMAIN NAMES OPERATED IN A ROOT SERVER WHICH IS
COMPLETELY INDEPENDENT OF THE ICANN ROOT SERVERS?
AND IF THEY ARE, THEN I'M WONDERING ABOUT HOW ANYONE IN THE REST OF THE WORLD
WILL BE ABLE TO INTERACT WITH THE ROOT SERVERS THAT YOU'RE RUNNING.
>>ANAS MOHAMMED ASSIRI: THE SCOPE OF THIS PROJECT IS TO TEST IT THROUGH THE
PARTICIPATING COUNTRIES ONLY.
SO ONLY USERS IN THOSE COUNTRIES CAN REACH THESE ARABIC DOMAIN NAMES.
NOT THE REST OF THE WORLD, UNTIL THE IDN.IDN WILL BE SUPPORTED BY ICANN.
YEAH.
OKAY.
THANK YOU.
>>RAM MOHAN: I HAVE A QUESTION FOR YOU.
>>VINT CERF: RAM.
>>RAM MOHAN: COULD YOU DEFINE AN ARABIC DOMAIN NAME?
IS IT A NAME EQUIVALENT TO A GCC COUNTRY CCTLD?
OR WHAT EXACTLY IS THE DEFINITION OF AN ARABIC DOMAIN NAME?
>>ANAS MOHAMMED ASSIRI: OKAY.
AN ARABIC DOMAIN NAME IS SIMILAR TO ANY DOMAIN NAME.
IT CONSISTS OF TWO BARS.
THE FIRST PART AND THE TOP-LEVEL DOMAIN NAME.
THE FIRST PART IS CHOSEN BY THE OWNER OF THE DOMAIN.
HE CAN CHOOSE HIS COMPANY NAME OR ORGANIZATION NAME, WHATEVER, IN ARABIC
LANGUAGE.
AND THE TLD OR THE TOP-LEVEL DOMAIN, IS PRESET THROUGH THE AGREED-UPON
DOCUMENTS.
>>RAM MOHAN: JUST A QUICK FOLLOW-UP.
COULD YOU DEFINE WHAT THE AGREED-UPON LABELS ARE.
>>ANAS MOHAMMED ASSIRI: YES, THE CCTLD LABELS, WHICH CORRESPONDS TO THE
COUNTRIES OR COUNTRY NAMES HAVE BEEN AGREED UPON.
AND THERE IS A DRAFT THAT HAS BEEN PUBLISHED THAT CONTAINS EACH COUNTRY AND
THE EQUIVALENT CCTLD IN ARABIC.
>>RAM MOHAN: THANK YOU.
>>ANAS MOHAMMED ASSIRI: OTHER QUESTIONS?
>> WILLIAM TAN: HI.
YES, WILLIAM TAN, NEUSTAR.
IN YOUR EXPERIENCE WITH THE TEST BED, DO YOU FIND THAT USERS HAVE TROUBLES
TYPING IN THE ASCII PROTOCOL SCHEME NAME FOR THE URL, LIKE HTTP, COLON,
SLASH, AND THE DOTS AND STUFF LIKE THAT.
>>ANAS MOHAMMED ASSIRI: YOU MEAN WRITING THE ARABIC NAME?
>> YEAH, SO YOU HAVE AN ARABIC NAME IN THE URL.
DO YOU FIND PEOPLE HAVE TROUBLE TYPING IT OUT?
>>ANAS MOHAMMED ASSIRI: THE WAY WE IMPLEMENTED THE PROJECT, THEY SHOULDN'T
HAVE ANY PROBLEM, SINCE THE WHOLE NAME IS IN ARABIC.
SO THEY SHOULDN'T HAVE ANY PROBLEM.
>> WHAT IF THEY NEED TO TYPE IN HTTP COLON SLASH.
>>ANAS MOHAMMED ASSIRI: YOU CAN TRY IT YOURSELF.
YOU CAN WRITE THE ARABIC DOMAIN NAME WITHOUT TYPING THAT.
IT WILL WORK.
>> WILLIAM TAN: OKAY.
BUT IF THEY HAD TO DO IT, WILL IT BE VERY CUMBERSOME FOR THEM TO DO IT?
>>ANAS MOHAMMED ASSIRI: ACTUALLY --
>> WILLIAM TAN: IN THE CURRENT KEYBOARDS IN YOUR OPERATING SYSTEMS?
>>ANAS MOHAMMED ASSIRI: YOU MEAN IF THEY USE --
>> WILLIAM TAN: CAN YOU SWITCH BACK AND FORTH BETWEEN ASCII --
>>ANAS MOHAMMED ASSIRI: THIS IS A PROBLEM, SWITCHING BETWEEN TWO DIFFERENT
SCRIPTS IS ACTUALLY A MAJOR PROBLEM.
ONE OF THE MAJOR OBJECTIVES OF THE PROJECT IS TO ELIMINATE THIS PROBLEM,
GIVING THE USE OF THE ABILITY TO WRITE THE FULL DOMAIN NAME IN ARABIC, IN HIS
OWN LANGUAGE.
BUT ALSO SHOULD SUPPORT ALSO WRITING THE DOMAIN NAME IN MORE THAN ONE
LANGUAGE.
HE CAN WRITE IN ARABIC WITH SOME OTHER LETTERS IN ENGLISH, FOR EXAMPLE.
>> WILLIAM TAN: OKAY.
THANK YOU.
>>RAM MOHAN: I HAD ONE MORE QUESTION.
THIS IS RAM MOHAN.
YOU WERE TALKING ABOUT THE DIFFICULTY WHERE SOME ISPS HAVE NOT IMPLEMENTED
THE CHANGES.
COULD YOU WALK THROUGH WHAT THE USER EXPERIENCE IS WHEN THE USER TYPES IN THE
DOMAIN AND THEY'RE USING AN ISP WHO DOES NOT SUPPORT IT.
>>ANAS MOHAMMED ASSIRI: UNLESS THE ISP SUPPORTS THE IDN PROJECT, THE USER
WILL NOT BE ABLE TO USE THE ARABIC DOMAIN NAMES.
THE PROBLEM WE FACED WITH ISPS IS THAT THEY ARE NOT CONVINCED TO JOIN THE
PROJECT SINCE THEY DON'T SEE ANY PROFIT IN REGISTERING ARABIC DOMAIN NAMES.
AND THEY ARGUE WITH US, WHY SHOULD WE REGISTER ARABIC DOMAIN NAMES IF
EVENTUALLY IT WILL NOT BE SUPPORTED.
>>RAM MOHAN: TECHNICALLY SPEAKING, THEY GET AN EXTRA (INAUDIBLE) BACK.
WHAT DO THEY GET BACK?
DO THEY GET -- ON A WEB SITE, DO THEY SEE A 404?
DO YOU HAVE TRANSLATIONS INTO -- AT THE HTTP AND OTHER PROTOCOLS, DO YOU HAVE
TRANSLATIONS?
OR IS IT JUST A -- AT THE DNS LEVEL, IF THE ISP DOES NOT SUPPORT THIS
ENCODING, THE -- WHAT DOES THE ISP SEND BACK, WHAT DO THE NAME SERVERS SEND
BACK?
>>ANAS MOHAMMED ASSIRI: ACTUALLY, IF THE ISP DOES NOT SUPPORT OR DID NOT JOIN
THE PROJECT, THE USER WILL NOT BE ABLE TO USE IDN.
I MEAN, THE REQUESTED THAT BE TRANSLATED THROUGH THE BROWSER AND TO THE XN
DASH DASH FORMAT WILL NOT RESOLVE TO ANYTHING.
SO IT WON'T BE ABLE TO REACH ANYTHING.
(INAUDIBLE) WILL BE SOMETHING LIKE THAT.
>>VINT CERF: BUT AN IMPORTANT POINT IS THAT THEY ARE USING IDNA TRANSLATIONS.
SO AT THE LOOKUP LEVEL, THEY WOULD BE COMPATIBLE ON THE PRESUMPTION THAT THEY
REACH A SERVER THAT CAN DO THE LOOKUP.
SO PART OF THE -- I THINK I HAVE TWO CONFUSIONS HERE.
IF YOU AVOIDED -- I'M SORRY, I CAN'T LOOK AT YOU WHILE I'M ASKING THE
QUESTION, BECAUSE I HAVE TO TALK INTO THIS DAMN MICROPHONE.
LET'S SEE.
NO, DOESN'T WORK.
SORRY.
>>ANAS MOHAMMED ASSIRI: IT'S OKAY.
>>VINT CERF: WHEN YOU -- WHEN YOU'RE RUNNING APPLICATIONS LIKE E-MAIL AND
BROWSERS, I CAN SORT OF UNDERSTAND MANAGING TO AVOID TYPING HTTP, COLON, AND
SO ON FOR A KIND OF CONVENTIONAL WEB LOOKUP.
BUT I'M NOT SURE WHAT HAPPENS WHEN YOU'RE DOING E-MAIL, SINCE NOW YOU NEED TO
HAVE "@" SIGNS, AT LEAST IN THE CONVENTIONAL WAY.
SO SEVERAL OF US, I THINK, ARE INTERESTED.
I HAVE AN IDEA.
INSTEAD OF GETTING AN ANSWER THIS WAY, MAYBE THERE'S AN OPPORTUNITY THIS WEEK
TO ACTUALLY LOOK AT WHAT'S BEEN DONE, IF YOU ARE ABLE TO SHOW IT TO US.
AND THAT WILL BE A BETTER WAY TO FIND OUT.
SO WHY DON'T WE PROPOSE TO DO THAT.
WOULD THAT BE OKAY?
>>ANAS MOHAMMED ASSIRI: YES, NO PROBLEM.
>>VINT CERF: TERRIFIC.
NO PROBLEM.
KHALED.
>>KHALED FATTAL: THANK YOU, VINT.
ACTUALLY, FIRST I WOULD LIKE TO CONGRATULATE THE ARABIC COMMUNITY FOR A GREAT
INITIATIVE, GREAT WORK.
AND MAYBE THE POINT A BIT LOST, BECAUSE I WANTED TO ANSWER A QUESTION THAT
CAME FROM WILLIAM EARLIER ON FROM NEUSTAR.
IT'S IMPORTANT TO KEEP IN PERSPECTIVE THE OBJECTIVE OF THE PROJECT.
THE OBJECTIVE OF THE PROJECT, AND THAT PERTAINS TO YOUR QUESTION, IS THAT IS
THERE CONFUSION WHEN YOU TRY TO SWITCH TO HTTP FROM BEING ABLE TO WRITE IT
FULLY IN ARABIC.
THE OBJECTIVE OF THE PROJECT IS TO MAKE IT POSSIBLE FOR THOSE WHO DO NOT
SPEAK ENGLISH TO BE ABLE TO ACCESS THE INTERNET.
I MEAN, THAT'S -- SO IN THAT SENSE, IF YOU ARE MAKING IT ACCESSIBLE FOR THOSE
PEOPLE WHO HAVE BEEN LEFT OUT, THESE PEOPLE DON'T NEED TO GO AND WORRY ABOUT
TYPING HTTP.
AND WE'VE SOLVED AT LEAST A HUGE OBSTACLE.
AND THEN WE MOVE ON TO THE NEXT OBSTACLE.
SO I HOPE THAT CLARIFIES IT.
THANK YOU.
>>ANAS MOHAMMED ASSIRI: OKAY.
THANK YOU.
>>VINT CERF: THANK YOU, KHALED.
OKAY.
AND THANK YOU VERY MUCH.
OUR LAST SPEAKER --
[ APPLAUSE ]
>>VINT CERF: OUR LAST SPEAKER IS GOING TO GIVE US AN UPDATE ON PERSIAN IDN
DEVELOPMENT. AND THIS WILL BE GIVEN BY ALIREZA SALEH.
AND I APOLOGIZE IF I'M NOT PRONOUNCING THAT RIGHT, FROM THE IR NIC.
>>ALIREZA SALEH: THANK YOU, MR. VINT CERF.
I'M GOING TO GIVE A PRESENTATION ON IDN DEVELOPMENT IN.
WE ALSO GAVE SOME OTHER PRESENTATION ABOUT OUR TECHNICAL PROCEDURE AND
STATISTICS ON THE PREVIOUS ICANN MEETINGS.
ON THE PREVIOUS ICANN MEETING, ONE OF THEM WAS AT THE BEGINNING OF STARTING
OUR TEST BED, AND THE OTHER ONE WAS AT THE END OF OUR SUNRISE PERIOD.
NOW THE REGISTRATION HAS BEEN OPENED FOR ABOUT SEVEN MONTHS, AND IT MIGHT BE
USEFUL TO TALK ABOUT IT AGAIN IN MARRAKECH THAT MANY ARABIC SPEAKERS AND
AUDIENCE.
AS YOU KNOW, ARABIC AND PERSIAN HAVE THE SAME SCRIPT.
JUST GIVE A SHORT REPORT ABOUT THE REGISTRATION.
AS YOU SEE IN THIS SLIDE, IT'S ABOUT 20 DOMAINS A WEEK.
AND WE HAVE 2,322 IDN DOMAINS WITHOUT CALCULATING THE BUNDLES.
AND IF YOU JUST WANT TO CALCULATE THE BUNDLES, IT WOULD BE 3,632 WITH
BUNDLES.
AND ABOUT 80% OF THEM ARE NOT ACTIVE.
AND IF YOU TYPE IT IN, IT WILL NOT RESOLVE.
OKAY, I WILL EXPLAIN IT LATER.
>>VINT CERF: OKAY.
>>ALIREZA SALEH: BUT THE ACTUAL NUMBER OF DOMAINS ARE NOW GETTING RESOLVED
ARE FOUR TIMES MORE THAN THE ONE YOU WILL SEE IN THIS SLIDE, TWO TIMES FOR
DOT IRAN, BECAUSE OUR TOP-LEVEL DOMAIN, .IRAN, HAS A CHARACTER THAT SHOULD
CALCULATE AS A BUNDLE.
AND TWO TIMES FOR APPENDING .IR AT THE ENDING OF THE DOMAIN.
AND AS YOU SEE, WE JUST RECEIVED 750 QUERIES PER WEEK FOR THIS, THAT WOULD BE
LESS THAN ONE QUERY PER DOMAIN.
DURING THE TEST PERIOD, WE DEFINED 15 IDN TOP-LEVEL DOMAINS FOR REGISTRATION,
BUT WE DECIDED TO LIMIT THE REGISTRATION ON DOT IRAN AFTER ANALYZING THE
STATISTICS DURING THE TEST BED PERIOD.
ALSO, WE ASSUMED THAT -- ALSO, WE ASSUME IT MAY HELP US TO QUESTION ICANN TO
ASSIGN IT AS AN IDN CCTLD EQUIVALENT TO OUR CCTLD.
REGISTRATION IS NOT OPEN AND EXCEPT FOR THE RESTRICTED NAMES. WE ALSO HAVE
TWO SUNRISE PERIODS FOR WHOM THAT HAS OFFICIAL REGISTRATION PERIODS, AND THE
LAST FOR THE PARTICIPANTS IN OUR TEST BED.
(INAUDIBLE) BECAUSE WE DIDN'T SEE SO MUCH INTEREST IN IDN DOMAINS. AS YOU
CAN SEE IN THIS SLIDE, WE JUST LISTED SOME OF THE REASONS THAT WE DISCOVERED
DURING THE RESEARCH.
ACTUALLY, USERS ARE USED TO USING ASCII LABELS BECAUSE I THINK THAT INTERNET
USER IN IRAN IS EDUCATED ENOUGH TO USE ENGLISH WORD. SO THEY PREFER TO USE
ASCII LABELS.
ACTUALLY, WHEN YOU TRY FOR TECHNICALS, THEY PREFER TO USE IP ADDRESSES
INSTEAD OF DOMAIN NAMES ALSO.
ACTUALLY, IN IRAN WE DON'T HAVE ANY FULLY PERSIAN MICROSOFT PRODUCT, AND A
FULLY PERSIAN LINUX HAS BEEN RELEASED ON (INAUDIBLE) THAT HELPS TO USAGE OF
IDN.
USERS, MOST OF THEM WHO ARE -- WHOM ARE USING MICROSOFT PRODUCTS NEED TO DO
SOME EXTRA WORK IN ORDER TO USE THE IDN DOMAINS. THEY ARE HAPPY WITH THE
CURRENT DOMAIN SYSTEM. THAT'S WHY THEY DON'T HAVE ANY INTEREST IN INSTALLING
PLUG-INS OR NEW BROWSER. BUT I HAVE HEARD GOOD NEWS FROM MICHEL ABOUT
MICROSOFT. IT IS CONFUSABLE FOR DOMAIN ADMINS TO CONFIGURE IDN DOMAIN.
ESPECIALLY SOMETIMES THEY (INAUDIBLE) THE BOT TO HAVE DOMAIN (INAUDIBLE)
APPLICATION, INCLUDING BUNDLES.
IRANI, ARABIC AND PERSIAN VERSION OF TLD.
IT SEEMS THAT THE USERS WOULD PREFER DIRECTLY USE OF UNICODE INSTEAD OF
PUNYCODE, SO WORKING HARDER FOR CHANGING THE PROTOCOL MAY BE A SOLUTION TO
THAT.
IDN.IDN WILL BE RESOLVED IF IT HAS RESOLVED (INAUDIBLE). SO WE APPEND THE IR
AT THE END OF EACH DOMAIN AND WE ARE LOOKING FOR ANY KIND OF IDN.IDN BEFORE
WE START PATCHING THE ISPS.
I WANT TO MOVE TO THE TECHNICAL ASPECT OF OUR SYSTEM. AS YOU SEE, ALL
RECOMMENDED RFCS WERE IMPLEMENTED, ESPECIALLY THE PARTS THAT ADDRESSED THE
BIDIRECTIONAL PROBLEMS.
A CHARACTER TABLE HAS ALSO DEFINED FOR PERSIAN REGISTRATION THAT IS WAITING
FOR A LONG TIME IN A QUEUE TO BE PUBLISHED BY IANA.
IN THIS INSTANCE I JUST GIVE AN EXAMPLE OF BIDIRECTIONAL PROBLEM THAT MAY
CAUSE FOR RIGHT TO LEFT AND LEFT TO RIGHT. AS YOU SEE IN THIS SLIDE, IT'S
EVEN HARD TO GRAB OUT THE -- TO JUST FIND OUT THE TLD THAT IS DOT IRAN. IT
IS VERY HARD TO FIND.
AND THE DECODED VERSION, IT'S THAT ONE. BUT NUMBERS AND SIGNS WILL FOLLOW
THE CONTEXT, AND IT WILL BE ALL RIGHT TO BE USED IN A LABEL.
WHAT IS OUR BUNDLE? IF A LABEL HAS A NUMBER OF ONE OF THE CHARACTER FROM THIS
TABLE WE WILL CALCULATE THE BUNDLE FOR IT. THE TWO CHARACTER KEH AND YEH
CAUSE A PROBLEM ESPECIALLY WHEN THEY ARE IN LOWER CASE AS YOU SEE IN THIS
EXAMPLE.
NUMBERS CAN ALSO CAUSE A PROBLEM BECAUSE THEY LOOK DIFFERENT ON THE --
BECAUSE THEY LOOK DIFFERENT ON A DIFFERENT -- ON DIFFERENT FONTS OR IN A
DIFFERENT SYSTEM.
AS YOU SEE, THE BUNDLE CAN BE UP TO SIX DOMAINS. IF THE APPLICANT USES BOTH
KEH AND YEH IN A LABEL, YOU DON'T CALCULATE BUNDLE LABELS FOR THE COMBINATION
OF SOME ARABIC AND SOME PERSIAN CHARACTERS. IT MEANS THAT THE DOMAIN SHOULD
BE FULLY PERSIAN OR SHOULD BE FULLY ARABIC CHARACTERS.
WHEN A DOMAIN IS SUBMITTED, WE CHECK THE VALIDITY OF THE DOMAIN. WE CHECK IT
AGAINST THE RFCS AND OUR CHARACTER TABLE.
THEN WE MODIFY IT TO THE FULLY PERSIAN FORMAT -- THEN WE MODIFY IT TO THE
FULLY PERSIAN FORMAT. THIS IS THE NORMALIZATION PROCESS.
WE REMOVE ALL ZERO REJOINER AND NONJOINER FROM THE NORMALIZED LABEL AND THEN
COMPUTE THE BUNDLE. AS YOU SEE IN THIS SLIDE REMOVING THE ZERO JOINER MAY
COMPLETELY CHANGE THE MEANING OF THE WORLD.
THEN WE JUST COMPUTE ALL ASCII EQUIVALENTS OF ALL THE BUNDLE, AND IF THE MAIN
REGISTERED LABEL IS TOO LONG, AFTER TO ASCII (INAUDIBLE), WE JUST THROW OUT
THE FAILED CASES.
AT LAST, WE SHOW THE -- SHOWING THE APPLICANT THAT -- THE EXACT IMAGE THAT HE
IS GOING TO REGISTER. THIS MAY HELP HIM TO PREVENT USE OF INCORRECT DOMAINS
BECAUSE OF USE OF NONSTANDARD KEYBOARDS OR FONTS.
IF YOU SEE IN THIS SLIDE, I JUST WANT TO SHOW THE PROCEDURE. THE INPUT LABEL
IS THIS ONE IN YEH, THAT'S THE YEH. THAT CHARACTER WOULD BE IN ARABIC. THAT
IS NOT CORRECT FOR PERSIAN REGISTRATION.
THEN WE NORMALIZE IT TO THIS ONE. AND AFTER THAT, WE GENERATING IMAGE. AND
THEN WE SIMPLIFY THE LABEL. BUT THE PROBLEM IS THAT WHEN WE SIMPLIFY THE
LABEL, THE MEANING OF THE LABEL IS TOTALLY CHANGED.
THEN WE JUST CREATE THE WHOLE BUNDLE THAT WOULD BE SIX, BECAUSE TWO BECAUSE
OF THE PERSIAN AND ARABIC KEH AND YEH, AND ALSO BECAUSE OF -- BECAUSE OF THE
NUMBERS.
AND THEN WE JUST GENERATE THE PUNYCODE BUNDLE.
WE ARE TRYING TO HELP IDN USERS, SO WE SEND THE DETAILED CONFIGURATION FOR
(INAUDIBLE) WHEN A DOMAIN GETS ACTIVE. ALSO THERE IS A WEB INTERFACE THAT
GENERATES THIS CONFIGURE ON THE E-MAIL.
WE ALSO PREPARE SOME GUIDES ABOUT IDN. WE HAVE A SOFTWARE THAT HELPS HOSTING
COMPANIES TO HOST AN IDN DOMAINS AND ALL ITS BUNDLES.
AS WE ARE USING BUNDLES, WE CHOOSE TWO SCENARIOS TO EXPORT REGISTERED DOMAIN
TO THE DNS SERVER. WE TESTED BOTH -- WE TESTED BOTH DNAMES AND NS RECORDS
AND THIS IS JUST THE THING WE OBSERVED ABOUT THE TEST, AND WE HAVE NOT
EXACTLY ANALYZED IT, RESULT AND THE STATISTICS ABOUT IT.
JUST HAVING A LARGER FILES WITH MORE RECORD, TAKING MORE TIME TO RELOAD,
ASSIGNS UP TO TWO-12 DIFFERENT DOMAINS TO AN APPLICANT, REDUCE THE BANDWIDTH
USAGE, REDUCE THE DELAY IN RESOLVING A DOMAIN.
AND THEN WE ARE USING THE DNAMES TO DECREASE THE SIZE OF THE ZONE FILES,
DECREASE THE TIME OF RELOADING THE ZONE, INCREASE THE BANDWIDTH USAGE AND
DELAY, AND IT WOULD BE REALLY ISSUE THAT YOU HAVE A VERY SLOW CONNECTION,
BECAUSE YOUR QUERY MAY BE TIMED OUT.
BUT IT IS EASIER TO MANAGE AND MAINTAIN.
AT LAST, I WANT TO THANK DR. SIAVASH SHAHSHAHANI AND ROOZBEH POURNADER THAT
HELP US TO HAVE THIS HAPPEN. AND I JUST WANT TO ADD SOMETHING ABOUT THE
WHOIS. AS SABINE SAID ABOUT DOT DE, WE HAVE ALSO IMPLEMENTED SOME CONDUCT
THAT SUPPORTS IDN. THANK YOU SO MUCH.
[ APPLAUSE ]
>>VINT CERF: FIRST OF ALL, WE'RE GOING TO WRAP UP IN A LITTLE WHILE, BUT I
WANTED TO THANK OUR THREE GUESTS ESPECIALLY IN THIS SECOND HALF.
IT'S CLEAR THAT A SUBSTANTIAL AMOUNT OF WORK IS GOING ON TO TRY TO COPE WITH
LANGUAGES AND PARTICULARLY SCRIPTS OTHER THAN ROMAN ONES. AND IT'S
TREMENDOUSLY ENCOURAGING TO KNOW THAT THERE IS THIS AMOUNT OF ENERGY GOING
INTO THE PROBLEMS.
I'D LIKE TO ENCOURAGE CONTINUED INTERACTION WITH THESE PROJECTS, AND
PARTICULARLY, TINA, IF THERE'S SOME RESULTS FROM DNAMES, FOR INSTANCE, THAT
SHOULD FACTOR INTO ANY OF EVALUATIONS THAT WE MIGHT DO OF THAT TECHNIQUE.
IT'S PRETTY CLEAR, I HOPE ANYWAY, FROM BOTH THE FIRST AND SECOND SESSIONS
THAT THERE ARE A LOT OF COMPLEXITIES ASSOCIATED WITH INTRODUCING IDNS. AND I
HOPE THAT YOU ALSO DETECTED THE LEVEL OF PASSION FOR TRYING TO SUPPORT
NON-ROMAN CHARACTER SETS IN THE USE OF INTERNET.
I THINK THAT IT'S ENTIRELY UNDERSTANDABLE THAT PEOPLE WOULD LIKE TO GET TO
THAT POINT. I'M ONE OF THEM.
GETTING THERE IN A WAY THAT PROMOTES INTEROPERABILITY WHERE IT'S NEEDED IS, I
THINK, A VERY IMPORTANT ELEMENT IN -- AND CHALLENGE IN MAKING THIS WORK FOR
EVERYONE.
SOMETHING THAT JOHN KLENSIN MENTIONED EARLIER IS ALSO ENCOURAGING, AND THAT
IS THAT IF, IN FACT, SOME OF THE IMPLEMENTATIONS SUPPORT COMMUNITIES WHO ARE
USING A COMMON SCRIPT AND LANGUAGE, EVEN IF THEY DON'T PERFECTLY SUPPORT
INTERCHANGE BETWEEN COMMUNITIES, MAY STILL PROVE TO BE USEFUL.
I THINK MOUHAMET'S OBSERVATION THAT INTERNET INVITES COMMUNITIES TO INTERACT
WITH EACH OTHER IS A PRETTY INSIGHTFUL OBSERVATION. IT'S GENERALLY A VERY
OPEN SYSTEM, AND IT DOESN'T NECESSARILY DEMAND THAT EVERY COMMUNITY INTERACT
WITH EVERY OTHER COMMUNITY BUT WHAT WE WOULD LIKE TO ACCOMPLISH IN THE LONG
RUN IS IF ANYONE IN ONE COMMUNITY WANTS TO COMMUNICATE WITH SOMEONE IN A
DIFFERENT COMMUNITY, THAT THERE IS A PATH BY WHICH THE TWO ARE ABLE TO
INTERACT.
IF WE AREN'T ABLE TO MAINTAIN THAT FUNDAMENTAL COMMONALITY, THEN ONE OF THE
MOST IMPORTANT VALUES OF THE INTERNET WOULD BE LOST.
SO I HOPE AS WE MOVE FORWARD ON IDNS ON ALL THE VARIOUS FRONTS THAT WE HAVE
DISCUSSED TODAY THAT WE KEEP THOSE OBJECTIVES IN MIND.
BEFORE WE CLOSE, LET ME ASK IF THERE ARE ANY QUESTIONS OR COMMENTS, CLOSING
COMMENTS, THAT PEOPLE ON THE FLOOR WOULD LIKE TO MAKE.
IF NOT, LET ME THANK YOU ALL FOR SITTING THROUGH A PRETTY HEFTY SESSION.
I WANT TO THANK ALL OF THE PRESENTERS ESPECIALLY FOR PREPARING FOR ALL OF
THIS, AND TINA FOR PULLING THIS ALL TOGETHER ON OUR BEHALF.
A REMINDER, THERE IS A RECEPTION AND THE BUSES WILL LEAVE AT 7:45. IF YOU
WANT TO BE ON THE BUS TO JOIN THE PARTY, YOU WANT TO BE IN PLACE BY THAT
TIME.
SEE YOU ALL AT THE PARTY. BYE FOR NOW.
[ APPLAUSE ]