ICANN: Label Generation Rules for the Root Zone Version 3 (RZ-LGR-3)

Brief Overview:

Purpose: To determine valid top-level Internationalized Domain Name (IDN) labels and their variant labels, the community had finalized the Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels (the Procedure). The Procedure requires community-based Generation Panels (GPs), organized for relevant scripts, to convene and propose specific rules. These rules are evaluated and then integrated into the Root Zone Label Generation Rules (RZ-LGR) by the Integration Panel (IP).

Current Status: The IP has successfully evaluated the Root Zone Label Generation Rules (LGR) proposals for 10 additional scripts, including Devanagari, Gujarati, Gurmukhi, Hebrew, Kannada, Malayalam, Oriya, Sinhala, Tamil and Telugu. These proposals were finalized and submitted by the respective GPs, following the release of the individual proposals for public comments. The IP has integrated these proposals, along with Arabic, Ethiopic, Georgian, Khmer, Lao and Thai scripts already integrated into the second version of the Root Zone LGR (RZ-LGR-2), to develop the third version of the Root Zone LGR (RZ-LGR-3).

Next Steps: As per the Procedure, RZ-LGR-3 is being released for public comments to gather community feedback for its finalization. Proposals for additional scripts will be integrated in future versions of the RZ-LGR.

Section I: Description and Explanation

As per the Procedure which guides this work, the RZ-LGR is developed with the GPs starting their analysis from the current version of the Maximal Starting Repertoire (MSR-4) and developing a proposal for the respective script(s) based on the principles and additional considerations presented in the Procedure. The RZ-LGR-3 is designed to be the third installment of a RZ-LGR that meets the requirement for a conservative set of label generation rules for stable and secure operation of the Internet’s Root Zone. RZ-LGR-3 contains rules for 16 scripts, including Arabic, Devanagari, Ethiopic, Georgian, Gujarati, Gurmukhi, Hebrew, Kannada, Khmer, Lao, Malayalam, Oriya, Sinhala, Tamil, Telugu and Thai, based on the proposals submitted by the respective GPs. The IP also considered the Armenian and Cyrillic script proposals, but as it has interactions with the LGRs of Greek and Latin scripts which are being developed, it was deemed prudent to delay their integration.

RZ-LGR provides a specification to mechanically determine valid IDN Top-Level Domains (TLDs). The RZ-LGR also determines the corresponding set of blocked and allocatable variant labels. Additional mechanisms need to be developed to determine which, if any, of the allocatable variant labels generated by the RZ-LGR will be allocated to the applicants.

The current version of the RZ-LGR will be followed by future versions that will support additional scripts and writing systems, as proposals from more GPs become available. It is necessary to ensure that these future additions are upwardly compatible. In addition to the panels which have already completed, work is also underway by Bangla, Chinese, Greek, Japanese, Korean, Latin and Myanmar panels. GPs for additional scripts, including Thaana and Tibetan are being formed.

Section II: Background

The Root Zone LGR development procedure requires three steps. Initially, the IP creates the Maximal Starting Repertoire (MSR) for the GPs to initiate their work. Based on the latest version of the MSR, the community-based GPs organize and develop proposals for the RZ-LGR for their respective scripts or writing systems. After public comments, these proposals are submitted to the IP for evaluation. Finally, the successfully evaluated proposals are integrated into the next version of RZ-LGR.

The current MSR-4 covers the following 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Thai, and Tibetan, and is based on Unicode version 6.3. 

Successful development of RZ-LGR depends on having a community-based GP for each script or writing system. A GP develops a LGR proposal to be used to generate valid TLD labels and their variant labels for the relevant script or writing system. Each proposal contains the valid code points, their variant code points and Whole Label Evaluation (WLE) rules. In doing so, the GP may need to coordinate efforts with other GPs, whenever their repertoires either overlap or are closely related. Each proposal is reviewed by the community through public comment process before submission to the IP for further consideration.

In the Procedure it is stated that the IP creates a set of recommended label generation rules that integrates all the approved proposals from the GPs. When the IP has created such a set, it is posted for public comments using the prevailing ICANN procedures. At the end of the public comment period, the IP receives and reviews the public comments to finalize the LGR. The resulting label generation rules become the next versions of the RZ-LGR.

Section III: Relevant Resources

The following Root Zone Label Generation Rules version 3 (RZ LGR-3) files are published for public comments. Additional files in RZ-LGR-3 contain LGRs for Arabic, Ethiopic, Georgian, Khmer, Lao and Thai scripts, which are already integrated into the second version of the Root Zone LGR (RZ-LGR-2).

Summary Documents:

  1. Overview and Summary: https://www.icann.org/sites/default/files/lgr/lgr-3-overview-25apr19-en.pdf
  2. Repertoire Tables, non-CJK: https://www.icann.org/sites/default/files/lgr/lgr-3-non-cjk-25apr19-en.pdf

XML versions (normative):

  1. Common: https://www.icann.org/sites/default/files/lgr/lgr-3-common-25apr19-en.xml
  2. Arabic: https://www.icann.org/sites/default/files/lgr/lgr-3-arabic-script-25apr19-en.xml
  3. Devanagari: https://www.icann.org/sites/default/files/lgr/lgr-3-devanagari-script-25apr19-en.xml
  4. Ethiopic: https://www.icann.org/sites/default/files/lgr/lgr-3-ethiopic-script-25apr19-en.xml
  5. Georgian: https://www.icann.org/sites/default/files/lgr/lgr-3-georgian-script-25apr19-en.xml
  6. Gujarati: https://www.icann.org/sites/default/files/lgr/lgr-3-gujarati-script-25apr19-en.xml
  7. Gurmukhi: https://www.icann.org/sites/default/files/lgr/lgr-3-gurmukhi-script-25apr19-en.xml
  8. Hebrew: https://www.icann.org/sites/default/files/lgr/lgr-3-hebrew-script-25apr19-en.xml
  9. Kannada: https://www.icann.org/sites/default/files/lgr/lgr-3-kannada-script-25apr19-en.xml
  10. Khmer: https://www.icann.org/sites/default/files/lgr/lgr-3-khmer-script-25apr19-en.xml
  11. Lao: https://www.icann.org/sites/default/files/lgr/lgr-3-lao-script-25apr19-en.xml
  12. Malayalam: https://www.icann.org/sites/default/files/lgr/lgr-3-malayalam-script-25apr19-en.xml
  13. Oriya: https://www.icann.org/sites/default/files/lgr/lgr-3-oriya-script-25apr19-en.xml
  14. Sinhala https://www.icann.org/sites/default/files/lgr/lgr-3-sinhala-script-25apr19-en.xml
  15. Tamil: https://www.icann.org/sites/default/files/lgr/lgr-3-tamil-script-25apr19-en.xml
  16. Telugu: https://www.icann.org/sites/default/files/lgr/lgr-3-telugu-script-25apr19-en.xml
  17. Thai: https://www.icann.org/sites/default/files/lgr/lgr-3-thai-script-25apr19-en.xml

HTML versions of the XML files (non-normative, for easier readability):

  1. Common: https://www.icann.org/sites/default/files/lgr/lgr-3-common-25apr19-en.html
  2. Arabic: https://www.icann.org/sites/default/files/lgr/lgr-3-arabic-script-25apr19-en.html
  3. Devanagari: https://www.icann.org/sites/default/files/lgr/lgr-3-devanagari-script-25apr19-en.html
  4. Ethiopic: https://www.icann.org/sites/default/files/lgr/lgr-3-ethiopic-script-25apr19-en.html
  5. Georgian: https://www.icann.org/sites/default/files/lgr/lgr-3-georgian-script-25apr19-en.html
  6. Gujarati: https://www.icann.org/sites/default/files/lgr/lgr-3-gujarati-script-25apr19-en.html
  7. Gurmukhi: https://www.icann.org/sites/default/files/lgr/lgr-3-gurmukhi-script-25apr19-en.html
  8. Hebrew: https://www.icann.org/sites/default/files/lgr/lgr-3-hebrew-script-25apr19-en.html
  9. Kannada: https://www.icann.org/sites/default/files/lgr/lgr-3-kannada-script-25apr19-en.html
  10. Khmer: https://www.icann.org/sites/default/files/lgr/lgr-3-khmer-script-25apr19-en.html
  11. Lao: https://www.icann.org/sites/default/files/lgr/lgr-3-lao-script-25apr19-en.html
  12. Malayalam: https://www.icann.org/sites/default/files/lgr/lgr-3-malayalam-script-25apr19-en.html
  13. Oriya: https://www.icann.org/sites/default/files/lgr/lgr-3-oriya-script-25apr19-en.html
  14. Sinhala: https://www.icann.org/sites/default/files/lgr/lgr-3-sinhala-script-25apr19-en.html
  15. Tamil: https://www.icann.org/sites/default/files/lgr/lgr-3-tamil-script-25apr19-en.html
  16. Telugu: https://www.icann.org/sites/default/files/lgr/lgr-3-telugu-script-25apr19-en.html
  17. Thai: https://www.icann.org/sites/default/files/lgr/lgr-3-thai-script-25apr19-en.html

Section IV: Additional Information

Open Date:25 Apr 2019 23:59 UTC

Close Date:4 Jun 2019 23:59 UTC

Staff Report Due: 19 Jun 2019 23:59 UTC

