| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" |
| "http://www.w3.org/TR/html4/loose.dtd"> |
| <html> |
| <head> |
| <meta name="generator" content= |
| "HTML Tidy for HTML5 for Apple macOS version 5.6.0"> |
| <meta http-equiv="Content-Type" content= |
| "text/html; charset=utf-8"> |
| <meta http-equiv="Content-Language" content="en-us"> |
| <link rel="stylesheet" href= |
| "http://www.unicode.org/reports/reports.css" type="text/css"> |
| <title>UTS #35: Unicode LDML: Supplemental</title> |
| <style type="text/css"> |
| <!-- |
| .dtd { |
| font-family: monospace; |
| font-size: 90%; |
| background-color: #CCCCFF; |
| border-style: dotted; |
| border-width: 1px; |
| } |
| |
| .xmlExample { |
| font-family: monospace; |
| font-size: 80% |
| } |
| |
| .blockedInherited { |
| font-style: italic; |
| font-weight: bold; |
| border-style: dashed; |
| border-width: 1px; |
| background-color: #FF0000 |
| } |
| |
| .inherited { |
| font-weight: bold; |
| border-style: dashed; |
| border-width: 1px; |
| background-color: #00FF00 |
| } |
| |
| .element { |
| font-weight: bold; |
| color: red; |
| } |
| |
| .attribute { |
| font-weight: bold; |
| color: maroon; |
| } |
| |
| .attributeValue { |
| font-weight: bold; |
| color: blue; |
| } |
| |
| li, p { |
| margin-top: 0.5em; |
| margin-bottom: 0.5em |
| } |
| |
| h2, h3, h4, table { |
| margin-top: 1.5em; |
| margin-bottom: 0.5em; |
| } |
| --> |
| </style> |
| </head> |
| <body> |
| <table class="header" width="100%"> |
| <tr> |
| <td class="icon"><a href="http://unicode.org"><img alt= |
| "[Unicode]" src="http://unicode.org/webscripts/logo60s2.gif" |
| width="34" height="33" style= |
| "vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a> |
| <a class="bar" href= |
| "http://www.unicode.org/reports/">Technical Reports</a></td> |
| </tr> |
| <tr> |
| <td class="gray"> </td> |
| </tr> |
| </table> |
| <div class="body"> |
| <h2 style="text-align: center">Unicode Technical Standard #35</h2> |
| <h1>Unicode Locale Data Markup Language (LDML)<br> |
| Part 6: Supplemental</h1> |
| <!-- At least the first row of this header table should be identical across the parts of this UTS. --> |
| <table border="1" cellpadding="2" cellspacing="0" class="wide"> |
| <tr> |
| <td>Version</td> |
| <td>36</td> |
| </tr> |
| <tr> |
| <td>Editors</td> |
| <td>Steven Loomis (<a href= |
| "mailto:[email protected]">[email protected]</a>) and |
| <a href="tr35.html#Acknowledgments">other CLDR committee |
| members</a></td> |
| </tr> |
| </table> |
| <p>For the full header, summary, and status, see <a href= |
| "tr35.html">Part 1: Core</a></p> |
| <h3><i>Summary</i></h3> |
| <p>This document describes parts of an XML format |
| (<i>vocabulary</i>) for the exchange of structured locale data. |
| This format is used in the <a href= |
| "http://cldr.unicode.org/">Unicode Common Locale Data |
| Repository</a>.</p> |
| <p>This is a partial document, describing only those parts of |
| the LDML that are relevant for supplemental data. For the other |
| parts of the LDML see the <a href="tr35.html">main LDML |
| document</a> and the links above.</p> |
| <h3><i>Status</i></h3> |
| |
| <!-- NOT YET APPROVED |
| <p> |
| <i class="changed">This is a<b><font color="#ff3333"> |
| draft </font></b>document which may be updated, replaced, or superseded by |
| other documents at any time. Publication does not imply endorsement |
| by the Unicode Consortium. This is not a stable document; it is |
| inappropriate to cite this document as other than a work in |
| progress. |
| </i> |
| </p> |
| END NOT YET APPROVED --> |
| <!-- APPROVED --> |
| <p><i>This document has been reviewed by Unicode members and |
| other interested parties, and has been approved for publication |
| by the Unicode Consortium. This is a stable document and may be |
| used as reference material or cited as a normative reference by |
| other specifications.</i></p> |
| <!-- END APPROVED --> |
| |
| <blockquote> |
| <p><i><b>A Unicode Technical Standard (UTS)</b> is an |
| independent specification. Conformance to the Unicode |
| Standard does not imply conformance to any UTS.</i></p> |
| </blockquote> |
| <p><i>Please submit corrigenda and other comments with the CLDR |
| bug reporting form [<a href="tr35.html#Bugs">Bugs</a>]. Related |
| information that is useful in understanding this document is |
| found in the <a href="tr35.html#References">References</a>. For |
| the latest version of the Unicode Standard see [<a href= |
| "tr35.html#Unicode">Unicode</a>]. For a list of current Unicode |
| Technical Reports see [<a href= |
| "tr35.html#Reports">Reports</a>]. For more information about |
| versions of the Unicode Standard, see [<a href= |
| "tr35.html#Versions">Versions</a>].</i></p> |
| <!-- This section of Parts should be identical in all of the parts of this UTS. --> |
| <h2><a name="Parts" href="#Parts" id="Parts">Parts</a></h2> |
| <p>The LDML specification is divided into the following |
| parts:</p> |
| <ul class="toc"> |
| <li>Part 1: <a href="tr35.html#Contents">Core</a> (languages, |
| locales, basic structure)</li> |
| <li>Part 2: <a href="tr35-general.html#Contents">General</a> |
| (display names & transforms, etc.)</li> |
| <li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a> |
| (number & currency formatting)</li> |
| <li>Part 4: <a href="tr35-dates.html#Contents">Dates</a> |
| (date, time, time zone formatting)</li> |
| <li>Part 5: <a href= |
| "tr35-collation.html#Contents">Collation</a> (sorting, |
| searching, grouping)</li> |
| <li>Part 6: <a href= |
| "tr35-info.html#Contents">Supplemental</a> (supplemental |
| data)</li> |
| <li>Part 7: <a href= |
| "tr35-keyboards.html#Contents">Keyboards</a> (keyboard |
| mappings)</li> |
| </ul> |
| <h2><a name="Contents" href="#Contents" id="Contents">Contents |
| of Part 6, Supplemental</a></h2> |
| <!-- START Generated TOC: CheckHtmlFiles --> |
| <ul class="toc"> |
| <li>1 <a href="#Supplemental_Data">Introduction Supplemental |
| Data</a></li> |
| <li>2 <a href="#Territory_Data">Territory Data</a> |
| <ul class="toc"> |
| <li>2.1 <a href= |
| "#Supplemental_Territory_Containment">Supplemental |
| Territory Containment</a></li> |
| <li>2.2 <a href="#Subdivision_Containment">Subdivision |
| Containment</a></li> |
| <li>2.3 <a href= |
| "#Supplemental_Territory_Information">Supplemental |
| Territory Information</a></li> |
| <li>2.4 <a href= |
| "#Territory_Based_Preferences">Territory-Based |
| Preferences</a> |
| <ul class="toc"> |
| <li>2.4.1 <a href= |
| "#Preferred_Units_For_Usage">Preferred Units for |
| Specific Usages</a> |
| <ul class="toc"> |
| <li>Table: <a href= |
| "#Unit_Preference_Categories">Unit Preference |
| Categories</a></li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| <li>2.5 <a href="#rgScope"><rgScope>: Scope of the |
| “rg” Locale Key</a></li> |
| </ul> |
| </li> |
| <li>3 <a href="#Supplemental_Language_Data">Supplemental |
| Language Data</a> |
| <ul class="toc"> |
| <li>3.1 <a href= |
| "#Supplemental_Language_Grouping">Supplemental Language |
| Grouping</a></li> |
| </ul> |
| </li> |
| <li>4 <a href="#Supplemental_Code_Mapping">Supplemental Code |
| Mapping</a></li> |
| <li>5 <a href="#Telephone_Code_Data">Telephone Code Data</a> |
| (Deprecated)</li> |
| <li>6 <a href="#Postal_Code_Validation">Postal Code |
| Validation (Deprecated)</a></li> |
| <li>7 <a href= |
| "#Supplemental_Character_Fallback_Data">Supplemental |
| Character Fallback Data</a></li> |
| <li>8 <a href="#Coverage_Levels">Coverage Levels</a> |
| <ul class="toc"> |
| <li>8.1 <a href= |
| "#Coverage_Level_Definitions">Definitions</a></li> |
| <li>8.2 <a href="#Coverage_Level_Data_Requirements">Data |
| Requirements</a></li> |
| <li>8.3 <a href="#Coverage_Level_Default_Values">Default |
| Values</a></li> |
| </ul> |
| </li> |
| <li>9 <a href="#Appendix_Supplemental_Metadata">Supplemental |
| Metadata</a> |
| <ul class="toc"> |
| <li>9.1 <a href= |
| "#Supplemental_Alias_Information">Supplemental Alias |
| Information</a> |
| <ul class="toc"> |
| <li>Table: <a href="#Alias_Attribute_Values">Alias |
| Attribute Values</a></li> |
| </ul> |
| </li> |
| <li>9.2 <a href= |
| "#Supplemental_Deprecated_Information">Supplemental |
| Deprecated Information (Deprecated)</a></li> |
| <li>9.3 <a href="#Default_Content">Default |
| Content</a></li> |
| </ul> |
| </li> |
| <li>10 <a href="#Metadata_Elements">Locale Metadata |
| Elements</a></li> |
| <li>11 <a href="#Version_Information">Version |
| Information</a></li> |
| <li>12 <a href="#Parent_Locales">Parent Locales</a></li> |
| </ul><!-- END Generated TOC: CheckHtmlFiles --> |
| <h2>1 Introduction <a name="Supplemental_Data" href= |
| "#Supplemental_Data" id="Supplemental_Data">Supplemental |
| Data</a></h2> |
| <p>The following represents the format for additional |
| supplemental information. This is information that is important |
| for internationalization and proper use of CLDR, but is not |
| contained in the locale hierarchy. It is not localizable, nor |
| is it overridden by locale data. The current CLDR data can be |
| viewed in the <a href= |
| "http://www.unicode.org/cldr/data/charts/supplemental/index.html"> |
| Supplemental Charts</a>.</p> |
| <p class="dtd"> |
| <!-- t d {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}--> |
| <!ELEMENT supplementalData (version, generation?, |
| cldrVersion?, currencyData?, territoryContainment?, |
| subdivisionContainment?, languageData?, territoryInfo?, |
| postalCodeData?, calendarData?, calendarPreferenceData?, |
| weekData?, timeData?, measurementData?, unitPreferenceData?, |
| timezoneData?, characters?, transforms?, metadata?, |
| codeMappings?, parentLocales?, likelySubtags?, metazoneInfo?, |
| plurals?, telephoneCodeData?, numberingSystems?, |
| bcp47KeywordMappings?, gender?, references?, languageMatching?, |
| dayPeriodRuleSet*, metaZones?, primaryZones?, windowsZones?, |
| coverageLevels?, idValidity?, rgScope?) ></p> |
| <p>The data in CLDR is presently split into multiple files: |
| supplementalData.xml, supplementalMetadata.xml, characters.xml, |
| likelySubtags.xml, ordinals.xml, plurals.xml, |
| telephoneCodeData.xml, genderList.xml, plus transforms (see |
| <i>Part 2 Section 10 <a href= |
| "tr35-general.html#Transforms">Transforms</a></i> and <i>Part 2 |
| Section 10.3 <a href= |
| "tr35-general.html#Transform_Rules_Syntax">Transform Rule |
| Syntax</a></i>). The split is just for convenience: logically, |
| they are treated as though they were a single file. Future |
| versions of CLDR may split the data in a different fashion. Do |
| not depend on any specific XML filename or path for |
| supplemental data.</p> |
| <p>Note that <a href="#Metadata_Elements">Chapter 10</a> |
| presents information about metadata that is maintained on a |
| per-locale basis. It is included in this section because it is |
| not intended to be used as part of the locale itself.</p> |
| <h2>2 <a name="Territory_Data" href="#Territory_Data" id= |
| "Territory_Data">Territory Data</a></h2> |
| <h3>2.1 <a name="Supplemental_Territory_Containment" href= |
| "#Supplemental_Territory_Containment" id= |
| "Supplemental_Territory_Containment">Supplemental Territory |
| Containment</a></h3> |
| <p class="dtd"><!ELEMENT territoryContainment ( group* ) |
| ><br> |
| <!ELEMENT group EMPTY ><br> |
| <!ATTLIST group type NMTOKEN #REQUIRED ><br> |
| <!ATTLIST group contains NMTOKENS #IMPLIED ><br> |
| <!ATTLIST group grouping ( true | false ) #IMPLIED ><br> |
| <!ATTLIST group status ( deprecated, grouping ) #IMPLIED |
| ></p> |
| <p>The following data provides information that shows groupings |
| of countries (regions). The data is based on the [<a href= |
| "tr35.html#UNM49">UNM49</a>]. There is one special code, |
| <code>QO</code> , which is used for outlying areas of Oceania |
| that are typically uninhabited. The territory containment forms |
| a tree with the following levels:</p> |
| <p align="center">World</p> |
| <p align="center">Continent</p> |
| <p align="center">Subcontinent</p> |
| <p align="center">Country</p> |
| <p>Excluding groupings, in this tree:<br></p> |
| <ul> |
| <li>All non-overlapping regions form a strict tree rooted at |
| World</li> |
| <li>All leaf-nodes (country) are always at depth 4. Some of |
| these “country” regions are actually parts of other |
| countries, such as Hong Kong (part of China). Such |
| relationships are not part of the containment data.</li> |
| </ul> |
| <p>For a chart showing the relationships (plus the included |
| timezones), see the <a href= |
| "http://www.unicode.org/cldr/charts/latest/supplemental/territory_containment_un_m_49.html"> |
| Territory Containment Chart</a>. The XML structure has the |
| following form.</p> |
| <pre><territoryContainment></pre> |
| <blockquote> |
| <pre> |
| <group type="001" contains="002 009 019 142 150"/> <!--World --> |
| <group type="011" contains="BF BJ CI CV GH GM GN GW LR ML MR NE NG SH SL SN TG"/> <!--Western Africa --> |
| <group type="013" contains="BZ CR GT HN MX NI PA SV"/> <!--Central America --> |
| <group type="014" contains="BI DJ ER ET KE KM MG MU MW MZ RE RW SC SO TZ UG YT ZM ZW"/> <!--Eastern Africa --> |
| <group type="142" contains="030 035 062 145"/> <!--Asia --> |
| <group type="145" contains="AE AM AZ BH CY GE IL IQ JO KW LB OM PS QA SA SY TR YE"/> <!--Western Asia --> |
| <group type="015" contains="DZ EG EH LY MA SD TN"/> <!--Northern Africa --> |
| ...</pre> |
| </blockquote> |
| <p>There are groupings that don't follow this regular |
| structure, such as:</p> |
| <pre> |
| <group type="003" contains="013 021 029" grouping="true"/> <!--North America --></pre> |
| <p>These are marked with the attribute <span class= |
| "attribute">grouping</span>="<span class= |
| "attributeValue">true</span>".</p> |
| <p>When groupings have been deprecated but kept around for |
| backwards compatibility, they are marked with the attribute |
| <span class="attribute">status</span>="<span class= |
| "attributeValue">deprecated</span>", like this:</p> |
| <pre> |
| <group type="029" contains="AN" status="deprecated"/> <!--Caribbean --></pre> |
| <p>When the containment relationship itself is a grouping, it |
| is marked with the attribute <span class= |
| "attribute">status</span>="<span class= |
| "attributeValue">grouping</span>", like this:</p> |
| <pre> |
| <group type="150" contains="EU" status="grouping"/> <!--Europe --></pre> |
| <p>That is, the type value isn’t a grouping, but if you filter |
| out groupings you can drop this containment. In the example |
| above, EU is a grouping, and contained in 150.</p> |
| <h3>2.2 <a name="Subdivision_Containment" href= |
| "#Subdivision_Containment" id= |
| "Subdivision_Containment">Subdivision Containment</a></h3> |
| <p class="dtd"><!ELEMENT subdivisionContainment ( subgroup* |
| ) ><br> |
| <br> |
| <!ELEMENT subgroup EMPTY ><br> |
| <!ATTLIST subgroup type NMTOKEN #REQUIRED ><br> |
| <!ATTLIST subgroup contains NMTOKENS #IMPLIED ></p> |
| <p>The subdivision containment data is similar to the territory |
| containment. It is based on ISO 3166-2 data, but may diverge |
| from it in the future.</p> |
| <p class="xmlExample"><subgroup type="BD" contains="bda bdb |
| bdc bdd bde bdf bdg bdh"/><br> |
| <subgroup type="bda" contains="bd02 bd06 bd07 bd25 bd50 |
| bd51"/></p> |
| <p>The <strong>type</strong> is a <code><a href= |
| "tr35.html#unicode_region_subtag">unicode_region_subtag</a></code> |
| (territory) identifier for the top level of containment, or a |
| <code><a href= |
| "tr35.html#unicode_subdivision_subtag">unicode_subdivision_id</a></code> |
| for lower levels of containment when there are multiple levels. |
| The <strong>contains</strong> value is a space-delimited list |
| of one or more <code><a href= |
| "tr35.html#unicode_subdivision_subtag">unicode_subdivision_id</a></code> |
| values. In the example above, subdivision bda contains other |
| subdivisions bd02, bd06, bd07, bd25, bd50, bd51.</p> |
| <p>Note: Formerly (in CLDR 28 through 30):</p> |
| <ul> |
| <li>The <strong>type</strong> attribute could only contain a |
| <code>unicode_region_subtag</code>;</li> |
| <li>The <strong>contains</strong> attribute contained |
| <code>unicode_subdivision_suffix</code> values; these are not |
| unique across multiple territories, so...</li> |
| <li>For lower containment levels, a now-deprecated subtype |
| <strong>attribute</strong> was used to specify the parent |
| <code>unicode_subdivision_suffix</code>.</li> |
| </ul>* The type attribute contained only a |
| <code>unicode_region_subtag</code> |
| <code>unicode_subdivision_suffix</code> values were used in the |
| <strong>contains</strong> attribute; these are not unique |
| across multiple territories, so for lower levels a |
| now-deprecated |
| <h3>2.3 <a name="Supplemental_Territory_Information" href= |
| "#Supplemental_Territory_Information" id= |
| "Supplemental_Territory_Information">Supplemental Territory |
| Information</a></h3> |
| <p class="dtd"><!ELEMENT territory ( languagePopulation* ) |
| ><br> |
| <!ATTLIST territory type NMTOKEN #REQUIRED ><br> |
| <!ATTLIST territory gdp NMTOKEN #REQUIRED ><br> |
| <!ATTLIST territory literacyPercent NMTOKEN #REQUIRED |
| ><br> |
| <!ATTLIST territory population NMTOKEN #REQUIRED ><br> |
| <br> |
| <!ELEMENT languagePopulation EMPTY ><br> |
| <!ATTLIST languagePopulation type NMTOKEN #REQUIRED ><br> |
| <!ATTLIST languagePopulation literacyPercent NMTOKEN |
| #IMPLIED ><br> |
| <!ATTLIST languagePopulation writingPercent NMTOKEN #IMPLIED |
| ><br> |
| <!ATTLIST languagePopulation populationPercent NMTOKEN |
| #REQUIRED ><br> |
| <!ATTLIST languagePopulation officialStatus |
| (de_facto_official | official | official_regional | |
| official_minority) #IMPLIED ></p> |
| <p>This data provides testing information for language and |
| territory populations. The main goal is to provide approximate |
| figures for the literate, functional population for each |
| language in each territory: that is, the population that is |
| able to read and write each language, and is comfortable enough |
| to use it with computers. For a chart of this data, see |
| <a href='http://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html'> |
| Territory-Language Information</a>.</p> |
| <p><em>Example</em></p> |
| <pre style='font-size: 70%'> |
| <territory type="AO" gdp="175500000000" literacyPercent="70.4" population="19088100"> <!--Angola--> |
| <languagePopulation type="pt" populationPercent="67" officialStatus="official"/> <!--Portuguese--> |
| <languagePopulation type="umb" populationPercent="29"/> <!--Umbundu--> |
| <languagePopulation type="kmb" writingPercent="10" populationPercent="25" references="R1034"/> <!--Kimbundu--> |
| <languagePopulation type="ln" populationPercent="0.67" references="R1010"/> <!--Lingala--> |
| </territory></pre> |
| <p>Note that reliable information is difficult to obtain; the |
| information in CLDR is an estimate culled from different |
| sources, including the World Bank, CIA Factbook, and others. |
| The GDP and country literacy figures are taken from the World |
| Bank where available, otherwise supplemented by FactBook data |
| and other sources. The GDP figures are “PPP (constant 2000 |
| international $)”. Much of the per-language data is taken from |
| the Ethnologue, but is supplemented and processed using many |
| other sources, including per-country census data. (The focus of |
| the Ethnologue is native speakers, which includes people who |
| are not literate, and excludes people who are functional |
| second-language users.) Some references are marked in the XML |
| files, with attributes such as <code>references="R1010"</code> |
| .</p> |
| <p>The percentages may add up to more than 100% due to |
| multilingual populations, or may be less than 100% due to |
| illiteracy or because the data has not yet been gathered or |
| processed. Languages with smaller populations might not be |
| included.</p> |
| <p>The following describes the meaning of some of these |
| terms—as used in CLDR—in more detail.</p> |
| <p><a name="literacy_percent" href="#literacy_percent" id= |
| "literacy_percent">literacy percent for the |
| territory</a> — an estimate of the percentage of the |
| country’s population that is functionally literate.</p> |
| <p><a name="language_population_percent" href= |
| "#language_population_percent" id= |
| "language_population_percent">language population |
| percent</a> — an estimate of the number of people who are |
| functional in that language in that country, including both |
| first and second language speakers. The level of fluency is |
| that necessary to use a UI on a computer, smartphone, or |
| similar devices, rather than complete fluency.</p> |
| <p><a name="literacy_percent_for_langPop" href= |
| "#literacy_percent_for_langPop" id= |
| "literacy_percent_for_langPop">literacy percent for language |
| population</a> — Within the set of people who are |
| functional in the corresponding language (as specified by |
| <a href="#language_population_percent">language population |
| percent</a>), this is an estimate of the percentage of those |
| people who are functionally literate in that language, that is, |
| who are <em>capable</em> of reading or writing in that |
| language, even if they do not regularly use it for reading or |
| writing. If not specified, this defaults to the <a href= |
| "#literacy_percent">literacy percent for the territory</a>.</p> |
| <p><a name="writing_percent" href="#writing_percent" id= |
| "writing_percent">writing percent</a> — Within the set of |
| people who are functional in the corresponding language (as |
| specified by <a href="#language_population_percent">language |
| population percent</a>), this is an estimate of the percentage |
| of those people who regularly read or write a significant |
| amount in that language. Ideally, the regularity would be |
| measured as “7-day actives”. If it is known that the language |
| is not widely or commonly written, but there are no solid |
| figures, the value is typically given 1%-5%.</p> |
| <p>For a language such as Swiss German, which is typically not |
| written, even though nearly the whole native Germanophone |
| population <em>could</em> write in Swiss German, the |
| <a href="#literacy_percent_for_langPop">literacy percent for |
| language population</a> is high, but the <a href= |
| "#writing_percent">writing percent</a> is low.</p> |
| <p><a name="official_language" href="#official_language" id= |
| "official_language">official language</a> — as used in |
| CLDR, a language that can generally be used in all |
| communications with a central government. That is, people can |
| expect that essentially all communication from the government |
| is available in that language (ballots, information pamphlets, |
| legal documents, …) and that they can use that language in any |
| communication to the central government (petitions, forms, |
| filing lawsuits,…).</p> |
| <p>Official languages for a country in this sense are not |
| necessarily the same as those with official legal status in the |
| country. For example, Irish is declared to be an official |
| language in Ireland, but English has no such formal status in |
| the United States. Languages such as the latter are |
| called <em>de facto</em> official languages. As |
| another example, German has legal status in Italy, but cannot |
| be used in all communications with the central government, and |
| is thus not an official language <em>of Italy</em> for CLDR |
| purposes. It is, however, an <em>official regional |
| language</em>. Other languages are declared to be official, but |
| can’t actually be used for all communication with any major |
| governmental entity in the country. There is no intention to |
| mark such nominally official languages as “official” in the |
| CLDR data.</p> |
| <p><a name="official_regional_language" href= |
| "#official_regional_language" id= |
| "official_regional_language">official regional |
| language</a> — a language that is official (<em>de |
| jure</em> or <em>de facto</em>) in a major region within a |
| country, but does not qualify as an official language of the |
| country as a whole. For example, it can be used in an official |
| petition to a provincial government, but not the central |
| government. The term “major” is meant to distinguish from |
| smaller-scale usage, such as for a town or village.</p> |
| <h3>2.4 <a name="Territory_Based_Preferences" href= |
| "#Territory_Based_Preferences" id= |
| "Territory_Based_Preferences">Territory-Based |
| Preferences</a></h3> |
| <p>The default preference for several locale items is based |
| solely on a <a href= |
| "tr35.html#unicode_region_subtag">unicode_region_subtag</a>, |
| which may either be specified as part of a <a href= |
| "tr35.html#unicode_language_id">unicode_language_id</a>, |
| inferred from other locale ID elements using the <a href= |
| "tr35.html#Likely_Subtags">Likely Subtags</a> mechanism, or |
| provided explicitly using an “rg” <a href= |
| "tr35.html#RegionOverride">Region Override</a> locale key. For |
| more information on this process see <a href= |
| "tr35.html#Locale_Inheritance">Locale Inheritance and |
| Matching</a>. The specific items that are handled in this way |
| are:</p> |
| <ul> |
| <li>Default calendar (see <a href= |
| "tr35-dates.html#Calendar_Preference_Data">Calendar |
| Preference Data</a>)</li> |
| <li>Default week conventions (first day of week and weekend |
| days; see <a href="tr35-dates.html#Week_Data">Week |
| Data</a>)</li> |
| <li>Default hour cycle (see <a href= |
| "tr35-dates.html#Time_Data">Time Data</a>)</li> |
| <li>Default currency (see <a href= |
| "tr35-numbers.html#Supplemental_Currency_Data">Supplemental |
| Currency Data</a>)</li> |
| <li>Default measurement system and paper size (see <a href= |
| "tr35-general.html#Measurement_System_Data">Measurement |
| System Data</a>)</li> |
| <li>Default units for specific usage (see <a href= |
| "#Preferred_Units_For_Usage">Preferred Units for Specific |
| Usages</a>, below)</li> |
| </ul> |
| <h4>2.4.1 <a name="Preferred_Units_For_Usage" href= |
| "#Preferred_Units_For_Usage" id= |
| "Preferred_Units_For_Usage">Preferred Units for Specific |
| Usages</a></h4> |
| <p>This data is intended to map from a particular usage — e.g. |
| measuring the height of a person or the fuel consumption of an |
| automobile — to the unit or combination of units typically used |
| for that usage in a given region. Considerations for such a |
| mapping include:</p> |
| <ul> |
| <li>The list of possible usages large and open-ended. The |
| intent here is to start with a small set for which there is |
| an urgent need, and expand as necessary.</li> |
| <li>Even for a given usage such a measuring a road distance, |
| there are multiple ranges in use. For example, one set of |
| units may be used for indicating the distance to the next |
| city (kilometers or miles), while another may be used for |
| indicating the distance to the next exit (meters, yards, or |
| feet).</li> |
| <li>There are also differences between more formal usage |
| (official signage, medical records) and more informal usage |
| (conversation, texting).</li> |
| <li>For some usages, the measurement may be expressed using a |
| sequence of units, such as “1 meter, 78 centimeters” or “12 |
| stone, 2 pounds”.</li> |
| </ul> |
| <p>The DTD structure is as follows:</p> |
| <p class="dtd"><!ELEMENT unitPreferenceData ( |
| unitPreferences* ) ><br> |
| <br> |
| <!ELEMENT unitPreferences ( unitPreference* ) ><br> |
| <!ATTLIST unitPreferences category NMTOKEN #REQUIRED |
| ><br> |
| <!ATTLIST unitPreferences usage NMTOKENS #REQUIRED ><br> |
| <!ATTLIST unitPreferences scope (small) #IMPLIED ><br> |
| <br> |
| <!ELEMENT unitPreference ( #PCDATA ) ><br> |
| <!ATTLIST unitPreference regions NMTOKENS #REQUIRED |
| ><br></p> |
| <p>An example of data using this structure is as follows:</p> |
| <pre> |
| <unitPreferenceData> |
| ... |
| <unitPreferences category="length" usage="person"> |
| <unitPreference regions="001">centimeter</unitPreference> |
| <unitPreference regions="BR CN DE DK MX NL NO PL PT RU" alt="informal">meter centimeter</unitPreference> |
| <unitPreference regions="AT BE DZ EG ES FR HK ID IL IT JO MY SA SE TR VN">meter centimeter</unitPreference> |
| <unitPreference regions="CA GB IN US" alt="informal">foot inch</unitPreference> |
| <unitPreference regions="US">inch</unitPreference> |
| </unitPreferences> |
| <unitPreferences category="length" usage="person" scope="small"> |
| <unitPreference regions="001">centimeter</unitPreference> |
| <unitPreference regions="CA GB IN" alt="informal">inch</unitPreference> |
| <unitPreference regions="US">inch</unitPreference> |
| </unitPreferences> |
| ... |
| </unitPreferenceData> |
| </pre> |
| <p>There are several things to note:</p> |
| <ul> |
| <li>The <unitPreferences> <em>category</em> attribute |
| values match a <unit> element <em>type</em> attribute |
| value, as listed in <a href= |
| "tr35-general.html#Unit_Elements">Unit Elements</a>.</li> |
| <li>The <unitPreferences> <em>usage</em> attribute |
| values are specific to this data; current values are listed |
| in a table at the end of this section.</li> |
| <li>The <unitPreferences> element may have a |
| <em>scope="small"</em> attribute to indicate that it is |
| intended for the smaller range of values for that usage, such |
| measuring the height or weight of an infant versus that of an |
| adult, or measuring the road distance to the next exit versus |
| that to the next city.</li> |
| <li>Each <unitPreferences> element must contain one |
| <unitPreference> element with attribute |
| <em>regions="001"</em>; this specifies the worldwide default |
| unit or unit sequence for the usage and scope specified by |
| the <unitPreferences> element. There may be additional |
| <unitPreference> elements which specify a different |
| unit or unit sequence for specific regions and possibly for a |
| different degree of formality.</li> |
| <li>The <unitPreference> element may have an |
| <em>alt="informal"</em> attribute to indicate that the |
| specified unit or unit sequence is preferred in more informal |
| usage.</li> |
| <li>The value of the <unitPreference> element is a |
| sequence of one or more space-separated unit names from the a |
| <unit> element <em>unit</em> attribute values for the |
| relevant type, as listed in <a href= |
| "tr35-general.html#Unit_Elements">Unit Elements</a>.</li> |
| </ul> |
| <p>For a given combination of category, usage, scope and |
| formality, the intended procedure for looking up the unit or |
| unit combination to use for a given region is as follows:</p> |
| <ul> |
| <li>Get the appropriate <unitPreferences> element for |
| the desired <em>category</em> and <em>usage</em>: If |
| scope=small is desired and a <unitPreferences> element |
| with <em>scope="small"</em> exists for the desired |
| <em>category</em> and <em>usage</em>, use it. Otherwise, use |
| a <unitPreferences> element for the desired |
| <em>category</em> and <em>usage</em> that has no |
| <em>scope</em> attribute. In the selected |
| <unitPreferences> element, pick a |
| <unitPreference> element using the following |
| steps.</li> |
| <li>If informal usage is preferred, look for a |
| <unitPreference> element with <em>alt="informal"</em> |
| whose <em>regions</em> attribute includes the given region. |
| If found, use the specified unit [sequence].</li> |
| <li>Look for a <unitPreference> element whose |
| <em>regions</em> attribute includes the given region. If |
| found, use the specified unit [sequence].</li> |
| <li>Look for a <unitPreference> element with |
| <em>alt="informal"</em> whose <em>regions</em> attribute is |
| "001". If found, use the specified unit [sequence].</li> |
| <li>Look for a <unitPreference> element whose |
| <em>regions</em> attribute is "001". If found, use the |
| specified unit [sequence].</li> |
| </ul> |
| <p>CLDR 29 contains usage mapping data for the following |
| combinations of category, usage, and scope:</p> |
| <table border="1" cellpadding="4" cellspacing="0"> |
| <caption> |
| <a name="Unit_Preference_Categories" href= |
| "#Unit_Preference_Categories" id= |
| "Unit_Preference_Categories">Unit Preference Categories</a> |
| </caption> |
| <tr> |
| <td><strong>Category</strong></td> |
| <td><strong>Usage</strong></td> |
| <td><strong>Sample Value</strong></td> |
| </tr> |
| <tr> |
| <td><em>area</em></td> |
| <td>land-agricult</td> |
| <td>hectare</td> |
| </tr> |
| <tr> |
| <td><em>area</em></td> |
| <td>land-commercl</td> |
| <td>hectare</td> |
| </tr> |
| <tr> |
| <td><em>area</em></td> |
| <td>land-residntl</td> |
| <td>hectare</td> |
| </tr> |
| <tr> |
| <td><em>concentr</em></td> |
| <td>blood-glucose</td> |
| <td>milligram-per-deciliter</td> |
| </tr> |
| <tr> |
| <td><em>consumption</em></td> |
| <td>vehicle-fuel</td> |
| <td>liter-per-100kilometers</td> |
| </tr> |
| <tr> |
| <td><em>duration</em></td> |
| <td>music-track</td> |
| <td>minute second</td> |
| </tr> |
| <tr> |
| <td><em>duration</em></td> |
| <td>person-age</td> |
| <td>year-person month-person</td> |
| </tr> |
| <tr> |
| <td><em>duration</em></td> |
| <td>tv-program</td> |
| <td>minute second</td> |
| </tr> |
| <tr> |
| <td><em>energy</em></td> |
| <td>food</td> |
| <td>foodcalorie</td> |
| </tr> |
| <tr> |
| <td><em>energy</em></td> |
| <td>person-usage</td> |
| <td>kilocalorie</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>person</td> |
| <td>centimeter</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>person, scope=small</td> |
| <td>centimeter</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>rainfall</td> |
| <td>millimeter</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>road</td> |
| <td>kilometer</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>road, scope=small</td> |
| <td>meter</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>snowfall</td> |
| <td>centimeter</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>vehicle</td> |
| <td>meter</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>visiblty</td> |
| <td>kilometer</td> |
| </tr> |
| <tr> |
| <td><em>length</em></td> |
| <td>visiblty, scope=small</td> |
| <td>meter</td> |
| </tr> |
| <tr> |
| <td><em>mass</em></td> |
| <td>person</td> |
| <td>kilogram</td> |
| </tr> |
| <tr> |
| <td><em>mass</em></td> |
| <td>person, scope=small</td> |
| <td>gram</td> |
| </tr> |
| <tr> |
| <td><em>pressure</em></td> |
| <td>baromtrc</td> |
| <td>hectopascal</td> |
| </tr> |
| <tr> |
| <td><em>speed</em></td> |
| <td>road-travel</td> |
| <td>kilometer-per-hour</td> |
| </tr> |
| <tr> |
| <td><em>speed</em></td> |
| <td>wind</td> |
| <td>kilometer-per-hour</td> |
| </tr> |
| <tr> |
| <td><em>temperature</em></td> |
| <td>person</td> |
| <td>celsius</td> |
| </tr> |
| <tr> |
| <td><em>temperature</em></td> |
| <td>weather</td> |
| <td>celsius</td> |
| </tr> |
| <tr> |
| <td><em>volume</em></td> |
| <td>vehicle-fuel</td> |
| <td>liter</td> |
| </tr> |
| </table> |
| <h3>2.5 <a name="rgScope" href="#rgScope" id= |
| "rgScope"><rgScope>: Scope of the “rg” Locale |
| Key</a></h3> |
| <p>The supplemental <rgScope> element specifies the data |
| paths for which the region used for data lookup is determined |
| by the value of any “rg” key present in the locale identifier |
| (see <a href="tr35.html#RegionOverride">Region Override</a>). |
| If no “rg” key is present, the region used for lookup is |
| determined as usual: from the unicode_region_subtag if present, |
| else inferred from the unicode_language_subtag. The DTD |
| structure is as follows:</p> |
| <p class="dtd"><!ELEMENT rgScope ( rgPath* ) ><br> |
| <br> |
| <!ELEMENT rgPath EMPTY ><br> |
| <!ATTLIST rgPath path CDATA #REQUIRED ><br></p> |
| <p>The <rgScope> element contains a list of |
| <rgPath> elements, each of which specifies a datapath for |
| which any “rg” key determines the region for lookup. For |
| example:</p> |
| <pre> |
| <rgScope> |
| <rgPath path="//supplementalData/currencyData/fractions/info[@iso4217='#'][@digits='*'][@rounding='*'][@cashDigits='*'][@cashRounding='*']" draft="provisional" /> |
| <rgPath path="//supplementalData/currencyData/fractions/info[@iso4217='#'][@digits='*'][@rounding='*'][@cashRounding='*']" draft="provisional" /> |
| <rgPath path="//supplementalData/currencyData/fractions/info[@iso4217='#'][@digits='*'][@rounding='*']" draft="provisional" /> |
| <rgPath path="//supplementalData/calendarPreferenceData/calendarPreference[@territories='#'][@ordering='*']" draft="provisional" /> |
| ... |
| <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*'][@scope='*']/unitPreference[@regions='#'][@alt='*']" draft="provisional" /> |
| <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*'][@scope='*']/unitPreference[@regions='#']" draft="provisional" /> |
| <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*']/unitPreference[@regions='#'][@alt='*']" draft="provisional" /> |
| <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*']/unitPreference[@regions='#']" draft="provisional" /> |
| </rgScope> |
| </pre> |
| <p>The exact format of the path is provisional in CLDR 29, but |
| as currently shown:</p> |
| <ul> |
| <li>An attribute value of '*' indicates that the path applies |
| regardless of the value of the attribute.</li> |
| <li>Each path must have exactly one attribute whose value is |
| marked here as '#'; in actual data items with this path, the |
| corresponding value is a list of region codes. It is the |
| region codes in this list that are compared with the region |
| specified by the “rg” key to determine which data item to use |
| for this path.</li> |
| </ul> |
| <h2>3 <a name="Supplemental_Language_Data" href= |
| "#Supplemental_Language_Data" id= |
| "Supplemental_Language_Data">Supplemental Language |
| Data</a></h2> |
| <p class="dtd"><!ELEMENT languageData ( language* ) ><br> |
| <!ELEMENT language EMPTY ><br> |
| <!ATTLIST language type NMTOKEN #REQUIRED ><br> |
| <!ATTLIST language scripts NMTOKENS #IMPLIED ><br> |
| <!ATTLIST language territories NMTOKENS #IMPLIED ><br> |
| <!ATTLIST language variants NMTOKENS #IMPLIED ><br> |
| <!ATTLIST language alt NMTOKENS #IMPLIED ><br> |
| </p> |
| <p>The language data is used for consistency checking and |
| testing. It provides a list of which languages are used with |
| which scripts and in which countries. To a large extent, |
| however, the territory list has been superseded by the data in |
| <em>Section 2.2 <a href= |
| "#Supplemental_Territory_Information">Supplemental Territory |
| Information</a></em> .</p> |
| <pre> <languageData> |
| <language type="af" scripts="Latn" territories="ZA"/> |
| <language type="am" scripts="Ethi" territories="ET"/> |
| <language type="ar" scripts="Arab" territories="AE BH DZ EG IN IQ JO KW LB |
| LY MA OM PS QA SA SD SY TN YE"/> |
| ...</pre> |
| <p>If the language is not a modern language, or the script is |
| not a modern script, or the language not a major language of |
| the territory, then the alt attribute is set to secondary.</p> |
| <pre> |
| <language type="fr" scripts="Latn" territories="IT US" alt="secondary" /> |
| ...</pre> |
| <h2>3.1 <a name="Supplemental_Language_Grouping" href= |
| "#Supplemental_Language_Grouping" id= |
| "Supplemental_Language_Grouping">Supplemental Language |
| Grouping</a></h2> |
| <p><!ELEMENT languageGroups ( languageGroup* ) ><br> |
| <!ELEMENT languageGroup ( #PCDATA ) ><br> |
| <!ATTLIST languageGroup parent NMTOKEN #REQUIRED ></p> |
| <p>The language groups supply language containment. For |
| example, the following indicates that aav is the Unicode |
| language code for a language group that contains caq, crv, |
| etc.</p><code><languageGroup |
| parent="<strong>fiu</strong>">chm et <strong>fi</strong> fit |
| fkv hu izh kca koi krl kv liv mdf mns mrj myv smi udm vep vot |
| vro</languageGroup></code> |
| <p>The vast majority of the languageGroup data is extracted |
| from wikidata, but may be overridden in some cases. The |
| wikidata information is more fine-grained, but makes use of |
| language groups that don't have ISO or Unicode language codes. |
| Those language groups are omitted from the data. For example, |
| wikidata has the following child-parent chain: only the first |
| and last elements are present in the language groups.</p> |
| <table> |
| <tr> |
| <td>Name</td> |
| <td>Wikidata Code</td> |
| <td>Language Code</td> |
| </tr> |
| <tr> |
| <td>Finnish</td> |
| <td><a href= |
| "https://www.wikidata.org/wiki/Q1412">Q1412</a></td> |
| <td>fi</td> |
| </tr> |
| <tr> |
| <td>Finnic languages</td> |
| <td><a href= |
| "https://www.wikidata.org/wiki/Q33328">Q33328</a></td> |
| </tr> |
| <tr> |
| <td>Finno-Samic languages</td> |
| <td><a href= |
| "https://www.wikidata.org/wiki/Q163652">Q163652</a></td> |
| </tr> |
| <tr> |
| <td>Finno-Volgaic languages</td> |
| <td><a href= |
| "https://www.wikidata.org/wiki/Q161236">Q161236</a></td> |
| </tr> |
| <tr> |
| <td>Finno-Permic languages</td> |
| <td><a href= |
| "https://www.wikidata.org/wiki/Q161240">Q161240</a></td> |
| </tr> |
| <tr> |
| <td>Finno-Ugric languages</td> |
| <td><a href= |
| "https://www.wikidata.org/wiki/Q79890">Q79890</a></td> |
| <td>fiu</td> |
| </tr> |
| </table><br> |
| <h2>4 <a name="Supplemental_Code_Mapping" href= |
| "#Supplemental_Code_Mapping" id= |
| "Supplemental_Code_Mapping">Supplemental Code Mapping</a></h2> |
| <p class="dtd"><!ELEMENT codeMappings (languageCodes*, |
| territoryCodes*, currencyCodes*) ></p> |
| <p class="dtd"><!ELEMENT languageCodes EMPTY ><br> |
| <!ATTLIST languageCodes type NMTOKEN #REQUIRED><br> |
| <!ATTLIST languageCodes alpha3 NMTOKEN #REQUIRED></p> |
| <p class="dtd"><!ELEMENT territoryCodes EMPTY ><br> |
| <!ATTLIST territoryCodes type NMTOKEN #REQUIRED><br> |
| <!ATTLIST territoryCodes numeric NMTOKEN #REQUIRED><br> |
| <!ATTLIST territoryCodes alpha3 NMTOKEN #REQUIRED><br> |
| <!ATTLIST territoryCodes fips10 NMTOKEN #IMPLIED><br> |
| <!ATTLIST territoryCodes internet NMTOKENS #IMPLIED> |
| [deprecated]</p> |
| <p class="dtd"><!ELEMENT currencyCodes EMPTY ><br> |
| <!ATTLIST currencyCodes type NMTOKEN #REQUIRED><br> |
| <!ATTLIST currencyCodes numeric NMTOKEN #REQUIRED></p> |
| <p>The code mapping information provides mappings between the |
| subtags used in the CLDR locale IDs (from BCP 47) and other |
| coding systems or related information. The language codes are |
| only provided for those codes that have two letters in BCP 47 |
| to their ISO three-letter equivalents. The territory codes |
| provide mappings to numeric (UN M.49 [<a href= |
| "tr35.html#UNM49">UNM49</a>] codes, equivalent to ISO numeric |
| codes), ISO three-letter codes, FIPS 10 codes, and the internet |
| top-level domain codes.</p> |
| <p>The alphabetic codes are only provided where different from |
| the type. For example:</p> |
| <pre> |
| <territoryCodes type="AA" numeric="958" alpha3="AAA"/> |
| <territoryCodes type="AD" numeric="020" alpha3="AND" fips10="AN"/> |
| <territoryCodes type="AE" numeric="784" alpha3="ARE"/> |
| ... |
| <territoryCodes type="GB" numeric="826" alpha3="GBR" fips10="UK"/> |
| ... |
| <territoryCodes type="QU" numeric="967" alpha3="QUU" internet="EU"/> |
| ... |
| <territoryCodes type="XK" numeric="983" alpha3="XKK"/> |
| ...</pre> |
| <p>Where there is no corresponding code, sometimes private use |
| codes are used, such as the numeric code for XK.</p> |
| <p>The currencyCodes are mappings from three letter currency |
| codes to numeric values (ISO 4217 <a href= |
| "http://www.currency-iso.org/en/home/tables/table-a1.html">Current |
| currency & funds code list</a>.) The mapping currently |
| covers only current codes and does not include historic |
| currencies. For example:</p> |
| <pre> |
| <currencyCodes type="AED" numeric="784"/> |
| <currencyCodes type="AFN" numeric="971"/> |
| ... |
| <currencyCodes type="EUR" numeric="978"/> |
| ... |
| <currencyCodes type="ZAR" numeric="710"/> |
| <currencyCodes type="ZMW" numeric="967"/> |
| </pre> |
| <h2>5 <a name="Telephone_Code_Data" href="#Telephone_Code_Data" |
| id="Telephone_Code_Data">Telephone Code Data</a> |
| (Deprecated)</h2> |
| <p>Deprecated in CLDR v34, and data removed.</p> |
| <p class="dtd"><!ELEMENT telephoneCodeData ( |
| codesByTerritory* ) ><br> |
| <br> |
| <!ELEMENT codesByTerritory ( telephoneCountryCode+ ) |
| ><br> |
| <!ATTLIST codesByTerritory territory NMTOKEN #REQUIRED |
| ><br> |
| <br> |
| <!ELEMENT telephoneCountryCode EMPTY ><br> |
| <!ATTLIST telephoneCountryCode code NMTOKEN #REQUIRED |
| ><br> |
| <!ATTLIST telephoneCountryCode from NMTOKEN #IMPLIED |
| ><br> |
| <!ATTLIST telephoneCountryCode to NMTOKEN #IMPLIED ></p> |
| <p>This data specifies the mapping between ITU telephone |
| country codes [<a href="tr35.html#ITUE164">ITUE164</a>] and |
| CLDR-style territory codes (ISO 3166 2-letter codes or |
| non-corresponding UN M.49 [<a href="tr35.html#UNM49">UNM49</a>] |
| 3-digit codes). There are several things to note:</p> |
| <ul> |
| <li>A given telephone country code may map to multiple CLDR |
| territory codes; +1 (North America Numbering Plan) covers the |
| US and Canada, as well as many islands in the Caribbean and |
| some in the Pacific</li> |
| <li>Some telephone country codes are for global services (for |
| example, some satellite services), and thus correspond to |
| territory code 001.</li> |
| <li>The mappings change over time (territories move from one |
| telephone code to another). These changes are usually planned |
| several years in advance, and there may be a period during |
| which either telephone code can be used to reach the |
| territory. While the CLDR telephone code data is not intended |
| to include past changes, it is intended to incorporate known |
| information on planned future changes, using "from" and "to" |
| date attributes to indicate when mappings are valid.</li> |
| </ul> |
| <p>A subset of the telephone code data might look like the |
| following (showing a past mapping change to illustrate the from |
| and to attributes):</p> |
| <pre><codesByTerritory territory="001"> |
| <telephoneCountryCode code="800"/> <!-- International Freephone Service --> |
| <telephoneCountryCode code="808"/> <!-- International Shared Cost Services (ISCS) --> |
| <telephoneCountryCode code="870"/> <!-- Inmarsat Single Number Access Service (SNAC) --> |
| </codesByTerritory> |
| <codesByTerritory territory="AS"> <!-- American Samoa --> |
| <telephoneCountryCode code="1" from="2004-10-02"/> <!-- +1 684 in North America Numbering Plan --> |
| <telephoneCountryCode code="684" to="2005-04-02"/> <!-- +684 now a spare code --> |
| </codesByTerritory> |
| <codesByTerritory territory="CA"> |
| <telephoneCountryCode code="1"/> <!-- North America Numbering Plan --> |
| </codesByTerritory></pre> |
| <h2>6 <a name="Postal_Code_Validation" href= |
| "#Postal_Code_Validation" id="Postal_Code_Validation">Postal |
| Code Validation (Deprecated)</a></h2> |
| <p>Deprecated in v27. Please see other services that are kept |
| up to date, such as:</p> |
| <ul> |
| <li><a href= |
| "http://i18napis.appspot.com/address/data/US">http://i18napis.appspot.com/address/data/US</a></li> |
| <li><a href= |
| "http://i18napis.appspot.com/address/data/CH">http://i18napis.appspot.com/address/data/CH</a></li> |
| <li>...<br></li> |
| </ul> |
| <p class="dtd"><!ELEMENT postalCodeData (postCodeRegex*) |
| ><br> |
| <!ELEMENT postCodeRegex (#PCDATA) ><br> |
| <!ATTLIST postCodeRegex territoryId NMTOKEN |
| #REQUIRED><br></p> |
| <p>The Postal Code regex information can be used to validate |
| postal codes used in different countries. In some cases, the |
| regex is quite simple, such as for Germany:</p> |
| <pre> |
| <postCodeRegex territoryId="DE" >\d{5}</postCodeRegex></pre> |
| <p>The US code is slightly more complicated, since there is an |
| optional portion:</p> |
| <pre> |
| <postCodeRegex territoryId="US" >\d{5}([ \-]\d{4})?</postCodeRegex></pre> |
| <p>The most complicated currently is the UK.</p> |
| <h2>7 <a name="Supplemental_Character_Fallback_Data" href= |
| "#Supplemental_Character_Fallback_Data" id= |
| "Supplemental_Character_Fallback_Data">Supplemental Character |
| Fallback Data</a></h2> |
| <p class="dtd"><!ELEMENT characters ( character-fallback*) |
| ><br> |
| <br> |
| <!ELEMENT character-fallback ( character* ) ><br> |
| <!ELEMENT character (substitute*) ><br> |
| <!ATTLIST character value CDATA #REQUIRED ><br> |
| <br> |
| <!ELEMENT substitute (#PCDATA) ></p> |
| <p>The characters element provides a way for non-Unicode |
| systems, or systems that only support a subset of Unicode |
| characters, to transform CLDR data. It gives a list of |
| characters with alternative values that can be used if the main |
| value is not available. For example:</p> |
| <pre><characters> |
| <character-fallback> |
| <character value = "ß"> |
| <substitute>ss</substitute> |
| </character> |
| <character value = "Ø"> |
| <substitute>Ö</substitute> |
| <substitute>O</substitute> |
| </character> |
| <character value = "<span style= |
| "font-size: 150%">₧</span>"> |
| <substitute>Pts</substitute> |
| </character> |
| <character value = "<span style= |
| "font-size: 150%">₣</span>"> |
| <substitute>Fr.</substitute> |
| </character> |
| </character-fallback> |
| </characters></pre> |
| <p>The ordering of the substitute elements indicates the |
| preference among them.</p>That is, this data provides |
| recommended fallbacks for use when a charset or supported |
| repertoire does not contain a desired character. There is more |
| than one possible fallback: the recommended usage is that when |
| a character <i>value</i> is not in the desired repertoire the |
| following process is used, whereby the first value that is |
| wholly in the desired repertoire is used. |
| <ul> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em"> |
| <code>toNFC</code>(<i>value</i>)</li> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em">other |
| canonically equivalent sequences, if there are any</li> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em">the |
| explicit <i>substitutes</i> value (in order)</li> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em"> |
| <code>toNFKC</code>(<i>value</i>)</li> |
| </ul> |
| <h2>8 <a name="Coverage_Levels" href="#Coverage_Levels" id= |
| "Coverage_Levels">Coverage Levels</a></h2> |
| <p>The following describes the coverage levels used for the |
| current version of CLDR. This list will change between releases |
| of CLDR. Each level adds to what is in the lower level.</p> |
| <table border="1" cellpadding="0" cellspacing="1"> |
| <!-- nocaption --> |
| <tr> |
| <th nowrap> |
| <div align="right"> |
| Level |
| </div> |
| </th> |
| <th colspan="2">Description</th> |
| </tr> |
| <tr> |
| <td nowrap> |
| <div align="right"> |
| 0 |
| </div> |
| </td> |
| <td>undetermined</td> |
| <td>Does not meet any of the following levels.</td> |
| </tr> |
| <tr> |
| <td nowrap> |
| <div align="right"> |
| 10 |
| </div> |
| </td> |
| <td>core</td> |
| <td>The CLDR "core" data, which is defined as the basic |
| information about the language and writing system that is |
| required before other information can be added using the |
| CLDR survey tool. See <a href= |
| "http://cldr.unicode.org/index/cldr-spec/minimaldata">http://cldr.unicode.org/index/cldr-spec/minimaldata</a></td> |
| </tr> |
| <tr> |
| <td nowrap> |
| <div align="right"> |
| 40 |
| </div> |
| </td> |
| <td>basic</td> |
| <td>The minimum amount of locale data deemed necessary to |
| create a "viable" locale in CLDR. Contains names for the |
| languages, scripts, and territories associated with the |
| language, numbering systems used in those languages, date |
| and number formats, plus a few key values such as the |
| values in Section 3.1 <a href= |
| "tr35.html#Unknown_or_Invalid_Identifiers">Unknown or |
| Invalid Identifiers</a>. Also contains data associated with |
| the most prominent languages and countries.</td> |
| </tr> |
| <tr> |
| <td nowrap> |
| <div align="right"> |
| 60 |
| </div> |
| </td> |
| <td>moderate</td> |
| <td>Contains more types of data and more language and |
| territory names than the basic level. If the language is |
| associated with an EU country, then the moderate level |
| attempts to complete the data as it pertains to all EU |
| member countries.</td> |
| </tr> |
| <tr> |
| <td nowrap> |
| <div align="right"> |
| 80 |
| </div> |
| </td> |
| <td>modern</td> |
| <td>Contains all fields in normal modern use, including all |
| country names, and currencies in use.</td> |
| </tr> |
| <tr> |
| <td nowrap> |
| <div align="right"> |
| 100 |
| </div> |
| </td> |
| <td>comprehensive</td> |
| <td>Contains complete localizations (or valid inheritance) |
| for every possible field.</td> |
| </tr> |
| </table> |
| <p>Levels 40 through 80 are based on the definitions and |
| specifications listed in <strong>8.1-8.4</strong>. However, |
| these principles are continually being refined by the CLDR |
| technical committee, and so do not completely reflect the data |
| that is actually used for coverage determination, which is |
| under the XPath |
| <strong>//supplementalData/CoverageLevels</strong>. For a view |
| of the trunk version of this data<strike>file</strike>, see |
| <a href= |
| "https://github.com/unicode-org/cldr/releases/tag/latest/common/supplemental/coverageLevels.xml"> |
| coverageLevels.xml</a>. (As described in the <a href= |
| "tr35-info.html#Supplemental_Data">introduction to Supplemental |
| Data</a>, the specific XML filename may change.)</p> |
| <p class="dtd"><!ELEMENT coverageLevels ( |
| approvalRequirements, coverageVariable*, coverageLevel* ) |
| ><br> |
| <!ELEMENT coverageLevel EMPTY ><br> |
| <!ATTLIST coverageLevel inLanguage CDATA #IMPLIED ><br> |
| <!ATTLIST coverageLevel inScript CDATA #IMPLIED ><br> |
| <!ATTLIST coverageLevel inTerritory CDATA #IMPLIED ><br> |
| <!ATTLIST coverageLevel value CDATA #REQUIRED ><br> |
| <!ATTLIST coverageLevel match CDATA #REQUIRED ></p> |
| <p>For example, here is an example coverageLevel line.</p> |
| <pre><coverageLevel<br> value="30" |
| inLanguage="(de|fi)" <br> match="localeDisplayNames/types/type[@type='phonebook'][@key='collation']"/></pre> |
| <p>The coverageLevel elements are read in order, and the first |
| match results in a coverage level value. The element matches |
| based on the <span class="attribute">inLanguage</span>, |
| <span class="attribute">inScript</span>, <span class= |
| "attribute">inTerritory</span>, and <span class= |
| "attribute">match</span> attribute values, which are regular |
| expressions. For example, in the above example, a match occurs |
| if the language is de or fi, and if the path is a locale |
| display name for collation=phonebook.</p> |
| <p>The <span class="attribute">match</span> attribute value |
| logically has "//ldml/" prefixed before it is applied. In |
| addition, the "[@" is automatically quoted. Otherwise standard |
| Perl/Java style regular expression syntax is used.</p> |
| <p class="dtd"><!ELEMENT coverageVariable EMPTY ><br> |
| <!ATTLIST coverageVariable key CDATA #REQUIRED ><br> |
| <!ATTLIST coverageVariable value CDATA #REQUIRED ></p> |
| <p>The coverageVariable element allows us to create variables |
| for certain regular expressions that are used frequently in the |
| coverageLevel definitions above. Each coverage varible must |
| contain a key / value pair of attributes, which can then be |
| used to be substituted into a coverageLevel definition |
| above.</p> |
| <p>For example, here is an example coverageLevel line using |
| coverageVariable substitution.</p> |
| <pre> |
| <coverageVariable key="%dayTypes" value="(sun|mon|tue|wed|thu|fri|sat)"><br> |
| <coverageVariable key="%wideAbbr" value="(wide|abbreviated)"><br> |
| <coverageLevel value="20" match="dates/calendars/calendar[@type='gregorian']/days/dayContext[@type='format']/dayWidth[@type='%wideAbbr']/day[@type='%dayTypes']"/></pre> |
| <p>In this example, the coverge variables %dayTypes and |
| %wideAbbr are used to substitute their respective values into |
| the match expression. This allows us to reuse the same variable |
| for other coverageLevel matches that use the same regular |
| expression fragment.</p> |
| <p class="dtd"><br> |
| <!ELEMENT approvalRequirements ( approvalRequirement* ) |
| ><br> |
| <!ELEMENT approvalRequirement EMPTY ><br> |
| <!ATTLIST approvalRequirement votes CDATA #REQUIRED><br> |
| <!ATTLIST approvalRequirement locales CDATA |
| #REQUIRED><br> |
| <!ATTLIST approvalRequirement paths CDATA |
| #REQUIRED><br></p> |
| <p>The approvalRequirements allows to specify the number of |
| survey tool votes required for approval, either based on |
| locale, or path, or both. Certain locales require a higher |
| voting threshhold (usually 8 votes instead of 4), in order to |
| promote greater stability in the data. Furthermore, certain |
| fields that are very high visibility fields, such as number |
| formats, require a CLDR TC committee member's vote for |
| approval.</p> |
| <p>Here is an example of the approvalRequirements section.</p> |
| <pre> |
| <approvalRequirements><br> <!-- "high bar" items --> |
| <approvalRequirement votes="20" locales="*" paths="//ldml/numbers/symbols[^/]++/(decimal|group)"/> |
| <!-- established locales - http://cldr.unicode.org/index/process#TOC-Draft-Status-of-Optimal-Field-Value --> |
| <approvalRequirement votes="8" locales="ar ca cs da de el es fi fr he hi hr hu it ja ko nb nl pl pt pt_PT ro ru sk sl sr sv th tr uk vi zh zh_Hant" paths=""/> |
| <!-- all other items --> |
| <approvalRequirement votes="4" locales="*" paths=""/><br></approvalRequirements> </pre> |
| <p>This section specifies that a TC vote (20 votes) is required |
| for decimal and grouping separators. Furthermore it specifies |
| that any field in the established locales list (i.e. ar, ca, |
| cs, etc.) requires 8 votes, and that all other locales require |
| 4 votes only.</p> |
| <p>For more information on the CLDR Voting process, See |
| <a href="http://cldr.unicode.org/index/process">http://cldr.unicode.org/index/process</a></p> |
| <h3>8.1 <a name="Coverage_Level_Definitions" href= |
| "#Coverage_Level_Definitions" id= |
| "Coverage_Level_Definitions">Definitions</a></h3> |
| <ul> |
| <li><i>Target-Language</i> is the language under |
| consideration.</li> |
| <li><i>Target-Territories</i> is the list of territories |
| found by looking up <i>Target-Language</i> in the |
| <languageData> elements in <a href= |
| "tr35-info.html#Supplemental_Language_Data">Supplemental |
| Language Data</a>.</li> |
| <li> |
| <i>Language-List</i> is <i>Target-Language</i>, plus |
| <ul> |
| <li><b>basic:</b> Chinese, English, French, German, |
| Italian, Japanese, Portuguese, Russian, Spanish, Unknown |
| (de, en, es, fr, it, ja, pt, ru, zh, und</li> |
| <li><b>moderate:</b> basic + Arabic, Hindi, Korean, |
| Indonesian, Dutch, Bengali, Turkish, Thai, Polish (ar, |
| hi, ko, in, nl, bn, tr, th, pl). If an EU language, add |
| the remaining official EU languages, currently: Danish, |
| Greek, Finnish, Swedish, Czech, Estonian, Latvian, |
| Lithuanian, Hungarian, Maltese, Slovak, Slovene (da, el, |
| fi, sv, cs, et, lv, lt, hu, mt, sk, sl)</li> |
| <li><b>modern:</b> all languages that are official or |
| major commercial languages of modern territories</li> |
| </ul> |
| </li> |
| <li><i>Target-Scripts</i> is the list of scripts in which |
| <i>Target-Language</i> can be customarily written (found by |
| looking up <i>Target-Language</i> in the <languageData> |
| elements in <a href= |
| "tr35-info.html#Supplemental_Language_Data">Supplemental |
| Language Data</a>.)<i>,</i> plus Unknown (Zzzz)<i>.</i></li> |
| <li> |
| <i>Script-List</i> is the <i>Target-Scripts</i> plus the |
| major scripts used for multiple languages |
| <ul> |
| <li>Latin, Simplified Chinese, Traditional Chinese, |
| Cyrillic, Arabic (Latn, Hans, Hant, Cyrl, Arab)</li> |
| </ul> |
| </li> |
| <li> |
| <i>Territory-List</i> is the list of territories formed by |
| taking the <i>Target-Territories</i> and adding: |
| <ul> |
| <li><b>basic:</b> Brazil, China, France, Germany, India, |
| Italy, Japan, Russia, United Kingdom, United States, |
| Unknown (BR, CN, DE, GB, FR, IN, IT, JP, RU, US, ZZ)</li> |
| <li><b>moderate:</b> basic + Spain, Canada, Korea, |
| Mexico, Australia, Netherlands, Switzerland, Belgium, |
| Sweden, Turkey, Austria, Indonesia, Saudi Arabia, Norway, |
| Denmark, Poland, South Africa, Greece, Finland, Ireland, |
| Portugal, Thailand, Hong Kong SAR China, Taiwan (ES, BE, |
| SE, TR, AT, ID, SA, NO, DK, PL, ZA, GR, FI, IE, PT, TH, |
| HK, TW). If an EU language, add the remaining member EU |
| countries: Luxembourg, Czech Republic, Hungary, Estonia, |
| Lithuania, Latvia, Slovenia, Slovakia, Malta (LU, CZ, HU, |
| ES, LT, LV, SI, SK, MT).</li> |
| <li><b>modern:</b> all current ISO 3166 territories, plus |
| the UN M.49 [<a href="tr35.html#UNM49">UNM49</a>] regions |
| in <a href= |
| "tr35-info.html#Supplemental_Territory_Containment">Supplemental |
| Territory Containment</a>.</li> |
| </ul> |
| </li> |
| <li><i>Currency-List</i> is the list of current official |
| currencies used in any of the territories in |
| <i>Territory-List</i>, found by looking at the region |
| elements in <a href= |
| "tr35-info.html#Supplemental_Territory_Containment">Supplemental |
| Territory Containment</a>, plus Unknown (XXX).</li> |
| <li><i>Calendar-List</i> is the set of calendars in customary |
| use in any of <i>Target-Territories</i>, plus Gregorian.</li> |
| <li><em>Number-System-List</em> is the set of number systems |
| in customary use in the language.</li> |
| </ul> |
| <h3>8.2 <a name="Coverage_Level_Data_Requirements" href= |
| "#Coverage_Level_Data_Requirements" id= |
| "Coverage_Level_Data_Requirements">Data Requirements</a></h3> |
| <p>The required data to qualify for the level is then the |
| following.</p> |
| <ol> |
| <li>localeDisplayNames |
| <ol> |
| <li><i>languages:</i> localized names for all languages |
| in <i>Language-List.</i></li> |
| <li><i>scripts:</i> localized names for all scripts in |
| <i>Script-List</i>.</li> |
| <li><i>territories:</i> localized names for all |
| territories in <i>Territory-List</i>.</li> |
| <li><i>variants, keys, types:</i> localized names for any |
| in use in <i>Target-Territories</i>; for example, a |
| translation for PHONEBOOK in a German locale.</li> |
| </ol> |
| </li> |
| <li>dates: all of the following for each calendar in |
| <i>Calendar-List</i>. |
| <ol> |
| <li>calendars: localized names</li> |
| <li>month names, day names, era names, and quarter names |
| <ul> |
| <li>context=format and width=narrow, wide, & |
| abbreviated</li> |
| <li>plus context=standAlone and width=narrow, wide, |
| & abbreviated, <i>if the grammatical forms of |
| these are different than for context=format.</i></li> |
| </ul> |
| </li> |
| <li>week: minDays, firstDay, weekendStart, weekendEnd |
| <ul> |
| <li>if some of these vary in territories in |
| <i>Territory-List</i>, include territory locales for |
| those that do.</li> |
| </ul> |
| </li> |
| <li>am, pm, eraNames, eraAbbr</li> |
| <li>dateFormat, timeFormat: full, long, medium, |
| short</li> |
| <li> |
| <p>intervalFormatFallback</p> |
| </li> |
| </ol> |
| </li> |
| <li>numbers: symbols, decimalFormats, scientificFormats, |
| percentFormats, currencyFormats for each number system in |
| <em>Number-System-List</em>.</li> |
| <li>currencies: displayNames and symbol for all currencies in |
| <i>Currency-List</i>, for all plural forms</li> |
| <li>transforms: (moderate and above) transliteration between |
| Latin and each other script in <i>Target-Scripts.</i></li> |
| </ol> |
| <h3>8.3 <a name="Coverage_Level_Default_Values" href= |
| "#Coverage_Level_Default_Values" id= |
| "Coverage_Level_Default_Values">Default Values</a></h3> |
| <p>Items should <i>only</i> be included if they are not the |
| same as the default, which is:</p> |
| <ul> |
| <li>what is in root, if there is something defined |
| there.</li> |
| <li>for timezone IDs: the name computed according to |
| <i><a href="tr35.html#Time_Zone_Fallback">Appendix J: Time |
| Zone Display Names</a></i></li> |
| <li>for collation sequence, the UCA DUCET (Default Unicode |
| Collation Element Table), as modified by CLDR. |
| <ul> |
| <li>however, in that case the locale must be added to the |
| validSubLocale list in <a href= |
| "http://unicode.org/cldr/data/common/collation/root.xml">collation/root.xml</a>.</li> |
| </ul> |
| </li> |
| <li>for currency symbol, language, territory, script names, |
| variants, keys, types, the internal code identifiers, for |
| example, |
| <ul> |
| <li>currencies: EUR, USD, JPY, ...</li> |
| <li>languages: en, ja, ru, ...</li> |
| <li>territories: GB, JP, FR, ...</li> |
| <li>scripts: Latn, Thai, ...</li> |
| <li>variants: PHONEBOOK,...</li> |
| </ul> |
| </li> |
| </ul><!-- end section 8 --> |
| <!-- begin section 9 supplemental metadata --> |
| <h2>9 <a name="Appendix_Supplemental_Metadata" href= |
| "#Appendix_Supplemental_Metadata" id= |
| "Appendix_Supplemental_Metadata">Supplemental Metadata</a></h2> |
| <p>Note that this section discusses the |
| <code><metadata></code> element within the |
| <code><supplementalData></code> element. For the |
| per-locale metadata used in tests and the Survey Tool, see |
| <a href="#Metadata_Elements">10: Locale Metadata |
| Element</a>.</p> |
| <p>The supplemental metadata contains information about the |
| CLDR file itself, used to test validity and provide information |
| for locale inheritance. A number of these elements are |
| described in</p> |
| <ul class="toc"> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix |
| I: <a href="tr35.html#Inheritance_and_Validity">Inheritance |
| and Validity</a></li> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix |
| K: <a href="tr35.html#Valid_Attribute_Values">Valid Attribute |
| Values</a></li> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix |
| L: <a href="tr35.html#Canonical_Form">Canonical Form</a></li> |
| <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix |
| M: <a href="#Coverage_Levels">Coverage Levels</a></li> |
| </ul> |
| <h3>9.1 <a name="Supplemental_Alias_Information" href= |
| "#Supplemental_Alias_Information" id= |
| "Supplemental_Alias_Information">Supplemental Alias |
| Information</a></h3> |
| <p class="dtd"><!ELEMENT alias |
| (languageAlias*,scriptAlias*,territoryAlias*,subdivisionAlias*,variantAlias*,zoneAlias*) |
| ><br> |
| <br> |
| <em>The following are common attributes for subelements of |
| <alias>:</em><br> |
| <!ELEMENT *Alias EMPTY ><br> |
| <!ATTLIST *Alias type NMTOKEN #IMPLIED ><br> |
| <!ATTLIST *Alias replacement NMTOKEN #IMPLIED ><br> |
| <!ATTLIST *Alias reason ( deprecated | overlong ) |
| #IMPLIED><br> |
| <br> |
| <em>The languageAlias has additional reasons</em><br> |
| <!ATTLIST languageAlias reason ( deprecated | overlong | |
| macrolanguage | legacy | bibliographic ) #IMPLIED></p> |
| <p>This element provides information as to parts of locale IDs |
| that should be substituted when accessing CLDR data. This |
| logical substitution should be done to both the locale id, and |
| to any lookup for display names of languages, territories, and |
| so on. The replacement for the language and territory types is |
| more complicated: see <em>Part 1: <a href= |
| "tr35.html#Contents">Core</a>, Section 3.3.1 <a href= |
| "tr35.html#BCP_47_Language_Tag_Conversion">BCP 47 Language Tag |
| Conversion</a></em> for details.</p> |
| <pre><alias> |
| <languageAlias type="in" replacement="id"> |
| <languageAlias type="sh" replacement="sr"> |
| <languageAlias type="sh_YU" replacement="sr_Latn_YU"> |
| ... |
| <territoryAlias type="BU" replacement="MM"> |
| ... |
| </alias></pre> |
| <p>Attribute values for the *Alias values include the |
| following:</p> |
| <table> |
| <caption> |
| <a name="Alias_Attribute_Values" href= |
| "#Alias_Attribute_Values" id="Alias_Attribute_Values">Alias |
| Attribute Values</a> |
| </caption> |
| <tr> |
| <th scope="col">Attribute</th> |
| <th scope="col">Value</th> |
| <th scope="col">Description</th> |
| </tr> |
| <tr> |
| <td>type</td> |
| <td>NMTOKEN</td> |
| <td>The code to be replaced</td> |
| </tr> |
| <tr> |
| <td>replacement</td> |
| <td>NMTOKEN</td> |
| <td>The code(s) to replace it, space-delimited.</td> |
| </tr> |
| <tr> |
| <td rowspan="5">reason</td> |
| <td>deprecated</td> |
| <td>The code in type is deprecated, such as 'iw' by 'he', |
| or 'CS' by 'RS ME'.</td> |
| </tr> |
| <tr> |
| <td>overlong</td> |
| <td>The code in type is too long, such as 'eng' by 'en' or |
| 'USA' or '840' by 'US'</td> |
| </tr> |
| <tr> |
| <td>macrolanguage</td> |
| <td>The code in type is an encompassed languagethat is |
| replaced by a macrolanguage, such as '<a href= |
| "http://www-01.sil.org/iso639-3/documentation.asp?id=arb">arb'</a> |
| by 'ar'.</td> |
| </tr> |
| <tr> |
| <td>legacy</td> |
| <td>The code in type is a legacy code that is replaced by |
| another code for compatiblity with established legacy |
| usage, such as 'sh' by 'sr_Latn'</td> |
| </tr> |
| <tr> |
| <td>bibliographic</td> |
| <td>The code in type is a <a href= |
| "http://www.loc.gov/standards/iso639-2/langhome.html">bibliographic |
| code</a>, which is replaced by a terminology code, such as |
| 'alb' by 'sq'.</td> |
| </tr> |
| </table> |
| <h3>9.2 <a name="Supplemental_Deprecated_Information" href= |
| "#Supplemental_Deprecated_Information" id= |
| "Supplemental_Deprecated_Information">Supplemental Deprecated |
| Information (Deprecated)</a></h3> |
| <pre class="dtd"> |
| <!ELEMENT deprecated ( deprecatedItems* ) > |
| <!ATTLIST deprecated draft ( approved | contributed | provisional | unconfirmed | true | false ) #IMPLIED > <!-- true and false are deprecated. --> |
| |
| <!ELEMENT deprecatedItems EMPTY > |
| <!ATTLIST deprecatedItems type ( standard | supplemental | ldml | supplementalData | ldmlBCP47 ) #IMPLIED > <!-- standard | supplemental are deprecated --> |
| <!ATTLIST deprecatedItems elements NMTOKENS #IMPLIED > |
| <!ATTLIST deprecatedItems attributes NMTOKENS #IMPLIED > |
| <!ATTLIST deprecatedItems values CDATA #IMPLIED ></pre> |
| <p>The deprecated items element was used to indicate elements, |
| attributes, and attribute values that are deprecated. This |
| means that the items are valid, but that their usage is |
| strongly discouraged. This element and its subelements have |
| been deprecated in favor of <a href= |
| "tr35.html#DTD_Annotations">DTD Annotations</a>.</p> |
| <p>Where particular values are deprecated (such as territory |
| codes like SU for Soviet Union), the names for such codes may |
| be removed from the common/main translated data after some |
| period of time. However, typically supplemental information for |
| deprecated codes is retained, such as containment, likely |
| subtags, older currency codes usage, etc. The English name may |
| also be retained, for debugging purposes.</p> |
| <h3>9.3 <a name="Default_Content" href="#Default_Content" id= |
| "Default_Content">Default Content</a></h3> |
| <pre class="dtd"><!ELEMENT defaultContent EMPTY > |
| <!ATTLIST defaultContent locales NMTOKENS #IMPLIED ></pre> |
| <p>In CLDR, locales without territory information (or where |
| needed, script information) provide data appropriate for what |
| is called the <i>default content locale</i>. For example, the |
| <i>en</i> locale contains data appropriate for <i>en-US</i>, |
| while the <i>zh</i> locale contains content for |
| <i>zh-Hans-CN</i>, and the <i>zh-Hant</i> locale contains |
| content for <i>zh-Hant-TW</i>. The default content locales |
| themselves thus inherit all of their contents, and are |
| empty.</p> |
| <p>The choice of content is typically based on the largest |
| literate population of the possible choices. Thus if an |
| implementation only provides the base language (such as |
| <i>en</i>), it will still get a complete and consistent set of |
| data appropriate for a locale which is reasonably likely to be |
| the one meant. Where other information is available, such as |
| independent country information, that information can always be |
| used to pick a different locale (such as <i>en-CA</i> for a |
| website targeted at Canadian users).</p> |
| <p>If an implementation is to use a different default locale, |
| then the data needs to be <i>pivoted</i>; all of the data from |
| the CLDR for the current default locale pushed out to the |
| locales that inherit from it, then the new default content |
| locale's data moved into the base. There are tools in CLDR to |
| perform this operation.</p> |
| <p>For the relationship between <span>Inheritance, |
| DefaultContent, LikelySubtags, and LocaleMatching, see |
| <strong><em>Section 4.2.6 <a href= |
| "tr35.html#Inheritance_vs_Related">Inheritance vs Related |
| Information</a></em></strong>.</span></p> |
| <!-- end section 9 supp metadata --> |
| <!-- begin section 10 the metadata element --> |
| <h2>10 <a name="Metadata_Elements" href="#Metadata_Elements" |
| id="Metadata_Elements">Locale Metadata |
| Element<strike>s</strike></a></h2> |
| <p>Note: This section refers to the per-locale |
| <code><metadata></code> element, containing metadata |
| about a particular locale. This is in contrast to the <a href= |
| "#Appendix_Supplemental_Metadata"><em>Supplemental</em> |
| Metadata</a>, which is in the supplemental tree and is not |
| specific to a locale.</p> |
| <p class="dtd"><!ELEMENT metadata ( alias | ( casingData?, |
| special* ) ) ><br> |
| <!ELEMENT casingData ( alias | ( casingItem*, special* ) ) |
| ><br> |
| <!ELEMENT casingItem ( #PCDATA ) ><br> |
| <!ATTLIST casingItem type CDATA #REQUIRED ><br> |
| <!ATTLIST casingItem override (true | false) #IMPLIED |
| ><br> |
| <!ATTLIST casingItem forceError (true | false) #IMPLIED |
| ><br></p> |
| <p>The <metadata> element contains metadata about the |
| locale for use by the Survey Tool or other tools in checking |
| locale data; this data is not intended for export as part of |
| the locale itself.</p> |
| <p>The <casingItem> element specifies the capitalization |
| intended for the majority of the data in a given category with |
| the locale. The purpose is so that warnings can be issued to |
| translators that anything deviating from that capitalization |
| should be carefully reviewed. Its type attribute has one of the |
| values used for the <contextTransformUsage> element |
| above, with the exception of the special value "all"; its value |
| is one of the following:</p> |
| <ul> |
| <li>lowercase</li> |
| <li>titlecase</li> |
| </ul> |
| <p>The <casingItem> data is generated by a tool based on |
| the data available in CLDR. In cases where the generated casing |
| information is incorrect and needs to be manually edited, the |
| override attribute is set to "true" so that the tool will not |
| override the manual edits. When the casing information is known |
| to be both correct and something that should apply to all |
| elements of the specified type in a given locale, the forceErr |
| attribute may be set to "true" to force an error instead of a |
| warning for items that do not match the casing information.</p> |
| <!-- end section Info-A metadta element --> |
| <!-- begin section 11 Version Information --> |
| <h2>11 <a name="Version_Information" href= |
| "#Version_Information" id="Version_Information">Version |
| Information</a></h2> |
| <p class="dtd"><!ELEMENT version EMPTY ><br> |
| <!ATTLIST version cldrVersion CDATA #FIXED "27" ><br> |
| <!ATTLIST version unicodeVersion CDATA #FIXED "7.0.0" |
| ><br></p> |
| <p>The <cldrVersion> attribute defines the CLDR version |
| for this data, as published on <a href= |
| "http://cldr.unicode.org/index/downloads">CLDR |
| Releases/Downloads</a></p> |
| <p>The <unicodeVersion> attribute defines the version of |
| the Unicode standard that is used to interpret data. |
| Specifically, some data elements such as exemplar characters |
| are expressed in terms of UnicodeSets. Since UnicodeSets can be |
| expressed in terms of Unicode properties, their meaning depend |
| on the Unicode version from which property values are |
| derived.</p><!-- end section Version Information metadta element --> |
| <h2>12 <a name="Parent_Locales" href="#Parent_Locales" id= |
| "Parent_Locales">Parent Locales</a></h2> |
| <p>The parentLocales data is supplemental data, but is |
| described in detail in the <a href= |
| "tr35.html#Parent_Locales">core specification section |
| 4.1.3.</a></p> |
| <hr> |
| <p class="copyright">Copyright © 2001–2019 Unicode, Inc. All |
| Rights Reserved. The Unicode Consortium makes no expressed or |
| implied warranty of any kind, and assumes no liability for |
| errors or omissions. No liability is assumed for incidental and |
| consequential damages in connection with or arising out of the |
| use of the information or programs contained or accompanying |
| this technical report. The Unicode <a href= |
| "http://unicode.org/copyright.html">Terms of Use</a> apply.</p> |
| <p class="copyright">Unicode and the Unicode logo are |
| trademarks of Unicode, Inc., and are registered in some |
| jurisdictions.</p> |
| </div> |
| </body> |
| </html> |