| Note | The values for this attribute are language
    ‘tags’ as defined in  BCP 47. Currently
    BCP 47 comprises RFC 4646 and RFC 4647; over time, other IETF
    documents may succeed these as the best current practice. A ‘language tag’, per BCP 47, is assembled
    from a sequence of components or  subtags separated by
    the hyphen character ( -, U+002D). The tag
    is made of the following subtags, in the following order. Every
    subtag except the first is optional. If present, each occurs only
    once, except the fourth and fifth components (variant and
    extension), which are repeatable.
     - language
 - The IANA-registered code for the language. This is almost
      always the same as the ISO 639 2-letter language code if there
      is one. The list of available registered language subtags can be
      found at http://www.iana.org/assignments/language-subtag-registry.
      It is recommended that this code be written in lower
      case.
 - script
 - The ISO 15924 code for the script. These codes consist of
      4 letters, and it is recommended they be written with an initial
      capital, the other three letters in lower case. The canonical
      list of codes is maintained by the Unicode Consortium, and is
      available at http://unicode.org/iso15924/iso15924-codes.html. The
      IETF recommends this code be omitted unless it is necessary to
      make a distinction you need.
 - region
 - Either an ISO 3166 country code or a UN M.49 region code
      that is registered with IANA (not all such codes are registered,
      e.g. UN codes for economic groupings or codes for countries for
      which there is already an ISO 3166 2-letter code are not
      registered). The former consist of 2 letters, and it is
      recommended they be written in upper case. The list of codes can
      be found at http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/index.html.
      The latter consist of 3 digits; the list of codes can be found
      at http://unstats.un.org/unsd/methods/m49/m49.htm.
 - variant
 - An IANA-registered variation. These codes ‘are used to indicate additional, well-recognized
      variations that define a language or its dialects that are not
      covered by other available subtags’.
 - extension
 - An extension has the format of a single letter followed by
      a hyphen followed by additional subtags. These exist to allow
      for future extension to BCP 47, but as of this writing no such
      extensions are in use.
 - private use
 - An extension that uses the initial subtag of the single
      letter x (i.e., starts with
      x-) has no meaning except as negotiated among the
      parties involved. These should be used with great care, since
      they interfere with the interoperability that use of RFC 4646 is
      intended to promote. In order for a document that makes use of
      these subtags to be TEI conformant, a corresponding
      language element must be present in the TEI
      header.
 
 There are two exceptions to the above format. First, there are
 language tags in the  IANA
 registry that do not match the above syntax, but are present
 because they have been ‘grandfathered’ from
 previous specifications. Second, an entire language tag can consist of only a private use
 subtag. These tags start with  x-, and do not need to
 follow any further rules established by the IETF and endorsed by
 these Guidelines. Like all language tags that make use of private use
 subtags, the language in question must be documented in a
 corresponding  language element in the TEI header. Examples include
       - sn
 - Shona
 - zh-TW
 - Taiwanese
 - zh-Hant-HK
 - Chinese written in traditional script as used in Hong Kong
 - en-SL
 - English as spoken in Sierra Leone
 - pl
 - Polish
 - es-MX
 - Spanish as spoken in Mexico
 - es-419
 - Spanish as spoken in Latin America
 
  |