diff --git a/peps/pep-0545.rst b/peps/pep-0545.rst index 633d497b44a..066bd5d9714 100644 --- a/peps/pep-0545.rst +++ b/peps/pep-0545.rst @@ -187,15 +187,25 @@ Language Tag '''''''''''' A common notation for language tags is the :rfc:`IETF Language Tag <5646>` -[4]_ based on ISO 639, although gettext uses ISO 639 tags with -underscores (ex: ``pt_BR``) instead of dashes to join tags [5]_ -(ex: ``pt-BR``). Examples of IETF Language Tags: ``fr`` (French), -``ja`` (Japanese), ``pt-BR`` (Orthographic formulation of 1943 - -Official in Brazil). +[4]_ (BCP 47, RFC 5646), which is based on ISO 639 for language codes, +ISO 15924 for script codes, and ISO 3166 for region codes. Gettext uses +ISO 639 tags with underscores (e.g. ``pt_BR``), but IETF tags use hyphens +as separators instead of dashes to join tags [5]_ (e.g. ``pt-BR``). -It is more common to see dashes instead of underscores in URLs [6]_, -so we should use IETF language tags, even if sphinx uses gettext -internally: URLs are not meant to leak the underlying implementation. +Examples of IETF Language Tags: + +* ``fr`` (French), +* ``ja`` (Japanese), +* ``pt-br`` (Portugese as spoken in Brazil), +* ``pa-guru`` (Punjabi written in Gurmukhi script) + +The ``script`` subtag is used when a language can be written in multiple +writing systems. For example, Punjabi can be written in Gurmukhi (``pa-guru``) +or Shahmukhi (``pa-arab``). + +It is more common to see hyphens instead of underscores in URLs [6]_, +so we should use IETF language tags in URL paths, even if Sphinx or Gettext use +different internal conventions. URLs should not leak implementation details. It's uncommon to see capitalized letters in URLs, and docs.python.org doesn't use any, so it may hurt readability by attracting the eye on it, @@ -206,10 +216,10 @@ states that tags are not case sensitive. As the RFC allows lower case, and it enhances readability, we should use lowercased tags like ``pt-br``. -We may drop the region subtag when it does not add distinguishing +We may drop the subtag when it does not add distinguishing information, for example: "de-DE" or "fr-FR". (Although it might make sense, respectively meaning "German as spoken in Germany" -and "French as spoken in France"). But when the region subtag +and "French as spoken in France"). But when the subtag actually adds information, for example "pt-BR" or "Portuguese as spoken in Brazil", it should be kept.