Skip to main content

A fast, simple ISO 639 library.

Project description

iso639-lang

PyPI Supported Python versions PyPI - Downloads

iso639-lang handles the ISO 639 code for individual languages and language groups.

>>> from iso639 import Lang
>>> Lang("French")
Lang(name='French', pt1='fr', pt2b='fre', pt2t='fra', pt3='fra', pt5='')

Installation

$ pip install iso639-lang

iso639-lang supports Python 3.8+.

Usage

Begin by importing the Lang class.

>>> from iso639 import Lang

Let's try with the identifier of an individual language.

>>> lg = Lang("deu")
>>> lg.name # 639-3 reference name
'German'
>>> lg.pt1 # 639-1 identifier
'de'
>>> lg.pt2b # 639-2/B bibliographic identifier
'ger'
>>> lg.pt2t # 639-2/T terminological identifier
'deu'
>>> lg.pt3 # 639-3 identifier
'deu'

And now with the identifier of a group of languages.

>>> lg = Lang("cel")
>>> lg.name # 639-5 English name
'Celtic languages'
>>> lg.pt2b # 639-2/B bibliographic identifier
'cel'
>>> lg.pt2t # 639-2/T terminological identifier
'cel'
>>> lg.pt5 # 639-5 identifier
'cel'

Lang is instantiable with any ISO 639 identifier or reference name.

>>> Lang("German") == Lang("de") == Lang("deu") == Lang("ger")
True

Lang also recognizes all non-reference English names associated with a language identifier in ISO 639.

>>> Lang("Chinese, Mandarin") # 639-3 inverted name
Lang(name='Mandarin Chinese', pt1='', pt2b='', pt2t='', pt3='cmn', pt5='')
>>> Lang("Uyghur") # other 639-3 printed name
Lang(name='Uighur', pt1='ug', pt2b='uig', pt2t='uig', pt3='uig', pt5='')
>>> Lang("Valencian") # other 639-2 English name
Lang(name='Catalan', pt1='ca', pt2b='cat', pt2t='cat', pt3='cat', pt5='')

Please note that Lang is case-sensitive.

>>> Lang("ak")
Lang(name='Akan', pt1='ak', pt2b='aka', pt2t='aka', pt3='aka', pt5='')
>>> Lang("Ak")
Lang(name='Ak', pt1='', pt2b='', pt2t='', pt3='akq', pt5='')

You can use the asdict method to return ISO 639 values as a Python dictionary.

>>> Lang("fra").asdict()
{'name': 'French', 'pt1': 'fr', 'pt2b': 'fre', 'pt2t': 'fra', 'pt3': 'fra', 'pt5': ''}

Other Language Names

In addition to their reference name, some language identifiers may be associated with other names. You can list them using the other_names method.

>>> lg = Lang("ast")
>>> lg.name
'Asturian'
>>> lg.other_names()
['Asturleonese', 'Bable', 'Leonese']

Language Types

The type of a language is accessible thanks to the type method.

>>> lg = Lang("Latin")
>>> lg.type()
'Historical'

Macrolanguages

You can easily determine whether a language is a macrolanguage or an individual language.

>>> lg = Lang("Arabic")
>>> lg.scope()
'Macrolanguage'

Use the macro method to get the macrolanguage of an individual language.

>>> lg = Lang("Wu Chinese")
>>> lg.macro()
Lang(name='Chinese', pt1='zh', pt2b='chi', pt2t='zho', pt3='zho', pt5='')

Conversely, you can also list all the individual languages that share a common macrolanguage.

>>> lg = Lang("Persian")
>>> lg.individuals()
[Lang(name='Iranian Persian', pt1='', pt2b='', pt2t='', pt3='pes', pt5=''), 
Lang(name='Dari', pt1='', pt2b='', pt2t='', pt3='prs', pt5='')]

In Data Structures

As Lang is hashable, Lang instances can be added to a set or used as dictionary keys.

>>> {Lang("de"): "foo", Lang("fr"):  "bar"}
{Lang(name='German', pt1='de', pt2b='ger', pt2t='deu', pt3='deu', pt5=''): 'foo', Lang(name='French', pt1='fr', pt2b='fre', pt2t='fra', pt3='fra', pt5=''): 'bar'}

Lists of Lang instances are sortable by name.

>>> [lg.name for lg in sorted([Lang("deu"), Lang("rus"), Lang("eng")])]
['English', 'German', 'Russian']

Iterator

iter_langs() iterates through all possible Lang instances, ordered alphabetically by name.

>>> from iso639 import iter_langs
>>> [lg.name for lg in iter_langs()]
["'Are'are", "'Auhelawa", "A'ou", ... , 'ǂHua', 'ǂUngkue', 'ǃXóõ']

Exceptions

When an invalid language value is passed to Lang, an InvalidLanguageValue exception is raised.

>>> from iso639.exceptions import InvalidLanguageValue
>>> try:
...     Lang("foobar")
... except InvalidLanguageValue as e:
...     e.msg
... 
"'foobar' is not a valid Lang argument."

When a deprecated language value is passed to Lang, a DeprecatedLanguageValue exception is raised.

>>> from iso639.exceptions import DeprecatedLanguageValue
>>> try:
...     Lang("gsc")
... except DeprecatedLanguageValue as e:
...     lg = Lang(e.change_to)
...     f"{e.name} replaced by {lg.name}."
...
'Gascon replaced by Occitan (post 1500).'

Note that you can use the is_language language checker if you don't want to handle exceptions.

Checker

The is_language function checks if a language value exists according to ISO 639.

>>> from iso639 import is_language
>>> is_language("fr")
True
>>> is_language("French")
True

You can restrict the check to certain identifiers or names by passing an additional argument.

>>> is_language("fr", "pt3") # only 639-3
False
>>> is_language("fre", ("pt2b", "pt2t")) # only 639-2/B or 639-2/T
True

Speed

iso639-lang loads its mappings into memory to process calls much faster than Python libraries that rely on an embedded database.

Sources

As of November 11, 2024, iso639-lang is based on the latest tables provided by the ISO 639 registration authorities. Please open a new issue if you find that this library uses out-of-date data files.

Set Description Registration Authority Last Modified
Set 1 two-letter language identifiers for major, mostly national individual languages Infoterm 2009-09-01
Set 2 three-letter language identifiers for a larger number of widely known individual languages and a number of language groups Library of Congress 2017-12-21
Set 3 three-letter language identifiers covering all individual languages, including living, extinct and ancient languages SIL International 2024-10-10
Set 5 three-letter language identifiers covering a larger set of language groups, living and extinct Library of Congress 2013-02-11

To learn more about how the source tables are processed by the iso639-lang library, read the generate.py script.

Contributing

We welcome contributions from the community to help improve this Python library. If you're interested in contributing, please follow the guidelines here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iso639_lang-2.5.1.tar.gz (319.7 kB view details)

Uploaded Source

Built Distribution

iso639_lang-2.5.1-py3-none-any.whl (325.0 kB view details)

Uploaded Python 3

File details

Details for the file iso639_lang-2.5.1.tar.gz.

File metadata

  • Download URL: iso639_lang-2.5.1.tar.gz
  • Upload date:
  • Size: 319.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for iso639_lang-2.5.1.tar.gz
Algorithm Hash digest
SHA256 c9e311ec2b6f1005eb36d3a0a0f6bbf82898cb4d3cdf568b6aae4a0705a19dd5
MD5 5b6aa7a7abc71b334ff221f495b846a4
BLAKE2b-256 bfafb865ac742df910913631cfcdb4137b6db39f8e817f7fefd6cb50936db8f8

See more details on using hashes here.

File details

Details for the file iso639_lang-2.5.1-py3-none-any.whl.

File metadata

  • Download URL: iso639_lang-2.5.1-py3-none-any.whl
  • Upload date:
  • Size: 325.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for iso639_lang-2.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec24570fe0892c1a7ccdf1aad5ea4e5f0a745f9ac284c601d62d149c7a1943ce
MD5 89ed6a6e9762f114935c50a3b69eda0e
BLAKE2b-256 b3dcfa21389742d7b08221038d6cd10daa67105ed48f875d0f4b412e2dc83d1a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page