ISO 639 Tools

About

A set of utilities to provide information about languages, by their <a href="https://en.wikipedia.org/wiki/ISO_639">ISO 639</a> code.

Usage

The ISOTools class provides tools for the normalisation of ISO 639 codes.

There are multiple different standards for language codes in software:

<a href="https://en.wikipedia.org/wiki/ISO_639-3">ISO 639-3</a>: There are 6000+ ISO to language codes in this standard, developed in collaboration with SIL international. Provides a very granular classification of languages, however does not allow for
<a href="https://en.wikipedia.org/wiki/ISO_639-2">ISO 639-2</a>:
<a href="https://en.wikipedia.org/wiki/ISO_639-1">ISO 639-1</a>:
<a href="https://en.wikipedia.org/wiki/Locale_(computer_software)">Local codes</a>: in format [language[_territory][.codeset][@modifier]]` or `[language[-territory][.codeset][@modifier]].

When writing software such as LangLynx, I found myself needing to provide information about languages and dialects which weren't possible with any of these standards:

ISO code: supports normalisation
Script code:
Variant code:
Territory code:

I didn't add the codeset/character set, as I think it can be reasonable to expect the use of Unicode across all of my applications.

from iso_tools.ISOTools import ISOTools

print(ISOTools.verify_iso('en'))

# Remove specific properties
print(ISOTools.get_L_removed('en_Latn', [SCRIPT]))
print(ISOTools.removed('ja_Jpan', LANG | SCRIPT))

# Miscellaneous
print(ISOTools.get_lang_props('en'))
print(ISOTools.guess_omitted_info('en'))

# Locale->ISO
print(ISOTools.locale_to_iso('en-US'))
print(ISOTools.locale_to_iso('en-GB'))

# Remove implied/unecessary information (or recover it)
print(ISOTools.pack_iso('ja_Jpan'))
print(ISOTools.remove_unneeded_info('ja_Jpan'))
print(ISOTools.unpack_iso('ja'))

# Split/join ISO code information
print(ISOTools.join(part3='en', script='Latn'))
print(ISOTools.split('ja'))
print(ISOTools.split_multiple('ja_-_en'))

# Fileame escape/de-escape
print(ISOTools.filename_escape('en'))
print(ISOTools.filename_split('en'))

# URL escape/de-escape
print(ISOTools.url_join('ja', 'en'))
print(ISOTools.url_split('ja_-_en'))
print(ISOTools.url_escape('ja_Jpan'))
print(ISOTools.url_unescape('ja_Jpan'))

There is also access to the original ISO 639-3 language data indexed by three-character codes, allowing both conversions

from iso_tools.iso_codes.ISOCodes import ISOCodes

# Convert ISO 639-1/ISO 639-2 codes to ISO 639-3
ISOCodes.to_part3('en')

# Get information about a specific ISO code
ISOCodes.get_D_iso('eng')

# Get other names for an ISO code
ISOCodes.get_D_alternate_names('eng')

# ???
ISOCodes.get_D_iso_639('eng')

# ???
ISOCodes.get_D_lang_codes('eng')

# Get whether there are macro codes (ADD AN EXPLANATION/EXAMPLES!)
ISOCodes.get_L_macros('eng')
ISOCodes.get_L_rev_macros('eng')

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
iso_tools		iso_tools
.gitignore		.gitignore
README.rst		README.rst
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ISO 639 Tools

About

Usage

About

Releases

Packages

Languages

mcyph/iso_tools

Folders and files

Latest commit

History

Repository files navigation

ISO 639 Tools

About

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages