Skip to content

Latest commit

 

History

History
99 lines (67 loc) · 3.12 KB

README.rst

File metadata and controls

99 lines (67 loc) · 3.12 KB

ISO 639 Tools

About

A set of utilities to provide information about languages, by their <a href="https://en.wikipedia.org/wiki/ISO_639">ISO 639</a> code.

Usage

The ISOTools class provides tools for the normalisation of ISO 639 codes.

There are multiple different standards for language codes in software:

When writing software such as LangLynx, I found myself needing to provide information about languages and dialects which weren't possible with any of these standards:

  • ISO code: supports normalisation
  • Script code:
  • Variant code:
  • Territory code:

I didn't add the codeset/character set, as I think it can be reasonable to expect the use of Unicode across all of my applications.

from iso_tools.ISOTools import ISOTools

print(ISOTools.verify_iso('en'))

# Remove specific properties
print(ISOTools.get_L_removed('en_Latn', [SCRIPT]))
print(ISOTools.removed('ja_Jpan', LANG | SCRIPT))

# Miscellaneous
print(ISOTools.get_lang_props('en'))
print(ISOTools.guess_omitted_info('en'))

# Locale->ISO
print(ISOTools.locale_to_iso('en-US'))
print(ISOTools.locale_to_iso('en-GB'))

# Remove implied/unecessary information (or recover it)
print(ISOTools.pack_iso('ja_Jpan'))
print(ISOTools.remove_unneeded_info('ja_Jpan'))
print(ISOTools.unpack_iso('ja'))

# Split/join ISO code information
print(ISOTools.join(part3='en', script='Latn'))
print(ISOTools.split('ja'))
print(ISOTools.split_multiple('ja_-_en'))

# Fileame escape/de-escape
print(ISOTools.filename_escape('en'))
print(ISOTools.filename_split('en'))

# URL escape/de-escape
print(ISOTools.url_join('ja', 'en'))
print(ISOTools.url_split('ja_-_en'))
print(ISOTools.url_escape('ja_Jpan'))
print(ISOTools.url_unescape('ja_Jpan'))

There is also access to the original ISO 639-3 language data indexed by three-character codes, allowing both conversions

from iso_tools.iso_codes.ISOCodes import ISOCodes

# Convert ISO 639-1/ISO 639-2 codes to ISO 639-3
ISOCodes.to_part3('en')

# Get information about a specific ISO code
ISOCodes.get_D_iso('eng')

# Get other names for an ISO code
ISOCodes.get_D_alternate_names('eng')

# ???
ISOCodes.get_D_iso_639('eng')

# ???
ISOCodes.get_D_lang_codes('eng')

# Get whether there are macro codes (ADD AN EXPLANATION/EXAMPLES!)
ISOCodes.get_L_macros('eng')
ISOCodes.get_L_rev_macros('eng')