Divide ligature letters such as Thai, Khmer letters and complex emoji into array of graphemes.
You can simply use this library instead of Array.from
to get graphemes.
$ npm install split-graphemes
// An emoji '👨👩👦👦' consists of 4 people face emoji joined by Zero Width Joiners (ZWJ).
const chars = Array.from('👨👩👦👦') // ['👨', ZWJ, '👩', ZWJ, '👦', ZWJ, '👦']
// It is interpreted exactly as one character!
const chars = splitGraphemes('👨👩👦👦') // ['👨👩👦👦']
Array.from('ប៉ុស្ដិ៍') // ['ប', '៉', 'ុ', 'ស', '្', 'ដ', 'ិ', '៍']
splitGraphemes('ប៉ុស្ដិ៍') // ['ប៉ុ', 'ស្ដិ៍']
splitGraphemes('ごん゙に゙ぢば') // ['ご', 'ん゙', 'に゙', 'ぢ', 'ば']
splitGraphemes('パピプペポ') // ['パ', 'ピ', 'プ', 'ペ', 'ポ']
splitGraphemes('Hello') // ['H', 'e', 'l', 'l', 'o']
The list of characters is at here.