Skip to content

'liga' should not support only latin words #697

@TonyJR

Description

@TonyJR

'liga' works in latin words well. But for non Latin characters, it's not work.
For example, https://learn.microsoft.com/en-us/typography/opentype/otspec183/gsub#example-7-contextsubstformat1-subtable-and-substlookuprecord
image
NewFont-Regular.ttf.zip

Expected Behavior

image

Current Behavior

import opentype from 'opentype.js';
import { readFileSync } from 'fs';
import path from 'path';

const { parse } = opentype;

const loadSync = (url, opt) => parse(readFileSync(url), opt);

let newFont = loadSync('./fonts/NewFont-Regular.ttf');
console.log(newFont.stringToGlyphIndexes(' – ')); // [ 1, 88, 1 ]  wrong, should be [86,88,86]
console.log(newFont.stringToGlyphIndexes('ABA')); // [ 25, 3, 25 ] right

Possible Solution

script latn;
language dflt;

lookup SUB_11 {
	sub A by X;
	sub space by thinspace;
} SUB_11;

These two rules are very similar, the only different is 'latin characters'. I think the problem lies in https://github.com/opentypejs/opentype.js/blob/master/src/features/latn/contextCheck/latinWord.js

function latinWordStartCheck(contextParams) {
    const char = contextParams.current;
    const prevChar = contextParams.get(-1);
    return (
        // ? latin first char
        (prevChar === null && isLatinChar(char)) ||
        // ? latin char preceded with a non latin char
        (!isLatinChar(prevChar) && isLatinChar(char))
    );
}

function latinWordEndCheck(contextParams) {
    const nextChar = contextParams.get(1);
    return (
        // ? last latin char
        (nextChar === null) ||
        // ? next char is not latin
        (!isLatinChar(nextChar))
    );
}

I think we mixed up ‘script’ and ‘language’.

  • 'script 'is more like a preference.
    image
    https://learn.microsoft.com/en-us/typography/script-development/standard#features
    If you choosed 'latn', Equivalent to you choosed ['ccmp','liga','clig'] by default. You can also manually choose to close or open certain tags if you want. When the program queries these tags first or returns ' latn ', but cannot find them, it needs to return' DFLT '

  • 'Language' is more like optimization for certain characters.
    Selective processing based on Unicode. And I think we needn't support it now.
    image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions