TL; DR æã®ãã¼ã¯ã³åã®ããã®ã©ã¤ãã©ãªã§ãã konoha ã®ç´¹ä»ããã¾ãï¼ (æ§ tiny_tokenizer) âã¿ãããªæãã§ä½¿ãã¾ãï¼ãªã«ã¨ãã from konoha import WordTokenizer sentence = 'èªç¶è¨èªå¦çãåå¼·ãã¦ãã¾ã' tokenizer = WordTokenizer('MeCab') print(tokenizer.tokenize(sentence)) # -> [èªç¶, è¨èª, å¦ç, ã, åå¼·, ã, ã¦, ã, ã¾ã] tokenizer = WordTokenizer('Kytea') print(tokenizer.tokenize(sentence)) # -> [èªç¶, è¨èª, å¦ç, ã, åå¼·, ã, ã¦, ã, ã¾, ã] tokenizer = WordTokenizer('Sentencepie

{{#tags}}- {{label}}
{{/tags}}