GitHub - mcyph/pos_tagger: A multi-engine part-of-speech tagging system

About

A multi-engine part-of-speech tagging system.

Brings together the following engines:

Not ready for general use - built for use at http://langlynx.com

TODO!

apt install natto mecab-ko mecab-ko-dic natto-py

TODO!

Provide testing to make sure things are setup correctly, turning off engines which aren't functioning (machine learning setups can be complex, and components can conflict or be difficult to setup, so better partially working than not at all!)
Allow downloading of models (either on-demand or explicitly).
Add support for MeCab (Japanese)
Add support for selecting models by license (LGPL/CC-BY/CC-BY-NC) etc
Add serialization/deserialization with "pretty-printed" HTML in a (somewhat) standard format, allowing for doing things like javascript identification on mouseover of dependencies, and the lemmatized forms of words

Please report any bugs/feature requests at GitHub: https://github.com/mcyph/pos_tagger

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
pos_tagger		pos_tagger
README.rst		README.rst
install.sh		install.sh
setup.py		setup.py