API

Download and use pre.trained models or datasets.

api_info() api_load()

API

Data

Helper functions and dataset for demonstration purposes.

authors

Authors

corpus

Corpus

common_texts() common_corpus() common_dictionary()

Preprocessed Text

datapath()

Data

Dependencies

Install python dependencies

install_dependencies() install_gensim() install_sklearn() install_ldavis()

Install Dependencies

Corpora

Transform and clean corpora.

doc2bow()

Create a dictionary

auth2doc()

Author Document Preprocess

corpora_dictionary()

Create a dictionary

serialize_mmcorpus() as_serialized_mmcorpus() delete_mmcorpus()

Serialise Matrix Market Corpus

read_serialized_mmcorpus()

Read Serialized Matrix Market

similarity()

Similarity

strip_multiple_spaces()

Strip Multiple space

strip_non_alphanum()

Strip Non Alphanumerics

strip_numeric()

Strip Numerics

strip_punctuation()

Strip Punctuation

strip_short()

Strip Short Words

strip_tags()

Strip Tags

split_alphanum()

Split Alphanumerics

porter_stemmer()

Porter Stemmer

prepare_documents()

Prepare Documents

preprocess()

Preprocess text

remove_stopwords()

Remove stopwords

stem_text()

Stem

text8corpus()

Line Sentence

filter_rare()

Filter Rarely

Models

Available models and utilities.

wrap()

Wrap

get_author_topics()

Get Author topics

map_model() get_perplexity_data()

Map Models

model_at() load_at()

Author-topic Model

model_fasttext() load_fasttext()

Fasttext Model

model_hdp() load_hdp()

Hierarchical Dirichlet Process Model

model_lda() load_lda() model_ldamc() load_ldamc()

Latent Dirichlet Allocation Model

model_logentropy() load_logentropy()

Log Entropy Model

model_lsi() load_lsi()

Latent Semantic Indexing Model

model_norm() load_norm()

Normalization Model

model_poincare()

PoinCare Model

model_rp() load_rp()

Random Projections Model

model_tfidf() load_tfidf()

Tf-idf Model

model_word2vec() load_word2vec()

Word2Vec Model

model_coherence()

Topic Coherence

get_docs_topics()

Get Document Topics

Visualisation

Functions to visualise model outputs, embeddding, etc.

prepare_ldavis() show_ldavis() plot_ldavis() ldavis_as_html()

Visualise Latent Dirichlet Allocation Models

save_ldavis_html() save_ldavis_json()

Save Visualisation

Sklearn

Scikit-learn API.

install_dependencies() install_gensim() install_sklearn() install_ldavis()

Install Dependencies

sklearn_at()

Author Topic Model

sklearn_doc2bow() sklearn_text2bow()

Word ID Mapper

sklearn_doc2vec()

Doc2vec Model

sklearn_hdp()

Hierarchical Dirichlet Process Model

sklearn_lda()

Latent Dirichlet Allocation Model

sklearn_logistic()

Scikit-learn Logistic Regression

sklearn_lsi()

Latent Semantic Indexing Model

sklearn_pipeline()

Scikit-learn Pipeline

sklearn_pt()

Phrase (Colocation) Detection

sklearn_rp()

Random Project Model

sklearn_tfidf()

Tf-idf Model

sklearn_word2vec()

Word2vec Model

Document Similarity

Document similarity-related functions.

similarity_matrix()

Similarity Matrix

get_similarity()

Get Similarity

similarity()

Similarity

Summarize

Text summarization-related functions.

summarize()

summarize

keywords()

Keywords

get_bm25_weights()

BM 25

phrases() phraser()

Phrases