Skip to content
#

corpus-data

Here are 163 public repositories matching this topic...

For a corpus linguistics project, I created an information retrieval program called "You Are Not Alone". My phrase_finder() function searches for a self-identifying phrase in 4 large classic texts (The Souls of Black Folk, Jane Eyre, The Strange Case of Dr. Jekyll & Mr. Hyde, and Frankenstein). Standpoint: "So Matilda’s strong young mind continu…

  • Updated Mar 16, 2023
  • Python

Word vector is a model of multi-dimensional vector representation of words. Similarity in the vector values often accompanies a semantic relation between words. But exploring the vector space further, we can find more interesting and surprising relations. I will shed some light on the mathematical meaning of the word vectors using an interactive…

  • Updated Oct 7, 2021
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the corpus-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the corpus-data topic, visit your repo's landing page and select "manage topics."

Learn more