This week you will start applying your new machine learning skills to real world problems. This week we start this process by learning about text analysis. First, we need to review the process by which textual data is converted into numerical data that can be processed by a computer. Along with this are a number of new concepts that focus on manipulating these data to generate improved machine learning predictions. Finally, we will apply machine learning algorithms including classification, dimensional reduction, and clustering to text data.
- Understand the basic concepts of tokenizing a text document including word counts, bag of words, stop words, TF-IDF, stemming, and n-grams.
- Understand the basic principles of text classification, including for sentiment analysis.
- Understand how to apply advanced text mining approaches such as dimensional reduction, clustering, and parameter grid search by using Python.
Activities and Assignments | Time Estimate | Deadline* | Points |
---|---|---|---|
Week 7 Introduction Video | 10 Minutes | Tuesday | N/A |
Week 7 Lesson 1: Introduction to Text Analysis | 2 Hours | Thursday | 20 |
Week 7 Lesson 2: Introduction to Text Classification | 2 Hours | Thursday | 20 |
Week 7 Lesson 3: Introduction to Text Mining | 2 Hours | Thursday | 20 |
Week 7 Quiz | 45 Minutes | Friday | 70 |
Week 7 Assignment Submission | 3 Hours | The following Monday | 125 Instructor, 10 Peer |
Week 7 Completion of Peer Review | 2 Hours | The following Saturday | 15 |
Please note that unless otherwise noted, the due time is 6pm Central time!