Date | Subject | Readings | Assignments |
---|---|---|---|
8/30 | Overview | ||
9/2 | Labor Day | ||
9/4 - 9/6 | Character Encodings + Philosophy | Technical Reading: Bits to Characters
Discussion: |
Programming 1: Character encodings |
9/9 - 9/13 | Tokenization and Counting | Technical:
Discussion: |
Programming 2: Tokenization, Counting |
9/16 - 9/20 | Sentiment analysis | Technical: Discussion: summarize and comment on these two approaches to Vonnegut's theory. |
Programming 3: Evaluate two sentiment lexicons; Manually create a dictionary-based lexicon for an emotion. |
9/23 - 9/27 | Classification: | Technical:
|
Programming 4: What distinguishes History, Tragedy, and Comedy in Shakespeare's plays? |
9/30 - 10/4 | Measurements of uncertainty; Similarity and Divergence | Technical:
|
Programming 5: Identifying change and variation over time by comparing documents in a sequence. |
10/7 - 10/11 | Similarity, Clustering, and Authorship | Technical:
|
Shared data project: Construct a collection of The Federalist Papers. Programming 6: We will apply similarity functions to the Federalist Papers, and see how they imply different clusterings. |
10/14 | Break | ||
10/16 - 10/18 | Similarity, Clustering, and Authorship | Technical: No reading Discussion: No Discussion! |
Complete data project on Federalist Papers Programming 6 cont'd: examine differences in keyword use, add clustering and tf-idf |
10/21 - 10/25 | Corpus-building, clustering |
Discussion:
|
Programming 7: IDF weighting. Build a collection from Project Gutenberg texts. Extending similarity to clustering. Distinguish Horror from non-Horror. |
10/28 - 11/1 | Corpus-building, clustering continued |
Discussion: Choose TWO (2) of the following articles about constructing a collection:
|
Programming 7 Cont'd: IDF weighting. Build a collection from Project Gutenberg texts. Extending similarity to clustering. Distinguish Horror from non-Horror. |
11/4 - 11/8 | Topic modeling | Technical:
|
Programming 8: Training, analyzing, and evaluating topic models. |
11/11 - 11/15 | Word embeddings | Technical:
|
Mini-project is due Monday Programming 9: Word embeddings, keywords in context, distance functions. |
11/18 - 11/22 | Tools for Hypothesis Testing | Technical:
|
How do we make the connection between text analytics and persuasive arguments? What methods can help us convince ourselves that we're not reporting random values, and how can we explain to others what we've done? |
11/25 | Multiple hypotheses | Technical: |
Do we need to think about significance differently when we're running many experiments than when we're running just one? |
11/27 - 11/29 | Thanksgiving, no class |