Day 1 - Sept. 30th - 10am to 5pm (roughly) - Overview of Text Analysis

The first day will combine a comprehensive overview of quantitative text analysis with an introduction to the structure, logic, and syntax of the quanteda text analysis package. Topics will include:

  • an introduction to quantitative text analysis and its workflow
  • a comprehensive map of quantitative text analysis including the main techniques used in political and social science
  • an overview of quanteda and how to use it to create core objects for textual analysis.

Day 2 - Oct. 1st - 10am to 5pm (roughly) - Describing, comparing, and scaling texts

The second day will use a few running examples including:

  • identifying key words based on statistical association measures
  • identifying multi-word expressions via collocation analysis
  • computing and interpreting textual statistics for comparing texts
  • using bootstrapping to measure uncertainty in texts
  • computing similarities and distances for identifying clusters of similar texts.


Day 3 - Oct. 2nd - 10am to 2:30pm (must end before Colonialism workshop) - Advanced text analysis

The final day will cover advanced methods including:

  • feature weighting and feature selection for textual matrices;
  • supervised and unsupervised document scaling for measuring ideological positions;
  • supervised machine learning for document classification; and
  • topic modelling.