2024-2025
Text as Data |
This course explores methods for extracting and analyzing text as data. Automated text analysis has become widely used in the social sciences following recent innovations in machine learning and the increased digitalization of political texts. The course covers the theoretical foundations for text analysis and discusses the promises and challenges of analyzing text as data. Students will become familiar with different techniques for systematically analyzing text with applications to political science topics. The focus of the course is on practical applications that allow students to apply cutting edge statistical and computational techniques to their own research.
We begin the course by exploring the conceptual foundations of quantitative text analysis and proceed to consider how we can extract, pre-process, and describe social text data. We then explore supervised and unsupervised machine learning methods that can be used to measure and analyze social science concepts using text as data. The course has a discussion and lab format whereby students are expected to complete the weekly readings, provide insights on the topic, and complete in-class programming activities. In addition to developing a general understanding of the text as data literature, students are required to focus in-depth on one particular method by developing an independent research data paper. We conclude the course with research presentations. Graduate syllabus |
Previous
Human Rights
Graduate syllabus
Undergraduate syllabus
Civil Conflict
Graduate syllabus
Undergraduate syllabus
Terrorism and Political Violence
Graduate syllabus
Undergraduate syllabus
Graduate syllabus
Undergraduate syllabus
Civil Conflict
Graduate syllabus
Undergraduate syllabus
Terrorism and Political Violence
Graduate syllabus
Undergraduate syllabus