Course Info

The central question of our course is how computational linguistics and information visualization can be put to the service of text-based research in the humanities.

This course will bring together students from the humanities who want to learn how technology can change how they do research, and students from information and computer science who want to help design and build the next generation of tools for humanities scholars, with a focus on analysis of written literature.

Students from each discipline will be expected to be open to learning from the other. The course will consist of readings and discussion of research papers as well as analysis and evaluation of existing tools. Students will be expected to contribute to the design, analysis, and/or evaluation of a new software tool for scholarly literature analysis.

Students interested in machine learning and natural language processing (NLP) will have the opportunity to apply those tools to literary analysis problems. Students interested in human-computer interaction (HCI) will have the opportunity to enage with problems of information design and visualization for analysis, search, and navigation.

Humanities students should have an open mind and a passion to learn about new techniques.

Units: 3


Text similarity and the vocabulary problem.

Literature scholars and historians are often interested in passages of text with a common thematic or conceptual similarity. So far, the only way for them to find passages of interest is to read their texts closely and mark the passages individually. This is an unreliable and time-consuming process: researchers' moods, recent experiences and states of mind affect what they notice when they read. How can we apply NLP and HCI to make this process faster and more reliable?

Visualization and Analysis

Humanities researchers working on different problems need different kinds of information about their texts. Visualizations of text-based data can provide overviews and perspectives unavailable from reading. However, it's often not enough to simply extract and visualize the information: when building tools, it's important to think about interactions, as well as how the visualization fits into the rest of the scholar's analysis process.

Such questions are central to the design of a helpful visual analytics tool.


Information and computer science students should have experience or backgrounds in some subset of database programming, XML design, graphic design, user interface design, information visualization, natural language processing, machine learning, data mining and/or statistical analysis as well as general programming skills.