CS 4999: Computing and the Humanities
Prof. Kathleen R. McKeown
Spring 1998; Tues & Thurs 4:10-5:25
Newsgroup: columbia.spring.cs4999
Office Hours
About the Course
Prerequisites
Assignments and Grading
Syllabus
Readings
Class Notes
Class URLs
Submitting Assignments
Office Hours
Prof. Kathleen R. McKeown
Tuesday 5:30 - 6:30 in 720 CEPSR
Thursday 11:00 - 12:00 in 450 CS Building
TA: Carl Sable
Monday 1:30 - 2:30 in 724 CEPSR
Wednesday 1:30 - 2:30 in 724 CEPSR
About the course
This course provides a broad introduction to the information
age. We will cover principles, software, tools, and analytic methods
for humanities computing. This includes research using electronic
texts of all types, ranging from on-line newspaper text to transcripts
and recordings of speech. We will cover the encoding of electronic
texts using various forms of markup languages, the fundamentals of
information retrieval and tools for statistical analysis of electronic
texts.
In addition to electronic texts, we will also look a the use of
of other forms of data in humanities computing, including images,
video, and databases.
We will study the use and analysis of these different media in a
variety of humanities applications, including:
- identification of authorship and style
- computer aided instruction and software for education
- history and law: recording and reasoning about facts and cases
- Online art: modelling, verifying, and accessing online images
Prerequisites
For CS students: CS 3139 (Data Structures)
For Humanities majors: CS1001 (Intro to Computers)
There will be different expectations for CS students and Humanities students
in terms of the programming projects (see below). However, Humanities students
will be expected to know how to use the computer as a tool. While instruction
will be given on how to use the software packages given for assignments,
students are expected to be familiar with general use of computers,
including email, editors, word processors, internet access and basic
unix commands.
Assignments and Grading
To accommodate students with different types of background and
experience, there will be 3 types of assignments. Homework
assignments, final projects, and exams will typically (not always)
include the 3 categories listed below. Students will have some
freedom in choosing topics and types of assignments, but all students
will be required to turn in assignments of at least 2 of the 3 types
listed here. Thus, humanities students may choose not to do
programming assignments, but would have to do both data analyses and
essays; cs students may choose not to do essays, but would have to
do data analyses in addition to programming tasks. CS students will be
required to do at least one programming project and Humanities students will
be required to do at least one extensive essay.
- essays on general topics pertaining to humanties computing
- analysis of data, e.g., using a software tool covered in class
- programming project, e.g., building a piece of software for
use in humanities computing, or modifying existing software to facilitate
humanities computing
There will be:
- 4 homework assignments: to see topics and due dates, click here.
- mid-term
- a class project consisting of an initial proposal 1/3 into the course;
a progress report 2/3 into the course; a ten-minute presentation of
the complete project at the end of the course
Class project. Each student will be responsible for
designing and completing a research project that demonstrates the
ability to use concepts from the class in addressing a practical
problem for humanities computing. A significant part of the final
grade will depend on the three project assignments. Students will
need to submit a project proposal, a progress report, and the project
itself. Students can elect to do a project on an assigned topic, or
to select a topic of their own.
The final version of the project will be put on the World Wide Web, and
will be defended in front of the class at the end of the semester. In
some cases, students may be allowed to work in pairs, e.g., a CS
student might pair with a humanities student to collaborate on a
larger project.
Click here to get to the class resources
directory (not available yet).
Syllabus
Introduction
Jan. 20th
- What is Computing and the Humanities?
- Computing and Humanities applications
- Introduction to electronic texts
Creating/using electronic text data-bases
Jan. 22nd
- Collections
- Online books
- Text access
Fundamentals of information retrieval
Jan. 27 - 29
- indexing, search, regular expression languages, retrieval
- evaluation
Online encoding of words and phrases
Feb. 3 - 5
- written representations: SGML, HTML, the TEI
- morphology, collocations, etc.
Literary analysis
Feb. 10 - Mar. 5
- ETS, text processing software
- Concordance analyzers
- Wordnet and other online dictionaries
- Statistics for Humanities computing: simple tests, packages
- Use of statistics for lexicography, stylistics, content analysis
- Authorship and style
- Other applications: machine translation
Analyzing Speech
Mar. 10 - Mar. 12
- spoken representations
- Analyzing and tagging speech
Midterm
Mar. 24
Computer Aided Instruction
Mar. 26 - Apr. 2
- Goal-oriented learning: museum curator, weather prediction, medical
curriculum, foreign language learning
- Methodology: interaction with psychology
- User interfaces for CAI
Images and art
Apr. 7 - 16
- Applications: art humanities, architecture
- Searching for images
- User interfaces for a library of images
- Statistical analysis of images
History, Law, and Legal Reasoning
Apr. 21 - 23
- Lexis/Nexis
- Legal and historical databases
- Representation and reasoning by cases
Issues for Digital Libraries
Apr. 28 - 30
Student Presentations
May 4
Readings
Readings will consist of various book chapters,
journal articles, WWW pages, and so on. Materials that are not on
line or in library reserve will be distributed in class, or can be
picked up during office hours. Click here
to see a list of readings, which will be updated as the class progresses.