Instructor:
Prof. John Kender
Lectures:
Monday and Wednesday 2:40pm - 3:55pm, 386 Engineering Terrace
Prerequisites:
The only formal prerequisite for this class is either CS3131 or CS3139. However, courses such as Artificial Intelligence (CS4701) and Computer Vision (CS4731) will be helpful, but not necessary.
Textbook:
None, but a $45 fee for reprints will be charged.
Course Structure:
The course is a survey of existing research systems that allow
computers to accept and manipulate visual input, usually from real
time cameras, in order to direct further computer processing. These
systems presently enable machines to: recognize and interpret human
hand and body gestures; analyze imagery such as fingerprint or iris
patterns for security data; generate natural language descriptions of
medical or map imagery; index into a database of pictures to retrieve
related images; summarize visually long video sequences like news
reports or comedy shows; steer automobiles automatically; recover
CAD/CAM models by inspecting physical examples; monitor large outdoor
areas for types of activity; and execute other and more exotic tasks.
Since the course is a topics course, a survey-of-research course, and
a course being offered only for the second time, the schedule is
fluid. However, most or all of the topics above will be covered, and
additionally, some of the underlying theoretic work in artificial
intelligence, computer vision, robotics, and psychology also will be
covered.
Course Assignments:
There will likely be one or two short assignments exercising some of
the concepts on actual visual data and/or on theoretic issues. The
length and nature of these assignments is dependent on the hardware
and software systems that are available in the departments
facilities. If there is one assignment, it will be worth 20% of the
course grade; if there are two, they will together be worth 35% of the
course grade. There will be no exams.
At approximately midterm, a five-page proposal for a course paper or
project, complete with description, proposed methods, and
references. This proposal is worth 15% of the course grade. At
approximately finals, either a 30-page research paper surveying some
aspect of visual interfaces, or a demonstrable working project
documented with a 15-page write-up (not including supplementary
documentation such as code listings). Those electing to produce a
system need to be able to demonstrate its abilities somewhere on the
main campus during finals week, although there are no other
restrictions on machine or language. The final paper or project is
worth remainder of the course grade, namely 50%, 65%, or 85%,
depending on the number of assignments.
Project Teams:
Teams of two (and in special cases with explicit instructor approval,
teams of three) are permitted, in which case three ground rules
apply. First, the amount of work expected is proportional to team
size, and the paper or project will be graded relative to this
expectation. Second, identical grades will be given to each member of
the team, and the instructor will not entertain any appeals concerning
individuals relative contributions. Third, proposed teaming
arrangements become final five weeks before deliverables are due;
prior to this time a proposed team can dissolve (or a new one can
form) on the approval of a revised proposal.
Course Materials:
Course materials will consist mainly of reprints of research articles
from various sources. Some of the systems are available for
exploration on the Web. Generally, reprints will be made available
before the lectures in which they are presented. Throughout the
course, the course web page will be updated with links to those pages
which demonstrate some of the various concepts covered in the course;
see http://www.columbia.edu/~cs4735