Machine Learning (W4771)
Homework 1
Due Date:
Tuesday, February 1.
Reading:
Chapters 1 and 2 ("Machine Learning" by Mitchell)
Homework problems. Answers to problems 1 and 3 are fairly open-ended
-- their purpose is to stimulate thinking about machine learning.
However, the other questions have correct and incorrect answers.
- Exercise 1.2 from the text. Please restrict yourself to 1 page max.
- Why use machine learning methods, when are they necessary? To answer
this questions, give an example hypothesis
space H1 for which an exhaustive search would suffice to find a good hypothesis, and a contrasting
one H2 for which exhaustive search is not possible.
- Consider the application of machine learning to the game Tritris (a
variation of Tetris) as described in class on day 1.
- a) By what means could we determine that the results of these
experiments are in any way "good"? That is, can we verify that the system
has actually learned something non-trivial about the problem domain (Tritris)
that could not be stumbled upon randomly? For
example, are there simple or purely random approaches that may perform just
as well? Answer in one paragraph, please.
- b) A general-to-specific ordering of hypotheses, as described in
Chapter 2, provides a systematic principle on which to base a search for a
good hypothesis. In general, what principles does the genetic algorithm
seem to be based on? Please provide another, alternative principle, such as an
ordering, on which a search over the Tritris hypotheses (function trees)
could be based. Do you think your principle would help or hurt in
comparison to the genetic algorithm?
- c) The learning system only finds out a hypothesis' score after
a complete game has been played; there is no direct feedback after each game piece
is dropped onto the game board. Please provide a method to
address this problem without incorporating any information or
hints about the Tritris domain.
- Consider a learning problem where each instance is described by a
conjunction of n boolean features, and each hypothesis is a
DISJUNCTION of literals over these features. Propose an algorithm that
produces a consistent hypothesis if one exists. Your algorithm should run
in time that is polynomial in n and in the number of training
examples.
- Exercise 2.4 (parts a, b and c only) from the text.
email: evs at cs dot columbia dot edu