Programming Project 1,
Part 2
(worth 25% of overall Project 1 grade)
Your team and "project mentor"
You will do Part 2 of Project 1 with the same team as for Part
1. If your team partner dropped the class and you did not
submit a contingency plan for this with your Part 1
submission, then unfortunately you will still have to complete
the whole project by yourself. If you team partner dropped the
class and you did submit a contingency plan for this with your
Part 1 submission, then you are welcome to switch to this reduced
version of your project.
TAs for this will
be your "project mentor" for the remainder of your
project, even if you had your Part 1 meeting with the instructor. Needless to say, you are welcome to
contact the
instructor about your project, but the TAs should be
your main contact, for continuity, since they will be grading all parts of the
project.
As mentioned earlier, both students in a team will receive the
same grade for Project 1. Team partners are expected to fully
collaborate with each other on solving the project. However,
communication about project details with somebody other than your
partner is not permitted, and is considered cheating. If in doubt
about what kinds of consultations are allowed, check with the
instructor, or see the Computer Science Department's policies and procedures on academic honesty.
Questions of a general nature that may be of interest to the whole class should
be posted to the CourseWorks discussion board.
Computer accounts
As we mentioned in Part 1, you will need a CS account to do
Part 2 (and Part 3) of Project 1. Consequently, at least one of the team members
should have a CS account. If neither of you have a CS account, please open one
from https://www.cs.columbia.edu/~crf/accounts/cs.html.
Choose the appropriate "student" category as the "account type." There
is a $50 charge to open a CS account. Please refer to CRF's homepage for
details on infrastructure and policies of the CS department.
Grading of Part 1
Teams with only in-class students can pick up their graded Part 1 in class
On Monday Feb 18.
Once you have received your graded Part 1, please modify your database according
to your mentor's instructions. Meanwhile, please start getting
familiarized with PostgreSQL and implement your database.
Overview of Part 2 of Project 1
For Part 2 of this project, which you should
submit electronically, you should implement your
relations of Part 1 over PostgreSQL
and load your tables with some real or realistic data, as you
outlined in Part 1. (In Part 3, you will get to build your
application, including a simple web front-end, on top of this
database.)
What you need to do for Part 2
- Familiarize yourself with the PostgreSQL DBMS by reading
the materials available at
http://www.postgresql.org/docs/.
Our servers are running PostgreSQL 8.1 For a PostgreSQL reference manual, check
http://www.postgresql.org/docs/8.1/interactive/index.html.
You can also find all sorts of documentation on PostgreSQL from
http://www.postgresql.org/docs/techdocs.
Please check these materials carefully before
sending email to the class staff with questions
on syntax or supported features. You don't need to turn
anything in for this part. Suggestion: The first time you run
psql (see below), type "\?" to see a list of the "slash commands"
available. You may want to pay close attention to the "\i" command.
- Set up your PostgreSQL database and change your
PostgreSQL account password (see below). You
only need to set up one PostgreSQL account/database per
team. You don't need to turn anything in for this part.
- Make all suggested changes to the overall design in
general, and to the SQL schema in particular, once you receive your graded Part 1.
You are likely to have extensive comments from your
project mentor in your graded Part 1. Your Part 2 grade
will be based in part on how well you have incorporated
your project mentor's comments. If you have any questions
about these comments once you received them, please contact your project mentor
as soon as possible to clarify.
- Create all the SQL tables in your
revised SQL schema on your PostgreSQL account, including
all constraints that you could specify in the
table declarations.
- Add any additional attribute-based CHECK
constraints and tuple-based CHECK constraints (as
discussed in class) that you need so as to express any
real-world constraints of your application that are
missing from your SQL schema. Note that PostgreSQL does
not support general assertions, and
that triggers are slightly different from our class description.
You do not need to use triggers for this project, and you
can ignore any real-world constraints that you could not
capture with either good-style (as
discussed in class) attribute- or tuple-based CHECK
constraints, or with PRIMARY KEY, UNIQUE, FOREIGN KEY,
and NOT NULL constructs. (Hint: "Good-style"
attribute- or tuple-based CHECK constraints tend to refer
only to the table in which they are defined, not to other
tables.)
- Insert into each table in your database at least
10 tuples of real or "realistic" data,
as you described in your Part 1 project description. This
data will help you test and play with your database. Of
course, issue queries of your choice to make sure that
everything works as you intend.
What to submit and when
You will submit this part of the project electronically
on Courseworks directly. The deadline
is Wednesday February 27 Here is how you should proceed:
- Create a directory named <your-UNI>-proj1part2, where you should
replace <your-UNI> with the Columbia UNI of one teammate (for
example, if the teammate's UNI is abc123, then the directory should
be named abc123-proj1part2).
- Generate the following two plain-text files inside the <your-UNI>-proj1part2
directory:
- File pgdump.txt: This plain-text file has the result of running the
pg_dump
command from your CS account shell (i.e., not
from the PostgreSQL client). pg_dump
"extracts" your PostgreSQL database
into a script that could be used to "reconstruct"
the database, and which includes the schema of
your database plus the data that you entered. (For
more details, run man pg_dump or
check the PostgreSQL documentation on the web.)
Specifically, run:
pg_dump -h w4111b.cs.columbia.edu -u -f
pgdump.txt -d
<p_user>
to get the content of your database dumped onto
file pgdump.txt. w4111b.cs.columbia.edu
and <p_user> are defined
below in the "Setting up..." section.)
You
should of course run pg_dump after
you have satisfactorily created all tables and constraints, and loaded
the tables with data as specified above.
- File queries.txt: This plain-text file has three "interesting"
SQL queries over your database, with a sentence
or two per query explaining what the query is
supposed to compute. You will not be graded on
these queries. Their goal is simply to help us
better understand your application.
- Tar and gzip the <your-UNI>-proj1part2
directory, to generate a single file <your-UNI>-proj1part2.tar.gz
(containing the pgdump.txt and queries.txt files), which is the file that
you will submit.
- Login to Courseworks at
https://courseworks.columbia.edu/ and select the site for our class.
- Select "Class Files" and then "Post File."
- Enter "Project 1, Part 2" in the "Title" field and select
your <your-UNI>-proj1part2.tar.gz file in the "File" field using the
"Browse" button.
- Select folder "Project 1, Part 2" (under "Shared Folders") in the
"Post File To" field.
- Enter the names of the members of the team and their UNI in the
"Comments" field.
- Hit the "Submit" button.
Late policy for projects
Please check the class webpage for the lateness policy.
Grading for Part 2
Your grade for Part 2 of Project 1 will be split as follows:
- Quality of final SQL schema and implementation on
PostgreSQL: 10 points.
We will evaluate the overall quality of your final SQL
schema on PostgreSQL, especially in terms of how
thoroughly you incorporated any revisions suggested by
your project mentor in the grading of Part 1 of your
project.
- Quality of constraint handling: 10 points.
We will evaluate how well you managed to capture real-world
constraints through primary key, foreign key, unique, and
attribute- and tuple-based CHECK constraints.
- Quality of the real-world (or at least realistic)
data that you loaded into the database: 5 points.
Setting up your PostgreSQL database
- Get a CS account if neither you nor your team-mate
has one. (See above.)
- Use ssh to clic.cs.columbia.edu
to access Linux machines. (See below for special
instructions for Solaris machines.)
- You should receive an email from the course staff with your
database username and password. This email will be sent to your
@columbia.edu email address. If you have not received such an email, contact
Nirav at nns2108@columbia.edu immediately.
In the steps below you must substitute your database username
for <p_user>.
Also, note that you only need to set up one single PostgreSQL
account/database per team, even though your team
was assigned one PostgreSQL account per team member. Pick
just one of the two accounts arbitrarily and work on it
together with your team-mate.
- For this class we have following database and
web server:
w4111b.cs.columbia.edu
CRF has
asked us to request that students in cs4111 not call or send
emergency tickets to CRF regarding a crashed database server (or any other
such problem) outside of CRF's working hours.. Therefore, use the pg_dump command
frequently so you can restore your data on the other machine if your
development machine crashes.
- Login to PostgreSQL:
psql -h w4111b.cs.columbia.edu -U <p_user> -d <db_name>
Your <p_user> username and
<db_name> database names are provided above. The database name should be the same as your username.
At the prompt, enter your password. You should now be logged into your database through
psql. Your prompt should now look like:
<db_name>=#
- You may change your password at the PostgreSQL prompt by typing:
ALTER USER <p_user> WITH PASSWORD '<your new password>';
where <your new password> is of course your new
password of choice. Note that SQL commands issued from within
the psql client need to end in ";" to be sent to the
server.
- You can now enter any SQL commands at the psql prompt. You may exit
psql by entering \q and pressing <enter>.
SQL NOTE: As mentioned in Step 6, all SQL commands
must terminate with a semicolon. Pressing <enter> without a semicolon
will continue the same command on the next line.
SOLARIS NOTE: If you are using one of the Computer Science Department's
Solaris machines, add the following line to your .bashrc file:
export PATH=$PATH:/opt/postgresql-7.4.5/bin
OTHER INFORMATION:
You can omit the -h <hostname> from you psql command by adding the following to your
.bashrc file:
export PGHOST=<hostname>
where <hostname> is the server for this class.
If you are having trouble connecting, you may need to add:
export PGPORT=5432
to your .bashrc file.
After you have modified your .bashrc file, either execute:
source .bashrc
or logout and then log back in so that the new variables take effect.
If you are still having problems connecting, then create a .bash_login
file with line:
export PATH=$PATH:/opt/postgresql-7.4.5/bin
(or, if you already have a .bash_login file, just add the above line to
it).
Important note on backups
To avoid losing any data, we strongly suggest that you periodically use the
pg_dump command described above to generate a file with all your schema
definitions and with the data that you uploaded to your database. You should
then save this file in a safe place. For a variety of technical reasons that are
not under our control, we cannot provide backups of the contents of your
PostgreSQL account, so do use the pg_dump command frequently.
- Q: Why use PostgreSQL for Parts
2 and 3 of Project 1? Can I use my favorite DBMS instead?
A: As much as we would like to be more flexible, we
just don't have the staff to handle several diverse systems and platforms.
Unfortunately, you cannot use any other DBMS.
- Q: Why are we using PostgreSQL instead of MySQL?
A: PostgreSQL generally supports more
advanced functionality than MySQL does.
- Q: Can I run a PostgreSQL client
on my Windows machine? How do I do that?
A: You might be able to do this, but we
will not support it in any way. Again, we just don't have
the staff to do so. We have installed clients for both
Linux and Solaris on the CS machines, so you don't have
to install any software whatsoever as long as you run a
client on a CS machine running Linux or Solaris.
- Q: How do I learn more about the psql client and
pg_dump?
A: Run man psql and man
pg_dump on your CS account.
- Q: How do I learn more about PostgreSQL? Is
<feature X> supported on PostgreSQL?
A: Please check
http://www.postgresql.org/docs/8.1/static/index.html and
http://www.postgresql.org/docs/techdocs.
- Q: I can't log in on the machines running
PostgreSQL servers using the userid and password as specified above. How do I fix this?
A: You do not have
accounts on the machines running the
PostgreSQL servers; your accounts are for PostgreSQL only.
This means that you cannot ssh to the machine that
runs the PostgreSQL server. Rather, you should
login to one of the CLIC machines using your regular CS
account userid, and follow the instructions above on starting a client
locally to connect to the remote
PostgreSQL server.
- Q: I can't find my .bashrc file. How do I
proceed?
A: You should simply create a plain-text file
called .bashrc in your home directory
and paste the commands on the project description into
the file.
Kenneth Ross
kar@cs.columbia.edu