COMS W4111-Database Systems
Spring 2008

Programming Project 1, Part 2
(worth 25% of overall Project 1 grade)

FAQs

Your team and "project mentor"

You will do Part 2 of Project 1 with the same team as for Part 1. If your team partner dropped the class and you did not submit a contingency plan for this with your Part 1 submission, then unfortunately you will still have to complete the whole project by yourself. If you team partner dropped the class and you did submit a contingency plan for this with your Part 1 submission, then you are welcome to switch to this reduced version of your project.

TAs for this will be your "project mentor" for the remainder of your project, even if you had your Part 1 meeting with the instructor. Needless to say, you are welcome to contact the instructor about your project, but the TAs should be your main contact, for continuity, since they will be grading all parts of the project.

As mentioned earlier, both students in a team will receive the same grade for Project 1. Team partners are expected to fully collaborate with each other on solving the project. However, communication about project details with somebody other than your partner is not permitted, and is considered cheating. If in doubt about what kinds of consultations are allowed, check with the instructor, or see the Computer Science Department's policies and procedures on academic honesty. Questions of a general nature that may be of interest to the whole class should be posted to the CourseWorks discussion board.

Computer accounts

As we mentioned in Part 1, you will need a CS account to do Part 2 (and Part 3) of Project 1. Consequently, at least one of the team members should have a CS account. If neither of you have a CS account, please open one from https://www.cs.columbia.edu/~crf/accounts/cs.html. Choose the appropriate "student" category as the "account type." There is a $50 charge to open a CS account. Please refer to CRF's homepage for details on infrastructure and policies of the CS department.

Grading of Part 1

Teams with only in-class students can pick up their graded Part 1 in class On Monday Feb 18. Once you have received your graded Part 1, please modify your database according to your mentor's instructions. Meanwhile, please start getting familiarized with PostgreSQL and implement your database.

Overview of Part 2 of Project 1

For Part 2 of this project, which you should submit electronically, you should implement your relations of Part 1 over PostgreSQL and load your tables with some real or realistic data, as you outlined in Part 1. (In Part 3, you will get to build your application, including a simple web front-end, on top of this database.)

What you need to do for Part 2

  1. Familiarize yourself with the PostgreSQL DBMS by reading the materials available at http://www.postgresql.org/docs/. Our servers are running PostgreSQL 8.1 For a PostgreSQL reference manual, check http://www.postgresql.org/docs/8.1/interactive/index.html. You can also find all sorts of documentation on PostgreSQL from http://www.postgresql.org/docs/techdocs. Please check these materials carefully before sending email to the class staff with questions on syntax or supported features. You don't need to turn anything in for this part. Suggestion: The first time you run psql (see below), type "\?" to see a list of the "slash commands" available. You may want to pay close attention to the "\i" command.
  2. Set up your PostgreSQL database and change your PostgreSQL account password (see below). You only need to set up one PostgreSQL account/database per team. You don't need to turn anything in for this part.
  3. Make all suggested changes to the overall design in general, and to the SQL schema in particular, once you receive your graded Part 1. You are likely to have extensive comments from your project mentor in your graded Part 1. Your Part 2 grade will be based in part on how well you have incorporated your project mentor's comments. If you have any questions about these comments once you received them, please contact your project mentor as soon as possible to clarify.
  4. Create all the SQL tables in your revised SQL schema on your PostgreSQL account, including all constraints that you could specify in the table declarations.
  5. Add any additional attribute-based CHECK constraints and tuple-based CHECK constraints (as discussed in class) that you need so as to express any real-world constraints of your application that are missing from your SQL schema. Note that PostgreSQL does not support general assertions, and that triggers are slightly different from our class description. You do not need to use triggers for this project, and you can ignore any real-world constraints that you could not capture with either good-style (as discussed in class) attribute- or tuple-based CHECK constraints, or with PRIMARY KEY, UNIQUE, FOREIGN KEY, and NOT NULL constructs. (Hint: "Good-style" attribute- or tuple-based CHECK constraints tend to refer only to the table in which they are defined, not to other tables.)
  6. Insert into each table in your database at least 10 tuples of real or "realistic" data, as you described in your Part 1 project description. This data will help you test and play with your database. Of course, issue queries of your choice to make sure that everything works as you intend.

What to submit and when

You will submit this part of the project electronically on Courseworks directly. The deadline is Wednesday February 27 Here is how you should proceed:

  1. Create a directory named <your-UNI>-proj1part2, where you should replace <your-UNI> with the Columbia UNI of one teammate (for example, if the teammate's UNI is abc123, then the directory should be named abc123-proj1part2).
  2. Generate the following two plain-text files inside the <your-UNI>-proj1part2 directory:
  1. File pgdump.txt: This plain-text file has the result of running the pg_dump command from your CS account shell (i.e., not from the PostgreSQL client). pg_dump "extracts" your PostgreSQL database into a script that could be used to "reconstruct" the database, and which includes the schema of your database plus the data that you entered. (For more details, run man pg_dump or check the PostgreSQL documentation on the web.) Specifically, run:

    pg_dump -h w4111b.cs.columbia.edu -u -f pgdump.txt -d <p_user>

    to get the content of your database dumped onto file pgdump.txt. w4111b.cs.columbia.edu and <p_user> are defined below in the "Setting up..." section.) You should of course run pg_dump after you have satisfactorily created all tables and constraints, and loaded the tables with data as specified above.
  2. File queries.txt: This plain-text file has three "interesting" SQL queries over your database, with a sentence or two per query explaining what the query is supposed to compute. You will not be graded on these queries. Their goal is simply to help us better understand your application.
  1. Tar and gzip the <your-UNI>-proj1part2 directory, to generate a single file <your-UNI>-proj1part2.tar.gz (containing the pgdump.txt and queries.txt files), which is the file that you will submit.
  2. Login to Courseworks at https://courseworks.columbia.edu/ and select the site for our class.
  3. Select "Class Files" and then "Post File."
  4. Enter "Project 1, Part 2" in the "Title" field and select your <your-UNI>-proj1part2.tar.gz file in the "File" field using the "Browse" button.
  5. Select folder "Project 1, Part 2" (under "Shared Folders") in the "Post File To" field.
  6. Enter the names of the members of the team and their UNI in the "Comments" field.
  7. Hit the "Submit" button.

Late policy for projects

Please check the class webpage for the lateness policy.

Grading for Part 2

Your grade for Part 2 of Project 1 will be split as follows:

  1. Quality of final SQL schema and implementation on PostgreSQL: 10 points.
    We will evaluate the overall quality of your final SQL schema on PostgreSQL, especially in terms of how thoroughly you incorporated any revisions suggested by your project mentor in the grading of Part 1 of your project.
  2. Quality of constraint handling: 10 points.
    We will evaluate how well you managed to capture real-world constraints through primary key, foreign key, unique, and attribute- and tuple-based CHECK constraints.
  3. Quality of the real-world (or at least realistic) data that you loaded into the database: 5 points.

Setting up your PostgreSQL database

  1. Get a CS account if neither you nor your team-mate has one. (See above.)
  2. Use ssh to clic.cs.columbia.edu to access Linux machines. (See below for special instructions for Solaris machines.)
  3. You should receive an email from the course staff with your database username and password. This email will be sent to your @columbia.edu email address. If you have not received such an email, contact Nirav at nns2108@columbia.edu immediately.

  4. In the steps below you must substitute your database username for <p_user>. Also, note that you only need to set up one single PostgreSQL account/database per team, even though your team was assigned one PostgreSQL account per team member. Pick just one of the two accounts arbitrarily and work on it together with your team-mate.
     
  5. For this class we have following database and web server:

    w4111b.cs.columbia.edu

    CRF has asked us to request that students in cs4111 not call or send emergency tickets to CRF regarding a crashed database server (or any other such problem) outside of CRF's working hours.. Therefore, use the pg_dump command frequently so you can restore your data on the other machine if your development machine crashes.
  6. Login to PostgreSQL:

    psql -h w4111b.cs.columbia.edu -U <p_user> -d <db_name>

    Your <p_user> username and <db_name> database names are provided above. The database name should be the same as your username. At the prompt, enter your password. You should now be logged into your database through psql. Your prompt should now look like:

    <db_name>=#

  7. You may change your password at the PostgreSQL prompt by typing:

    ALTER USER <p_user> WITH PASSWORD '<your new password>';

    where <your new password> is of course your new password of choice. Note that SQL commands issued from within the psql client need to end in ";" to be sent to the server.
  8. You can now enter any SQL commands at the psql prompt. You may exit psql by entering \q and pressing <enter>.

SQL NOTE: As mentioned in Step 6, all SQL commands must terminate with a semicolon. Pressing <enter> without a semicolon will continue the same command on the next line.

SOLARIS NOTE: If you are using one of the Computer Science Department's Solaris machines, add the following line to your .bashrc file:

export PATH=$PATH:/opt/postgresql-7.4.5/bin

OTHER INFORMATION:
You can omit the -h <hostname> from you psql command by adding the following to your .bashrc file:

export PGHOST=<hostname>

where <hostname> is the server for this class.

If you are having trouble connecting, you may need to add:

export PGPORT=5432

to your .bashrc file.

After you have modified your .bashrc file, either execute:

source .bashrc

or logout and then log back in so that the new variables take effect.

If you are still having problems connecting, then create a .bash_login file with line:

export PATH=$PATH:/opt/postgresql-7.4.5/bin

(or, if you already have a .bash_login file, just add the above line to it).

 


Important note on backups

To avoid losing any data, we strongly suggest that you periodically use the pg_dump command described above to generate a file with all your schema definitions and with the data that you uploaded to your database. You should then save this file in a safe place. For a variety of technical reasons that are not under our control, we cannot provide backups of the contents of your PostgreSQL account, so do use the pg_dump command frequently.

Frequently-asked questions


Kenneth Ross
kar@cs.columbia.edu