The following procedure has been tested on an RedHat Linux system.  
Please make necessary modifications for your platform.

PART I: Data Load

NOTE 1: Order of execution of these commands is important. 

1) Set up the database schema
  (a) Make sure you that Oracle is installed and you are able to execute 
      SQL*Plus and SQL*Loader. 

  shell> which sqlplus sqlldr
   /opt/oracle/product/10.2.0/db_1/bin/sqlplus
   /opt/oracle/product/10.2.0/db_1/bin/sqlldr

  (b) Create Oracle user (e.g. snp), and execute Tables.sql to create
      the user's schema.
    
   shell> sqlplus snp/pass@SID
   SQL> @Tables.sql

  (c) Load data into auxiliary tables

  shell> sqlldr userid=snp/pass@SID control=ctl/population.ctl
  shell> sqlldr userid=snp/pass@SID control=ctl/technology.ctl

2) Loading HapMap data

  (a) Download frequency data from HapMap into a new directory. 

  shell> ftp hapmap.org (username = anonymous, password = email address)

  ftp> cd frequencies/latest/rs_strand/non-redundant
  ftp> mget allele_freqs_*CEU*
  ftp> mget allele_freqs_*YRI*
  ftp> mget allele_freqs_*JPT+CHB*

  (b) Extract and pre-process HapMap data.

  shell> prepHapMap.sh <data directory>

  (c) Load HapMap data into the database; check the file
      ctl/hapmap.ctl to make sure that it is pointing to the
      right location for hapmap.csv. 

  shell> sqlldr userid=snp/pass@SID control=ctl/hapmap.ctl

  (d) Optional: remove the directory with downloaded and processed
      files to save space.

3) Loading Marker Association data

  (a) Create a new directory and download the following files from 
      www.cs.columbia.edu/~itsik/projects/Tagging/20060929/

      Affymetrix*both* 
      Illumina*

   You may want to disregard the Illumina*250k* files, 
   because they are only available for the JPT+CHB population.

   (b) Make sure that geneMarkers.jar is in your CLASSPATH and execute
       a script that processes marker association data.

    shell> prepMarker.sh <data directory>

   (c) Load marker association data into the database; check the file
       ctl/marker.ctl to make sure that it's pointing to the
       right location for marker.csv.

   shell> sqlldr userid=snp/pass@SID control=ctl/single_marker.ctl
   shell> sqlldr userid=snp/pass@SID control=ctl/double_marker.ctl

   (d) Optional: remove the directory with downloaded and processed
      files to save space.

4) Loading OMIM data

   (a) Download the OMIM dataset from
       ftp://ftp.ncbi.nih.gov/repository/OMIM/omim.txt.Z.

   (b) Unzip omim.txt.Z

       shell> gunzip omim.txt.Z

   (c) Make sure that geneMarkers.jar is in your CLASSPATH and execute
       a program that parses omim.txt.

       shell> prepOMIM.sh <directory where omim.txt is located>

   (d) Load omim_snp_map.txt and omim_title.txt into the database.  Make sure
       that paths to these files are correct in the corresponding SQL*Loader
       control files.

   shell> sqlldr userid=snp/pass@SID control=ctl/omim.ctl
   shell> sqlldr userid=snp/pass@SID control=ctl/snp_omim_map.ctl

  (d) Optional: remove omim.txt.Z, omim.txt, omim_title.csv and
      snp_omim_map.csv to save space.

5) Creating materialized views (repeat (a) and (b) when new SNP,
   association or OMIM data is loaded into the database)

   (a) Gather schema statistics, so that materialized view creation 
       is as fast as possible

   shell> sqlplus snp/pass@SID
   SQL> exec dbms_stats.gather_schema_stats('SNP');

   (b) Execute Views.sql to build materialized views

   SQL> @Views.sql;

   (c) Register the Markers PL/SQL package with the database.

   SQL> @Markers.sql;

PART II: Using the System for SNP-based Diagnostics

This section demonstrates the use of Markers PL/SQL package
from the command line. All functions may also be invoked 
from a program, using JDBC or similar.

6) Test suite: randomly generating a genotype
   
    Log on to the database and execute the function Markers.Random_Individual,
    specifying gender (F/M), population (CEU, YRI or JPT+CHB), and number of haplotypes.
   
    shell> sqlplus snp/pass@SID
    SQL> select Markers.Random_Individual('F', 'JPT+CHB', 2) from dual;

    Call will return a unique id of the generated individual.  Two tables will
    be populated: Individual and Haplotype

7) Loading IntraGenDB genotype data for querying

   (a) Generate a random individual with 0 haplotypes (as in (6)), and note
       the individual's id.
   
   (b) Reserve two haplotype ids by executing the following command TWICE:

       SQL> select HapId_Seq.nextval from dual;

   (c) Call prepIntraGenDB.sh <data dir> <indId> <hapId 1> <hapId 2> 
 
   (d) Load the generated hap.csv with sqlldr, using haplotype.ctl
      
   shell> sqlldr userid=snp/pass@SID control=ctl/haplotype.ctl

8) Logging and executing a diagnostic request

   (a) Log on to the database and execute the Markers.Log_Request function.

   Arguments: <individual id>	 number 	(from Individual.id)
              <haplotype id>  	 number 		(from Haplotype.id)
	      <population id> 	 CU|YRI|JPT+CHB
              <technology id>    0 for Affymetrix, 1 for Illumina
              <resolution>       number   
              <min confidence>   number

   Either individual id or haplotype id must be specified.  If individual id
   is specified, request will look for associations for all haplotypes of the
   individual. If both are specified and conflict (e.g. haplotype id is
   invalid for the given individual id), halplotype id over-rides.
   
   All other arguments are optional.  The call returns a request id.

    shell> sqlplus snp/pass@SID
    SQL> select Markers.Log_Request(<indId>, null, 'JPT+CHB', null, null, 0.7) from dual;

  (b) Initiate the association request by calling Markers.Associate.

    SQL> select Markers.Associate(<reqId>) from dual;

    This call populates Results_Full and Results_Summary tables.

9) Generating XML output

   The Markers package implements several XML output routines.  To output XML
   to the file system, log on to the database and execute one of the
   following functions (all arguments are reqired) as follows

    shell> sqlplus snp/pass@SID

    SQL> set lines 120 pages 9999 long 1000000
    SQL> set verify off termout off head off feedback off echo off
    SQL> spool haplotype.xml 	

    SQL> select Markers.Haplotype_XML(<indId>)

    SQL> spool off
    SQL> set verify on termout on head on feedback on echo on

   (a) Individual_XML(<indId>) -- output individual's data (from the Individual
                                  table) in XML format

   (b) Haplotype_XML(<indId>)  -- output haplotype data for the individual  

   (c) Result_Full_XML(<reqId>) 

   (d) Result_Summary_XML(<reqId>)

   (e) Succeptible_XML(<reqId>, <string>) -- string is matched against OMIM
                                             titles (as a sub-string)

