The following procedure has been tested on an RedHat Linux system. Please make necessary modifications for your platform. PART I: Data Load NOTE: Order of execution of these commands is important. (a) Make sure you that Oracle is installed and you are able to execute SQL*Plus and SQL*Loader. shell> which sqlplus sqlldr /opt/oracle/product/10.2.0/db_1/bin/sqlplus /opt/oracle/product/10.2.0/db_1/bin/sqlldr (b) Create Oracle user (e.g. snp), and execute SNP_Tables_102507.sql and SNP_Package_102507.sql to create the user's schema. shell> cd sql shell> sqlplus snp/pass@SID SQL> @SNP_Tables_102507.sql SQL> @SNP_Package_102507.sql SQL> exit; shell> cd .. (c) Load data shell> cd ctl shell> sqlldr userid=snp/pass@SID control=population.ctl shell> sqlldr userid=snp/pass@SID control=technology.ctl shell> sqlldr userid=snp/pass@SID control=omim.ctl shell> sqlldr userid=snp/pass@SID control=snp_omim_map.ctl shell> sqlldr userid=snp/pass@SID control=single_omim_map.ctl shell> sqlldr userid=snp/pass@SID control=double_omim_map.ctl (d) Check that the data was loaded properly. Log on to the database and check the number of rows in each of 6 tables below. For example: shell> sqlplus snp/pass@SID SQL> select count(*) from Population; Commands and table names are NOT case-sensitive. Row counts should be as follows: 1) Population 3 2) Technology 6 3) OMIM 18003 4) SNP_OMIM_Map 187 5) Single_OMIM_Map 905 6) Double_OMIM_Map 396 If there are fewer rows in any of the tables, check the file data/.log for any errors. PART II: Using the System We provide a Java program that demonstrates how an individual's genetic data may be loaded into the database, and how to invoke MutaGeneSys on this data to generate disease susceptibility hypotheses. You may choose to write your own load mechanisms, this program is just and example. The program accepts the following arguments. Please see http://www.cs.columbia.edu/~jds1/MutaGeneSys/ for format information. db connect string -- user/pass@database, e.g. snp/snp@SID input_file -- name of a file that stores snp ids and alleles output_file -- file into which MutaGeneSys will write its output population -- can be one of 'any', 'CEU', 'JPT+CHB', 'YRI' technology -- genotyping technology, can be one of: 'any', 'Illumina', 'Affymetrix' coefficient -- coefficient of determination, a number between 0 and 1 In order to execute the program, you must have an JDBC library in your classpath, or supply the path to this library to java when you invoke the program. JDBC drivers are available for download from Orace: http://www.oracle.com/technology/software/tech/java/sqlj_jdbc/index.html They should already be installed in $ORACLE_HOME/jdbc/lib. For example: java MutaGeneSys snp/pass@SID snpSample.txt snpSample.out JPT+CHB any 0.3