The JAM Project


Configuring JAM 2.11


How to Configure Configuration Manager and Data Site(s)

  1. Get list of hostnames of all machines
    Write down in a list the hostnames of all the machines which you plan to use (for either the configuration manager (server) and/or data sites (clients). You will need these names when setting up the server's and client(s)'s configuration files.

  2. Decide which machine will run the server (Configuration Manager)
    Note: It is permitted to run a client on the same machine which runs the server.

  3. On the server (Configuration Manager)...
    Nothing to be done.
  4. On each client (Data Site)...

    1. Create a directory.
      It doesn't matter what you name the directory. This directory will hold data set files, the configuration file, attribute files etc.

    2. Put the data set files into this directory.
      There should be:

      • an attributes description file, i.e., `hypo.attr'
      • a data set file, i.e., `hypo.dta'
      • an attributes description file for the meta-learning data, the name is hardcoded for now to be tmpcom.attr

    3. Create the client's configuration file.
      For a template, look at the DataSite.config file. For detailed information click here
      The name of the file is assumed to be DataSite.config.

      Suppose the designated server machine is dynamo.cs.columbia.edu. Insert the following lines in the configuration file where you see comparable text in the demo config file.

      # The host for Configuration File Manager ############################################## CONFIGFILE_HOST=dynamo.cs.columbia.edu

      # The port number for the host of the Configuration File Manager ##############################################
      CONFIGFILE_PORT=8175

      The nickname os the machine running this data site (it need not be the same with the host name

      # A unique nickname for this data site ############################################## NICKNAME=Mango

      The directory containing the images for standard parts of the animation.
      NUM_DATA_IMAGES denotes the number of images used in the "key" panel.

      # The directory containing images for
      # standard parts of the animation.
      #
      # Note: Expected gifs in this directory include:
      # data.gif ...representation of data
      # local.gif
      # engine.gif
      # default.gif
      # metac.gif
      ##############################################
      IMAGE_DIR_URL=file:/u/boat/andreas/JAM-2.11/demo/JAMimages/mango
      NUM_DATA_IMAGES=4

      Insert the lines which define where the relevant data files are located.

      # Pathnames to dataset files
      # (These names can be relative or absolute.)
      # Dataset filename extensions. ##############################################
      # name of the dataset
      DATANAME=thyroid

      #filename extension for the classifier
      CLASSIFIER_EXT=cls

      #filename extension for the data set
      CLASSIFIER_EXT=dta

      #filename extension for the training data
      TRAINDATA_EXT=bld

      #filename extension for the data dictionary
      DICT_EXT=attr

      #filename extension for the decision file -- the file that contains
      #the output of the classifier
      DEC_EXT=dec

      #filename extension for the classifying data file, which is sometimes
      #called the testing data file.
      CLASSIFYDATA_EXT=tst

      #filename extension for files associated with meta-learning
      METALEARNING_EXT=m-c

      # The name of the class which is this datasite's local learning
      #algorithm
      CLASSNAME_LEARNER=jam.algs.id3.ID3Learner

      # The name of the class which is this datasite's local meta-learning
      #algorithm
      CLASSNAME_METALEARNER=jam.algs.id3.ID3Learner

      # Default values for the Cross Validation fold, meta-learning fold
      #and level of Meta-Learning
      CVFOLD=1
      MLFOLD=1
      MLLEVEL=1

      # Default values for splitting the data set into training, validation
      #and testing files
      TRAIN_SPLIT_PERC=75
      LOCAL_SPLIT_PERC=75
      # Post Processing script (optional: if additional processing on
      #resulting data is desired)
      POST_PROCESSING_SCRIPT=report.script

      ##############################################

  5. The configuration is finished.
    Now you can run the program.


    Columbia University, September 1997. Last Modified: June 5, 1998
    andreas@cs.columbia.edu