 
   IN ORDER TO BECOME FAMILIAR WITH THE Algorithm::DecisionTree MODULE:


(1) First run the scripts

        construct_dt_and_classify_one_sample_case1.pl

        construct_dt_and_classify_one_sample_case2.pl

        construct_dt_and_classify_one_sample_case3.pl

        construct_dt_and_classify_one_sample_case4.pl

    as they are.  The first script is for the purely symbolic case, the
    second for a case that involves both numeric and symbolic features, the
    third for the case of purely numeric features, and the last for the
    case when the training data is synthetically generated by the script
    generate_training_data_numeric.pl

    Next, try to modify the test sample in these scripts and see what
    classification results you get for the new test samples.


(2) The second and the third scripts listed above use the training file
    `stage3cancer.csv' for the training data.  This datafile includes both
    numeric and symbolic features.  The first script named above uses the
    training file `training.dat' and it is for the purely symbolic case.
    Study these training datafiles carefully and make sure that your own
    training data conforms to these files.


(3) So far we have talked about classifying one test data record at a time.
    You can place multiple test data records in a diskfile and classify
    them all in one go.  To see how that can be done for the purely
    symbolic case, execute the command line in the Examples directory:

      classify_test_data_in_a_file.pl  training.dat  testdata.dat  out.txt

    The script classify_test_data_in_a_file.pl constructs the decision tree
    from the data in the first argument file and then uses it to classify
    the data in the second argument file.  The computed class labels are
    deposited in the third argument file.  Note that the only difference
    between the file `training.dat' and `testdata.dat' is that the latter
    does not mention the class labels for the data records.

    You can create a similar script for classifying an arbitrary number of
    numerical data records placed in a file.  In this case, your training
    datafile must be a CSV file.  However, your test datafile can be a
    regular txt file.


>   TO REMIND THE READER AGAIN, IF YOUR TRAINING DATA USES JUST NUMERIC
>   FEATURES OR A MIXTURE OF NUMERIC AND SYMBOLIC FEATURES, YOU MUST USE A
>   CSV FILE FOR THE TRAINING DATA.



=========================================================================

             FOR USING A DECISION TREE CLASSIFIER INTERACTIVELY

    Starting with Version 1.6 of the module, you can use the DecisionTree
    classifier in an interactive mode.  In this mode, after you have
    constructed the decision tree, the user is prompted for answers to the
    questions regarding the feature tests at the nodes of the tree.
    Depending on the answer supplied by the user at a node, the classifier
    takes a path corresponding to the answer to descend down the tree to
    the next node, and so on.  To get a feel for using a decision tree in
    this mode, examine the script

        classify_by_asking_questions.pl

    Execute the script as it is and see what happens.


=========================================================================

                 GENERATING SYNTHETIC TRAINING AND TEST DATA


    Starting with Version 1.6, you can use the module itself to generate
    synthetic training and test data.  See the scripts

        generate_training_data_numeric.pl

        generate_training_data_symbolic.pl

    for how to generate training data for the decision-tree classifier for
    the purely numeric case and for the purely symbolic case.  The data is
    generated according to the information placed in a parameter file in
    each case.  These files must follow certain rules regarding the
    declaration of the classes, the features, the possible values for the
    features, etc.  An example of such a parameter file for the numeric
    case is:

        param_numeric.txt

    and for the symbolic case:

        param_symbolic.txt

    A test datafile looks very much like a training data file, except that
    the former does not contain the class labels for the different data
    records.  See the script

        generate_test_data_symbolic.pl

    for an example of how you can generate test data for the purely
    symbolic case.  Note that the class labels for the test data are placed
    in a separate file whose name is supplied in the script named above.
    By comparing the classification labels obtained for each of the data
    records with their true labels you can assess the accuracy of the
    decision-tree classifier.

