
Changes made in Sense-Clusters version 0.93 during version 0.95

Ted Pedersen 	 tpederse@d.umn.edu
Anagha Kulkarni  kulka020@d.umn.edu
Mahesh Joshi	 joshi031@d.umn.edu

1. Updated Toolkit/clusterstop/clusterstopping.pl :			-Anagha
   - changed the default cluster-stopping measure from PK2 to PK3
   - changed the default crfun from h2 to i2
   - formatted and added details to the error messages
   - added check for catching "NaN" values generated by the crfuns 
     with the Expected / reference data (Gap Statistic)
   - added a check for -ve delta values
   - updated and reorganized the documentation.
   - now generates PREFIX.gap file that contains crfun values, 
     delta values and the predicted k.
   - updated the logic for setting the default delta value.
   - modified the redirection from >& to > for the vcluster and 
     scluster calls.

2. Updated discriminate.pl :						-Anagha
   - changed the default #clusters from 10 to 2
   - modified the program logic to catch the exit status of 
     clusterstopping.pl and if it has failed then output the 
     reason of failure from the *.predictions file (if present)
     and use the default #clusters (2) to proceed.
   - changed the calls to vcluster and scluster such that now 
     the --showtree option is used only if the #clusters > 1.
     (NOTE: The -showtree option provides a ascii representation 
     of the clustering solution however if the #clusters is 1 then
     this option generates quite a few error messages which are
     not related to SenseClusters functionality. Thus we are 
     currently not using this option when #clusters = 1. If Cluto
     fixes this problem in future then we can go back to using 
     -showtree option consistently.)
   - now dendograms are generated by vclusters or scluster only
     if #clusters > 1
   - updated and reorganized the documentation.
   - added an error check to verify that the number of bigram 
     features is not 0 before proceeding with generation of 
     co-occurrence features.
   - removed the error check: if --training option not used 
     nor --split option used then --scope_train cannot be used.
   - modified messages: added angled brackets to the filenames 
     and remove periods following filenames or parameters.
   - added an error check to discriminate.pl to verify that the 
     specified training file exists.

3. Updated Web/SC-cgi/callwrap.pl :					-Anagha
   - Now displays the message about SVD not being performed or 
     cluster-stopping failing and thus using the default #clusters.

4. Updated Demos directory : 		                                -Ted
   - reorganized files and directories somewhat, and added new options
     to demo scripts, to reflect new functionality in the package that
     has been introduced since the demos were last updated 2 years ago.

5. Updated Toolkit/preprocess/sval2/maketarget.pl :			-Anagha
   - added enclosing head tags to the regex generated by this script
     via --head option.

6. Updated default stoplist :                                           -Ted
   - former stoplist only removed lower case words. The new list includes
     stop words that begin with upper and lower case. This affects the 
     web interface, Demos, and Docs.
   
7. Updated discriminate.pl :						-Mahesh
   - Added support for LSA context clustering using the 
     "--context o2 --lsa" option combination
   - Modified error messages
   - Updated POD and command line help with respect to LSA context
     clustering
   - Incremented internal version
	 - Updated to invoke nsp2regex.pl after wordvec.pl in SC native
	   order2 context clustering mode

8. Updated Toolkit/vector/order1vec.pl :				-Mahesh
   - Modified output of --clabel option to discard features that were
     not found even once in the test data
	 - Added --transpose option to support output in the form of a
	   feature-by-context matrix similar to Latent Semantic Analysis
		 (LSA) representation
	 - Added --testregex TEST_REGEX option, which outputs only those
	   regular expressions from the input FEATURE_REGEX file that
		 matched at least once in the input SVAL2 file. This file
		 is required as input to order2vec.pl in LSA context clustering
		 mode.

9. Updated Toolkit/vector/order2vec.pl :				-Mahesh
   - Dropped the --token TOKEN_REGEX option and the FEATURES file at
	   the command line, order2vec.pl now requires a command line of 
		 the form:
		 order2vec.pl [options] SVAL2 WORDVEC FEATURE_REGEX
   - Modified the regex that reads features from features file, to
     accept general ngrams, rather than just unigrams
   - Updated POD and command line help
 
10. Added new test cases in Testing/vector/order2vec/			-Mahesh
   - Added four test cases for four types of features, testing the LSA
     context clustering scenario, in binary and non-binry mode

11. Updated web interface files in Web/SC-cgi				-Mahesh
   - Modified index.cgi, first.cgi, second.cgi and callwrap.pl to support
     LSA context clustering

12. Updated Docs/HTML/discriminate.html					-Mahesh
   - Updated with respect to POD update of discriminate.pl

13. Updated Docs/HTML/Toolkit_Docs/vector/order2vec.html		-Mahesh
   - Updated with respect to POD update of order2vec.pl

14. Updated Docs/Flows/SenseClusters-ContextClustering.ai/pdf		-Mahesh
   - Added LSA context clustering flow

15. Updated Docs/Flows/SenseClusters-WordClustering.ai/pdf		-Mahesh
   - Removed obsolete kocos.pl call from the flow

16. Updated SC/Toolkit/clusterlabel/clusterlabeling.pl to create	-Anagha
    the temporary files with time-stamp in their names.

(Changelog-v0.93to0.95 Last Updated on August 7, 2006 by Anagha)
