History of CRUISE release CRUISE 3.6.4 --------------- September 13, 2014 update 1. An bug on exhaustive search with missing values was fixed. 2. R graphic option is provided for bivariate node model. Splus option is abandoned. CRUISE 3.6.3 --------------- December 11, 2008 update 1. '\textit" command in LaTeX is replaced with 'texttt' to properly write class labels that contain "<" or "<=". CRUISE 3.6.2 --------------- November 27, 2008 update 1. The option to include the node number in the LaTeX tree is added. CRUISE 3.6.1 --------------- October 21, 2008 update 1. An error in LaTeX tree file is fixed to run under Linux. 2. The colors of classes are rearranged to be consistent wiht the GUIDE tree. CRUISE 3.6.0 --------------- September 28, 2008 update 1. LaTeX tree generation file is modified. It can put colors in each terminal nodes. Also the node misclassification costs are printed beneath each termiaml node. CRUISE 3.5.9 --------------- February 1, 2008 update 1. Batch file creation is modified. The batch input file would work for all types of data regardless of the existence of missing values. CRUISE 3.5.8 --------------- January 27, 2008 update 1. A bug in class distribution split was fixed. A category value is assigned a node even if it has no observation associated with. It takes the majority node. 2. If one class is dominant in a node, the linear split stops. CRUISE 3.5.7 --------------- January 24, 2008 update 1. EPSILON is used instead of TINY in linear discriminant algorithm. CRUISE 3.5.6 --------------- January 23, 2008 update 1. The routine to calculate chi-square is modified again to get even more precise value. 2. Coef are initialized in linear comb splits. 3. The default mindat value for 2V model has changed. CRUISE 3.5.5 --------------- January 22, 2008 update 1. A bug in output step for levene test was fixed. 2. The routine to calculate chi-square is slightly modified to get more precise value. CRUISE 3.5.4 --------------- January 21, 2008 update 1. A bug in a mean/mode imputation algorithm is fixed. 2. When a node has all missing values, the tree growing stops. 3. Bugs in pouring step of the pruning in 2V tree is fixed. CRUISE 3.5.3 --------------- January 20, 2008 update 1. The algorithm for the class assignment in a node having tie scores is updated to avoid any numerical inaccuracies. CRUISE 3.5.2 --------------- January 20, 2008 update 1. A bug in class distribution split was fixed. If all the obs go to the same node, the tree growing will stop. 2. A bug in MANOVA pair variable selection was fixed. The pair with the determinant less than EPSILON are ignored. CRUISE 3.5.1 --------------- January 19, 2008 update 1. EPSILON is used instead of TINY in one algorithm generating pruning alpha sequence. 2. The default mindat for 2V model has changed. If the observation is small, the default mindat for 2V model is larger than that for 1V model. CRUISE 3.5.0 --------------- January 17, 2008 update 1. A bug is fixed when the node models fail to fit. The misclassification cost of the majority rule is used if the node models fail. 2. The empty terminal nodes are not counted in computing alpha sequence. CRUISE 3.4.9 --------------- January 16, 2008 update 1. Some bugs are fixed for class dist method in 2V model. Crimcords are not modified by the class dist method. CRUISE 3.4.8 --------------- January 15, 2008 update 1. When the 2V ldf fails in a node, the 1v ldf is fitted instead of majority rule. 2. The dialogs of the program are slightly changed. CRUISE 3.4.7 --------------- January 11, 2008 update 1. Variable importance ranking is produced in the output file. CRUISE 3.4.6 --------------- January 10, 2008 update 1. A bug at the Box-Cox transformation is fixed. As a result, the cut point is calculated more precisely. CRUISE 3.4.5 --------------- January 8, 2008 update 1. Interaction test is modified. It may tabulate up to 3 by 3 table. CRUISE 3.4.4 --------------- January 7, 2008 update 1. Missing value treatments are added to class distribution split method. The program has the same missing treatment options to all split methods. 2. The default of the s.e. rule is set to 0.5. CRUISE 3.4.3 --------------- January 6, 2008 update 1. Quantiles are now computed using uniform distribution. The number of quantiles considered is dependent on the node size. It can be as large as 16 quantiles and as small as four quantiles. The sample quantile by index program in 3.4.2 turned out to be slow. CRUISE 3.4.2 --------------- January 5, 2008 update 1. Sample quantiles are used again by the aid of quick index program. The number of quantiles considered is dependent on the node size. It can be as large as 16 quantiles and as small as four quantiles. CRUISE 3.4.1 --------------- January 4, 2008 update 1. In two class problem, when the class distribution method for categorical variables is used, the option for the exhaustive search for numerical variable is added. This option is the default for two class problems. CRUISE 3.4.0 --------------- January 3, 2008 update 1. The split method that uses class distribution for categorical variables is modified. It now incorporates the misclassification costs and any type of priors. CRUISE 3.3 --------------- January 1, 2008 update 1. The way to get the quartiles are modified to reduce the computation time. It uses normal distribution. CRUISE 3.2 --------------- December 31, 2007 update 1. If the test data has unseen class labels that does not exist in training data, the corresponding observations will be attributed to misclassified ones. 2. If the test data has unseen category values that does not exist in training data, the corresponding observations will take the right-most node to follow down the tree. 3. A bug is fixed when zero mean square error occurs in the box-cox transformation. 4. Give the user the option to generate a file that contains prediction results (node id, actual class, predicted class) of an external data file. CRUISE 3.1 --------------- December 30, 2007 update 1. 2D variable selection algorithm is modified. Univariate tests are first performed, then the interaction tests are performed only when the univariate test is insignificant. If the interaction tests are not significant again, then the univariate test result is resumed. 2. 2D variable selection algorithm is further modified. For numerical variables only, the variable space is partitioned into 8 pieces for the tabulation. However, this is done only when there are enough cell count for a node so that each cell count is larger than 5. If the condition is not met, the usual quartile tabulation is performed. CRUISE 3.0 --------------- August 4, 2007 update 1. In a LaTeX tree diagram, a fraction "m/n" beside each node is printed, where m is the number misclassified and n is the sample size. 2. Modify the latex code for two class problem. The split condition is positioned beside the splitting node. CRUISE 2.9 --------------- August 2, 2007 update 1. Modify the latex code for better viewing the tree structure. 2. New split method is added for categorical variables. Each category takes one subnode and the subnodes having the same class are merged. CRUISE 2.8 --------------- October 27, 2005 update 1. Modify the code for R or Splus plot that can show different colors for each class. 2. Correct the mismatch problem of allowable character length. 3. Make the program run even if some variables consist of all the missing values. CRUISE 2.7 --------------- July 14, 2005 update 1. When a singularity occurs in a node in 2V tree option, the split is continued with univariate split for the node. The hope is that it may find a good two variable plot after the univariate split. CRUISE 2.6 --------------- July 11, 2005 update 1. Modify the \Latex code generating program for better viewing the tree structure. If the terminal node is 10 or more, the landscape mode will be turned on. Otherwise portrait mode is the default. 2. Increase the limit of the character length of each value to 21 since some statistical software print out in length of 21. 3. Add a trap to stop the program if there is no 'd' type variable in the description file. 4. The name of the program is added in the caption of \Latex tree. 5. Increase the length of variable descriptor in 'list()' of 'scan' command of R or Splus. 6. Fix the ldf function calculation to remove any numerical inaccuracies. 7. Corrected typos in the caption of \Latex tree. CRUISE 2.5 --------------- May 11, 2005 update 1. Correct the mismatch problem between default option and detailed option in determining minimum node size. CRUISE 2.4 --------------- May 2, 2005 update 1. Correct the mismatch problem between default option and detailed option in selecting missing value treatment. CRUISE 2.3 --------------- February 15, 2005 update 1. Modify the code generating program for R since an R command became defunct in the newer version (R-2.0.1). CRUISE 2.2 --------------- June 5, 2004 update 1. Correct a minor I/O descriptor problem. CRUISE 2.1 --------------- February 16, 2004 update 1. Incorporate "2V Tree" program. The algorithm is described in Kim & Loh (2003), published in JCGS. Users can choose this method by selecting "univariate split with bivariate node models" during running the program. CRUISE 1.11 --------------- May 8, 2003 update 1. Correct I/O descriptor problem. If there are too many categories in a split variable, then it may go over the preset length (160 or 300) of the descriptor variable. If this happens, the program write the whole categories in one line 2. Correct the maximum allowable depth of tree for large number of classes. CRUISE 1.10 --------------- December 12, 2002 update 1. Correct the error by linear combination splits when there is no missing value in learning sample but some missing values in test sample. CRUISE 1.9 --------------- March 1, 2002 update 1. Correct WRITE statement to accept very large number of categories in split. 2. Check whether the description file ends with carriage return. If not, the program gives error messages then stops. CRUISE 1.8 --------------- December 10, 2001 update 1. Generate error message for empty character string in a data file. 2. If a test data has missing values in class variable, prediction results are printed along with the true class and missing value code. Previously, the program generated error messages and stopped. CRUISE 1.7 --------------- October 30, 2001 update 1. Make 1-SE rule the default for pruning 2. Make exhaustive search the default split point selection method when the number of classes = 2. 3. Change the arguments in the psset line of the pstricks file to: \psset{tnsep=2pt,tnheight=2pt,treesep=.1cm,levelsep=40pt,radius=5pt} CRUISE 1.6 --------------- August 15, 2001 update 1. Print out the percentages of the observations with one or more missing values 2. Print out the percentages of the missing values in the data 3. Fix the bug in input statement when a user type out of range values CRUISE release 1 --------------- November 7, 2000