June 10, 2005. ---------------------------------------------------------------- covarionTest.pl is a perl script that performs the heterogeneity or the covarion test. Citation: Covarion structure in the plastid genome evolution: a new statistical test. Cecile Ane, J. Gordon Burleigh, Michelle M. McMahon, Michael J. Sanderson. Mol. Biol Evol., 22(4):914-924. 2005 ---------------------------------------------------------------- Please email me (ane at stat.wisc.edu) if you have any question. ---------------------------------------------------------------- It needs an argument: a file name. The file should contain info about the data matrix, tree, clades, the type of test to be done, the characters to exlude... Results will be located in a directory named "results" if not otherwise specified (in the input file). Output: - a conclusion file - the estimated tree, in case estimation has been performed (not recommanded). example of use: perl covarionTest.pl inputfile perl covarionTest.pl < inputfile the script calls the following other programs: paup resolvetree.pl for resolving trees with 0-length branches seq-gen oneTestStatistic (C program) for computing the value of the test statistic W on one data set. example of input file (or standard input): excludeChar=3 NBoot=10 ras=1 treeFile=MLtree.nex treeMethod=fast matrixFile=myMatrix_nexus_or_phylip_format partition=Amborella,Calycanthus, Marchantia,Physcomitrella ,Anthoceros,Psilotum model=REV ncat=4 directory=myDirName - matrixFile: must be in phylip or nexus format. - treeFile: must be in Nexus format. (not needed if estimation performed) - treeMethod can be "parsimony" (or "par"), "likelihood" (or "like") If "par", the MP tree is computed and used or bootstrapping. If "like", it will be the ML tree. In all other cases, the tree is to be read from a file (which is recommanded). - model can be one of "HKY" "F84" or "REV". default=HKY. Indicates what substitution model will be used for both estimation of parameters and simulation. Simple models (like JC or equal base frequencies) are not implemented in the script. Sorry! - ncat: integer value. default=4. Number of gamma rate categories, used both in estimation and simulation. - nboot= number of bootstrap replicates. default=1000. - ras: type of test. 0: homogeneity test. 1: RAS vs RAS+COV test. default=1. - directory: name of directory for output. default="results" - partition: list of one clade's taxon names, separated by commas. - excludeChar: if =3 then 3d positions are excluded. if =12 then 1st and 2d positions are excluded. Otherwise, all positions are included. When estimated, the tree estimation uses all of the specified characters. However, the test is performed on sites with no gaps only, in order to avoid bias. Bootstrap replicates have a number of characters equal to the number of no-gaps characters of the matrix. Ooops! the C program "oneTestStatistic" would need to be adjusted in order to treat ambiguous characters adequately.