Binary abstraction Markov models: example data & software

provided by Samuel S. Shepard, Andrew McSweeny, Gursel Serpen, & Alexei Fedorov


Steps

  1. Software available.
    1. Perl script for BAMM.
      Usage:
      	perl convertTrainTestMM SAMPLE_DIR OUTPUT_FILE ABSTRACTION_SCHEME_FILE HMM_ORDER ABSTRACTION_LEVEL [frame]
      
    2. Perl script for CDBAMM Duplication model.
      Usage:
      	perl indelDynamicMM SAMPLE_DIR OUTPUT_FILE JUMP_STEP MM_ORDER WINDOW_SIZE [frame]
      
    3. Perl script for CDBAMM Purine-pyrimidine model.
      Usage:
      	perl YRdynamicMM SAMPLE_DIR OUTPUT_FILE JUMP_STEP MM_ORDER WINDOW_SIZE [frame]
      
    4. Perl script for homogeneous, nucleotide Markov model.
      Usage:
      	perl mcClassifier SAMPLE_DIR OUTPUT_FILE MM_ORDER [frame]
      
    5. Perl script for average mutual information function on nucleotides.
      Usage:  AvMutInfo INPUT_FILE.fa OUTPUT_FILE
      	where INPUT_FILE.fa is the name of a file containing sequences to be analyzed (in FASTA format),
      	OUTPUT_FILE is the name of a file to hold the output two-column table with AMI values,
      

  2. Additional files used for BAMM.
    1. Abstraction scheme files for a priori 3 (AP3), BA1-best, BA2-best, BA3-best, BA4-best, the GT-rich scheme, and the positive splicing potential scheme.
    2. Training and test data set for CDS exons and introns.
    3. Equivalent training and test data set, used to train the SVM meta classifier.
    4. 5UTR test set, CDS exon training set, and training/test intron sets.

  3. Support vector machine meta classification.
    1. Example SVM data (*.dat files).
    2. Example octave/matlab script (requires the installation of Shogun).
    3. Example result file.


Last updated: 9.2011  |  Author: Samuel S. Shepard, Ph.D.  |  Contact:  sammysheep@gmail.com