Analyze your data Home Help Citations Job Queue Stats HyPhy package
Navigation Banner
Available topics.
An example based tutorial on using Datamonkey. An excellent 'Getting Started' resource!
Data files.
Preparing data files for general guidelines and common mistakes to avoid.
Site-by-site selection.
There are six available methods to test selection at a single codon site:
SLAC [all methods are subject to maximum size restrictions]
The fastest and most conservative method. Use it for large datasets (50 sequences or more) and to obtain substitution maps at each site - a useful feature for visualizing the evolutionary process.
The best overall method, in terms of tradeoff between statistical performance and computational expense. Use it for intermediate to large datasets (50 sequences or more) and if you wish to obtain good site-by-site substitution rate estimates. Use IFEL to test for sitewise selection on internal branches of the tree.
REL is an extension of familiar codon-based selection analyses pioneered by Nielsen and Yang and implemented in PAML. Importantly, REL allows synonymoys rate variation. It is often the only method that can infer selection from small (5-15 sequence) or low divergence alignments, but also the method that makes the most assumptions and susceptible to high rates of false positives in extreme cases.
TOGGLE analysis evaluates selection associated with host-immune response. This model was developed to identify sites which toggle between a wild-type and escaped amino acid state. Typically, these sites have lower levels of amino acid diversity and are not detected by standard diversifying selection tests of selection. However, the analysis is computationally expensive since site-wise tests of escape from wild-type amino acid resiudes are evaluated for each of the 20 potential wildtypes. Note that alignments with more than 50 sites can be uploaded, however only 50 sites will be tested for toggling. Indeed, we recommend that alignments of more than 50 sites are used for the estimation of branch lengths (the first phase of TOGGLE).
Directional Evolution of Protein Sequences uses amino acid sequences to identify directional evoltion towards residues at sites. Useful for the detection of selective sweeps.
Mixed effects model evolution which combines FIXED effects at the level of a site with RANDOM effects at the level of branches. This model is an extension of FEL, where the ω values are allowed to vary along branches according to a 2-bin distribution, i.e. some branches may be under positive selection while others - under negative selection. This method is most appropritate to detect episodic diversifying selection affecting indvidiual codon sites.
Overall signature of selection. PARRIS
An extension of standard likelihood ratio tests to deal with recombinant data. Useful for answering the question: is there evidence of positive selection anywhere in my alignment?
Evolutionary Fingerprinting. ESD
This method fits a versatile general discrete bivariate model of site to site variation in selection. The evolutionary fingerprint comprises a description of the number of selective classes, the dN/dS rates for each class and the assignment of sites to classes.
Diversifying selection along lineages (branch- and branch-site type models)
There are two available methods to test for diversifying selection along individual lineages:
GA Branch
A genetic algorithm based data mining procedure which automatically partitions all branches in the tree into several selective regimes (and infers the most appropriate regimes), and performs multi-model inference for increased robustness.
Branch Site REL
A principled approach to so-called "branch-site" models, i.e. models that allow evolutionary rate variation along both branches and sites simultaneously. The resulting model allows us to detect lineages on which a subset of sites have evolved under positive selection, without requiring prior knowledge about which lineages are of interest. This approach has better power and accuracy that the popular branch-site model of Yang and Nielsen in nearly all cases.
Lineage specific selection.
Evolutionary interations between sites. Spidermonkey
Use a Bayesian Graphical Model (BGM) applied to reconstructed evolutionary histories of individual sites to find evidence of co-evolution between sites in an alignment.
Codon Model Selection. CMS
Use a Genetic Algorithm to identify the best model of codon evolution which allows for multiple non-synonymous substitution rates.
Recombination detection. SBP/GARD
Determine whether recombination has acted on your alignment, and identify recombination breakpoints using a Genetic Algorithm.
HIV-1 subtype assignment. SCUEAL
Assign HIV-1 subtypes based on HIV-1 pol alignments.
Ancestral Sequence Reconstruction. ASR
Reconstruct ancestral sequences using three likelihood-based methods.
Other topics.
Post your questions on our user assistance message boards if none of the above topics match your query.
UCSD Viral Evolution Group 2004-2017  
Datamonkeys Webcomic New! Spidermonkey. HyPhy Package Page start page