Analyze your data Home Help Citations Job Queue Stats HyPhy package
Navigation Banner

How does FEL/IFEL infer selection?

Complete details can be found in our MBE paper (FEL) and PLoS Comp Bio paper (IFEL).
Phase 1: Nucleotide model maximum likelihood (ML) fit
A nucleotide model (any model from the time-reversible class can be chosen) is fitted to the data and tree (either NJ or user supplied) using maximum likelihood to obtain branch lengths and substitution rates. If the input alignment contains multiple segments, base frequencies and substitution rates are inferred jointly from the entire alignment, while branch lengths are fitted to each segment separately. The "best-fitting" model can be determined automatically by a model selection procedure or chosen by the user.
Phase 2: Codon model ML fit
Holding branch lengths proportional to and subsitution rate parameters constant at the values estimated in Phase 1, a codon model obtained by crossing MG94 and the nucleotide model of Phase 1 is fitted to the data to obtain codon branch lengths for scaling dN and dS estimated subsequently from each site.
Phase 3: Site by site likelihood ratio test
For every site, utilizing parameter estimates from Phases 1 and 2, an MG94 based codon model from Phase 2, now with two parameters - α (instantaneous synonymous site rate) and β (instantaneous non-synonymous site rate) rate are first fitted independently, and then under the constraint α=β. Next, a one degree of freedom likelihood ratio test is performed to infer whether α is different from β, and a p-value is derived. If the p-value is significant, the site is classified based on whether α>β (negative selection) or α<β (positive selection).
IFEL is essentially the same as FEL, except that selection is only tested for along internal branches of the phylogeny. Each site has three parameters, α (syn. rate), β_I (non.syn. rate for internal branches) and β_L (non.syn. rate for terminal branches). The null model now assumes that α=β_I (β_L is unconstrained in both models). This test is appropriate when 'population level' effects are sought.
UCSD Viral Evolution Group 2004-2017  
Datamonkeys Webcomic New! Spidermonkey. HyPhy Package Page start page