Complete method details can be found in our MBE paper

Phase 1: Nucleotide model maximum likelihood (ML) fit

A nucleotide model (any model from the time-reversible class can be chosen) is fitted to the data and tree
(either NJ or user supplied) using maximum likelihood **to obtain branch lengths and substitution rates**. If the input alignment contains multiple segments,
base frequencies and substitution rates are inferred **jointly** from the entire alignment, while branch lengths are fitted to each segment separately.
The "best-fitting" model can be determined automatically by a model selection procedure or chosen by the user.

Phase 2: Codon model ML fit

Holding branch lengths proportional to and subsitution rate parameters constant at the values estimated in Phase 1, a codon model
obtained by crossing MG94 and the nucleotide model of Phase 1 is fitted to the data to obtain **independent rate distributions** for dN
and dS. This methods allows for rate heterogeneity both in synonymous and non-synonymous rates, by fitting a 3 bin general discrete distribution to synonymous rates,
and another 3 bin general discrete distribution to dN, yielding 9 possible values for the ratio dN/dS.

Phase 3: Empirical Bayes analysis.

For every site, utilizing parameter estimates from Phases 1 and 2 we compute two Bayes Factors, one for the event that {dN<dS} at that site (negative selection),
and another for the event that {dN>dS} (positive selection). When these Bayes Factors are sufficiently large (say 50 or more), we call such a site selected.
Note, that Bayes Factors can not be in general easily related to statistical significance, although our simulation studies showed respectable power even for
small datasets and reasonable false positive rates. As a rule of thumb, 1/Bayes Factor is analogous to the p-value of the other two tests in this setting.
This method tends to be less conservative and slower than SLAC and FEL.

Note

This method is a generalization of site-by-site positive selection analyses implemented in Ziheng Yang's PAML. The main differences are

- More general nucleotide bias models
- Modeling of synonymous rate variation as well as non-synonymous rate variation
- Use of Bayes factors for empirical Bayes result processing (although the Bayes Empricial Bayes procedure in recent versions of PAML is more suited from smaller and 'noisier' datasets).