Estimating the Frequency of Horizontal Gene Transfer Using Phylogenetic Models of Gene Gain and Loss
- Additional Document Info
- View All
We analyze patterns of gene presence and absence in a maximum likelihood framework with rate parameters for gene gain and loss. Standard methods allow independent gains and losses in different parts of a tree. While losses of the same gene are likely to be frequent, multiple gains need to be considered carefully. A gene gain could occur by horizontal transfer or by origin of a gene within the lineage being studied. If a gene is gained more than once, then at least one of these gains must be a horizontal transfer. A key parameter is the ratio of gain to loss rates, a/v We consider the limiting case known as the infinitely many genes model, where a/v tends to zero and a gene cannot be gained more than once. The infinitely many genes model is used as a null model in comparison to models that allow multiple gains. Using genome data from cyanobacteria and archaea, it is found that the likelihood is significantly improved by allowing for multiple gains, but the average a/v is very small. The fraction of genes whose presence/absence pattern is best explained by multiple gains is only 15% in the cyanobacteria and 20% and 39% in two data sets of archaea. The distribution of rates of gene loss is very broad, which explains why many genes follow a treelike pattern of vertical inheritance, despite the presence of a significant minority of genes that undergo horizontal transfer.
has subject area