PUBLICATION

Dealing with saturation at the amino acid level: a case study based on anciently duplicated zebrafish genes

Authors
van de Peer, Y., Frickey, T., Taylor, J., and Meyer, A.
ID
ZDB-PUB-021015-17
Date
2002
Source
Gene   295(2): 205-211 (Journal)
Registered Authors
Meyer, Axel, Taylor, John
Keywords
duplicate fish genes; mutational saturation; amino acid sequences
MeSH Terms
  • Amino Acids/genetics*
  • Animals
  • Evolution, Molecular
  • Gene Duplication*
  • Genome
  • Models, Genetic
  • Phylogeny*
  • Zebrafish/genetics*
PubMed
12354655 Full text @ Gene
Abstract
The ray-finned fishes (Actinopterygii) seem to have two copies of many tetrapod (Sarcopterygii) genes. The origin of these duplicate fish genes is the subject of some controversy. One explanation for the existence of these extra fish genes could be an increase in the rate of independent gene duplications in fishes. Alternatively, gene duplicates in fish may have been formed in the ancestor of all or most Actinopterygii during a complete genome duplication event. A third possibility is that tetrapods have lost more genes than fish after gene or genome duplication events in the common ancestor of both lineages. These three hypotheses can be tested by phylogenetic reconstruction. Previously, we found that a large number of anciently duplicated genes of zebrafish are sister sequences in evolutionary trees suggesting that they were produced in Actinopterygii after the divergence of Sarcopterygii [Phil. Trans. R. Soc. Lond. B 356 (2001) 119]. On the other hand, several well-supported trees showed one of the two fish genes as the sister sequence to a monophyletic clade that included the second fish gene and genes from frog, chicken, mouse and human. These so -called outgroup topologies suggest that the origin of many fish duplicates predates the divergence of the Sarcopterygii and Actinopterygii and support the hypothesis that tetrapods have lost duplicates that have been retained in fish. Here we show that many of these 'outgroup' tree topologies are erroneous and can be corrected when mutational saturation is taken into account. To this end, a Java-based application has been developed to visualize the amount of saturation in amino acid sequences. The program graphically displays the number of observed frequent and rare amino acid replacements between pairs of sequences against their overall evolutionary distance. Discrimination between frequent and rare amino acid replacements is based on substitution probability matrices (e.g. PAM and BLOSUM). Evolutionary distances between sequences can be computed from the fraction of unsaturated sites only and evolutionary trees inferred by pairwise distance methods. When trees are computed by omitting the saturated fraction of sites, most fish duplicates are sister sequences.
Genes / Markers
Figures
Expression
Phenotype
Mutations / Transgenics
Human Disease / Model
Sequence Targeting Reagents
Fish
Antibodies
Orthology
Engineered Foreign Genes
Mapping