PUBLICATION

Comparative genomics in cyprinids: common carp ESTs help the annotation of the zebrafish genome

Authors
Christoffels, A., Bartfai, R., Srinivasan, H., Komen, H., and Orban, L.
ID
ZDB-PUB-070210-35
Date
2006
Source
BMC Bioinformatics   7 (Suppl 5): S2 (Journal)
Registered Authors
Bartfai, Richard, Orban, Laszlo
Keywords
none
MeSH Terms
  • Algorithms
  • Animals
  • Carps/genetics*
  • Chromosome Mapping/methods
  • Cyprinidae/genetics*
  • Databases, Nucleic Acid
  • Expressed Sequence Tags*
  • Gene Library
  • Genome
  • Genomics/methods*
  • Male
  • Sequence Homology, Nucleic Acid
  • Testis/metabolism
  • Untranslated Regions/genetics
  • Zebrafish/genetics*
PubMed
17254304 Full text @ BMC Bioinformatics
Abstract
BACKGROUND : Automatic annotation of sequenced eukaryotic genomes integrates a combination of methodologies such as ab-initio methods and alignment of homologous genes and/or proteins. For example, annotation of the zebrafish genome within Ensembl relies heavily on available cDNA and protein sequences from two distantly related fish species and other vertebrates that have diverged several hundred million years ago. The scarcity of genomic information from other cyprinids provides the impetus to leverage EST collections to understand gene structures in this diverse teleost group. RESULTS : We have generated 6,050 ESTs from the differentiating testis of common carp (Cyprinus carpio) and clustered them with 9,303 non-gonadal ESTs from CarpBase as well as 1,317 ESTs and 652 common carp mRNAs from GenBank. Over 28% of the resulting 8,663 unique transcripts are exclusively testis-derived ESTs. Moreover, 974 of these transcripts did not match any sequence in the zebrafish or fathead minnow EST collection.A total of 1,843 unique common carp sequences could be stringently mapped to the zebrafish genome (version 5), of which 1,752 matched coding sequences of zebrafish genes with or without potential splice variants. We show that 91 common carp transcripts map to intergenic and intronic regions on the zebrafish genome assembly and regions annotated with non-teleost sequences. Interestingly, an additional 42 common carp transcripts indicate the potential presence of new splicing variants not found in zebrafish databases so far. The fact that common carp transcripts help the identification or confirmation of these coding regions in zebrafish exemplifies the usefulness of sequences from closely related species for the annotation of model genomes.We also demonstrate that 5' UTR sequences of common carp and zebrafish orthologs share a significant level of similarity based on preservation of motif arrangements for as many as 10 ab-initio motifs. CONCLUSION : Our data show that there is sufficient homology between the transcribed sequences of common carp and zebrafish to warrant an even deeper cyprinid transcriptome comparison. On the other hand, the comparative analysis illustrates the value in utilizing partially sequenced transcriptomes to understand gene structure in this diverse teleost group. We highlight the need for integrated resources to leverage the wealth of fragmented genomic data.
Genes / Markers
Figures
Expression
Phenotype
Mutations / Transgenics
Human Disease / Model
Sequence Targeting Reagents
Fish
Antibodies
Orthology
Engineered Foreign Genes
Mapping