Iowa State University Brigham Young University University of Georgia

Fiber Evolution

Introgression Populations
Homoeolog-specific Profiling
Genetic Networks & Phenotype
Effects of Selection
Sequence Capture

Genetic and Physical mapping resources
Comparative BAC Sequencing
Genome Sequence Resources
EST D-genome map
EST Resources

Web Database
Education and Outreach
Significance for cotton industry
Cotton Literature
Cotton Links
Wendel Lab
PGML (Paterson Lab)
Udall Lab

Lists & protocols
How to
CEGC Site Search

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
turn explanations on/off


Developing an enhanced EST resource for cotton

Here we present a vastly expanded cotton EST assembly, which contains approximately 4.4 million Sanger and next-generation (454) transcripts. Like previous assemblies [1], this one incorporates ESTs from both the A- and D-genome diploid progenitors, along with allopolyploid ESTs from two species of allopolyploid cotton, G. barbadense and G. hirsutum. The 56,373 contigs extracted from this assembly represent a vastly expanded representation of the genic content of cotton. We describe this collection and document its utility for genome-specific transcriptome analysis in allopolyploid cotton. We also present a characterization of the functional properties of the cotton transcriptome and analyses of molecular evolution following the most recent whole genome duplication that accompanied allopolyploid formation 1-2 million years ago[2, 3].

To add additional depth to the assembly, we also generated ~152 million 82 bp Illumina reads, representing the fiber transcriptome of diploid A- and D-genome cotton as well as the allopolyploids G. barbadense and G. hirsutum. Together these resources allow us to detect 259,192 genome-specific SNPs, which in turn can be used to distinguish the A- and D-genome homoeologs found in the allopolyploid cotton genome. At the time of writing, allopolyploid cotton is now among the most important crops lacking a whole genome sequence, but as progress is made in this regard, the EST assembly and genome-specific SNP resources presented here will be of use in assembling and annotating the cotton genome.

Plant material and EST library construction and sequencing 454-FLX and Titanium ESTs were derived from various Gossypium species and tissue types. RNA was independently extracted from each tissue source using a modified hot-borate method (Wilkins and Smart, 1996) and checked for integrity on Bioanalyzer (Agilent Technologies, Santa Clara, CA). Equimolar amounts of RNA from each extraction were combined into a single sample for cDNA library construction. cDNA libraries were constructed using SMART method (Clontech, Mountain View, CA) and the resulting amplified, double- stranded libraries were normalized using a double-strand nuclease (Trimmer, Evrogen, Moscow, Russia). To prevent poly-A (or poly-T) homopolyers in the 454 reads, we employed two strategies. The first strategy was applied to the FLX reads where a TypeIIS endonuclease was used to cleave 18-20 bp of transcript from a modified 3' SMART adapter (K. Delehaunty, personal communication). The second strategy, used for the 454 Titanium reads, employed PCR-suppression oligos to target particular regions in the transcript (5', internal, or 3'). 5', internal, and 3' transcript segments were pooled for cDNA sequencing of the G. raimondii sample. Only 5' and internal segments were pooled for Titanium sequencing of G. hirsutum (Tx2094) and G. barbadense (K101 and S6). DNA sequencing was performed using 454 sequencing (454 Life Sciences, Branford, CT) at the Brigham Young University DNA sequencing center (FLX and Titanium) and Washington University (FLX). The reads have been made publically available through NCBI's Sequence Read Archive (Study #SRP001603). All publicly available Sanger reads were downloaded from GenBank (Feb. 2009) and filtered for duplicate, vector, and low-complexity sequences. Short ESTs (< 30 bp) and low-quality sequence (quality limit = 0.5) were removed from the 454 reads.

  • 1) Udall JA, Swanson JM, Haller K, Rapp RA, Sparks ME, Hatfield J, Yu Y, Wu Y, Dowd C, Arpat AB, Sickler BA, Wilkins TA, Guo JY, Chen XY, Scheffler J, Taliercio E, Turley R, McFadden H, Payton P, Klueva N, Allen R, Zhang D, Haigler C, Wilkerson C, Suo J, Schulze SR, Pierce ML, Essenberg M, Kim H, Llewellyn DJ et al.: A global assembly of cotton ESTs. Genome Res 2006, 16:441-450.
  • 2) Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25:1754-1760.
  • 3) Wendel JF, Brubaker CL, Alvarez I, Cronn RC, Stewart JM: Evolution and natural history of the cotton genus. In: Genomics of cotton, plant genetics and genomics; crops and models 3. Edited by Paterson AH. New York: Springer; 2009: 3-22.

We welcome your comments and suggestions.