Iowa State University Brigham Young University University of Georgia







Overview
Phylogeny
Fiber Evolution


Introgression Populations
Homoeolog-specific Profiling
Genetic Networks & Phenotype
Effects of Selection
Sequence Capture

Genetic and Physical mapping resources
Comparative BAC Sequencing
Genome Sequence Resources
EST D-genome map
EST Resources
Microarray

Web Database
Education and Outreach
Significance for cotton industry
Cotton Literature
Cotton Links
Events
Wendel Lab
PGML (Paterson Lab)
Udall Lab

Lists & protocols
Publications
How to
CEGC Site Search

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
turn explanations on/off

Sequencing Results:

An Evaluation of Sequence Capture Technologies with Polyploid Cotton (SRP009879)

Sequence capture is a revolutionary method to resequence targeted portions of the genome but it has not been tested with a polyploid plant. We constructed two capture platforms 1) a novel Nimblegen sequence capture microarray and 2) Mycroarray beads containing RNA probes. Both platforms targeted the same 534 genes. On each platform, we hybridized two different accessions to this microarray to assess capture efficiency and potential of combined sequence using DNA multiplex identifiers (MIDs).

Sequence capture of 532 selected genes from G. hirsutum (SRP009870)

Cotton (G. hirsutum) is a polyploid species native to Central America. It was domesticated by the ancient inhabitants of Central America and it was quickly adopted by Europeans with colonial agricultural technology. Initially, it was grown around the Caribbean Sea. Feral cultivars from these historical growing regions and native populations have been investigated for genetic diversity with the aim to improve modern cultivated cotton. Here, we use sequence capture to investigate nucleotide diversity within and flanking 532 selected genes from the cotton genome.

Evolutionary Genomics of Cotton - cDNA (SRP001603)

This study includes the generation of a pan-transcriptome from domesticated and wild cotton accessions from both Gossypium hirsutum and G. barbadense.
Description: RNA was extracted from whole seedlings, leaves, roots, floral organs, and fiber of each accession (Acala Maxxa and Tx2094, G. hirsutum; K101 and S6, G. barbadense). RNA pooled in equimolar amounts for each sample. cDNA was generated using a poly-T primer and amplified using the Clontech SMRT technology. Samples were normalized using the Evrogen Trimmer Kit (DSN). Some samples were sequenced after cutting off the poly-A tails using MmeI nuclease. For other samples, the 5' and interior portions of the transcripts were preferentially amplified using ligation mediated PCR-suppression. Samples were sequenced on both FLX and Titanium 454 techologies. Sequence of expressed genes were also generated by Illumina sequencing from 10 and 20 dpa fiber (days post-anthesis). From each stage, sequence was generated from A2 (G. arboreum), D5 (G. raimondii), Acala Maxxa (G. hirsutum), TX2094 (G. hirsutum), Pima-S6 (G. barbadens), and K101 (G. barbadense). Sequence was generated using Illumina's recommended protocols, including cDNA synthesis, cluster generation, and sequencing. These reads were used to validate and identify SNPs between the A and D genomes of diploid cotton and SNPs between the A- and D-genomes of tetraploid cotton.

Additional information and data

Whole genome DNA samples were extracted using a modified CTAB protocol. Twelve samples were tagged with multiplex identifiers (MIDs), each with a different MID. They were idependently hybridized to a custom Nimblegen Sequence Capture Array (12-plex). After washing, the samples were all eluted into a single tube and subjected to ligation-mediated PCR (i.e. PCR using the 454 sequence adapters) to amplify the captured fragments to a high enough concentration for sequencing. Subsequently, the libraries were prepared for 454 sequencing (size selection on a Capliper XL, emPCR, and bead preparation etc.). The twelve captured samples tagged with multiplex identifiers (MIDs) were run on a single plate (2 large regions) of 454. The image files from the sequence were processed with 'less stringent' settings when compared to the default settings for the 454 software pipeline. Namely, BadFlowThreshold from 4 (default) to 8; LastFlowToTest from 320 (default) to 240; TrimBackScaleFactor 0.7 (default) to 1; errorQscoreWindowTrim from 0.01 (default) to 0.02; QScoreTrimBackScaleFactor from 0.9 (default) to 1.0. The samples were prepared using 'traditional' WGS libraries from Roche. Below is a list of the samples and MIDs included in these runs (Capture 1 - 4).

Download links for capture (e.g., fasta and ace) and sample specific (e.g., fasta, fastq and sff) data files are also included in the table.

MID     Forward Sequence Capture 1 Capture 2 Capture 3 Capture 4
1ACGAGTGCGT TX_704 TX_1182 TX_2002 Coker_315
2ACGCTCGACA TX_786 TX_44 TX_2090 Cascot_L7
3AGACGCACTC TX_953 TX_959 TX_2091 Lkt_511
4AGCACTGTAG TX_1009 TX_1110 TX_2092 PM_145
5ATCAGACACG TX_1055 TX_1120 TX_2093 FM_958
6ATATCGCGAG TX_665 TX_1226 TX_2094 BR_110
7CGTGTCTCTA TX_2089 TX_1228 TX_2095 Maxxa
8CTCGCGTGTC TX_1037 TX_674 TX_2096 D5
9TAGTATCAGC TX_672 TX_1046 ARK_2402 AD3
10TCTCTATGCG TX_1107 TX_1982 ST825 GK
11TGATACGTCT TX_480 TX_1236 TX_1988 TAM
12TACTGAGCTA TX_1748 TX_1996 DP90 A2


We welcome your comments and suggestions.