Iowa State University Brigham Young University University of Georgia

Fiber Evolution

Introgression Populations
Homoeolog-specific Profiling
Genetic Networks & Phenotype
Effects of Selection
Sequence Capture

Genetic and Physical mapping resources
Comparative BAC Sequencing
Genome Sequence Resources
EST D-genome map
EST Resources

Web Database
Education and Outreach
Significance for cotton industry
Cotton Literature
Cotton Links
Wendel Lab
PGML (Paterson Lab)
Udall Lab

Lists & protocols
How to
CEGC Site Search

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
turn explanations on/off

Further Information

SNP detection

SNPs were extracted from the 454/Sanger contig assembly and the Illumina mRNA-Seq alignments to this the 454/Sanger reference contig in parallel. SNPs were extracted from the 454/Sanger contig assembly using bioperl and custom perl scripts. Within each contig, a consensus sequence of each diploid genome was created from overlapping reads. Individual tetraploid reads were categorized as either At or Dt based on comparisons to the diploid consensus sequences. Consensus At and consensus Dt sequences were created for each category and SNPs were identified based on differences between these two consensus sequences. For all consensus sequences, the major allele had a frequency greater than 90%. Read counts and quality values qualified the SNP calls and only qualified SNPs were used in subsequent analyses.

In parallel, SNPs were extracted from the Illumina mRNA-Seq alignments using the pileup program from the SAMtools package. All Solexa SNP calls were required to have a pileup SNP-quality score < 20. Diploid A- and D-genome (G. arboreum and G. raimondii) specific SNPs and allopolyploid specific gene losses can confound homoeolog detection within allopolyploid cotton (see Figure 1 in Salmon et al. [1] for additional description). To overcome this problem, we checked if both the A and D parental SNP alleles were present among the allopolyploid reads. In so doing, we created 2 categories, Illumina SNPs with confirmed or unconfirmed presence in the allopolyploids. The Illumina SNPs with a confirmed presence are likely the result of shared ancestral polymorphisms within the cotton A- and D-genome lineages, and therefore have the highest reliability. The unconfirmed class of Illumina SNPs may include many bona fide SNPs shared by both the parental diploids and the allopolyploids, but we lack the evidence necessary to prove that they are not a result of diploid lineage-specific mutations or allopolyploid gene loss. For this reason we report only the confirmed class of Illumina SNPs.

  • 1) Salmon A, Flagel L, Ying B, Udall JA, Wendel JF. (2010). Homoeologous non-reciprocal recombination in polyploid cotton. New Phytologist 186: 123–134.

We welcome your comments and suggestions.