Detailed analysis of the indels occurring in the CesA and AdhA regions of AT, DT, D, and A
genomes for all branches
A reanalysis of both regions, with the addition of the newly available outgroup resources, provided the ability to
address genome size evolution on the ancestral branches (i.e. before diploid-polyploid divergence), as well as on
the tips (A, D, AT, and DT alone). By addressing the evolution on the ancestral branches
separately from the tips, we were able to calculate the rates at which these genomes have expanded or contracted
over time, as well as due to specific mechanisms. Using those calculated rates, we began to map genome growth and
reduction onto the phylogeny of Gossypium to shed light on the history of genome size change in Gossypium.
Mechanisms affecting genome size in Gossypium: From the regions analyzed, we see that transposable elements
have affected genome size to varying degrees along four of the six branches analyzed (pictured below), while the
removal of transposable elements via intra-strand homologous recombination was only detected once. Illegitimate
recombination was present on all six branches, and was biased either toward gain (?) or loss (?) in each genome.
In addition for all six of the branches analyzed, some insertions and deletions were not able to be assigned a
category, leaving them in a lumped section ("unknown") that also could be biased toward gain or loss in each genome.
(Note: this "unknown" category could represent some of the other mechanisms noted that have lost their hallmarks
due to subsequent evolution and are now unrecognizable, as well as mechanisms that do not leave hallmarks that we
are aware of). In examining these mechanisms with respect to relative impact along each branch, we saw that there
was no single dominant mechanism of change for the genus. Some branches had clear "winners" that accounted for a
vast majority of the change (e.g. A/AT ancestral branch), while other branches were more evenly spread
(e.g. D/DT ancestral branch); however, all mechanisms of genome size change currently implicated in
other systems have operated in shaping the genomes of Gossypium, although to varying degrees.
Mechanisms affecting genome size during the evolution of Gossypium species are listed along each branch.
The percentages after the mechanism names indicate the proportion of genomic turnover attributed to that
mechanism for that branch.
Rates of genome size evolution in Gossypium: From our earlier data we realized that genome size evolution
in Gossypium was subject to regional phenomena; however, data concerning the rates of evolution for each region
and genome allowed us to better quantify this observation. As demonstrated on the figure below, not only are many
of the rates an order of magnitude different for each region within a genome, but for half of the branches, there
is also a difference in direction (expansion or contraction) between regions that was not consistent across genomes
(i.e. neither the AdhA or CesA consistently expanded or contracted across all branches). This indicates that not
only are there regional differences within a genome, allowing some regions to expand while others contract, those
differences or biases do not need to remain static across evolution (i.e. a region that is expanding along one
branch may end up contracting on a later branch).
The trend in genome size for Gossypium is typically toward growth, with the notable exception of the
AT genome. The diploid branches all experienced growth, as did the DT genome, although
notably slower. Also notable were the rate changes between the ancestral and extant branches; in both cases, the
rate of change on the ancestral branch is faster than the rate of change in the tips of the tree. The most
significant amount of genome size change took place during the ancestral genome evolution, which was not
unexpected due to the relative amount of time spent on that branch.
Further information concerning the analysis and the conclusions drawn from this data (including rates partitioned
by region, genome, and mechanism) can be found in
Grover, MBE, 2008; however, the main conclusions of this
data are as follows. For Gossypium, the diploid genomes appear to have achieved their difference in sizes
due primarily to growth. All diploid branches experienced growth, while the polyploid (as a whole) experienced
contraction-which is consistent with its less than additive genome size. The rate of growth itself (in nt per year)
slowed (or reversed) in all cases after diploid-polyploid divergence. This may be due in part to the episodic
nature of TEs, whose proliferation in the A/AT ancestor accounts for most of the genome size differences
between the four extant genomes; however, there is no mechanism in Gossypium that consistently operates to effect
the most change. Intra-strand homologous recombination, for example, had the greatest effect on the AT
genome and was responsible, in large part, for the contraction of the polyploid sequence, although increased
illegitimate recombination in the AT and DT genomes (relative to the ancestor) contributed
The rates of genome loss and gain as inferred by the combined indel data for the CesA and AdhA regions. The
evolutionary relationship and times of divergence between the model diploid progenitors (G. arboreum (A) and
G. raimondii (D)), and the true parents to the polyploid, as well as their subsequent reunion in the polyploid (AD)
are shown. Branch lengths reflect time, and branch thickness indicates change in genome size (filled denotes
sequence gain; open indicates sequence loss). Gossypium diverged from the outgroup
(Gossypioides kirkii, 1C 5 590 Mb) approximately 10-15 mya, and A-genome and D-genome cottons diverged from each
other approximately 6.8 mya. The genome groups evolved independently for 5.2 and 4.2 my, respectively, before the
model diploid progenitors diverged from the actual (and extinct) parents of the polyploid 1.6 and 2.6 mya for the
A and D genomes, respectively. Approximately 1.3 mya, the A and D genomes were reunited in a polyploid nucleus,
whose genome size is slightly less than the sum of the 2 model parents. Overall rates of genome size change are
represented by the first line in the green boxes, whereas the individual regional rates are listed independently
underneath. Rates of deletion (d), non-TE insertions (i), and TE insertions (TE) are also listed in the gray