Register or Login To Download This Patent As A PDF
| United States Patent Application |
20110258743
|
| Kind Code
|
A1
|
|
Sebastian; Scott
;   et al.
|
October 20, 2011
|
GENETIC LOCI ASSOCIATED WITH IRON DEFICIENCY TOLERANCE IN SOYBEAN
Abstract
The invention relates to methods and compositions for identifying soybean
plants that are tolerant, have improved tolerance or are susceptible to
iron deficient growth conditions. The methods use molecular genetic
markers to identify, select and/or construct disease-tolerant plants or
identify and counterselect disease-susceptible plants. Soybean plants
that display tolerance or improved tolerance to Phytophthora root rot
infection that are generated by the methods of the invention are also a
feature of the invention.
| Inventors: |
Sebastian; Scott; (Polk City, IA)
; Lu; Hong; (Johnston, IA)
; Han; Feng; (Johnston, IA)
; Fabrizius; Martin; (Redwood Falls, MN)
; Streit; Leon; (Johnston, IA)
|
| Serial No.:
|
162634 |
| Series Code:
|
13
|
| Filed:
|
June 17, 2011 |
| Current U.S. Class: |
800/312; 111/14; 47/58.1SE; 800/298 |
| Class at Publication: |
800/312; 800/298; 111/14; 47/58.1SE |
| International Class: |
A01H 5/00 20060101 A01H005/00; A01C 7/00 20060101 A01C007/00; A01G 1/00 20060101 A01G001/00; A01H 5/10 20060101 A01H005/10 |
Claims
1-22. (canceled)
23. A plant comprising in its genome one or more locus related to
tolerance or susceptibility to iron deficiency, wherein the one or more
locus is within a chromosome interval flanked by and including SATT334
and SATT510 or a chromosome interval flanked by and including SATT277 and
SATT433.
24. A plant comprising in its genome one or more locus related to
tolerance or susceptibility to iron deficiency, wherein the one or more
locus is closely linked to a marker selected from the group consisting of
SATT334, SCT.sub.--033, SAT.sub.--120, SATT510, SAC1724, SATT319,
SAT.sub.--142-DB, SATT708-TB, SATT460, P13073A-1, and SATT307.
25. The plant of claim 23, wherein the plant is a soybean line or soybean
variety.
26. The plant of claim 24, wherein the plant is a soybean line or soybean
variety.
27. The plant of claim 23, wherein the tolerance or susceptibility is
assayed in a population of soybean in a stand that is known to produce
chlorotic soybean plants.
28. The plant of claim 24, wherein the tolerance or susceptibility is
assayed in a population of soybean in a stand that is known to produce
chlorotic soybean plants.
29. The plant of claim 23, wherein the plant comprises an elite soybean
strain or an exotic soybean strain.
30. The plant of claim 24, wherein the plant comprises an elite soybean
strain or an exotic soybean strain.
31. A field comprising a plurality of plants of claim 23.
32. A field comprising a plurality of plants of claim 24.
33. A seed of the plant of claim 23.
34. A seed of the plant of claim 24.
35. A method of producing a plant with tolerance or susceptibility to
iron deficiency, the method comprising planting the seed of claim 33 in
soil and growing a plant therefrom.
36. A method of producing a plant with tolerance or susceptibility to
iron deficiency, the method comprising planting the seed of claim 34 in
soil and growing a plant therefrom.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Divisional Application of U.S. Ser. No.
11/200,539 filed Aug. 8, 2005 which claims priority to and benefit of
U.S. Provisional Patent Application Ser. No. 60/599,497, filed on Aug. 6,
2004, and U.S. Provisional Patent Application Ser. No. 60/599,379, filed
on Aug. 6, 2004, which are hereby incorporated by reference in their
entirety.
FIELD OF THE INVENTION
[0002] Iron deficiency pathology in soybean is manifested as iron
deficiency chlorosis (FEC or IDC). The invention relates to compositions
and methods for identifying soybean plants that are tolerant, have
improved tolerance or are susceptible to iron-deficient growth
conditions, where the methods use molecular genetic markers to identify,
select and/or construct low iron-tolerant plants. The invention also
relates to the soybean plants that display tolerance or improved
tolerance to low iron growth conditions that are generated by the methods
of the invention.
BACKGROUND OF THE INVENTION
[0003] Soybean, a legume, has become the world's primary source of seed
oil and seed protein. In addition, its utilization is being expanded to
the industrial, manufacturing and pharmaceutical sectors. Soybean
productivity is a vital agricultural and economic consideration.
Improving soybean tolerance to diverse and/or adverse growth conditions
is crucial for maximizing yields.
Iron Deficiency Chlorosis
[0004] Iron-deficiency chlorosis (IDC; alternatively, FEC), reduces
soybean yields, particularly on calcareous or other high pH
soils. IDC
develops in soybean due to a lack of chlorophyll in the leaves of
affected plants, manifesting as yellowing on the leaves. Iron is required
for the synthesis of chlorophyll and, although iron is sufficiently
present in most soils, it is often in an insoluble form that cannot be
used by the plant. Iron deficiency occurs in soils due to high pH, high
salt content, cool temperatures or other environmental factors that
decrease iron solubility. Studies have shown that even mild IDC symptoms
are an indication that yield is being negatively affected (Fehr (1982)
Journal of Plant Nutrition, 611-621.)
[0005] Iron is found in soil mainly as insoluble oxyhydroxide polymers
(FeOOH) that are extremely insoluble (10.sup.-17 M) at neutral pH. Since
the optimal concentration of soluble Fe for plant growth is approximately
10.sup.-6 M, plants have evolved two different strategies to mine the
iron they need from
soil (Fox and Guerinot 1998 "Molecular biology of
cation transport in plants," Annu. Rev. Plant Physiol. Plant Mol. Biol.
49:669-96).
[0006] So-called "Strategy I" is used by all plants except grasses. This
strategy involves a two step process. In the first step, the oxidized
iron Fe(III) is reduced to the more soluble Fe(II) by a membrane-bound
ferric chelate reductase located in root epidermal cells. This reductase
activity is inducible and necessary for iron uptake under iron deficient
conditions (Yi and Guerinot (1996) Plant Journal 10:835-844). A gene FRO2
that encodes such a ferric chelate reductase enzyme has been identified
and sequenced in Arabidopsis (Robinson et al, Nature 397:694-697, 1999).
Following the reduction step, a separate transport protein is required to
move the reduced iron across the root plasma membrane. A gene IRT1 (iron
regulated transporter) which codes for the transport protein has also
been found in Arabidopsis (Bide et al, PNAC 93:5624-5628). This same
transport protein has been shown to transport manganese, zinc, and cobalt
as well (Korshunova et al, Plant Mol. Biology 40:37-44, 1999). In
addition to this two step process, Strategy I plants also acidify the
soil by exuding protons from the roots via the conversion of ATP to ADP
within the roots. This lowers the pH in the rhizosphere and makes the
iron oxides more soluble.
[0007] While iron availability can, to an extent, be modulated
environmentally (e.g., by modifying soil pH or adding soluble iron,
applying foliar iron treatments, or applying iron to seed), these
approaches can cause unwanted side effects in the soybean or the
environment and also add to soybean production costs. Some treatments,
such as iron treatment of seed, display inconsistent results in different
cultivars or field environments. Despite these difficulties, most
producers currently rely on the use of seed, foliar, or soil treatments
to reduce IDC (Weirsma (2002) "Iron Deficiency Chlorosis (IDC) In
Soybean," Cropping Issues in Northwest Minnesota 1(7): 1-2); Goos and
Germain (2001) "Solubility of Twelve Iron Fertilizer Products in Alkaline
Soils" Communications in Soil Science and Plant Analysis 32:2317-2323.
[0008] For some time, soybean producers have sought to develop IDC
tolerant plants as a cost-effective alternative or supplement to standard
foliar, soil and/or seed treatments (e.g., Hintz et al. (1987)
"Population development for the selection of high-yielding soybean
cultivars with resistance to iron deficiency chlorosis," Crop Sci.
28:369-370). Recent studies also suggest that cultivar selection is more
reliable and universally applicable than foliar sprays or iron seed
treatment methods, though environmental and cultivar selection methods
can also be used effectively in combination. See also, Goos and Johnson
(2000) "A Comparison of Three Methods for Reducing Iron-Deficiency
Chlorosis in Soybean" Agronomy Journal 92:1135-1139; and Goos and Johnson
"Seed Treatment, Seeding Rate, and Cultivar Effects on Iron Deficiency
Chlorosis of Soybean" Journal of Plant Nutrition 24 (8) 1255-1268.
[0009] The advent of molecular genetic markers has facilitated mapping and
selection of agriculturally important traits in soybean. Markers tightly
linked to disease tolerance genes are an asset in the rapid
identification of tolerant soybean lines on the basis of genotype by the
use of marker assisted selection (MAS). Introgressing disease tolerance
genes into a desired cultivar would also be facilitated by using suitable
DNA markers.
[0010] Soybean cultivar improvement for IDS tolerance can be performed
using classical breeding methods, or, more preferably, using marker
assisted selection (MAS). Genetic markers for IDC
tolerance/susceptibility have been identified (e.g., Lin et al. (2000)
"Molecular characterization of iron deficiency chlorosis in soybean"
Journal of Plant Nutrition 23:1929-1939). Recent work suggests that
marker assisted selection is particularly beneficial when selecting
plants for IDC tolerance, because the strength of environmental effects
on chlorosis expression impedes progress in improving IDC resistance. See
also, Charlson et al., "Associating SSR Markers with Soybean Resistance
to Iron Chlorosis," Journal of Plant Nutrition, vol. 26, nos. 10 & 11;
2267-2276 (2003).
Molecular Markers and Marker Assisted Selection
[0011] A genetic map is a graphical representation of a genome (or a
portion of a genome such as a single chromosome) where the distances
between landmarks on the chromosome are measured by the recombination
frequencies between the landmarks. A genetic landmark can be any of a
variety of known polymorphic markers, for example but not limited to,
molecular markers such as SSR markers, RFLP markers, or SNP markers.
Furthermore, SSR markers can be derived from genomic or expressed nucleic
acids (e.g., ESTs). The nature of these physical landmarks and the
methods used to detect them vary, but all of these markers are physically
distinguishable from each other (as well as from the plurality of alleles
of any one particular marker) on the basis of polynucleotide length
and/or sequence.
[0012] Although specific DNA sequences which encode proteins are generally
well-conserved across a species, other regions of DNA (typically
non-coding) tend to accumulate polymorphism, and therefore, can be
variable between individuals of the same species. Such regions provide
the basis for numerous molecular genetic markers. In general, any
differentially inherited polymorphic trait (including nucleic acid
polymorphism) that segregates among progeny is a potential marker. The
genomic variability can be of any origin, for example, insertions,
deletions, duplications, repetitive elements, point mutations,
recombination events, or the presence and sequence of transposable
elements. A large number of soybean molecular markers are known in the
art, and are published or available from various sources, such as the
SOYBASE internet resource. Similarly, numerous methods for detecting
molecular markers are also well-established.
[0013] The primary motivation for developing molecular marker technologies
from the point of view of plant breeders has been the possibility to
increase breeding efficiency through marker assisted selection (MAS). A
molecular marker allele that demonstrates linkage disequilibrium with a
desired phenotypic trait (e.g., a quantitative trait locus, or QTL, such
as resistance to a particular disease) provides a useful tool for the
selection of a desired trait in a plant population. The key components to
the implementation of this approach are: (i) the creation of a dense
genetic map of molecular markers, (ii) the detection of QTL based on
statistical associations between marker and phenotypic variability, (iii)
the definition of a set of desirable marker alleles based on the results
of the QTL analysis, and (iv) the use and/or extrapolation of this
information to the current set of breeding germplasm to enable
marker-based selection decisions to be made.
[0014] The availability of integrated linkage maps of the soybean genome
containing increasing densities of public soybean markers has facilitated
soybean genetic mapping and MAS. See, e.g., Cregan et al. (1999) "An
Integrated Genetic Linkage Map of the Soybean Genome" Crop Sci.
39:1464-1490; Song et al., "A New Integrated Genetic Linkage Map of the
Soybean," Theor. Appl. Genet., 109:122-128 (2004); Diwan and Cregan
(1997) "Automated sizing of fluorescent-labeled simple sequence repeat
(SSR) markers to assay genetic variation in Soybean," Theor. Appl.
Genet., 95:220-225; the SOYBASE resources on the world wide web,
including the Shoemaker Lab Home Page and other resources that can be
accessed through SOYBASE; and see the Soybean Genomics and Improvements
Laboratory (SGIL) on the world wide web.
[0015] Two types of markers are frequently used in marker assisted
selection protocols, namely simple sequence repeat (SSR, also known as
microsatellite) markers, and single nucleotide polymorphism (SNP)
markers. The term SSR refers generally to any type of molecular
heterogeneity that results in length variability, and most typically is a
short (up to several hundred base pairs) segment of DNA that consists of
multiple tandem repeats of a two or three base-pair sequence. These
repeated sequences result in highly polymorphic DNA regions of variable
length due to poor replication fidelity, e.g., caused by polymerase
slippage. SSRs appear to be randomly dispersed through the genome and are
generally flanked by conserved regions. SSR markers can also be derived
from RNA sequences (in the form of a cDNA, a partial cDNA or an EST) as
well as genomic material.
[0016] The characteristics of SSR heterogeneity make them well suited for
use as molecular genetic markers; namely, SSR genomic variability is
inherited, is multiallelic, codominant and is reproducibly detectable.
The proliferation of increasingly sophisticated amplification-based
detection techniques (e.g., PCR-based) provides a variety of sensitive
methods for the detection of nucleotide sequence heterogeneity. Primers
(or other types of probes) are designed to hybridize to conserved regions
that flank the SSR domain, resulting in the amplification of the variable
SSR region. The different sized amplicons generated from an SSR region
have characteristic and reproducible sizes. The different sized SSR
amplicons observed from two homologous chromosomes in an individual, or
from different individuals in the plant population are generally termed
"marker alleles." As long as there exists at least two SSR alleles that
produce PCR products with at least two different sizes, the SSRs can be
employed as a marker.
[0017] Soybean markers that rely on single nucleotide polymorphisms (SNPs)
are also well known in the art. Various techniques have been developed
for the detection of SNPs, including allele specific hybridization (ASH;
see, e.g., Coryell et al., (1999) "Allele specific hybridization markers
for soybean," Theor. Appl. Genet., 98:690-696). Additional types of
molecular markers are also widely used, including but not limited to
expressed sequence tags (ESTs) and SSR markers derived from EST
sequences, restriction fragment length polymorphism (RFLP), amplified
fragment length polymorphism (AFLP), randomly amplified polymorphic DNA
(RAPD) and isozyme markers. A wide range of protocols are known to one of
skill in the art for detecting this variability, and these protocols are
frequently specific for the type of polymorphism they are designed to
detect. For example, PCR amplification, single-strand conformation
polymorphisms (SSCP) and self-sustained sequence replication (3 SR; see
Chan and Fox, "NASBA and other transcription-based amplification methods
for research and diagnostic microbiology," Reviews in Medical
Microbiology 10:185-196 [1999]).
[0018] Linkage of one molecular marker to another molecular marker is
measured as a recombination frequency. In general, the closer two loci
(e.g., two SSR markers) are on the genetic map, the closer they lie to
each other on the physical map. A relative genetic distance (determined
by crossing over frequencies, measured in centimorgans; cM) is generally
proportional to the physical distance (measured in base pairs, e.g.,
kilobase pairs [kb] or megabasepairs [Mbp]) that two linked loci are
separated from each other on a chromosome. A lack of precise
proportionality between cM and physical distance can result from
variation in recombination frequencies for different chromosomal regions,
e.g., some chromosomal regions are recombinational "hot spots," while
others regions do not show any recombination, or only demonstrate rare
recombination events. In general, the closer one marker is to another
marker, whether measured in terms of recombination or physical distance,
the more strongly they are linked. In some aspects, the closer a
molecular marker is to a gene that encodes a polypeptide that imparts a
particular phenotype (disease tolerance), whether measured in terms of
recombination or physical distance, the better that marker serves to tag
the desired phenotypic trait.
[0019] Genetic mapping variability can also be observed between different
populations of the same crop species, including soybean. In spite of this
variability in the genetic map that may occur between populations,
genetic map and marker information derived from one population generally
remains useful across multiple populations in identification of plants
with desired traits, counter-selection of plants with undesirable traits
and in guiding MAS.
QTL Mapping
[0020] It is the goal of the plant breeder to select plants and enrich the
plant population for individuals that have desired traits, for example,
pathogen tolerance, leading ultimately to increased agricultural
productivity. It has been recognized for quite some time that specific
chromosomal loci (or intervals) can be mapped in an organism's genome
that correlate with particular quantitative phenotypes. Such loci are
termed quantitative trait loci, or QTL. The plant breeder can
advantageously use molecular markers to identify desired individuals by
identifying marker alleles that show a statistically significant
probability of co-segregation with a desired phenotype (e.g., pathogenic
infection tolerance), manifested as linkage disequilibrium. By
identifying a molecular marker or clusters of molecular markers that
co-segregate with a quantitative trait, the breeder is thus identifying a
QTL. By identifying and selecting a marker allele (or desired alleles
from multiple markers) that associates with the desired phenotype, the
plant breeder is able to rapidly select a desired phenotype by selecting
for the proper molecular marker allele (a process called marker-assisted
selection, or MAS). The more molecular markers that are placed on the
genetic map, the more potentially useful that map becomes for conducting
MAS.
[0021] Multiple experimental paradigms have been developed to identify and
analyze QTL (see, e.g., Jansen (1996) Trends Plant Sci 1:89). The
majority of published reports on QTL mapping in crop species have been
based on the use of the bi-parental cross (Lynch and Walsh (1997)
Genetics and Analysis of Quantitative Traits, Sinauer Associates,
Sunderland). Typically, these paradigms involve crossing one or more
parental pairs, which can be, for example, a single pair derived from two
inbred strains, or multiple related or unrelated parents of different
inbred strains or lines, which each exhibit different characteristics
relative to the phenotypic trait of interest. Typically, this
experimental protocol involves deriving 100 to 300 segregating progeny
from a single cross of two divergent inbred lines (e.g., selected to
maximize phenotypic and molecular marker differences between the lines).
The parents and segregating progeny are genotyped for multiple marker
loci and evaluated for one to several quantitative traits (e.g., disease
resistance). QTL are then identified as significant statistical
associations between genotypic values and phenotypic variability among
the segregating progeny. The strength of this experimental protocol comes
from the utilization of the inbred cross, because the resulting F1
parents all have the same linkage phase. Thus, after selfing of the F1
plants, all segregating progeny (F2) are informative and linkage
disequilibrium is maximized, the linkage phase is known, there are only
two QTL alleles, and, except for backcross progeny, the frequency of each
QTL allele is 0.5.
[0022] Numerous statistical methods for determining whether markers are
genetically linked to a QTL (or to another marker) are known to those of
skill in the art and include, e.g., standard linear models, such as ANOVA
or regression mapping (Haley and Knott (1992) Heredity 69:315), maximum
likelihood methods such as expectation-maximization algorithms, (e.g.,
Lander and Botstein (1989) "Mapping Mendelian factors underlying
quantitative traits using RFLP linkage maps," Genetics 121:185-199;
Jansen (1992) "A general mixture model for mapping quantitative trait
loci by using molecular markers," Theor. Appl. Genet., 85:252-260; Jansen
(1993) "Maximum likelihood in a generalized linear finite mixture model
by using the EM algorithm," Biometrics 49:227-231; Jansen (1994) "Mapping
of quantitative trait loci by using genetic markers: an overview of
biometrical models," In J. W. van Ooijen and J. Jansen (eds.), Biometrics
in Plant breeding: applications of molecular markers, pp. 116-124,
CPRO-DLO Metherlands; Jansen (1996) "A general Monte Carlo method for
mapping multiple quantitative trait loci," Genetics 142:305-311; and
Jansen and Stam (1994) "High Resolution of quantitative trait into
multiple loci via interval mapping," Genetics 136:1447-1455). Exemplary
statistical methods include single point marker analysis, interval
mapping (Lander and Botstein (1989) Genetics 121:185), composite interval
mapping, penalized regression analysis, complex pedigree analysis, MCMC
analysis, MQM analysis (Jansen (1994) Genetics 138:871), HAPLO-IM+
analysis, HAPLO-MQM analysis, and HAPLO-MQM+ analysis, Bayesian MCMC,
ridge regression, identity-by-descent analysis, Haseman-Elston
regression, any of which are suitable in the context of the present
invention. In addition, additional details regarding alternative
statistical methods applicable to complex breeding populations which can
be used to identify and localize QTLs are described in: U.S. Ser. No.
09/216,089 by Beavis et al. "QTL MAPPING IN PLANT BREEDING POPULATIONS"
and PCT/US00/34971 by Jansen et al. "MQM MAPPING USING HAPLOTYPED
PUTATIVE QTLS ALLELES: A SIMPLE APPROACH FOR MAPPING QTLS IN PLANT
BREEDING POPULATIONS." Any of these approaches are computationally
intensive and are usually performed with the assistance of a computer
based system and specialized software. Appropriate statistical packages
are available from a variety of public and commercial sources, and are
known to those of skill in the art.
[0023] There is a need in the art for improved soybean strains that are
tolerant to iron-deficient growth conditions. There is a need in the art
for methods that identify soybean plants or populations (germplasm) that
display tolerance to iron-deficient growth conditions. What is needed in
the art is to identify molecular genetic markers that co-segregate with
to low-iron tolerance loci (e.g., tolerance QTL) in order to facilitate
MAS, and also to facilitate gene discovery and cloning of gene alleles
that impart tolerance to low iron growth conditions. Such markers can be
used to select individual plants and plant populations that show
favorable marker alleles in soybean populations and then employed to
select the tolerant phenotype, or alternatively, be used to counterselect
plants or plant populations that show a low-iron susceptibility
phenotype. The present invention provides these and other advantages.
SUMMARY OF THE INVENTION
[0024] Compositions and methods for identifying soybean plants or
germplasm with tolerance to low iron growth conditions are provided.
Methods of making soybean plants or germplasm that are tolerant to low
iron growth conditions, e.g., through introgression of desired tolerance
marker alleles and/or by transgenic production methods, as well as plants
and germplasm made by these methods, are also provided. Systems and kits
for selecting tolerant plants and germplasm are also a feature of the
invention.
[0025] Low iron growth conditions can produce plant pathology termed iron
deficiency chlorosis (IDC) or iron chlorosis (FEC). The identification
and selection of soybean plants that show tolerance to low iron growth
conditions using MAS can provide an effective and environmentally
friendly approach to overcoming losses caused by this disease. The
present invention provides a number of soybean marker loci and QTL
chromosome intervals that demonstrate statistically significant
co-segregation with tolerance to low iron growth conditions. Detection of
these QTL markers or additional loci linked to the QTL markers can be
used in marker-assisted soybean breeding programs to produce tolerant
plants, or plants with improved tolerance.
[0026] In some aspects, the invention provides methods for identifying a
first soybean plant or germplasm (e.g., a line or variety) that has
tolerance, improved tolerance or susceptibility to low iron growth
conditions. In the methods, at least one allele of one or more marker
locus (e.g., a plurality of marker loci) that is associated with the
tolerance, improved tolerance or susceptibility are detected in the first
soybean plant or germplasm. The marker loci can be selected from the loci
provided in FIG. 1, including: S60210-TB, SAC1006, SATT391, SAC1724,
SATT307, P13073A-1, P10598A-1, SATT334, SATT510, SATT335, P5219A-1,
P7659A-2, SAT.sub.--117, SATT191, S60143-TB, SATT451, SATT367, SATT495,
P10649C-3, SATT613, SATT257, SATT581 and SATT153, as well as any other
marker that is closely linked to these QTL markers (e.g., within about 10
cM of these loci). The invention also provides chromosomal QTL intervals
that correlate with low iron tolerance. These intervals are located on
linkage groups Cl, F, G, I, L and M. Any marker located within these
intervals also finds use as a marker for low iron tolerance. These
intervals include any marker locus localizing within a chromosome
interval flanked by and including:
[0027] (a) S60210-TB and SATT391 (LG-C1);
[0028] (b) P10598A-1 and SATT334 (LG-F);
[0029] (c) SATT510 and SATT335 (LG-F);
[0030] (d) P5219A-1 and P7659A-2 (LG-G);
[0031] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0032] (f) SATT451 and SATT367 (LG-I);
[0033] (g) SATT495 and P10649C-3 (LG-L); and
[0034] (h) SATT250 and SATT346 (LG-M).
[0035] A plurality of maker loci can be selected in the same plant. Which
QTL markers are selected in combination is not particularly limited. The
QTL markers used in combinations can be any of the makers listed in FIG.
1, any other marker that is closely linked to the markers in FIG. 1
(e.g., the closely linked markers as determined from FIG. 4 and FIG. 5,
or determined from the SOYBASE resource), or any marker within the QTL
intervals described herein.
[0036] The markers that are linked to the QTL markers of the invention
(e.g., those markers provided in FIG. 1) are closely linked, for example,
within about 10 cM from the QTL markers. In desirable embodiments, the
linked locus displays a genetic recombination distance of 9 centiMorgans,
8, 7, 6, 5, 4, 3, 2, 1, 0.75, 0.5 or 0.25, or less from the QTL marker.
In some embodiments, the closely linked locus is selected from the list
of marker loci determined from FIG. 6 or FIG. 6.
[0037] In some embodiments, preferred QTL markers are selected from
SAC1724, SATT307, P13073A-1, P10598A-1, SATT334, SATT495, P10649C-3,
SATT613 and SATT257.
[0038] In some embodiments, the germplasm is a soybean line or variety. In
some aspects, the tolerance or improved tolerance is a non-race specific
tolerance or a non-race specific improved tolerance. In some aspects, the
tolerance, improved tolerance or susceptibility of a soybean plant to low
iron growth conditions can be quantitated using any suitable means, for
example, by assaying soybean pathology in a field where disease is known
to occur naturally.
[0039] Any of a variety of techniques can be used to identify a marker
allele. It is not intended that the method of allele detection be limited
in any way. Methods for allele detection typically include molecular
identification methods such as amplification and detection of the marker
amplicon. For example, an allelic form of a polymorphic simple sequence
repeat (SSR), or of a single nucleotide polymorphism (SNP) can be
detected, e.g., by an amplification based technology. In these and other
amplification based detection methods, the marker locus or a portion of
the marker locus is amplified (e.g., via PCR, LCR or transcription using
a nucleic acid isolated from a soybean plant of interest as a template)
and the resulting amplified marker amplicon is detected. In one example
of such an approach, an amplification primer or amplification primer pair
is admixed with genomic nucleic acid isolated from the first soybean
plant or germplasm, wherein the primer or primer pair is complementary or
partially complementary to at least a portion of the marker locus, and is
capable of initiating DNA polymerization by a DNA polymerase using the
soybean genomic nucleic acid as a template. The primer or primer pair
(e.g., a primer pair provided in FIG. 2 or 3) is extended in a DNA
polymerization reaction having a DNA polymerase and a template genomic
nucleic acid to generate at least one amplicon. In any case, data
representing the detected allele(s) can be transmitted (e.g.,
electronically or via infrared, wireless or optical transmission) to a
computer or computer readable medium for analysis or storage. In some
embodiments, plant RNA is the template for the amplification reaction. In
other embodiments, plant genomic DNA is the template for the
amplification reaction. In some embodiments, the QTL marker is a SNP type
marker, and the detected allele is a SNP allele, and the method of
detection is allele specific hybridization (ASH).
[0040] In some embodiments, the allele that is detected is a favorable
allele that positively correlates with tolerance or improved tolerance.
In the case where more than one marker is selected, an allele is selected
for each of the markers; thus, two or more alleles are selected. In some
embodiments, it can be the case that a marker locus will have more than
one advantageous allele, and in that case, either allele can be selected.
[0041] It will be appreciated that the ability to identify QTL marker loci
alleles that correlate with tolerance, improved tolerance or
susceptibility of a soybean plant to low iron growth conditions provides
a method for selecting plants that have favorable marker loci as well.
That is, any plant that is identified as comprising a desired marker
locus (e.g., a marker allele that positively correlates with tolerance)
can be selected for, while plants that lack the locus, or that have a
locus that negatively correlates with tolerance, can be selected against.
Thus, in one method, subsequent to identification of a marker locus, the
methods include selecting (e.g., isolating) the first soybean plant or
germplasm, or selecting a progeny of the first plant or germplasm. In
some embodiments, the resulting selected first soybean plant or germplasm
can be crossed with a second soybean plant or germplasm (e.g., an elite
or exotic soybean, depending on characteristics that are desired in the
progeny).
[0042] Similarly, in other embodiments, if an allele is correlated with
tolerance or improved tolerance to low iron growth conditions, the method
can include introgressing the allele into a second soybean plant or
germplasm to produce an introgressed soybean plant or germplasm. In some
embodiments, the second soybean plant or germplasm will typically display
reduced tolerance to low iron growth conditions as compared to the first
soybean plant or germplasm, while the introgressed soybean plant or
germplasm will display an increased tolerance to low iron growth
conditions as compared to the second plant or germplasm. An introgressed
soybean plant or germplasm produced by these methods are also a feature
of the invention.
[0043] In other aspects, various mapping populations are used to determine
the linked markers of the invention. In one embodiment, the mapping
population used is the population derived from the cross UP1C6-43/90B73.
In other embodiments, other mapping populations can be used. In other
aspects, various software is used in determining linked marker loci. For
example, TASSEL, GeneFlow and MapManager all find use with the invention.
In some embodiments, such as when software is used in the linkage
analysis, the detected allele information (i.e., the data) is
electronically transmitted or electronically stored, for example, in a
computer readable medium.
[0044] In addition to introgressing selected marker alleles into desired
genetic backgrounds, transgenic approaches can also be used to produce
plants or germplasm that are tolerant to low iron growth conditions. For
example, in some aspects, the invention provides methods of producing a
soybean plant having tolerance or improved tolerance to low iron growth
conditions, the methods comprising introducing an exogenous nucleic acid
into a target soybean plant or progeny thereof, wherein the exogenous
nucleic acid is derived from a nucleotide sequence that is linked to at
least one favorable allele of one or more marker locus that is associated
with tolerance or improved tolerance to low iron growth conditions. In
some embodiments, the marker locus can be selected from: S60210-TB,
SAC1006, SATT391, SAC1724, SATT307, P13073A-1, P10598A-1, SATT334,
SATT510, SATT335, P5219A-1, P7659A-2, SAT.sub.--117, SATT191, S60143-TB,
SATT451, SATT367, SATT495, P10649C-3, SATT613, SATT257, SATT581 and
SATT153, as well as any other marker that is closely linked (e.g.,
demonstrating not more than 10% recombination frequency) to these QTL
markers; and furthermore, any marker locus that is located within the
chromosomal QTL intervals flanked by and including:
[0045] (a) S60210-TB and SATT391 (LG-C1);
[0046] (b) P10598A-1 and SATT334 (LG-F);
[0047] (c) SATT510 and SATT335 (LG-F);
[0048] (d) P5219A-1 and P7659A-2 (LG-G);
[0049] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0050] (f) SATT451 and SATT367 (LG-I);
[0051] (g) SATT495 and P10649C-3 (LG-L); and
[0052] (h) SATT250 and SATT346 (LG-M).
[0053] In some embodiments, preferred QTL markers used in these transgenic
plant methods are selected from SAC1724, SATT307, P13073A-1, P10598A-1,
SATT334, SATT495, P10649C-3, SATT613 and SATT257.
[0054] In some embodiments, a plurality of maker loci can be used to
construct the transgenic plant. Which QTL markers are used in combination
is not particularly limited. The QTL markers used in combinations can be
any of the makers listed in FIG. 1, any other marker that is linked to
the markers in FIG. 1 (e.g., the linked markers as determined from FIGS.
5 and 6, or determined from the SOYBASE resource), or any markers
selected from the QTL intervals described herein.
[0055] Any of a variety of methods can be used to provide the exogenous
nucleic acid to the soybean plant. In one method, the nucleotide sequence
is isolated by positional cloning, and is identified by linkage to the
favorable allele. The precise composition of the exogenous nucleic acid
can vary; in one embodiment, the exogenous nucleic acid corresponds to an
open reading frame (ORF) that encodes a polypeptide that, when expressed
in a soybean plant, results in the soybean plant having tolerance or
improved tolerance to iron-deficient growth conditions. The exogenous
nucleic acid optionally comprises an expression vector to provide for
expression of the exogenous nucleic acid in the plant.
[0056] In other aspects, various mapping populations are used to determine
the linked markers that find use in constructing the transgenic plant. In
one embodiment, the mapping population used is the population derived
from the cross UP1C6-43/90B73. In other embodiments, other populations
can be used. In other aspects, various software is used in determining
linked marker loci used to construct the transgenic plant. For example,
TASSEL, GeneFlow or MapManager-QTX all find use with the invention.
[0057] Systems for identifying a soybean plant predicted to have tolerance
or improved tolerance to iron-deficient growth conditions are also a
feature of the invention. Typically, the system can include a set of
marker primers and/or probes configured to detect at least one favorable
allele of one or more marker locus associated with tolerance or improved
tolerance to iron-deficient growth conditions, wherein the marker locus
or loci are selected from: S60210-TB, SAC1006, SATT391, SAC1724, SATT307,
P13073A-1, P10598A-1, SATT334, SATT510, SATT335, P5219A-1, P7659A-2,
SAT.sub.--117, SATT191, S60143-TB, SATT451, SATT367, SATT495, P10649C-3,
SATT613, SATT257, SATT581 and SATT153, as well as any other marker that
is closely linked (e.g., demonstrating not more than 10% recombination
frequency) to these QTL markers; and furthermore, any marker locus that
is located within the chromosomal QTL intervals flanked by and including:
[0058] (a) S60210-TB and SATT391 (LG-C1);
[0059] (b) P10598A-1 and SATT334 (LG-F);
[0060] (c) SATT510 and SATT335 (LG-F);
[0061] (d) P5219A-1 and P7659A-2 (LG-G);
[0062] (e) SAT.sub.--117 and 560143-TB (LG-G);
[0063] (f) SATT451 and SATT367 (LG-I);
[0064] (g) SATT495 and P10649C-3 (LG-L); and
[0065] (h) SATT250 and SATT346 (LG-M);
In some embodiments, preferred QTL markers used in these transgenic plant
methods are selected from SAC1724, SATT307, P13073A-1, P10598A-1,
SATT334, SATT495, P10649C-3, SATT613 and SATT257.
[0066] Where a system that performs marker detection or correlation is
desired, the system can also include a detector that is configured to
detect one or more signal outputs from the set of marker probes or
primers, or amplicon thereof, thereby identifying the presence or absence
of the allele; and/or system instructions that correlate the presence or
absence of the favorable allele with the predicted tolerance. The precise
configuration of the detector will depend on the type of label used to
detect the marker allele. Typical embodiments include light detectors,
radioactivity detectors, and the like. Detection of the light emission or
other probe label is indicative of the presence or absence of a marker
allele. Similarly, the precise form of the instructions can vary
depending on the components of the system, e.g., they can be present as
system software in one or more integrated unit of the system, or can be
present in one or more computers or computer readable media operably
coupled to the detector. In one typical embodiment, the system
instructions include at least one look-up table that includes a
correlation between the presence or absence of the favorable allele and
predicted tolerance, improved tolerance or susceptibility.
[0067] In some embodiments, the system can be comprised of separate
elements or can be integrated into a single unit for convenient detection
of markers alleles and for performing marker-tolerance trait
correlations. In some embodiments, the system can also include a sample,
for example, genomic DNA, amplified genomic DNA, cDNA, amplified cDNA,
RNA, or amplified RNA from soybean or from a selected soybean plant
tissue.
[0068] Kits are also a feature of the invention. For example, a kit can
include appropriate primers or probes for detecting tolerance associated
marker loci and instructions in using the primers or probes for detecting
the marker loci and correlating the loci with predicted low iron
tolerance. The kits can further include packaging materials for packaging
the probes, primers or instructions, controls such as control
amplification reactions that include probes, primers or template nucleic
acids for amplifications, molecular size markers, or the like.
[0069] In other aspects, the invention provides nucleic acid compositions
that are the novel EST-derived SSR QTL markers of the invention. For
example, the invention provides compositions comprising an amplification
primer pair capable of initiating DNA polymerization by a DNA polymerase
on a soybean nucleic acid template to generate a soybean marker amplicon,
where the marker amplicon corresponds to a soybean marker selected from
S60210-TB, S60143-TB and S60392-TB, and further where the composition
comprises a primer pair that is specific for the marker.
DEFINITIONS
[0070] Before describing the present invention in detail, it is to be
understood that this invention is not limited to particular embodiments,
which can, of course, vary. It is also to be understood that the
terminology used herein is for the purpose of describing particular
embodiments only, and is not intended to be limiting. As used in this
specification and the appended claims, terms in the singular and the
singular forms "a," "an" and "the," for example, include plural referents
unless the content clearly dictates otherwise. Thus, for example,
reference to "plant," "the plant" or "a plant" also includes a plurality
of plants; also, depending on the context, use of the term "plant" can
also include genetically similar or identical progeny of that plant; use
of the term "a nucleic acid" optionally includes, as a practical matter,
many copies of that nucleic acid molecule; similarly, the term "probe"
optionally (and typically) encompasses many similar or identical probe
molecules.
[0071] Unless otherwise indicated, nucleic acids are written left to right
in 5' to 3' orientation. Numeric ranges recited within the specification
are inclusive of the numbers defining the range and include each integer
or any non-integer fraction within the defined range. Unless defined
otherwise, all technical and scientific terms used herein have the same
meaning as commonly understood by one of ordinary skill in the art to
which the invention pertains. Although any methods and materials similar
or equivalent to those described herein can be used in the practice for
testing of the present invention, the preferred materials and methods are
described herein. In describing and claiming the present invention, the
following terminology will be used in accordance with the definitions set
out below.
[0072] A "plant" can be a whole plant, any part thereof, or a cell or
tissue culture derived from a plant. Thus, the term "plant" can refer to
any of whole plants, plant components or organs (e.g., leaves, stems,
roots, etc.), plant tissues, seeds, plant cells, and/or progeny of the
same. A plant cell is a cell of a plant, taken from a plant, or derived
through culture from a cell taken from a plant. Thus, the term "soybean
plant" includes whole soybean plants, soybean plant cells, soybean plant
protoplast, soybean plant cell or soybean tissue culture from which
soybean plants can be regenerated, soybean plant calli, soybean plant
clumps and soybean plant cells that are intact in soybean plants or parts
of soybean plants, such as soybean seeds, soybean pods, soybean flowers,
soybean cotyledons, soybean leaves, soybean stems, soybean buds, soybean
roots, soybean root tips and the like.
[0073] "Germplasm" refers to genetic material of or from an individual
(e.g., a plant), a group of individuals (e.g., a plant line, variety or
family), or a clone derived from a line, variety, species, or culture.
The germplasm can be part of an organism or cell, or can be separate from
the organism or cell. In general, germplasm provides genetic material
with a specific molecular makeup that provides a physical foundation for
some or all of the hereditary qualities of an organism or cell culture.
As used herein, germplasm includes cells, seed or tissues from which new
plants may be grown, or plant parts, such as leafs, stems, pollen, or
cells, that can be cultured into a whole plant.
[0074] The term "allele" refers to one of two or more different nucleotide
sequences that occur at a specific locus. For example, a first allele can
occur on one chromosome, while a second allele occurs on a second
homologous chromosome, e.g., as occurs for different chromosomes of a
heterozygous individual, or between different homozygous or heterozygous
individuals in a population. A "favorable allele" is the allele at a
particular locus that confers, or contributes to, an agronomically
desirable phenotype, e.g., tolerance to Phytophthora infection, or
alternatively, is an allele that allows the identification of susceptible
plants that can be removed from a breeding program or planting. A
favorable allele of a marker is a marker allele that segregates with the
favorable phenotype, or alternatively, segregates with susceptible plant
phenotype, therefore providing the benefit of identifying disease-prone
plants. A favorable allelic form of a chromosome segment is a chromosome
segment that includes a nucleotide sequence that contributes to superior
agronomic performance at one or more genetic loci physically located on
the chromosome segment. "Allele frequency" refers to the frequency
(proportion or percentage) at which an allele is present at a locus
within an individual, within a line, or within a population of lines. For
example, for an allele "A," diploid individuals of genotype "AA," "Aa,"
or "aa" have allele frequencies of 1.0, 0.5, or 0.0, respectively. One
can estimate the allele frequency within a line by averaging the allele
frequencies of a sample of individuals from that line. Similarly, one can
calculate the allele frequency within a population of lines by averaging
the allele frequencies of lines that make up the population. For a
population with a finite number of individuals or lines, an allele
frequency can be expressed as a count of individuals or lines (or any
other specified grouping) containing the allele.
[0075] An allele "positively" correlates with a trait when it is linked to
it and when presence of the allele is an indictor that the desired trait
or trait form will occur in a plant comprising the allele. An allele
negatively correlates with a trait when it is linked to it and when
presence of the allele is an indicator that a desired trait or trait form
will not occur in a plant comprising the allele.
[0076] An individual is "homozygous" if the individual has only one type
of allele at a given locus (e.g., a diploid individual has a copy of the
same allele at a locus for each of two homologous chromosomes). An
individual is "heterozygous" if more than one allele type is present at a
given locus (e.g., a diploid individual with one copy each of two
different alleles). The term "homogeneity" indicates that members of a
group have the same genotype at one or more specific loci. In contrast,
the term "heterogeneity" is used to indicate that individuals within the
group differ in genotype at one or more specific loci.
[0077] A "locus" is a chromosomal region where a polymorphic nucleic acid,
trait determinant, gene or marker is located. Thus, for example, a "gene
locus" is a specific chromosome location in the genome of a species where
a specific gene can be found.
[0078] The term "quantitative trait locus" or "QTL" refers to a
polymorphic genetic locus with at least two alleles that differentially
affect the expression of a phenotypic trait in at least one genetic
background, e.g., in at least one breeding population or progeny.
[0079] The terms "marker," "molecular marker," "marker nucleic acid," and
"marker locus" refer to a nucleotide sequence or encoded product thereof
(e.g., a protein) used as a point of reference when identifying a linked
locus. A marker can be derived from genomic nucleotide sequence or from
expressed nucleotide sequences (e.g., from a spliced RNA, a cDNA, etc.),
or from an encoded polypeptide. The term also refers to nucleic acid
sequences complementary to or flanking the marker sequences, such as
nucleic acids used as probes or primer pairs capable of amplifying the
marker sequence. A "marker probe" is a nucleic acid sequence or molecule
that can be used to identify the presence of a marker locus, e.g., a
nucleic acid probe that is complementary to a marker locus sequence.
Alternatively, in some aspects, a marker probe refers to a probe of any
type that is able to distinguish (i.e., genotype) the particular allele
that is present at a marker locus. Nucleic acids are "complementary" when
they specifically hybridize in solution, e.g., according to Watson-Crick
base pairing rules. A "marker locus" is a locus that can be used to track
the presence of a second linked locus, e.g., a linked locus that encodes
or contributes to expression of a phenotypic trait. For example, a marker
locus can be used to monitor segregation of alleles at a locus, such as a
QTL, that are genetically or physically linked to the marker locus. Thus,
a "marker allele," alternatively an "allele of a marker locus" is one of
a plurality of polymorphic nucleotide sequences found at a marker locus
in a population that is polymorphic for the marker locus. In some
aspects, the present invention provides marker loci correlating with
tolerance to Phytophthora infection in soybean. Each of the identified
markers is expected to be in close physical and genetic proximity
(resulting in physical and/or genetic linkage) to a genetic element,
e.g., a QTL, that contributes to tolerance.
[0080] "Genetic markers" are nucleic acids that are polymorphic in a
population and where the alleles of which can be detected and
distinguished by one or more analytic methods, e.g., RFLP, AFLP, isozyme,
SNP, SSR, and the like. The terms "genetic marker" and "molecular marker"
refer to a genetic locus (a "marker locus") that can be used as a point
of reference when identifying a genetically linked locus such as a QTL.
Such a marker is also referred to as a QTL marker. The term also refers
to nucleic acid sequences complementary to the genomic sequences, such as
nucleic acids used as probes.
[0081] Markers corresponding to genetic polymorphisms between members of a
population can be detected by methods well-established in the art. These
include, e.g., PCR-based sequence specific amplification methods,
detection of restriction fragment length polymorphisms (RFLP), detection
of isozyme markers, detection of polynucleotide polymorphisms by allele
specific hybridization (ASH), detection of amplified variable sequences
of the plant genome, detection of self-sustained sequence replication,
detection of simple sequence repeats (SSRs), detection of single
nucleotide polymorphisms (SNPs), or detection of amplified fragment
length polymorphisms (AFLPs). Well established methods are also know for
the detection of expressed sequence tags (ESTs) and SSR markers derived
from EST sequences and randomly amplified polymorphic DNA (RAPD).
[0082] A "genetic map" is a description of genetic linkage relationships
among loci on one or more chromosomes (or linkage groups) within a given
species, generally depicted in a diagrammatic or tabular form. "Genetic
mapping" is the process of defining the linkage relationships of loci
through the use of genetic markers, populations segregating for the
markers, and standard genetic principles of recombination frequency. A
"genetic map location" is a location on a genetic map relative to
surrounding genetic markers on the same linkage group where a specified
marker can be found within a given species. In contrast, a physical map
of the genome refers to absolute distances (for example, measured in base
pairs or isolated and overlapping contiguous genetic fragments, e.g.,
contigs). A physical map of the genome does not take into account the
genetic behavior (e.g., recombination frequencies) between different
points on the physical map.
[0083] A "genetic recombination frequency" is the frequency of a crossing
over event (recombination) between two genetic loci. Recombination
frequency can be observed by following the segregation of markers and/or
traits following meiosis. A genetic recombination frequency can be
expressed in centimorgans (cM), where one cM is the distance between two
genetic markers that show a 1% recombination frequency (i.e., a
crossing-over event occurs between those two markers once in every 100
cell divisions).
[0084] As used herein, the term "linkage" is used to describe the degree
with which one marker locus is "associated with" another marker locus or
some other locus (for example, a tolerance locus).
[0085] As used herein, linkage equilibrium describes a situation where two
markers independently segregate, i.e., sort among progeny randomly.
Markers that show linkage equilibrium are considered unlinked (whether or
not they lie on the same chromosome).
[0086] As used herein, linkage disequilibrium describes a situation where
two markers segregate in a non-random manner, i.e., have a recombination
frequency of less than 50% (and by definition, are separated by less than
50 cM on the same linkage group). Markers that show linkage
disequilibrium are considered linked. Linkage occurs when the marker
locus and a linked locus are found together in progeny plants more
frequently than not together in the progeny plants. As used herein,
linkage can be between two markers, or alternatively between a marker and
a phenotype. A marker locus can be associated with (linked to) a trait,
e.g., a marker locus can be associated with tolerance or improved
tolerance to a plant pathogen when the marker locus is in linkage
disequilibrium with the tolerance trait. The degree of linkage of a
molecular marker to a phenotypic trait (e.g., a QTL) is measured, e.g.,
as a statistical probability of co-segregation of that molecular marker
with the phenotype.
[0087] As used herein, the linkage relationship between a molecular marker
and a phenotype is given as a "probability" or "adjusted probability."
The probability value is the statistical likelihood that the particular
combination of a phenotype and the presence or absence of a particular
marker allele is random. Thus, the lower the probability score, the
greater the likelihood that a phenotype and a particular marker will
co-segregate. In some aspects, the probability score is considered
"significant" or "nonsignificant." In some embodiments, a probability
score of 0.05 (p=0.05, or a 5% probability) of random assortment is
considered a significant indication of co-segregation. However, the
present invention is not limited to this particular standard, and an
acceptable probability can be any probability of less than 50% (p=0.5).
For example, a significant probability can be less than 0.25, less than
0.20, less than 0.15, or less than 0.1.
[0088] The term "linkage disequilibrium" refers to a non-random
segregation of genetic loci or traits (or both). In either case, linkage
disequilibrium implies that the relevant loci are within sufficient
physical proximity along a length of a chromosome so that they segregate
together with greater than random (i.e., non-random) frequency (in the
case of co-segregating traits, the loci that underlie the traits are in
sufficient proximity to each other). Linked loci co-segregate more than
50% of the time, e.g., from about 51% to about 100% of the time. The term
"physically linked" is sometimes used to indicate that two loci, e.g.,
two marker loci, are physically present on the same chromosome.
[0089] Advantageously, the two linked loci are located in close proximity
such that recombination between homologous chromosome pairs does not
occur between the two loci during meiosis with high frequency, e.g., such
that linked loci co-segregate at least about 90% of the time, e.g., 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.75%, or more of the
time.
[0090] The phrase "closely linked," in the present application, means that
recombination between two linked loci occurs with a frequency of equal to
or less than about 10% (i.e., are separated on a genetic map by not more
than 10 cM). Put another way, the closely linked loci co-segregate at
least 90% of the time. Marker loci are especially useful in the present
invention when they demonstrate a significant probability of
co-segregation (linkage) with a desired trait (e.g., pathogenic
tolerance). For example, in some aspects, these markers can be termed
linked QTL markers. In other aspects, especially useful molecular markers
are those markers that are linked or closely linked to QTL markers.
[0091] In some aspects, linkage can be expressed as any desired limit or
range. For example, in some embodiments, two linked loci are two loci
that are separated by less than 50 cM map units. In other embodiments,
linked loci are two loci that are separated by less than 40 cM. In other
embodiments, two linked loci are two loci that are separated by less than
30 cM. In other embodiments, two linked loci are two loci that are
separated by less than 25 cM. In other embodiments, two linked loci are
two loci that are separated by less than 20 cM. In other embodiments, two
linked loci are two loci that are separated by less than 15 cM. In some
aspects, it is advantageous to define a bracketed range of linkage, for
example, between 10 and 20 cM, or between 10 and 30 cM, or between 10 and
40 cM.
[0092] The more closely a marker is linked to a second locus, the better
an indicator for the second locus that marker becomes. Thus, in one
embodiment, closely linked loci such as a marker locus and a second locus
(e.g., a QTL marker) display an inter-locus recombination frequency of
10% or less, preferably about 9% or less, still more preferably about 8%
or less, yet more preferably about 7% or less, still more preferably
about 6% or less, yet more preferably about 5% or less, still more
preferably about 4% or less, yet more preferably about 3% or less, and
still more preferably about 2% or less. In highly preferred embodiments,
the relevant loci (e.g., a marker locus and a QTL marker) display a
recombination a frequency of about 1% or less, e.g., about 0.75% or less,
more preferably about 0.5% or less, or yet more preferably about 0.25% or
less. Two loci that are localized to the same chromosome, and at such a
distance that recombination between the two loci occurs at a frequency of
less than 10% (e.g., about 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.75%,
0.5%, 0.25%, or less) are also said to be "proximal to" each other. In
some cases, two different markers can have the same genetic map
coordinates. In that case, the two markers are in such close proximity to
each other that recombination occurs between them with such low frequency
that it is undetectable.
[0093] When referring to the relationship between two genetic elements,
such as a genetic element contributing to tolerance and a proximal
marker, "coupling" phase linkage indicates the state where the
"favorable" allele at the tolerance locus is physically associated on the
same chromosome strand as the "favorable" allele of the respective linked
marker locus. In coupling phase, both favorable alleles are inherited
together by progeny that inherit that chromosome strand. In "repulsion"
phase linkage, the "favorable" allele at the locus of interest (e.g., a
QTL for tolerance) is physically linked with an "unfavorable" allele at
the proximal marker locus, and the two "favorable" alleles are not
inherited together (i.e., the two loci are "out of phase" with each
other).
[0094] As used herein, the terms "chromosome interval" or "chromosome
segment" designate a contiguous linear span of genomic DNA that resides
in planta on a single chromosome. The genetic elements or genes located
on a single chromosome interval are physically linked. The size of a
chromosome interval is not particularly limited.
[0095] In some aspects, for example in the context of the present
invention, generally the genetic elements located within a single
chromosome interval are also genetically linked, typically within a
genetic recombination distance of, for example, less than or equal to 20
centimorgan (cM), or alternatively, less than or equal to 10 cM. That is,
two genetic elements within a single chromosome interval undergo
recombination at a frequency of less than or equal to 20% or 10%
[0096] In one aspect, any marker of the invention is linked (genetically
and physically) to any other marker that is at or less than 50 cM
distant. In another aspect, any marker of the invention is closely linked
(genetically and physically) to any other marker that is in close
proximity, e.g., at or less than 10 cM distant. Two closely linked
markers on the same chromosome can be positioned 9, 8, 7, 6, 5, 4, 3, 2,
1, 0.75, 0.5 or 0.25 cM or less from each other.
[0097] The phrases "low iron," "low-available iron," "low soluble iron,"
"low iron conditions," "low iron growth conditions," "iron shortage" or
"iron deficiency" or the like refer to conditions where iron availability
is less than optimal for soybean growth, and can cause plant pathology,
e.g., IDC, due to the lack of metabolically-available iron. It is
recognized that under "iron deficient" conditions, the absolute
concentration of atomic iron may be sufficient, but the form of the iron
(e.g., its incorporation into various molecular structures) and other
environmental factors may make the iron unavailable for plant use. For
example, high carbonate levels. High pH, high salt content, herbicide
applications, cool temperatures saturated
soils or other environmental
factors can decrease iron solubility, and reduce the solubilized forms of
iron that the plant requires for uptake. One of skill in the art is
familiar with assays to measure iron content of soil, as well as those
concentrations of iron that are optimal or sub-optimal for plant growth.
[0098] "Tolerance" or "improved tolerance" in a soybean plant to
low-available iron growth conditions is an indication that the soybean
plant is less affected by low-available iron conditions with respect to
yield, survivability and/or other relevant agronomic measures, compared
to a less tolerant, more "susceptible" plant. Tolerance is a relative
term, indicating that a "tolerant" plant survives and/or produces better
yield of soybean in low-available iron growth conditions compared to a
different (less tolerant) plant (e.g., a different soybean strain) grown
in similar low-available iron conditions. That is, the low-available iron
growth conditions cause a reduced decrease in soybean survival and/or
yield in a tolerant soybean plant, as compared to a susceptible soybean
plant. As used in the art, iron-deficiency "tolerance" is sometimes used
interchangeably with iron-deficiency "resistance."
[0099] One of skill will appreciate that soybean plant tolerance to
low-available iron conditions varies widely, and can represent a spectrum
of more-tolerant or less-tolerant phenotypes. However, by simple
observation, one of skill can generally determine the relative tolerance
or susceptibility of different plants, plant lines or plant families
under low-available iron conditions, and furthermore, will also recognize
the phenotypic gradations of "tolerant."
[0100] In one example, a plant's tolerance can be approximately
quantitated using a chlorosis scoring system. In such a system, a plant
that is grown in a known iron-deficient area, or in low-available iron
experimental conditions, and is assigned a tolerance rating of between 1
(highly susceptible; most or all plants dead; those that live are stunted
and have little living tissue) to 9 (highly tolerant; yield and
survivability not significantly affected; all plants normal green color).
See also, Dahiya and Singh (1979) "Effect of salinity, alkalinity and
iron sources on availability of iron," Plant and Soil 51:13-18.
[0101] The term "crossed" or "cross" in the context of this invention
means the fusion of gametes via pollination to produce progeny (e.g.,
cells, seeds or plants). The term encompasses both sexual crosses (the
pollination of one plant by another) and selfing (self-pollination, e.g.,
when the pollen and ovule are from the same plant).
[0102] The term "introgression" refers to the transmission of a desired
allele of a genetic locus from one genetic background to another. For
example, introgression of a desired allele at a specified locus can be
transmitted to at least one progeny via a sexual cross between two
parents of the same species, where at least one of the parents has the
desired allele in its genome. Alternatively, for example, transmission of
an allele can occur by recombination between two donor genomes, e.g., in
a fused protoplast, where at least one of the donor protoplasts has the
desired allele in its genome. The desired allele can be, e.g., a selected
allele of a marker, a QTL, a transgene, or the like. In any case,
offspring comprising the desired allele can be repeatedly backcrossed to
a line having a desired genetic background and selected for the desired
allele, to result in the allele becoming fixed in a selected genetic
background.
[0103] A "line" or "strain" is a group of individuals of identical
parentage that are generally inbred to some degree and that are generally
homozygous and homogeneous at most loci (isogenic or near isogenic). A
"sublime" refers to an inbred subset of descendents that are genetically
distinct from other similarly inbred subsets descended from the same
progenitor. Traditionally, a "subline" has been derived by inbreeding the
seed from an individual soybean plant selected at the F3 to F5 generation
until the residual segregating loci are "fixed" or homozygous across most
or all loci. Commercial soybean varieties (or lines) are typically
produced by aggregating ("bulking") the self-pollinated progeny of a
single F3 to F5 plant from a controlled cross between 2 genetically
different parents. While the variety typically appears uniform, the
self-pollinating variety derived from the selected plant eventually
(e.g., F8) becomes a mixture of homozygous plants that can vary in
genotype at any locus that was heterozygous in the originally selected F3
to F5 plant. In the context of the invention, marker-based sublines, that
differ from each other based on qualitative polymorphism at the DNA level
at one or more specific marker loci, are derived by genotyping a sample
of seed derived from individual self-pollinated progeny derived from a
selected F3-F5 plant. The seed sample can be genotyped directly as seed,
or as plant tissue grown from such a seed sample. Optionally, seed
sharing a common genotype at the specified locus (or loci) are bulked
providing a subline that is genetically homogenous at identified loci
important for a trait of interest (yield, tolerance, etc.).
[0104] An "ancestral line" is a parent line used as a source of genes
e.g., for the development of elite lines. An "ancestral population" is a
group of ancestors that have contributed the bulk of the genetic
variation that was used to develop elite lines. "Descendants" are the
progeny of ancestors, and may be separated from their ancestors by many
generations of breeding. For example, elite lines are the descendants of
their ancestors. A "pedigree structure" defines the relationship between
a descendant and each ancestor that gave rise to that descendant. A
pedigree structure can span one or more generations, describing
relationships between the descendant and it's parents, grand parents,
great-grand parents, etc.
[0105] An "elite line" or "elite strain" is an agronomically superior line
that has resulted from many cycles of breeding and selection for superior
agronomic performance. Numerous elite lines are available and known to
those of skill in the art of soybean breeding. An "elite population" is
an assortment of elite individuals or lines that can be used to represent
the state of the art in terms of agronomically superior genotypes of a
given crop species, such as soybean. Similarly, an "elite germplasm" or
elite strain of germplasm is an agronomically superior germplasm,
typically derived from and/or capable of giving rise to a plant with
superior agronomic performance, such as an existing or newly developed
elite line of soybean.
[0106] In contrast, an "exotic soybean strain" or an "exotic soybean
germplasm" is a strain or germplasm derived from a soybean not belonging
to an available elite soybean line or strain of germplasm. In the context
of a cross between two soybean plants or strains of germplasm, an exotic
germplasm is not closely related by descent to the elite germplasm with
which it is crossed. Most commonly, the exotic germplasm is not derived
from any known elite line of soybean, but rather is selected to introduce
novel genetic elements (typically novel alleles) into a breeding program.
[0107] The term "amplifying" in the context of nucleic acid amplification
is any process whereby additional copies of a selected nucleic acid (or a
transcribed form thereof) are produced. Typical amplification methods
include various polymerase based replication methods, including the
polymerase chain reaction (PCR), ligase mediated methods such as the
ligase chain reaction (LCR) and RNA polymerase based amplification (e.g.,
by transcription) methods. An "amplicon" is an amplified nucleic acid,
e.g., a nucleic acid that is produced by amplifying a template nucleic
acid by any available amplification method (e.g., PCR, LCR,
transcription, or the like).
[0108] A "genomic nucleic acid" is a nucleic acid that corresponds in
sequence to a heritable nucleic acid in a cell. Common examples include
nuclear genomic DNA and amplicons thereof. A genomic nucleic acid is, in
some cases, different from a spliced RNA, or a corresponding cDNA, in
that the spliced RNA or cDNA is processed, e.g., by the splicing
machinery, to remove introns. Genomic nucleic acids optionally comprise
non-transcribed (e.g., chromosome structural sequences, promoter regions,
enhancer regions, etc.) and/or non-translated sequences (e.g., introns),
whereas spliced RNA/cDNA typically do not have non-transcribed sequences
or introns. A "template nucleic acid" is a nucleic acid that serves as a
template in an amplification reaction (e.g., a polymerase based
amplification reaction such as PCR, a ligase mediated amplification
reaction such as LCR, a transcription reaction, or the like). A template
nucleic acid can be genomic in origin, or alternatively, can be derived
from expressed sequences, e.g., a cDNA or an EST.
[0109] An "exogenous nucleic acid" is a nucleic acid that is not native to
a specified system (e.g., a germplasm, plant, variety, etc.), with
respect to sequence, genomic position, or both. As used herein, the terms
"exogenous" or "heterologous" as applied to polynucleotides or
polypeptides typically refers to molecules that have been artificially
supplied to a biological system (e.g., a plant cell, a plant gene, a
particular plant species or variety or a plant chromosome under study)
and are not native to that particular biological system. The terms can
indicate that the relevant material originated from a source other than a
naturally occurring source, or can refer to molecules having a
non-natural configuration, genetic location or arrangement of parts.
[0110] In contrast, for example, a "native" or "endogenous" gene is a gene
that does not contain nucleic acid elements encoded by sources other than
the chromosome or other genetic element on which it is normally found in
nature. An endogenous gene, transcript or polypeptide is encoded by its
natural chromosomal locus, and not artificially supplied to the cell.
[0111] The term "recombinant" in reference to a nucleic acid or
polypeptide indicates that the material (e.g., a recombinant nucleic
acid, gene, polynucleotide, polypeptide, etc.) has been altered by human
intervention. Generally, the arrangement of parts of a recombinant
molecule is not a native configuration, or the primary sequence of the
recombinant polynucleotide or polypeptide has in some way been
manipulated. The alteration to yield the recombinant material can be
performed on the material within or removed from its natural environment
or state. For example, a naturally occurring nucleic acid becomes a
recombinant nucleic acid if it is altered, or if it is transcribed from
DNA which has been altered, by means of human intervention performed
within the cell from which it originates. A gene sequence open reading
frame is recombinant if that nucleotide sequence has been removed from it
natural context and cloned into any type of artificial nucleic acid
vector. Protocols and reagents to produce recombinant molecules,
especially recombinant nucleic acids, are common and routine in the art.
The term recombinant can also refer to an organism that harbors
recombinant material, e.g., a plant that comprises a recombinant nucleic
acid is considered a recombinant plant. In some embodiments, a
recombinant organism is a transgenic organism.
[0112] The term "introduced" when referring to translocating a
heterologous or exogenous nucleic acid into a cell refers to the
incorporation of the nucleic acid into the cell using any methodology.
The term encompasses such nucleic acid introduction methods as
"transfection," "transformation" and "transduction."
[0113] As used herein, the term "vector" is used in reference to
polynucleotide or other molecules that transfer nucleic acid segment(s)
into a cell. The term "vehicle" is sometimes used interchangeably with
"vector." A vector optionally comprises parts which mediate vector
maintenance and enable its intended use (e.g., sequences necessary for
replication, genes imparting drug or antibiotic resistance, a multiple
cloning site, operably linked promoter/enhancer elements which enable the
expression of a cloned gene, etc.). Vectors are often derived from
plasmids, bacteriophages, or plant or animal viruses. A "cloning vector"
or "shuttle vector" or "subcloning vector" contains operably linked parts
that facilitate subcloning steps (e.g., a multiple cloning site
containing multiple restriction endonuclease sites).
[0114] The term "expression vector" as used herein refers to a vector
comprising operably linked polynucleotide sequences that facilitate
expression of a coding sequence in a particular host organism (e.g., a
bacterial expression vector or a plant expression vector). Polynucleotide
sequences that facilitate expression in prokaryotes typically include,
e.g., a promoter, an operator (optional), and a ribosome binding site,
often along with other sequences. Eukaryotic cells can use promoters,
enhancers, termination and polyadenylation signals and other sequences
that are generally different from those used by prokaryotes.
[0115] The term "transgenic plant" refers to a plant that comprises within
its cells a heterologous polynucleotide. Generally, the heterologous
polynucleotide is stably integrated within the genome such that the
polynucleotide is passed on to successive generations. The heterologous
polynucleotide may be integrated into the genome alone or as part of a
recombinant expression cassette. "Transgenic" is used herein to refer to
any cell, cell line, callus, tissue, plant part or plant, the genotype of
which has been altered by the presence of heterologous nucleic acid
including those transgenic organisms or cells initially so altered, as
well as those created by crosses or asexual propagation from the initial
transgenic organism or cell. The term "transgenic" as used herein does
not encompass the alteration of the genome (chromosomal or
extra-chromosomal) by conventional plant breeding methods (e.g., crosses)
or by naturally occurring events such as random cross-fertilization,
non-recombinant viral infection, non-recombinant bacterial
transformation, non-recombinant transposition, or spontaneous mutation.
[0116] "Positional cloning" is a cloning procedure in which a target
nucleic acid is identified and isolated by its genomic proximity to
marker nucleic acid. For example, a genomic nucleic acid clone can
include part or all of two more chromosomal regions that are proximal to
one another. If a marker can be used to identify the genomic nucleic acid
clone from a genomic library, standard methods such as sub-cloning or
sequencing can be used to identify and or isolate subsequences of the
clone that are located near the marker.
[0117] A specified nucleic acid is "derived from" a given nucleic acid
when it is constructed using the given nucleic acid's sequence, or when
the specified nucleic acid is constructed using the given nucleic acid.
For example, a cDNA or EST is derived from an expressed mRNA.
[0118] The term "genetic element" or "gene" refers to a heritable sequence
of DNA, i.e., a genomic sequence, with functional significance. The term
"gene" can also be used to refer to, e.g., a cDNA and/or a mRNA encoded
by a genomic sequence, as well as to that genomic sequence.
[0119] The term "genotype" is the genetic constitution of an individual
(or group of individuals) at one or more genetic loci, as contrasted with
the observable trait (the phenotype). Genotype is defined by the
allele(s) of one or more known loci that the individual has inherited
from its parents. The term genotype can be used to refer to an
individual's genetic constitution at a single locus, at multiple loci,
or, more generally, the term genotype can be used to refer to an
individual's genetic make-up for all the genes in its genome. A
"haplotype" is the genotype of an individual at a plurality of genetic
loci. Typically, the genetic loci described by a haplotype are physically
and genetically linked, i.e., on the same chromosome segment.
[0120] The terms "phenotype," or "phenotypic trait" or "trait" refers to
one or more trait of an organism. The phenotype can be observable to the
naked eye, or by any other means of evaluation known in the art, e.g.,
microscopy, biochemical analysis, genomic analysis, an assay for a
particular disease resistance, etc. In some cases, a phenotype is
directly controlled by a single gene or genetic locus, i.e., a "single
gene trait." In other cases, a phenotype is the result of several genes.
A "quantitative trait loci" (QTL) is a genetic domain that is polymorphic
and effects a phenotype that can be described in quantitative terms,
e.g., height, weight, oil content, days to germination, disease
resistance, etc, and, therefore, can be assigned a "phenotypic value"
which corresponds to a quantitative value for the phenotypic trait. A QTL
can act through a single gene mechanism or by a polygenic mechanism.
[0121] A "molecular phenotype" is a phenotype detectable at the level of a
population of (one or more) molecules. Such molecules can be nucleic
acids such as genomic DNA or RNA, proteins, or metabolites. For example,
a molecular phenotype can be an expression profile for one or more gene
products, e.g., at a specific stage of plant development, in response to
an environmental condition or stress, etc. Expression profiles are
typically evaluated at the level of RNA or protein, e.g., on a nucleic
acid array or "chip" or using antibodies or other binding proteins.
[0122] The term "yield" refers to the productivity per unit area of a
particular plant product of commercial value. For example, yield of
soybean is commonly measured in bushels of seed per acre or metric tons
of seed per hectare per season. Yield is affected by both genetic and
environmental factors. "Agronomics," "agronomic traits," and "agronomic
performance" refer to the traits (and underlying genetic elements) of a
given plant variety that contribute to yield over the course of growing
season. Individual agronomic traits include emergence vigor, vegetative
vigor, stress tolerance, disease resistance or tolerance, herbicide
resistance, branching, flowering, seed set, seed size, seed density,
standability, threshability and the like. Yield is, therefore, the final
culmination of all agronomic traits.
[0123] A "set" of markers or probes refers to a collection or group of
markers or probes, or the data derived therefrom, used for a common
purpose, e.g., identifying soybean plants with a desired trait (e.g.,
tolerance to Phytophthora infection). Frequently, data corresponding to
the markers or probes, or data derived from their use, is stored in an
electronic medium. While each of the members of a set possess utility
with respect to the specified purpose, individual markers selected from
the set as well as subsets including some, but not all of the markers,
are also effective in achieving the specified purpose.
[0124] A "look up table" is a table that correlates one form of data to
another, or one or more forms of data with a predicted outcome that the
data is relevant to. For example, a look up table can include a
correlation between allele data and a predicted trait that a plant
comprising a given allele is likely to display. These tables can be, and
typically are, multidimensional, e.g., taking multiple alleles into
account simultaneously, and, optionally, taking other factors into
account as well, such as genetic background, e.g., in making a trait
prediction.
[0125] A "computer readable medium" is an information storage media that
can be accessed by a computer using an available or custom interface.
Examples include memory (e.g., ROM or RAM, flash memory, etc.), optical
storage media (e.g., CD-ROM), magnetic storage media (computer hard
drives, floppy disks, etc.), punch cards, and many others that are
commercially available. Information can be transmitted between a system
of interest and the computer, or to or from the computer to or from the
computer readable medium for storage or access of stored information.
This transmission can be an electrical transmission, or can be made by
other available methods, such as an IR link, a wireless connection, or
the like.
[0126] "System instructions" are instruction sets that can be partially or
fully executed by the system. Typically, the instruction sets are present
as system software.
BRIEF DESCRIPTION OF THE FIGURES
[0127] FIG. 1 provides a table listing soybean markers demonstrating
linkage disequilibrium with low available iron tolerance phenotype as
determined by intergroup allele frequency distribution analysis,
association mapping analysis and QTL interval mapping (including marker
regression analysis) methods. The table indicates the marker type SSR
(simple sequence repeat; genomic or EST) or SNP (single nucleotide
polymorphism), the chromosome on which the marker is located and its
approximate genetic map position relative to other known markers, given
in cM, with position zero being the first (most distal) marker on the
chromosome, as provided in the integrated genetic map in FIG. 6. Also
shown are the soybean populations used in the analysis and the
statistical probability of random segregation of the marker and the
tolerance phenotype given as an adjusted probability taking into account
the variability and false positives of multiple tests. Results from QTL
interval mapping are provided, with the significance values given as a
likelihood ratio statistic (LRS).
[0128] FIG. 2 provides a table listing the genomic and EST SSR markers
that demonstrated linkage disequilibrium with the low iron tolerance
phenotype and the sequences of the left and right PCR primers used in the
SSR marker locus genotyping analysis. Also shown is the pigtail sequence
used on the 5' end of the right primer, and the number of nucleotides in
the tandem repeating element in the SSR.
[0129] FIG. 3 provides a table listing the SNP markers that demonstrated
linkage disequilibrium with the low iron tolerance phenotype. The table
provides the sequences of the PCR primers used to generate a
SNP-containing amplicon, and the allele-specific probes that were used to
identify the SNP allele in an allele-specific hybridization assay (ASH
assay).
[0130] FIG. 4 provides an allele dictionary of the characterized alleles
of the SSR markers that demonstrated linkage disequilibrium with the low
iron tolerance phenotype. Each allele is defined by the size of a PCR
amplicon generated from soybean genomic DNA or mRNA using the primers
listed in FIG. 2. Sizes of the PCR amplicons are indicated in base pairs
(bp).
[0131] FIG. 5 provides a table listing genetic markers that are closely
linked to the low iron tolerance markers identified by the present
invention.
[0132] FIG. 6 provides an integrated genetic map for approximately 750
soybean markers, including both SSR-type and SNP-type markers. These
markers are distributed over each soybean chromosome. The chromosome
number, as well as the equivalent historical chromosome name are
indicated. The genetic map positions of the markers are indicated in
centiMorgans (cM), typically with position zero being the first (most
distal) marker on the chromosome.
DETAILED DESCRIPTION
[0133] Iron deficiency chlorosis (IDC or FEC) is a soybean disease causing
severe losses in viability and reductions in yield. The disease is caused
by poor iron availability in soil, and is strongly influenced by
environmental factors that control iron availability (e.g., environmental
factors that reduce iron solubility result in reduced iron availability
in the soil). Yield losses can be minimized by the field application of
iron-rich fertilizers such as livestock manure or making foliar
applications of iron-containing materials. However, one of the most
effective and most environmentally friendly approaches to overcoming this
disease is through the selection of soybean varieties that are tolerant
to the iron-deficient growth conditions.
[0134] The identification and selection of soybean plants that show
tolerance to iron-deficient growth conditions using MAS can provide an
effective and environmentally friendly approach to overcoming losses
caused by this disease. The present invention provides soybean marker
loci that demonstrate statistically significant co-segregation with
tolerance to iron-deficient growth conditions. Detection of these loci or
additional linked loci can be used in marker assisted soybean breeding
programs to produce tolerant plants, or plants with improved tolerance.
The linked SSR and SNP markers identified herein are provided in FIG. 1.
These markers include S60210-TB, SAC1006, SATT391, SAC1724, SATT307,
P13073A-1, P10598A-1, SATT334, SATT510, SATT335, P5219A-1, P7659A-2,
SAT.sub.--117, SATT191, S60143-TB, SATT451, SATT367, SATT495, P10649C-3,
SATT613, SATT257, SATT581 and SATT153.
[0135] Each of the SSR-type markers display a plurality of alleles that
can be visualized as different sized PCR amplicons, as summarized in the
SSR allele dictionary in FIG. 4. The PCR primers that are used to
generate the SSR-marker amplicons are provided in FIG. 2. The alleles of
SNP-type markers are determined using an allele-specific hybridization
protocol, as known in the art. The PCR primers used to amplify the SNP
domain, and the allele-specific probes used to genotype the locus are
provided in FIG. 3.
[0136] As recognized in the art, any other marker that is linked to a QTL
marker (e.g., a disease tolerance marker) also finds use for that same
purpose. Examples of additional markers that are linked to the disease
tolerance markers recited herein are provided. For example, a linked
marker can be determined from the soybean consensus genetic map provided
in FIG. 6. Additional closely linked markers are further provided in FIG.
5. It is not intended, however, that linked markers finding use with the
invention be limited to those recited in FIG. 5 or 6.
[0137] The invention also provides chromosomal QTL intervals that
correlate with tolerance to low-iron conditions. Any marker located
within these intervals finds use as a marker for iron-deficiency
tolerance. These intervals include:
[0138] (a) S60210-TB and SATT391 (LG-C1);
[0139] (b) P10598A-1 and SATT334 (LG-F);
[0140] (c) SATT510 and SATT335 (LG-F);
[0141] (d) P5219A-1 and P7659A-2 (LG-G);
[0142] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0143] (f) SATT451 and SATT367 (LG-I);
[0144] (g) SATT495 and P10649C-3 (LG-L); and
[0145] (h) SATT250 and SATT346 (LG-M).
[0146] Methods for identifying soybean plants or germplasm that carry
preferred alleles of tolerance marker loci are a feature of the
invention. In these methods, any of a variety of marker detection
protocols are used to identify marker loci, depending on the type of
marker loci. Typical methods for marker detection include amplification
and detection of the resulting amplified markers, e.g., by PCR, LCR,
transcription based amplification methods, or the like. These include
ASH, SSR detection, RFLP analysis and many others.
[0147] Although particular marker alleles can show co-segregation with a
disease tolerance or susceptibility phenotype, it is important to note
that the marker locus is not necessarily part of the QTL locus
responsible for the tolerance or susceptibility. For example, it is not a
requirement that the marker polynucleotide sequence be part of a gene
that imparts disease resistance (for example, be part of the gene open
reading frame). The association between a specific marker allele with the
tolerance or susceptibility phenotype is due to the original "coupling"
linkage phase between the marker allele and the QTL tolerance or
susceptibility allele in the ancestral soybean line from which the
tolerance or susceptibility allele originated. Eventually, with repeated
recombination, crossing over events between the marker and QTL locus can
change this orientation. For this reason, the favorable marker allele may
change depending on the linkage phase that exists within the tolerant
parent used to create segregating populations. This does not change the
fact the genetic marker can be used to monitor segregation of the
phenotype. It only changes which marker allele is considered favorable in
a given segregating population.
[0148] Identification of soybean plants or germplasm that include a marker
locus or marker loci linked to a tolerance trait or traits provides a
basis for performing marker assisted selection of soybean. Soybean plants
that comprise favorable markers or favorable alleles are selected for,
while soybean plants that comprise markers or alleles that are negatively
correlated with tolerance can be selected against. Desired markers and/or
alleles can be introgressed into soybean having a desired (e.g., elite or
exotic) genetic background to produce an introgressed tolerant soybean
plant or germplasm. In some aspects, it is contemplated that a plurality
of tolerance markers are sequentially or simultaneous selected and/or
introgressed. The combinations of tolerance markers that are selected for
in a single plant is not limited, and can include any combination of
markers recited in FIG. 1, any markers linked to the markers recited in
FIG. 1, or any markers located within the QTL intervals defined herein.
[0149] As an alternative to standard breeding methods of introducing
traits of interest into soybean (e.g., introgression), transgenic
approaches can also be used. In these methods, exogenous nucleic acids
that encode traits linked to markers are introduced into target plants or
germplasm. For example, a nucleic acid that codes for a tolerance trait
is cloned, e.g., via positional cloning and introduced into a target
plant or germplasm.
[0150] Verification of iron-deficiency tolerance can be performed by
available tolerance assay protocols, as known in the art and discussed in
more detail below. For example, see Dahiya and Singh (1979) "Effect of
salinity, alkalinity and iron sources on availability of iron," Plant and
Soil 51:13-18. Tolerance assays are useful to verify that the tolerance
trait still segregates with the marker in any particular plant or
population, and, of course, to measure the degree of tolerance
improvement achieved by introgressing or recombinantly introducing the
trait into a desired background.
[0151] Systems, including automated systems for selecting plants that
comprise a marker of interest and/or for correlating presence of the
marker with tolerance are also a feature of the invention. These systems
can include probes relevant to marker locus detection, detectors for
detecting labels on the probes, appropriate fluid handling elements and
temperature controllers that mix probes and templates and/or amplify
templates, and systems instructions that correlate label detection to the
presence of a particular marker locus or allele.
[0152] Kits are also a feature of the invention. For example, a kit can
include appropriate primers or probes for detecting tolerance associated
marker loci and instructions in using the primers or probes for detecting
the marker loci and correlating the loci with predicted low iron
tolerance. The kits can further include packaging materials for packaging
the probes, primers or instructions, controls such as control
amplification reactions that include probes, primers or template nucleic
acids for amplifications, molecular size markers, or the like.
Tolerance Markers and Favorable Alleles
[0153] In traditional linkage analysis, no direct knowledge of the
physical relationship of genes on a chromosome is required. Mendel's
first law is that factors of pairs of characters are segregated, meaning
that alleles of a diploid trait separate into two gametes and then into
different offspring. Classical linkage analysis can be thought of as a
statistical description of the relative frequencies of cosegregation of
different traits. Linkage analysis is the well characterized descriptive
framework of how traits are grouped together based upon the frequency
with which they segregate together. That is, if two non-allelic traits
are inherited together with a greater than random frequency, they are
said to be "linked." The frequency with which the traits are inherited
together is the primary measure of how tightly the traits are linked,
i.e., traits which are inherited together with a higher frequency are
more closely linked than traits which are inherited together with lower
(but still above random) frequency. Traits are linked because the genes
which underlie the traits reside on the same chromosome. The further
apart on a chromosome the genes reside, the less likely they are to
segregate together, because homologous chromosomes recombine during
meiosis. Thus, the further apart on a chromosome the genes reside, the
more likely it is that there will be a crossing over event during meiosis
that will result in two genes segregating separately into progeny.
[0154] A common measure of linkage is the frequency with which traits
cosegregate. This can be expressed as a percentage of cosegregation
(recombination frequency) or, also commonly, in centiMorgans (cM). The cM
is named after the pioneering geneticist Thomas Hunt Morgan and is a unit
of measure of genetic recombination frequency. One cM is equal to a 1%
chance that a trait at one genetic locus will be separated from a trait
at another locus due to crossing over in a single generation (meaning the
traits segregate together 99% of the time). Because chromosomal distance
is approximately proportional to the frequency of crossing over events
between traits, there is an approximate physical distance that correlates
with recombination frequency. For example, in soybean, 1 cM correlates,
on average, to about 400,000 base pairs (400 Kb).
[0155] Marker loci are themselves traits and can be assessed according to
standard linkage analysis by tracking the marker loci during segregation.
Thus, in the context of the present invention, one cM is equal to a 1%
chance that a marker locus will be separated from another locus (which
can be any other trait, e.g., another marker locus, or another trait
locus that encodes a QTL), due to crossing over in a single generation.
The markers herein, as described in FIG. 1, e.g., 560210-TB, SAC1006,
SATT391, SAC1724, SATT307, P13073A-1, P10598A-1, SATT334, SATT510,
SATT335, P5219A-1, P7659A-2, SAT.sub.--117, SATT191, S60143-TB, SATT451,
SATT367, SATT495, P10649C-3, SATT613, SATT257, SATT581 and SATT153, as
well as any of the chromosome intervals
[0156] (a) S60210-TB and SATT391 (LG-C1);
[0157] (b) P10598A-1 and SATT334 (LG-F);
[0158] (c) SATT510 and SATT335 (LG-F);
[0159] (d) P5219A-1 and P7659A-2 (LG-G);
[0160] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0161] (f) SATT451 and SATT367 (LG-I);
[0162] (g) SATT495 and P10649C-3 (LG-L); and
[0163] (h) SATT250 and SATT346 (LG-M),
have been found to correlate with tolerance, improved tolerance or
susceptibility to low iron growth conditions in soybean. This means that
the markers are sufficiently proximal to a tolerance QTL that they can be
used as a predictor for the tolerance trait. This is extremely useful in
the context of marker assisted selection (MAS), discussed in more detail
herein. In brief, soybean plants or germplasm can be selected for markers
or marker alleles that positively correlate with tolerance, without
actually raising soybean and measuring for tolerance or improved
tolerance (or, contrawise, soybean plants can be selected against if they
possess markers that negatively correlate with tolerance or improved
tolerance). MAS is a powerful shortcut to selecting for desired
phenotypes and for introgressing desired traits into cultivars of soybean
(e.g., introgressing desired traits into elite lines). MAS is easily
adapted to high throughput molecular analysis methods that can quickly
screen large numbers of plant or germplasm genetic material for the
markers of interest and is much more cost effective than raising and
observing plants for visible traits.
[0164] In some embodiments, the most preferred QTL markers are a subset of
the markers provided in FIG. 1. For example, the most preferred markers
can be selected from SAC1724, SATT307, P13073A-1, P10598A-1, SATT334,
SATT495, P10649C-3, SATT613 and SATT257.
[0165] When referring to the relationship between two genetic elements,
such as a genetic element contributing to tolerance and a proximal
marker, "coupling" phase linkage indicates the state where the
"favorable" allele at the tolerance locus is physically associated on the
same chromosome strand as the "favorable" allele of the respective linked
marker locus. In coupling phase, both favorable alleles are inherited
together by progeny that inherit that chromosome strand. In "repulsion"
phase linkage, the "favorable" allele at the locus of interest (e.g., a
QTL for tolerance) is physically linked with an "unfavorable" allele at
the proximal marker locus, and the two "favorable" alleles are not
inherited together (i.e., the two loci are "out of phase" with each
other).
[0166] A favorable allele of a marker is that allele of the marker that
co-segregates with a desired phenotype (e.g., disease tolerance). As used
herein, a QTL marker has a minimum of one favorable allele, although it
is possible that the marker might have two or more favorable alleles
found in the population. Any favorable allele of that marker can be used
advantageously for the identification and construction of tolerant
soybean lines. Optionally, one, two, three or more favorable allele(s) of
different markers are identified in, or introgressed into a plant, and
can be selected for or against during MAS. Desirably, plants or germplasm
are identified that have at least one such favorable allele that
positively correlates with tolerance or improved tolerance.
[0167] Alternatively, a marker allele that co-segregates with disease
susceptibility also finds use with the invention, since that allele can
be used to identify and counter select disease-susceptible plants. Such
an allele can be used for exclusionary purposes during breeding to
identify alleles that negatively correlate with tolerance, to eliminate
susceptible plants or germplasm from subsequent rounds of breeding.
[0168] In some embodiments of the invention, a plurality of marker alleles
are simultaneously selected for in a single plant or a population of
plants. In these methods, plants are selected that contain favorable
alleles from more than one tolerance marker, or alternatively, favorable
alleles from more than one tolerance marker are introgressed into a
desired soybean germplasm. One of skill in the art recognizes that the
simultaneous selection of favorable alleles from more than one disease
tolerance marker in the same plant is likely to result in an additive (or
even synergistic) protective effect for the plant.
[0169] One of skill recognizes that the identification of favorable marker
alleles is germplasm-specific. The determination of which marker alleles
correlate with tolerance (or susceptibility) is determined for the
particular germplasm under study. One of skill recognizes that methods
for identifying the favorable alleles are routine and well known in the
art, and furthermore, that the identification and use of such favorable
alleles is well within the scope of the invention. Furthermore still,
identification of favorable marker alleles in soybean populations other
than the populations used or described herein is well within the scope of
the invention.
[0170] Amplification primers for amplifying SSR-type marker loci are a
feature of the invention. Another feature of the invention are primers
specific for the amplification of SNP domains (SNP markers), and the
probes that are used to genotype the SNP sequences. FIGS. 2 and 3 provide
specific primers for locus amplification and probes for detecting
amplified marker loci are provided. However, one of skill will
immediately recognize that other sequences to either side of the given
primers can be used in place of the given primers, so long as the primers
can amplify a region that includes the allele to be detected. Further, it
will be appreciated that the precise probe to be used for detection can
vary, e.g., any probe that can identify the region of a marker amplicon
to be detected can be substituted for those examples provided herein.
Further, the configuration of the amplification primers and detection
probes can, of course, vary. Thus, the invention is not limited to the
primers and probes specifically recited herein.
[0171] In some aspects, methods of the invention utilize an amplification
step to detect/genotype a marker locus. However, it will be appreciated
that amplification is not a requirement for marker detection--for
example, one can directly detect unamplified genomic DNA simply by
performing a Southern blot on a sample of genomic DNA. Procedures for
performing Southern blotting, amplification (PCR, LCR, or the like) and
many other nucleic acid detection methods are well established and are
taught, e.g., in Sambrook et al., Molecular Cloning--A Laboratory Manual
(3rd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor,
N.Y., 2000 ("Sambrook"); Current Protocols in Molecular Biology, F. M.
Ausubel et al., eds., Current Protocols, a joint venture between Greene
Publishing Associates, Inc. and John Wiley & Sons, Inc., (supplemented
through 2002) ("Ausubel")) and PCR Protocols A Guide to Methods and
Applications (Innis et al. eds) Academic Press Inc. San Diego, Calif.
(1990) (Innis). Additional details regarding detection of nucleic acids
in plants can also be found, e.g., in Plant Molecular Biology (1993) Croy
(ed.) BIOS Scientific Publishers, Inc.
[0172] Separate detection probes can also be omitted in
amplification/detection methods, e.g., by performing a real time
amplification reaction that detects product formation by modification of
the relevant amplification primer upon incorporation into a product,
incorporation of labeled nucleotides into an amplicon, or by monitoring
changes in molecular rotation properties of amplicons as compared to
unamplified precursors (e.g., by fluorescence polarization).
[0173] Typically, molecular markers are detected by any established method
available in the art, including, without limitation, allele specific
hybridization (ASH) or other methods for detecting single nucleotide
polymorphisms (SNP), amplified fragment length polymorphism (AFLP)
detection, amplified variable sequence detection, randomly amplified
polymorphic DNA (RAPD) detection, restriction fragment length
polymorphism (RFLP) detection, self-sustained sequence replication
detection, simple sequence repeat (SSR) detection, single-strand
conformation polymorphisms (SSCP) detection, isozyme markers detection,
or the like. While the exemplary markers provided in the figures and
tables herein are either SSR or SNP (ASH) markers, any of the
aforementioned marker types can be employed in the context of the
invention to identify chromosome segments encompassing genetic element
that contribute to superior agronomic performance (e.g., tolerance or
improved tolerance).
QTL Chromosome Intervals
[0174] In some aspects, the invention provides QTL chromosome intervals,
where a QTL (or multiple QTLs) that segregate with low iron tolerance are
contained in those intervals. A variety of methods well known in the art
are available for identifying chromosome intervals (also as described in
detail in EXAMPLE 3). The boundaries of such chromosome intervals are
drawn to encompass markers that will be linked to one or more QTL. In
other words, the chromosome interval is drawn such that any marker that
lies within that interval (including the terminal markers that define the
boundaries of the interval) can be used as markers for disease tolerance.
Each interval comprises at least one QTL, and furthermore, may indeed
comprise more than one QTL. Close proximity of multiple QTL in the same
interval may obfuscate the correlation of a particular marker with a
particular QTL, as one marker may demonstrate linkage to more than one
QTL. Conversely, e.g., if two markers in close proximity show
co-segregation with the desired phenotypic trait, it is sometimes unclear
if each of those markers identifying the same QTL or two different QTL.
Regardless, knowledge of how many QTL are in a particular interval is not
necessary to make or practice the invention.
[0175] The present invention provides soybean chromosome intervals, where
the markers within those intervals demonstrate co-segregation with
tolerance to low iron environmental conditions. Thus, each of these
intervals comprises at least one low iron tolerance QTL. These intervals
are:
TABLE-US-00001
Linkage Method(s) of
Group Flanking Markers Identification
C1 S60210-TB and SATT391 Marker Clustering
F P10598A-1 and SATT334 QTL Interval Mapping
and
Marker Clustering
F SATT510 and SATT335 QTL Interval Mapping
and
Marker Clustering
G P5219A-1 and P7659A-2 Marker Clustering
G SAT_117 and S60143-TB Marker Clustering
I SATT451 and SATT367 Marker Clustering
L SATT495 and P10649C-3 Marker Clustering
M SATT250 and SATT346 QTL Interval Mapping
[0176] Each of the intervals described above shows a clustering of markers
that co-segregate with iron-deficiency tolerance. This clustering of
markers occurs in relatively small domains on the linkage groups,
indicating the presence of one or more QTL in those chromosome regions.
QTL intervals were drawn to encompass the markers that co-segregate with
environmental low iron tolerance. The intervals are defined by the
markers on their termini, where the interval encompasses all the markers
that map within the interval as well as the markers that define the
termini.
[0177] In two cases, intervals that were drawn on LG-F by a marker
clustering effect were further independently confirmed by a QTL mapping
analysis, as described in detail in EXAMPLE 3. In those experiments,
markers on that that domain of LG-F show a significant likelihood ratio
statistic (LRS) for the presence of one or more QTL responsible for the
low-iron tolerance trait. Optionally, because the two intervals on LG-F
are relatively close together, these two intervals can be viewed as a
single interval that contain one (or more) tolerance QTL.
Genetic Maps
[0178] As one of skill in the art will recognize, recombination
frequencies (and as a result, genetic map positions) in any particular
population are not static. The genetic distances separating two markers
(or a marker and a QTL) can vary depending on how the map positions are
determined. For example, variables such as the parental mapping
populations used, the software used in the marker mapping or QTL mapping,
and the parameters input by the user of the mapping software can
contribute to the QTL/marker genetic map relationships. However, it is
not intended that the invention be limited to any particular mapping
populations, use of any particular software, or any particular set of
software parameters to determine linkage of a particular marker or
chromosome interval with the low iron tolerance phenotype. It is well
within the ability of one of ordinary skill in the art to extrapolate the
novel features described herein to any soybean gene pool or population of
interest, and using any particular software and software parameters.
Indeed, observations regarding tolerance markers and chromosome intervals
in populations in addition to those described herein are readily made
using the teaching of the present disclosure.
[0179] Mapping Populations
[0180] Any suitable soybean strains can be used to generate mapping data
or for marker association studies. A large number of commonly used
soybean lines (e.g., commercial varieties) and mapping populations are
known in the art. Additional strains finding use with the invention are
also described in the present disclosure. Useful soybean mapping
populations and lines include but are not limited to:
TABLE-US-00002
Mapping Population Description/Reference
UP1C6-43 .times. 90B73 UP1C6-43 is a public line from the University of
Nebraska. 90B73
is notably susceptible to FEC, and is described in Plant Variety
Protection Act, Certificate No. 200000152 for Soybean `90B73,`
issued May 8, 2001; see also, U.S. Pat. No. 6,316,700, issued
Nov. 13, 2001, to Hedges.
P1082 .times. 90B73 P1082 is described in Plant Variety Protection Act,
Certificate No.
8200115, issued May 26, 1982. 90B73 is described in Plant Variety
Protection Act, Certificate No. 200000152 for Soybean `90B73`
issued May 8, 2001; see also, U.S. Pat. No. 6,316,700, issued
Nov. 13, 2001, to Hedges.
Minsoy .times. Noir 1 Recombinant Inbred Line (RIL) population derived by
single seed
descent, consisting of 240 F7-derived RILs. Described in Lark et
al., (1993) "A genetic map of soybean (Glycine max L.) and using
an intraspecific cross of two cultivars: Minosy and Noir 1," Theor.
Appl. Genet., 86: 901-906; Mansur and Orf (1995) "Evaluation of
soybean recombinant inbreds for agronomic performance in northern
USA and Chile," Crop Sci., 35: 422-425; Mansur et al., (1996)
"Genetic mapping of agronomic traits using recombinant inbred
lines of soybean," Crop Sci., 36: 1327-1336. Developed at the
University of Utah. See also Intl. Patent Appl. No. WO 98/49887,
filed May 1, 1998.
Minsoy .times. Archer RIL population derived by single seed descent,
consisting of 233 F7-
derived RILs. Described in Mansur and Orf (1995) "Evaluation of
soybean recombinant inbreds for agronomic performance in northern
USA and Chile," Crop Sci., 35: 422-425; Mansur et al., (1996)
"Genetic mapping of agronomic traits using recombinant inbred
lines of soybean," Crop Sci., 36: 1327-1336. Developed at the
University of Utah. See also Intl. Patent Appl. No. WO 98/49887,
filed May 1, 1998.
Noir 1 .times. Archer RIL Population derived by single seed descent,
consisting of 240 F7-
derived RILs. Described in Mansur and Orf (1995) "Evaluation of
soybean recombinant inbreds for agronomic performance in northern
USA and Chile," Crop Sci., 35: 422-425; Mansur et al., (1996)
"Genetic mapping of agronomic traits using recombinant inbred
lines of soybean," Crop Sci., 36: 1327-1336. Developed at the
University of Utah. See also Intl. Patent Appl. No. WO 98/49887,
filed May 1, 1998.
Clark .times. Harosoy Population derived from the cross of near isogenic
lines (NILs) of
the cultivars Clark and Harosoy. The population consists of
derivatives of 57 F2 plants (see, Shoemaker and Specht (1995)
"Integration of the soybean molecular and classical genetic linkage
groups," Crop Sci., 35: 436-446). Developed at the University of
Nebraska.
A81-356022 .times. PI468916 This is an F2-derived mapping population from
the interspecific
cross of the A81-356022 (Glycine max) and PI468.916 (G. soja).
The population consists of 59 F2 plant derivatives and has been
described in detail (Keim et al., (1990) "RFLP mapping in soybean:
association between marker loci and variation in quantitative traits,"
Genetics 126: 735-742; Shoemaker and Specht (1995) "Integration of
the soybean molecular and classical genetic linkage groups," Crop
Sci., 35: 436-446; Shoemaker and Olson (1993) Molecular linkage
map of soybean (Glycine max L. Merr.). p.6.131-6.138, in Genetic
maps: Locus maps of complex genomes [O'Brien (ed.)] Cold Spring
Harbor Laboratory Press, New York). Commonly referred to as the
USDA/Iowa State University Population (MS).
OX715 .times. P9242 OX715 is a public variety. P9242 is described in Plant
Variety
Protection Act, Certificate No. 9300238 for Soybean `9242` issued
May 30, 1997.
Sloan, Williams, Harosoy and See, Burnham et al., Crop Sci., "Quantitative
Trait Loci for Partial
Conrad Resistance to Phytophthora sojae in Soybean," 43: 1610-1617
(various RILs derived from (2003); Weiss and Stevenson, Agron. J., 47:
541-543 (1955);
crosses of the above cultivars) Bernard and Lindahl, Crop Sci., 43:
101-105 (1972); Bahrenfus and
Fehr, Crop Sci., 20: 673 (1980); Fehr et al., Crop Sci., 29: 830 (1989).
Bert, Marcus, Corsoy, A92- See, Glover and Scott, "Heritability and
Phenotypic Variation of
627030, Simpson, OT92-1, Tolerance to Phytophthora Root Rot of Soybean,"
Crop Sci.,
Hendricks, Freeborn, Surge, 38: 1495-1500 (1998); and additional
references made therein.
Kenwood 94
(various RILs derived from
crosses of the above cultivars)
Essex .times. Forrest See, Yuan et al., "Quantitative trait loci in two
soybean recombinant
Flyer .times. Hartwig inbred line populations segregating for yield and
disease resistance,"
Crop Sci., 42: 271-277 (2002).
Williams .times. PI399073 US Patent Appl. No. 2004/0034890, published Feb.
19, 2004; U.S.
S 19-90 .times. PI399073 Pat. Appl. No. 2004/0261144, published Dec. 23,
2004.
9163 .times. 92B05 P9163 is a commercially available Pioneer variety
described in Plant
Variety Protection Act, Certificate No. 9600053. 92B05 is a
commercially available Pioneer variety described in Plant Variety
Protection Act, Certificate No. 9900092 for Soybean `92B05` issued
Sep. 21, 2000; see also, U.S. Pat. No. 5,942,668, issued Aug.
24, 1999 to Grace et al.
9362 .times. 93B41 P9362 is a commercially available Pioneer variety
described in Plant
Variety Protection Act, Certificate No. 9400098. 93B41 is a
commercially available Pioneer variety described in Plant Variety
Protection Act, Certificate No. 9800068; see also, U.S. Pat. No.
5,750,853, issued May 12, 1998 to Fuller et al.
93B35 Described in Plant Variety Protection Act, Certificate No.
200000035, issued Apr. 24, 2001. See also, U.S. Pat. No. 6,153,818,
issued Nov. 28, 2000.
93B53 Described in Plant Variety Protection Act, Certificate No. 9900101,
issued Oct. 27, 2000. See also, U.S. Pat. No. 6,075,182, issued Jun.
13, 2000.
93M11 Described in Plant Variety Protection Act, Certificate No.
200400080, issued Aug. 16, 2004. See also, U.S. Pat. No. 6,855,875,
issued Feb. 15, 2005.
93B68 Described in Plant Variety Protection Act, Certificate No.
200200084, issued Jun. 10, 2002. See also, US Patent Appl. Serial No.
10/271,115.
93B72 Described in Plant Variety Protection Act, Certificate No.
200100071, issued May 8, 2001. See also, U.S. Pat. No. 6,566,589,
issued May 20, 2003.
94B53 Described in Plant Variety Protection Act, Certificate No.
200000031, issued May 8, 2001. See also, U.S. Pat. No. 6,235,976,
issued May 22, 2001.
94M80 Described in pending Plant Variety Protection Act, Certificate No.
200500084, filed Jan. 18, 2005. See also, pending US Patent Appl.
Serial No. 10/768,275, filed Jan. 30, 2005.
9492 Described in Plant Variety Protection Act, Certificate No. 9800077,
issued Sep. 12, 2001. See also, U.S. Pat. No. 5,792,907, issued
Aug. 11, 1998.
[0181] Mapping Software
[0182] A variety of commercial software is available for genetic mapping
and marker association studies (e.g., QTL mapping). This software
includes but is not limited to:
TABLE-US-00003
Software Description/References
JoinMap .RTM. VanOoijen, and Voorrips (2001) "JoinMap 3.0 software
for the calculation of genetic linkage maps," Plant
Research International, Wageningen, the Netherlands;
and, Stam "Construction of integrated genetic linkage
maps by means of a new computer package: JoinMap,"
The Plant Journal 3(5): 739-744 (1993)
MapQTL .RTM. J. W. vanOoijen, "Software for the mapping of
quantitative trait loci in experimental populations,"
Kyazma B. V., Wageningen, Netherlands
MapManager Manly and Olson, "Overview of QTL mapping software
QT and introduction to Map Manager QT," Mamm.
Genome 10: 327-334 (1999)
MapManager Manly, Cudmore and Meer, "MapManager QTX,
QTX cross-platform software for genetic mapping," Mamm.
Genome 12: 930-932 (2001)
GeneFlow .RTM. GENEFLOW, Inc. (Alexandria, VA)
and
QTLocate .TM.
TASSEL (Trait Analysis by aSSociation, Evolution, and Linkage) by
Edward Buckler, and information about the program
can be found on the Buckler Lab web page at the
Institute for Genomic Diversity at Cornell University.
[0183] Unified Genetic Maps
[0184] "Unified," "consensus" or "integrated" genetic maps have been
created that incorporate mapping data from two or more sources, including
sources that used different mapping populations and different modes of
statistical analysis. The merging of genetic map information increases
the marker density on the map, as well as improving map resolution. These
improved maps can be advantageously used in marker assisted selection,
map-based cloning, provide an improved framework for positioning newly
identified molecular markers and aid in the identification of QTL
chromosome intervals and clusters of advantageously-linked markers.
[0185] In some aspects, a consensus map is derived by simply overlaying
one map on top of another. In other aspects, various algorithms, e.g.,
JoinMap.RTM. analysis, allows the combination of genetic mapping data
from multiple sources, and reconciles discrepancies between mapping data
from the original sources. See, Van Ooijen, and Voorrips (2001) "JoinMap
3.0 software for the calculation of genetic linkage maps," Plant Research
International, Wageningen, the Netherlands; and, Stam (1993)
"Construction of integrated genetic linkage maps by means of a new
computer package: JoinMap," The Plant Journal 3(5):739-744.
[0186] FIG. 6 provides a composite genetic map that incorporates mapping
information from various sources. This map was derived using the
USDA/Iowa State University mapping population data (as described in
Cregan et al., "An Integrated Genetic Linkage Map of the Soybean Genome"
Crop Science 39:1464-1490 [1999]; and see references therein) as a
framework. Additional markers, as they became known, have been
continuously added to that map, including public SSR markers, EST-derived
markers, and SNP markers. This map contains approximately 750 soybean
markers that are distributed over each of the soybean chromosomes. The
markers that are on this map are known in the art (i.e., have been
previously described; see, e.g., the SOYBASE on-line resource for
extensive listings of these markers and descriptions of the individual
markers) or are described herein.
[0187] Additional integrated maps are known in the art. See, e.g., Cregan
et al., "An Integrated Genetic Linkage Map of the Soybean Genome" Crop
Science 39:1464-1490 (1999); and also International Application No.
PCT/US2004/024919 by Sebastian, filed Jul. 27, 2004, entitled "Soybean
Plants Having Superior Agronomic Performance and Methods for their
Production").
[0188] Song et al. provides another integrated soybean genetic map that
incorporates mapping information from five different mapping populations
(Song et al., "A New Integrated Genetic Linkage Map of the Soybean,"
Theor. Appl. Genet., 109:122-128 [2004]). This integrated map contains
approximately 1,800 soybean markers, including SSR and SNP-type markers,
as well as EST markers, RPLP markers, AFLP, RAPD, isozyme and classical
markers (e.g., seed coat color). The markers that are on this map are
known in the art and have been previously characterized. This information
is also available at the website for the Soybean Genomics and Improvement
Laboratory (SGIL) at the USDA Beltsville Agricultural Research Center
(BARC). See, specifically, the description of projects in the Cregan
Laboratory on that website.
[0189] The soybean integrated linkage map provided in Song et al. (2004)
is based on the principle described by Stam (1993) "Construction of
integrated genetic linkage maps by means of a new computer package:
JoinMap," The Plant Journal 3(5):739-744; and Van Ooijen and Voorrips
(2001) "JoinMap 3.0 software for the calculation of genetic linkage
maps," Plant Research International, Wageningen, the Netherlands. Mapping
information from five soybean populations was used in the map
integration, and also used to place recently identified SSR markers onto
the soybean genome. These mapping populations were Minsoy.times.Noir 1
(MN), Minsoy.times.Archer (MA), Noir 1.times.Archer (NA),
Clark.times.Harosoy (CH) and A81-356022.times.P1468916 (MS). The
JoinMap.RTM. analysis resulted in a map with 20 linkage groups containing
a total of 1849 markers, including 1015 SSRs, 709 RFLPs, 73 RAPDs, 24
classical traits, six AFLPs, ten isozymes and 12 others. Among the mapped
SSR markers were 417 previously uncharacterized SSRs.
[0190] Initially, LOD scores and pairwise recombination frequencies
between markers were calculated. A LOD of 5.0 was used to create groups
in the MS, MA, NA populations and LOD 4.0 in the MN and CH populations.
The map of each linkage group was then integrated. Recombination values
were converted to genetic distances using the Kosambi mapping function.
Linked Markers
[0191] From the present disclosure and widely recognized in the art, it is
clear that any genetic marker that has a significant probability of
co-segregation with a phenotypic trait of interest (e.g., in the present
case, a pathogen tolerance or improved tolerance trait) can be used as a
marker for that trait. As list of useful QTL markers provided by the
present invention is provided in FIG. 1.
[0192] In addition to the QTL markers noted in FIG. 1, additional markers
linked to (showing linkage disequilibrium with) the QTL markers can also
be used to predict the tolerance or improved tolerance trait in a soybean
plant. In other words, any other marker showing less than 50%
recombination frequency (separated by a genetic distance less than 50 cM)
with a QTL marker of the invention (e.g., the markers provided in FIG. 1)
is also a feature of the invention. Any marker that is linked to a QTL
marker can also be used advantageously in marker-assisted selection for
the particular trait.
[0193] Genetic markers that are linked to QTL markers (e.g., QTL markers
provided in FIG. 1) are particularly useful when they are sufficiently
proximal (e.g., closely linked) to a given QTL marker so that the genetic
marker and the QTL marker display a low recombination frequency. In the
present invention, such closely linked markers are a feature of the
invention. As defined herein, closely linked markers display a
recombination frequency of about 10% or less (e.g., the given marker is
within 10 cM of the QTL). Put another way, these closely linked loci
co-segregate at least 90% of the time. Indeed, the closer a marker is to
a QTL marker, the more effective and advantageous that marker becomes as
an indicator for the desired trait.
[0194] Thus, in other embodiments, closely linked loci such as a QTL
marker locus and a second locus display an inter-locus cross-over
frequency of about 10% or less, preferably about 9% or less, still more
preferably about 8% or less, yet more preferably about 7% or less, still
more preferably about 6% or less, yet more preferably about 5% or less,
still more preferably about 4% or less, yet more preferably about 3% or
less, and still more preferably about 2% or less. In highly preferred
embodiments, the relevant loci (e.g., a marker locus and a target locus
such as a QTL) display a recombination a frequency of about 1% or less,
e.g., about 0.75% or less, more preferably about 0.5% or less, or yet
more preferably about 0.25% or less. Thus, the loci are about 10 cM, 9
cM, 8 cM, 7 cM, 6 cM, 5 cM, 4 cM, 3 cM, 2 cM, 1 cM, 0.75 cM, 0.5 cM or
0.25 cM or less apart. Put another way, two loci that are localized to
the same chromosome, and at such a distance that recombination between
the two loci occurs at a frequency of less than 10% (e.g., about 9%, 8%,
7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.75%, 0.5%, 0.25%, or less) are said to be
"proximal to" each other.
[0195] In some aspects, linked markers (including closely linked markers)
of the invention are determined by review of a genetic map, for example,
the integrated genetic map shown in FIG. 6. For example, it is shown
herein that the linkage group L marker SATT613 correlates with at least
one low-iron tolerance QTL. Markers that are linked to SATT613 (e.g.,
within 50 cM) can be determined from the map provided in FIG. 6. For
example, markers on linkage group L that are linked to SATT613 include:
TABLE-US-00004
Chrom. Position
Marker Name (cM)
SATT495 L 4.5
SATT232 L 7.4
SATT446 L 9.2
P10649C-3 L 12.5
SATT182 L 13.4
SAT_301 L 15.6
SAT_071 L 19.7
SATT238 L 19.7
SATT388 L 22.2
SATT143 L 31.8
SAT_134 L 32.4
SATT523 L 32.4
SATT278 L 33.2
SATT418 L 33.9
SATT711 L 34.0
SATT398 L 34.6
SATT497 L 42.3
SATT313 L 43.9
SATT613 L 45.1
SATT284 L 47.7
SATT462 L 49.3
SAT_340 L 65.2
SATT156 L 65.8
SATT481 L 65.8
SCT_010 L 68.5
S60392-TB L 70.0
SATT076 L 72.3
SATT265 L 75.0
SATT527 L 75.7
SATT561 L 75.7
SATT166 L 77.1
SATT448 L 78.0
SATT678 L 80.9
SAT_099 L 89.4
SAG1055 L 95.0
[0196] In other aspects, closely linked markers of the invention can be
determined by review of this same genetic map. For example, markers that
are closely linked (e.g., separated by not more than 10 cM) to SATT613 on
linkage group L include:
TABLE-US-00005
Chrom. Position
Marker Name (cM)
SATT497 L 42.3
SATT313 L 43.9
SATT613 L 45.1
SATT284 L 47.7
SATT462 L 49.3
[0197] Similarly, linked markers (including closely linked markers) of the
invention can be determined by review of any suitable soybean genetic
map. For example, the integrated genetic map described in Song et al.
(2004) also provides a means to identify linked (including closely
linked) markers. See, Song et al., "A New Integrated Genetic Linkage Map
of the Soybean," Theor. Appl. Genet., 109:122-128 [2004]; see also the
website for the Soybean Genomics and Improvement Laboratory (SGIL) at the
USDA Beltsville Agricultural Research Center (BARC), and see specifically
the description of projects in the Cregan Laboratory on that website.
That genetic map incorporates a variety of genetic markers that are known
in the art or alternatively are described in that reference. Detailed
descriptions of numerous markers, including many of those described in
Song et al. (2004) can be found at the SOYBASE website resource.
[0198] For example, according to the Song et al. (2004) integrated genetic
map, markers on linkage group L that are closely linked to SATT613
include:
A264.sub.--1, RGA.sub.--7, Satt523, Sat.sub.--134, i8.sub.--2,
A450.sub.--2, A106.sub.--1, Sat.sub.--405, Satt143, B124.sub.--2,
A459.sub.--1, Satt398, Satt694, Sat.sub.--195, Sat.sub.--388, Satt652,
Satt711, Sat.sub.--187, Satt418, Satt278, Sat.sub.--397, Sat.sub.--191,
Sat.sub.--320, A204.sub.--2, Satt497, G214.sub.--17, Satt313,
B164.sub.--1, G214.sub.--16, Satt613, A023.sub.--1, Satt284, AW508247,
Satt462, L050.sub.--7, E014.sub.--1 and A071.sub.--5.
[0199] It is not intended that the determination of linked or closely
linked markers be limited to the use of any particular soybean genetic
map. Indeed, a large number of soybean genetic maps is available and are
well known to one of skill in the art. Another map that finds use with
the invention in this respect is the integrated soybean genetic maps
found on the SOYBASE website resource. Alternatively still, the
determination of linked and closely linked markers can be made by the
generation of an experimental dataset and linkage analysis.
[0200] It is not intended that the identification of markers that are
linked (e.g., within about 50 cM or within about 10 cM) to the low iron
tolerance QTL markers identified herein be limited to any particular map
or methodology. The integrated genetic map provided in FIG. 6 serves only
as example for identifying linked markers. Indeed, linked markers as
defined herein can be determined from any genetic map known in the art
(an experimental map or an integrated map), or alternatively, can be
determined from any new mapping dataset.
[0201] It is noted that lists of linked and closely linked markers may
vary between maps and methodologies due to various factors. First, the
markers that are placed on any two maps may not be identical, and
furthermore, some maps may have a greater marker density than another
map. Also, the mapping populations, methodologies and algorithms used to
construct genetic maps can differ. One of skill in the art recognizes
that one genetic map is not necessarily more or less accurate than
another, and furthermore, recognizes that any soybean genetic map can be
used to determine markers that are linked and closely linked to the QTL
markers of the present invention.
Techniques for Marker Detection
[0202] The invention provides molecular markers that have a significant
probability of co-segregation with QTL that impart a low iron tolerance
phenotype. These QTL markers find use in marker assisted selection for
desired traits (tolerance or improved tolerance), and also have other
uses. It is not intended that the invention be limited to any particular
method for the detection of these markers.
[0203] Markers corresponding to genetic polymorphisms between members of a
population can be detected by numerous methods well-established in the
art (e.g., PCR-based sequence specific amplification, restriction
fragment length polymorphisms (RFLPs), isozyme markers, allele specific
hybridization (ASH), amplified variable sequences of the plant genome,
self-sustained sequence replication, simple sequence repeat (SSR), single
nucleotide polymorphism (SNP), random amplified polymorphic DNA ("RAPD")
or amplified fragment length polymorphisms (RFLP). In one additional
embodiment, the presence or absence of a molecular marker is determined
simply through nucleotide sequencing of the polymorphic marker region.
This method is readily adapted to high throughput analysis as are the
other methods noted above, e.g., using available high throughput
sequencing methods such as sequencing by hybridization.
[0204] In general, the majority of genetic markers rely on one or more
property of nucleic acids for their detection. For example, some
techniques for detecting genetic markers utilize hybridization of a probe
nucleic acid to nucleic acids corresponding to the genetic marker (e.g.,
amplified nucleic acids produced using genomic soybean DNA as a
template). Hybridization formats, including but not limited to solution
phase, solid phase, mixed phase, or in situ hybridization assays are
useful for allele detection. An extensive guide to the hybridization of
nucleic acids is found in Tijssen (1993) Laboratory Techniques in
Biochemistry and Molecular Biology--Hybridization with Nucleic Acid
Probes Elsevier, N.Y., as well as in Sambrook, Berger and Ausubel
(herein).
[0205] For example, markers that comprise restriction fragment length
polymorphisms (RFLP) are detected, e.g., by hybridizing a probe which is
typically a sub-fragment (or a synthetic oligonucleotide corresponding to
a sub-fragment) of the nucleic acid to be detected to restriction
digested genomic DNA. The restriction enzyme is selected to provide
restriction fragments of at least two alternative (or polymorphic)
lengths in different individuals or populations. Determining one or more
restriction enzyme that produces informative fragments for each cross is
a simple procedure, well known in the art. After separation by length in
an appropriate matrix (e.g., agarose or polyacrylamide) and transfer to a
membrane (e.g., nitrocellulose, nylon, etc.), the labeled probe is
hybridized under conditions which result in equilibrium binding of the
probe to the target followed by removal of excess probe by washing.
[0206] Nucleic acid probes to the marker loci can be cloned and/or
synthesized. Any suitable label can be used with a probe of the
invention. Detectable labels suitable for use with nucleic acid probes
include, for example, any composition detectable by spectroscopic,
radioisotopic, photochemical, biochemical, immunochemical, electrical,
optical or chemical means. Useful labels include biotin for staining with
labeled streptavidin conjugate, magnetic beads, fluorescent dyes,
radiolabels, enzymes, and colorimetric labels. Other labels include
ligands which bind to antibodies labeled with fluorophores,
chemiluminescent agents, and enzymes. A probe can also constitute
radiolabelled PCR primers that are used to generate a radiolabelled
amplicon. Labeling strategies for labeling nucleic acids and
corresponding detection strategies can be found, e.g., in Haugland (1996)
Handbook of Fluorescent Probes and Research Chemicals Sixth Edition by
Molecular Probes, Inc. (Eugene Oreg.); or Haugland (2001) Handbook of
Fluorescent Probes and Research Chemicals Eighth Edition by Molecular
Probes, Inc. (Eugene Oreg.) (Available on CD ROM).
[0207] Amplification-Based Detection Methods
[0208] PCR, RT-PCR and LCR are in particularly broad use as amplification
and amplification-detection methods for amplifying nucleic acids of
interest (e.g., those comprising marker loci), facilitating detection of
the markers. Details regarding the use of these and other amplification
methods can be found in any of a variety of standard texts, including,
e.g., Sambrook, Ausubel, Berger and Croy, herein. Many available biology
texts also have extended discussions regarding PCR and related
amplification methods. One of skill will appreciate that essentially any
RNA can be converted into a double stranded DNA suitable for restriction
digestion, PCR expansion and sequencing using reverse transcriptase and a
polymerase ("Reverse Transcription-PCR, or "RT-PCR"). See also, Ausubel,
Sambrook and Berger, above.
[0209] Real Time Amplification/Detection Methods
[0210] In one aspect, real time PCR or LCR is performed on the
amplification mixtures described herein, e.g., using molecular beacons or
TaqMan.TM. probes. A molecular beacon (MB) is an oligonucleotide or PNA
which, under appropriate hybridization conditions, self-hybridizes to
form a stem and loop structure. The MB has a label and a quencher at the
termini of the oligonucleotide or PNA; thus, under conditions that permit
intra-molecular hybridization, the label is typically quenched (or at
least altered in its fluorescence) by the quencher. Under conditions
where the MB does not display intra-molecular hybridization (e.g., when
bound to a target nucleic acid, e.g., to a region of an amplicon during
amplification), the MB label is unquenched. Details regarding standard
methods of making and using MBs are well established in the literature
and MBs are available from a number of commercial reagent sources. See
also, e.g., Leone et al. (1995) "Molecular beacon probes combined with
amplification by NASBA enable homogenous real-time detection of RNA."
Nucleic Acids Res. 26:2150-2155; Tyagi and Kramer (1996) "Molecular
beacons: probes that fluoresce upon hybridization" Nature Biotechnology
14:303-308; Blok and Kramer (1997) "Amplifiable hybridization probes
containing a molecular switch" Mol Cell Probes 11:187-194; Hsuih et al.
(1997) "Novel, ligation-dependent PCR assay for detection of hepatitis C
in serum" J Clin Microbiol 34:501-507; Kostrikis et al. (1998) "Molecular
beacons: spectral genotyping of human alleles" Science 279:1228-1229;
Sokol et al. (1998) "Real time detection of DNA:RNA hybridization in
living cells" Proc. Natl. Acad. Sci. U.S.A. 95:11538-11543; Tyagi et al.
(1998) "Multicolor molecular beacons for allele discrimination" Nature
Biotechnology 16:49-53; Bonnet et al. (1999) "Thermodynamic basis of the
chemical specificity of structured DNA probes" Proc. Natl. Acad. Sci.
U.S.A. 96:6171-6176; Fang et al. (1999) "Designing a novel molecular
beacon for surface-immobilized DNA hybridization studies" J. Am. Chem.
Soc. 121:2921-2922; Marras et al. (1999) "Multiplex detection of
single-nucleotide variation using molecular beacons" Genet. Anal. Biomol.
Eng. 14:151-156; and Vet et al. (1999) "Multiplex detection of four
pathogenic retroviruses using molecular beacons" Proc. Natl. Acad. Sci.
U.S.A. 96:6394-6399. Additional details regarding MB construction and use
is found in the patent literature, e.g., U.S. Pat. No. 5,925,517 (Jul.
20, 1999) to Tyagi et al. entitled "Detectably labeled dual conformation
oligonucleotide probes, assays and kits;" U.S. Pat. No. 6,150,097 to
Tyagi et al (Nov. 21, 2000) entitled "Nucleic acid detection probes
having non-FRET fluorescence quenching and kits and assays including such
probes" and U.S. Pat. No. 6,037,130 to Tyagi et al (Mar. 14, 2000),
entitled "Wavelength-shifting probes and primers and their use in assays
and kits."
[0211] PCR detection and quantification using dual-labeled fluorogenic
oligonucleotide probes, commonly referred to as "TaqMan.TM." probes, can
also be performed according to the present invention. These probes are
composed of short (e.g., 20-25 base) oligodeoxynucleotides that are
labeled with two different fluorescent dyes. On the 5' terminus of each
probe is a reporter dye, and on the 3' terminus of each probe a quenching
dye is found. The oligonucleotide probe sequence is complementary to an
internal target sequence present in a PCR amplicon. When the probe is
intact, energy transfer occurs between the two fluorophores and emission
from the reporter is quenched by the quencher by FRET. During the
extension phase of PCR, the probe is cleaved by 5' nuclease activity of
the polymerase used in the reaction, thereby releasing the reporter from
the oligonucleotide-quencher and producing an increase in reporter
emission intensity. Accordingly, TaqMan.TM. probes are oligonucleotides
that have a label and a quencher, where the label is released during
amplification by the exonuclease action of the polymerase used in
amplification. This provides a real time measure of amplification during
synthesis. A variety of TaqMan.TM. reagents are commercially available,
e.g., from Applied Biosystems (Division Headquarters in Foster City,
Calif.) as well as from a variety of specialty vendors such as Biosearch
Technologies (e.g., black hole quencher probes).
[0212] Additional Details Regarding Amplified Variable Sequences, SSR,
AFLP ASH, SNPs and Isozyme Markers
[0213] Amplified variable sequences refer to amplified sequences of the
plant genome which exhibit high nucleic acid residue variability between
members of the same species. All organisms have variable genomic
sequences and each organism (with the exception of a clone) has a
different set of variable sequences. Once identified, the presence of
specific variable sequence can be used to predict phenotypic traits.
Preferably, DNA from the plant serves as a template for amplification
with primers that flank a variable sequence of DNA. The variable sequence
is amplified and then sequenced.
[0214] Alternatively, self-sustained sequence replication can be used to
identify genetic markers. Self-sustained sequence replication refers to a
method of nucleic acid amplification using target nucleic acid sequences
which are replicated exponentially in vitro under substantially
isothermal conditions by using three enzymatic activities involved in
retroviral replication: (1) reverse transcriptase, (2) Rnase H, and (3) a
DNA-dependent RNA polymerase (Guatelli et al. (1990) Proc Natl Acad Sci
USA 87:1874). By mimicking the retroviral strategy of RNA replication by
means of cDNA intermediates, this reaction accumulates cDNA and RNA
copies of the original target.
[0215] Amplified fragment length polymophisms (AFLP) can also be used as
genetic markers (Vos et al. (1995) Nucl Acids Res 23:4407). The phrase
"amplified fragment length polymorphism" refers to selected restriction
fragments which are amplified before or after cleavage by a restriction
endonuclease. The amplification step allows easier detection of specific
restriction fragments. AFLP allows the detection large numbers of
polymorphic markers and has been used for genetic mapping of plants
(Becker et al. (1995) Mol Gen Genet 249:65; and Meksem et al. (1995) Mol
Gen Genet 249:74).
[0216] Allele-specific hybridization (ASH) can be used to identify the
genetic markers of the invention. ASH technology is based on the stable
annealing of a short, single-stranded, oligonucleotide probe to a
completely complementary single-strand target nucleic acid. Detection is
via an isotopic or non-isotopic label attached to the probe.
[0217] For each polymorphism, two or more different ASH probes are
designed to have identical DNA sequences except at the polymorphic
nucleotides. Each probe will have exact homology with one allele sequence
so that the range of probes can distinguish all the known alternative
allele sequences. Each probe is hybridized to the target DNA. With
appropriate probe design and hybridization conditions, a single-base
mismatch between the probe and target DNA will prevent hybridization. In
this manner, only one of the alternative probes will hybridize to a
target sample that is homozygous or homogenous for an allele. Samples
that are heterozygous or heterogeneous for two alleles will hybridize to
both of two alternative probes.
[0218] ASH markers are used as dominant markers where the presence or
absence of only one allele is determined from hybridization or lack of
hybridization by only one probe. The alternative allele may be inferred
from the lack of hybridization. ASH probe and target molecules are
optionally RNA or DNA; the target molecules are any length of nucleotides
beyond the sequence that is complementary to the probe; the probe is
designed to hybridize with either strand of a DNA target; the probe
ranges in size to conform to variously stringent hybridization
conditions, etc.
[0219] PCR allows the target sequence for ASH to be amplified from low
concentrations of nucleic acid in relatively small volumes. Otherwise,
the target sequence from genomic DNA is digested with a restriction
endonuclease and size separated by gel electrophoresis. Hybridizations
typically occur with the target sequence bound to the surface of a
membrane or, as described in U.S. Pat. No. 5,468,613, the ASH probe
sequence may be bound to a membrane.
[0220] In one embodiment, ASH data are typically obtained by amplifying
nucleic acid fragments (amplicons) from genomic DNA using PCR,
transferring the amplicon target DNA to a membrane in a dot-blot format,
hybridizing a labeled oligonucleotide probe to the amplicon target, and
observing the hybridization dots by autoradiography.
[0221] Single nucleotide polymorphisms (SNP) are markers that consist of a
shared sequence differentiated on the basis of a single nucleotide.
Typically, this distinction is detected by differential migration
patterns of an amplicon comprising the SNP on e.g., an acrylamide gel.
However, alternative modes of detection, such as hybridization, e.g.,
ASH, or RFLP analysis are also appropriate.
[0222] Isozyme markers can be employed as genetic markers, e.g., to track
markers other than the tolerance markers herein, or to track isozyme
markers linked to the markers herein. Isozymes are multiple forms of
enzymes that differ from one another in their amino acid, and therefore
their nucleic acid sequences. Some isozymes are multimeric enzymes
containing slightly different subunits. Other isozymes are either
multimeric or monomeric but have been cleaved from the proenzyme at
different sites in the amino acid sequence. Isozymes can be characterized
and analyzed at the protein level, or alternatively, isozymes which
differ at the nucleic acid level can be determined. In such cases any of
the nucleic acid based methods described herein can be used to analyze
isozyme markers.
[0223] Additional Details Regarding Nucleic Acid Amplification
[0224] As noted, nucleic acid amplification techniques such as PCR and LCR
are well known in the art and can be applied to the present invention to
amplify and/or detect nucleic acids of interest, such as nucleic acids
comprising marker loci. Examples of techniques sufficient to direct
persons of skill through such in vitro methods, including the polymerase
chain reaction (PCR), the ligase chain reaction (LCR), Q.beta.-replicase
amplification and other RNA polymerase mediated techniques (e.g., NASBA),
are found in the references noted above, e.g., Innis, Sambrook, Ausubel,
Berger and Croy. Additional details are found in Mullis et al. (1987)
U.S. Pat. No. 4,683,202; Arnheim & Levinson (Oct. 1, 1990) C&EN 36-47;
The Journal Of NIH Research (1991) 3, 81-94; (Kwoh et al. (1989) Proc.
Natl. Acad. Sci. USA 86, 1173; Guatelli et al. (1990) Proc. Natl. Acad.
Sci. USA 87, 1874; Lomell et al. (1989) J. Clin. Chem 35, 1826; Landegren
et al., (1988) Science 241, 1077-1080; Van Brunt (1990) Biotechnology 8,
291-294; Wu and Wallace, (1989) Gene 4, 560; Barringer et al. (1990) Gene
89, 117, and Sooknanan and Malek (1995) Biotechnology 13: 563-564.
Improved methods of amplifying large nucleic acids by PCR, which is
useful in the context of positional cloning, are further summarized in
Cheng et al. (1994) Nature 369: 684, and the references therein, in which
PCR amplicons of up to 40 kb are generated.
[0225] Detection of Markers for Positional Cloning
[0226] In some embodiments, a nucleic acid probe is used to detect a
nucleic acid that comprises a marker sequence. Such probes can be used,
for example, in positional cloning to isolate nucleotide sequences linked
to the marker nucleotide sequence. It is not intended that the nucleic
acid probes of the invention be limited to any particular size. In some
embodiments, nucleic acid probe is at least 20 nucleotides in length, or
alternatively, at least 50 nucleotides in length, or alternatively, at
least 100 nucleotides in length, or alternatively, at least 200
nucleotides in length.
[0227] A hybridized probe is detected using, autoradiography, fluorography
or other similar detection techniques depending on the label to be
detected. Examples of specific hybridization protocols are widely
available in the art, see, e.g., Berger, Sambrook, and Ausubel, all
herein.
[0228] Probe/Primer Synthesis Methods
[0229] In general, synthetic methods for making oligonucleotides,
including probes, primers, molecular beacons, PNAs, LNAs (locked nucleic
acids), etc., are well known. For example, oligonucleotides can be
synthesized chemically according to the solid phase phosphoramidite
triester method described by Beaucage and Caruthers (1981), Tetrahedron.
Letts., 22(20):1859-1862, e.g., using a commercially available automated
synthesizer, e.g., as described in Needham-VanDevanter et al. (1984)
Nucleic Acids Res., 12:6159-6168. Oligonucleotides, including modified
oligonucleotides can also be ordered from a variety of commercial sources
known to persons of skill. There are many commercial providers of oligo
synthesis services, and thus this is a broadly accessible technology. Any
nucleic acid can be custom ordered from any of a variety of commercial
sources, such as The Midland Certified Reagent Company, The Great
American Gene Company, ExpressGen Inc., Operon Technologies Inc.
(Alameda, Calif.) and many others. Similarly, PNAs can be custom ordered
from any of a variety of sources, such as PeptidoGenic, HTI Bio-products,
Inc., BMA Biomedicals Ltd (U.K.), Bio-Synthesis, Inc., and many others.
[0230] In Silico Marker Detection
[0231] In alternative embodiments, in silico methods can be used to detect
the marker loci of interest. For example, the sequence of a nucleic acid
comprising the marker locus of interest can be stored in a computer. The
desired marker locus sequence or its homolog can be identified using an
appropriate nucleic acid search algorithm as provided by, for example, in
such readily available programs as BLAST, or even simple word processors.
Amplification Primers for Marker Detection
[0232] In some preferred embodiments, the molecular markers of the
invention are detected using a suitable PCR-based detection method, where
the size or sequence of the PCR amplicon is indicative of the absence or
presence of the marker (e.g., a particular marker allele). In these types
of methods, PCR primers are hybridized to the conserved regions flanking
the polymorphic marker region. As used in the art, PCR primers used to
amplify a molecular marker are sometimes termed "PCR markers" or simply
"markers."
[0233] It will be appreciated that, although many specific examples of
primers are provided herein (see, FIGS. 2 and 3), suitable primers to be
used with the invention can be designed using any suitable method. It is
not intended that the invention be limited to any particular primer or
primer pair. For example, primers can be designed using any suitable
software program, such as LASERGENE.RTM..
[0234] In some embodiments, the primers of the invention are
radiolabelled, or labeled by any suitable means (e.g., using a
non-radioactive fluorescent tag), to allow for rapid visualization of the
different size amplicons following an amplification reaction without any
additional labeling step or visualization step. In some embodiments, the
primers are not labeled, and the amplicons are visualized following their
size resolution, e.g., following agarose gel electrophoresis. In some
embodiments, ethidium bromide staining of the PCR amplicons following
size resolution allows visualization of the different size amplicons.
[0235] It is not intended that the primers of the invention be limited to
generating an amplicon of any particular size. For example, the primers
used to amplify the marker loci and alleles herein are not limited to
amplifying the entire region of the relevant locus. The primers can
generate an amplicon of any suitable length that is longer or shorter
than those given in the allele definitions in FIG. 4. In some
embodiments, marker amplification produces an amplicon at least 20
nucleotides in length, or alternatively, at least 50 nucleotides in
length, or alternatively, at least 100 nucleotides in length, or
alternatively, at least 200 nucleotides in length. Marker alleles in
addition to those recited in FIG. 4 also find use with the present
invention.
Marker Assisted Selection and Breeding of Plants
[0236] A primary motivation for development of molecular markers in crop
species is the potential for increased efficiency in plant breeding
through marker assisted selection (MAS). Genetic markers are used to
identify plants that contain a desired genotype at one or more loci, and
that are expected to transfer the desired genotype, along with a desired
phenotype to their progeny. Genetic markers can be used to identify
plants that contain a desired genotype at one locus, or at several
unlinked or linked loci (e.g., a haplotype), and that would be expected
to transfer the desired genotype, along with a desired phenotype to their
progeny. The present invention provides the means to identify plants,
particularly soybean plants, that are tolerant, exhibit improved
tolerance or are susceptible to low iron growth conditions by identifying
plants having a specified allele at one of those loci, e.g., S60210-TB,
SAC1006, SATT391, SAC1724, SATT307, P13073A-1, P10598A-1, SATT334,
SATT510, SATT335, P5219A-1, P7659A-2, SAT.sub.--117, SATT191, S60143-TB,
SATT451, SATT367, SATT495, P10649C-3, SATT613, SATT257, SATT581 and/or
SATT153. Similarly, by identifying plants lacking the desired marker
locus, susceptible or less tolerant plants can be identified, and, e.g.,
eliminated from subsequent crosses. Similarly, these marker loci can be
introgressed into any desired genomic background, germplasm, plant, line,
variety, etc., as part of an overall MAS breeding program designed to
enhance soybean yield.
[0237] The invention also provides chromosome QTL intervals that find
equal use in MAS to select plants that demonstrate low iron tolerance or
improved tolerance. Similarly, the QTL intervals can also be used to
counter-select plants that are susceptible or have reduced tolerance to
low iron growth conditions. Any marker that maps within the QTL interval
(including the termini of the intervals) finds use with the invention.
These intervals are defined by the following pairs of markers:
[0238] (a) S60210-TB and SATT391 (LG-C1);
[0239] (b) P10598A-1 and SATT334 (LG-F);
[0240] (c) SATT510 and SATT335 (LG-F);
[0241] (d) P5219A-1 and P7659A-2 (LG-G);
[0242] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0243] (f) SATT451 and SATT367 (LG-I);
[0244] (g) SATT495 and P10649C-3 (LG-L); and
[0245] (h) SATT250 and SATT346 (LG-M).
[0246] In general, MAS uses polymorphic markers that have been identified
as having a significant likelihood of co-segregation with a tolerance
trait. Such markers are presumed to map near a gene or genes that give
the plant its tolerance phenotype, and are considered indicators for the
desired trait, and are termed QTL markers. Plants are tested for the
presence of a desired allele in the QTL marker. The most preferred
markers (or marker alleles) are those that have the strongest association
with the tolerance trait.
[0247] Linkage analysis is used to determine which polymorphic marker
allele demonstrates a statistical likelihood of co-segregation with the
tolerance phenotype (thus, a "tolerance marker allele"). Following
identification of a marker allele for co-segregation with the tolerance
phenotype, it is possible to use this marker for rapid, accurate
screening of plant lines for the tolerance allele without the need to
grow the plants through their life cycle and await phenotypic
evaluations, and furthermore, permits genetic selection for the
particular tolerance allele even when the molecular identity of the
actual tolerance QTL is unknown. Tissue samples can be taken, for
example, from the first leaf of the plant and screened with the
appropriate molecular marker, and it is rapidly determined which progeny
will advance. Linked markers also remove the impact of environmental
factors that can often influence phenotypic expression.
[0248] A polymorphic QTL marker locus can be used to select plants that
contain the marker allele (or alleles) that correlate with the desired
tolerance phenotype, typically called marker-assisted selection (MAS). In
brief, a nucleic acid corresponding to the marker nucleic acid allele is
detected in a biological sample from a plant to be selected. This
detection can take the form of hybridization of a probe nucleic acid to a
marker allele or amplicon thereof, e.g., using allele-specific
hybridization, Southern analysis, northern analysis, in situ
hybridization, hybridization of primers followed by PCR amplification of
a region of the marker, or the like. A variety of procedures for
detecting markers are described herein, e.g., in the section entitled
"TECHNIQUES FOR MARKER DETECTION." After the presence (or absence) of a
particular marker allele in the biological sample is verified, the plant
is selected, e.g., used to make progeny plants by selective breeding.
[0249] Soybean plant breeders desire combinations of tolerance loci with
genes for high yield and other desirable traits to develop improved
soybean varieties. Screening large numbers of samples by non-molecular
methods (e.g., trait evaluation in soybean plants) can be expensive, time
consuming, and unreliable. Use of the polymorphic markers described
herein, when genetically-linked to tolerance loci, provide an effective
method for selecting resistant varieties in breeding programs. For
example, one advantage of marker-assisted selection over field
evaluations for tolerance resistance is that MAS can be done at any time
of year, regardless of the growing season. Moreover, environmental
effects are largely irrelevant to marker-assisted selection.
[0250] When a population is segregating for multiple loci affecting one or
multiple traits, e.g., multiple loci involved in tolerance, or multiple
loci each involved in tolerance or resistance to different diseases, the
efficiency of MAS compared to phenotypic screening becomes even greater,
because all of the loci can be evaluated in the lab together from a
single sample of DNA. In the present instance, the S60210-TB, SAC1006,
SATT391, SAC1724, SATT307, P13073A-1, P10598A-1, SATT334, SATT510,
SATT335, P5219A-1, P7659A-2, SAT.sub.--117, SATT191, S60143-TB, SATT451,
SATT367, SATT495, P10649C-3, SATT613, SATT257, SATT581 and SATT153
markers, as well as any of the chromosome intervals:
[0251] (a) S60210-TB and SATT391 (LG-C1);
[0252] (b) P10598A-1 and SATT334 (LG-F);
[0253] (c) SATT510 and SATT335 (LG-F);
[0254] (d) P5219A-1 and P7659A-2 (LG-G);
[0255] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0256] (I) SATT451 and SATT367 (LG-I);
[0257] (g) SATT495 and P10649C-3 (LG-L); and
[0258] (h) SATT250 and SATT346 (LG-M),
can be assayed simultaneously or sequentially in a single sample or
population of samples.
[0259] Another use of MAS in plant breeding is to assist the recovery of
the recurrent parent genotype by backcross breeding. Backcross breeding
is the process of crossing a progeny back to one of its parents or parent
lines. Backcrossing is usually done for the purpose of introgressing one
or a few loci from a donor parent (e.g., a parent comprising desirable
tolerance marker loci) into an otherwise desirable genetic background
from the recurrent parent (e.g., an otherwise high yielding soybean
line). The more cycles of backcrossing that are done, the greater the
genetic contribution of the recurrent parent to the resulting
introgressed variety. This is often necessary, because tolerant plants
may be otherwise undesirable, e.g., due to low yield, low fecundity, or
the like. In contrast, strains which are the result of intensive breeding
programs may have excellent yield, fecundity or the like, merely being
deficient in one desired trait such as tolerance to low iron growth
conditions.
[0260] The presence and/or absence of a particular genetic marker or
allele, e.g., S60210-TB, SAC1006, SATT391, SAC1724, SATT307, P13073A-1,
P10598A-1, SATT334, SATT510, SATT335, P5219A-1, P7659A-2, SAT.sub.--117,
SATT191, S60143-TB, SATT451, SATT367, SATT495, P10649C-3, SATT613,
SATT257, SATT581 and/or SATT153 markers, as well as any of the chromosome
intervals
[0261] (a) S60210-TB and SATT391 (LG-C1);
[0262] (b) P10598A-1 and SATT334 (LG-F);
[0263] (c) SATT510 and SATT335 (LG-F);
[0264] (d) P5219A-1 and P7659A-2 (LG-G);
[0265] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0266] (f) SATT451 and SATT367 (LG-I);
[0267] (g) SATT495 and P10649C-3 (LG-L); and
[0268] (h) SATT250 and SATT346 (LG-M),
in the genome of a plant is made by any method noted herein. If the
nucleic acids from the plant are positive for a desired genetic marker
allele, the plant can be self fertilized to create a true breeding line
with the same genotype, or it can be crossed with a plant with the same
marker or with other desired characteristics to create a sexually crossed
hybrid generation.
[0269] Introgression Of Favorable Alleles--Efficient Backcrossing of
Tolerance Markers into Elite Lines
[0270] One application of MAS, in the context of the present invention is
to use the tolerance or improved tolerance markers to increase the
efficiency of an introgression or backcrossing effort aimed at
introducing a tolerance QTL into a desired (typically high yielding)
background. In marker assisted backcrossing of specific markers (and
associated QTL) from a donor source, e.g., to an elite or exotic genetic
background, one selects among backcross progeny for the donor trait and
then uses repeated backcrossing to the elite or exotic line to
reconstitute as much of the elite/exotic background's genome as possible.
[0271] Thus, the markers and methods of the present invention can be
utilized to guide marker assisted selection or breeding of soybean
varieties with the desired complement (set) of allelic forms of
chromosome segments associated with superior agronomic performance
(tolerance, along with any other available markers for yield, disease
resistance, etc.). Any of the disclosed marker alleles can be introduced
into a soybean line via introgression, by traditional breeding (or
introduced via transformation, or both) to yield a soybean plant with
superior agronomic performance. The number of alleles associated with
tolerance that can be introduced or be present in a soybean plant of the
present invention ranges from 1 to the number of alleles disclosed
herein, each integer of which is incorporated herein as if explicitly
recited.
[0272] The present invention also extends to a method of making a progeny
soybean plant and these progeny soybean plants, per se. The method
comprises crossing a first parent soybean plant with a second soybean
plant and growing the female soybean plant under plant growth conditions
to yield soybean plant progeny. Methods of crossing and growing soybean
plants are well within the ability of those of ordinary skill in the art.
Such soybean plant progeny can be assayed for alleles associated with
tolerance and, thereby, the desired progeny selected. Such progeny plants
or seed can be sold commercially for soybean production, used for food,
processed to obtain a desired constituent of the soybean, or further
utilized in subsequent rounds of breeding. At least one of the first or
second soybean plants is a soybean plant of the present invention in that
it comprises at least one of the allelic forms of the markers of the
present invention, such that the progeny are capable of inheriting the
allele.
[0273] Often, a method of the present invention is applied to at least one
related soybean plant such as from progenitor or descendant lines in the
subject soybean plants pedigree such that inheritance of the desired
tolerance allele can be traced. The number of generations separating the
soybean plants being subject to the methods of the present invention will
generally be from 1 to 20, commonly 1 to 5, and typically 1, 2, or 3
generations of separation, and quite often a direct descendant or parent
of the soybean plant will be subject to the method (i.e., one generation
of separation).
[0274] Introgression of Favorable Alleles--Incorporation of "Exotic"
Germplasm while Maintaining Breeding Progress
[0275] Genetic diversity is important for long term genetic gain in any
breeding program. With limited diversity, genetic gain will eventually
plateau when all of the favorable alleles have been fixed within the
elite population. One objective is to incorporate diversity into an elite
pool without losing the genetic gain that has already been made and with
the minimum possible investment. MAS provide an indication of which
genomic regions and which favorable alleles from the original ancestors
have been selected for and conserved over time, facilitating efforts to
incorporate favorable variation from exotic germplasm sources (parents
that are unrelated to the elite gene pool) in the hopes of finding
favorable alleles that do not currently exist in the elite gene pool.
[0276] For example, the markers of the present invention can be used for
MAS in crosses involving elite x exotic soybean lines by subjecting the
segregating progeny to MAS to maintain major yield alleles, along with
the tolerance marker alleles herein.
Positional Cloning
[0277] The molecular marker loci and alleles of the present invention,
e.g., S60210-TB, SAC1006, SATT391, SAC1724, SATT307, P13073A-1,
P10598A-1, SATT334, SATT510, SATT335, P5219A-1, P7659A-2, SAT.sub.--117,
SATT191, S60143-TB, SATT451, SATT367, SATT495, P10649C-3, SATT613,
SATT257, SATT581 and SATT153 markers, as well as any of the chromosome
intervals
[0278] (a) S60210-TB and SATT391 (LG-C1);
[0279] (b) P10598A-1 and SATT334 (LG-F);
[0280] (c) SATT510 and SATT335 (LG-F);
[0281] (d) P5219A-1 and P7659A-2 (LG-G);
[0282] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0283] (f) SATT451 and SATT367 (LG-I);
[0284] (g) SATT495 and P10649C-3 (LG-L); and
[0285] (h) SATT250 and SATT346 (LG-M),
can be used, as indicated previously, to identify a tolerance QTL, which
can be cloned by well established procedures, e.g., as described in
detail in Ausubel, Berger and Sambrook, herein.
[0286] These tolerance clones are first identified by their genetic
linkage to markers of the present invention. Isolation of a nucleic acid
of interest is achieved by any number of methods as discussed in detail
in such references as Ausubel, Berger and Sambrook, herein, and Clark,
Ed. (1997) Plant Molecular Biology: A Laboratory Manual Springer-Verlag,
Berlin.
[0287] For example, "positional gene cloning" uses the proximity of a
tolerance marker to physically define an isolated chromosomal fragment
containing a tolerance QTL gene. The isolated chromosomal fragment can be
produced by such well known methods as digesting chromosomal DNA with one
or more restriction enzymes, or by amplifying a chromosomal region in a
polymerase chain reaction (PCR), or any suitable alternative
amplification reaction. The digested or amplified fragment is typically
ligated into a vector suitable for replication, and, e.g., expression, of
the inserted fragment. Markers that are adjacent to an open reading frame
(ORF) associated with a phenotypic trait can hybridize to a DNA clone
(e.g., a clone from a genomic DNA library), thereby identifying a clone
on which an ORF (or a fragment of an ORF) is located. If the marker is
more distant, a fragment containing the open reading frame is identified
by successive rounds of screening and isolation of clones which together
comprise a contiguous sequence of DNA, a process termed "chromosome
walking", resulting in a "contig" or "contig map." Protocols sufficient
to guide one of skill through the isolation of clones associated with
linked markers are found in, e.g. Berger, Sambrook and Ausubel, all
herein.
Generation of Transgenic Cells and Plants
[0288] The present invention also relates to host cells and organisms
which are transformed with nucleic acids corresponding to tolerance QTL
identified according to the invention. For example, such nucleic acids
include chromosome intervals (e.g., genomic fragments), ORFs and/or cDNAs
that encode a tolerance or improved tolerance trait. Additionally, the
invention provides for the production of polypeptides that provide
tolerance or improved tolerance by recombinant techniques.
[0289] General texts which describe molecular biological techniques for
the cloning and manipulation of nucleic acids and production of encoded
polypeptides include Berger and Kimmel, Guide to Molecular Cloning
Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San
Diego, Calif. (Berger); Sambrook et al., Molecular Cloning--A Laboratory
Manual (3rd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring
Harbor, N.Y., 2001 ("Sambrook") and Current Protocols in Molecular
Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture
between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc.,
(supplemented through 2004 or later) ("Ausubel")). These texts describe
mutagenesis, the use of vectors, promoters and many other relevant topics
related to, e.g., the generation of clones that comprise nucleic acids of
interest, e.g., marker loci, marker probes, QTL that segregate with
marker loci, etc.
[0290] Host cells are genetically engineered (e.g., transduced,
transfected, transformed, etc.) with the vectors of this invention (e.g.,
vectors, such as expression vectors which comprise an ORF derived from or
related to a tolerance QTL) which can be, for example, a cloning vector,
a shuttle vector or an expression vector. Such vectors are, for example,
in the form of a plasmid, a phagemid, an agrobacterium, a virus, a naked
polynucleotide (linear or circular), or a conjugated polynucleotide.
Vectors can be introduced into bacteria, especially for the purpose of
propagation and expansion. The vectors are also introduced into plant
tissues, cultured plant cells or plant protoplasts by a variety of
standard methods known in the art, including but not limited to
electroporation (From et al. (1985) Proc. Natl. Acad. Sci. USA 82; 5824),
infection by viral vectors such as cauliflower mosaic virus (CaMV) (Hohn
et al. (1982) Molecular Biology of Plant Tumors (Academic Press, New
York, pp. 549-560; Howell U.S. Pat. No. 4,407,956), high velocity
ballistic penetration by small particles with the nucleic acid either
within the matrix of small beads or particles, or on the surface (Klein
et al. (1987) Nature 327; 70), use of pollen as vector (WO 85/01856), or
use of Agrobacterium tumefaciens or A. rhizogenes carrying a T-DNA
plasmid in which DNA fragments are cloned. The T-DNA plasmid is
transmitted to plant cells upon infection by Agrobacterium tumefaciens,
and a portion is stably integrated into the plant genome (Horsch et al.
(1984) Science 233; 496; Fraley et al. (1983) Proc. Natl. Acad. Sci. USA
80; 4803). Additional details regarding nucleic acid introduction methods
are found in Sambrook, Berger and Ausubel, infra. The method of
introducing a nucleic acid of the present invention into a host cell is
not critical to the instant invention, and it is not intended that the
invention be limited to any particular method for introducing exogenous
genetic material into a host cell. Thus, any suitable method, e.g.,
including but not limited to the methods provided herein, which provides
for effective introduction of a nucleic acid into a cell or protoplast
can be employed and finds use with the invention.
[0291] The engineered host cells can be cultured in conventional nutrient
media modified as appropriate for such activities as, for example,
activating promoters or selecting transformants. These cells can
optionally be cultured into transgenic plants. In addition to Sambrook,
Berger and Ausubel, all infra, Plant regeneration from cultured
protoplasts is described in Evans et al. (1983) "Protoplast Isolation and
Culture," Handbook of Plant Cell Cultures 1, 124-176 (MacMillan
Publishing Co., New York; Davey (1983) "Recent Developments in the
Culture and Regeneration of Plant Protoplasts," Protoplasts, pp. 12-29,
(Birkhauser, Basel); Dale (1983) "Protoplast Culture and Plant
Regeneration of Cereals and Other Recalcitrant Crops," Protoplasts pp.
31-41, (Birkhauser, Basel); Binding (1985) "Regeneration of Plants,"
Plant Protoplasts, pp. 21-73, (CRC Press, Boca Raton, Fla.). Additional
details regarding plant cell culture and regeneration include Payne et
al. (1992) Plant Cell and Tissue Culture in Liquid Systems John Wiley &
Sons, Inc. New York, N.Y.; Gamborg and Phillips (eds) (1995) Plant Cell,
Tissue and Organ Culture; Fundamental Methods Springer Lab Manual,
Springer-Verlag (Berlin Heidelberg New York) and Plant Molecular Biolgy
(1993) R. R. D. Croy, Ed. Bios Scientific Publishers, Oxford, U.K. ISBN 0
12 198370 6. Cell culture media in general are also set forth in Atlas
and Parks (eds) The Handbook of Microbiological Media (1993) CRC Press,
Boca Raton, Fla. Additional information for cell culture is found in
available commercial literature such as the Life Science Research Cell
Culture Catalogue (1998) from Sigma-Aldrich, Inc (St Louis, Mo.)
("Sigma-LSRCCC") and, e.g., the Plant Culture Catalogue and supplement
(e.g., 1997 or later) also from Sigma-Aldrich, Inc (St Louis, Mo.)
("Sigma-PCCS").
[0292] The present invention also relates to the production of transgenic
organisms, which may be bacteria, yeast, fungi, animals or plants,
transduced with the nucleic acids of the invention (e.g., nucleic acids
comprising the marker loci and/or QTL noted herein). A thorough
discussion of techniques relevant to bacteria, unicellular eukaryotes and
cell culture is found in references enumerated herein and are briefly
outlined as follows. Several well-known methods of introducing target
nucleic acids into bacterial cells are available, any of which may be
used in the present invention. These include: fusion of the recipient
cells with bacterial protoplasts containing the DNA, treatment of the
cells with liposomes containing the DNA, electroporation, projectile
bombardment (biolistics), carbon fiber delivery, and infection with viral
vectors (discussed further, below), etc. Bacterial cells can be used to
amplify the number of plasmids containing DNA constructs of this
invention. The bacteria are grown to log phase and the plasmids within
the bacteria can be isolated by a variety of methods known in the art
(see, for instance, Sambrook). In addition, a plethora of kits are
commercially available for the purification of plasmids from bacteria.
For their proper use, follow the manufacturer's instructions (see, for
example, EasyPrep.TM., FlexiPrep.TM., both from Pharmacia Biotech;
StrataClean.TM., from Stratagene; and, QIAprep.TM. from Qiagen). The
isolated and purified plasmids are then further manipulated to produce
other plasmids, used to transfect plant cells or incorporated into
Agrobacterium tumefaciens related vectors to infect plants. Typical
vectors contain transcription and translation terminators, transcription
and translation initiation sequences, and promoters useful for regulation
of the expression of the particular target nucleic acid. The vectors
optionally comprise generic expression cassettes containing at least one
independent terminator sequence, sequences permitting replication of the
cassette in eukaryotes, or prokaryotes, or both, (e.g., shuttle vectors)
and selection markers for both prokaryotic and eukaryotic systems.
Vectors are suitable for replication and integration in prokaryotes,
eukaryotes, or preferably both. See, Gillman & Smith (1979) Gene 8:81;
Roberts et al. (1987) Nature 328:731; Schneider et al. (1995) Protein
Expr. Purif: 6435:10; Ausubel, Sambrook, Berger (all infra). A catalogue
of Bacteria and Bacteriophages useful for cloning is provided, e.g., by
the ATCC, e.g., The ATCC Catalogue of Bacteria and Bacteriophage (1992)
Gherna et al. (eds) published by the ATCC. Additional basic procedures
for sequencing, cloning and other aspects of molecular biology and
underlying theoretical considerations are also found in Watson et al.
(1992) Recombinant DNA, Second Edition, Scientific American Books, NY. In
addition, essentially any nucleic acid (and virtually any labeled nucleic
acid, whether standard or non-standard) can be custom or standard ordered
from any of a variety of commercial sources, such as the Midland
Certified Reagent Company (Midland, Tex.), The Great American Gene
Company (Ramona, Calif.), ExpressGen Inc. (Chicago, Ill.), Operon
Technologies Inc. (Alameda, Calif.) and many others.
[0293] Introducing Nucleic Acids into Plants.
[0294] Embodiments of the present invention pertain to the production of
transgenic plants comprising the cloned nucleic acids, e.g., isolated
ORFs and cDNAs encoding tolerance genes. Techniques for transforming
plant cells with nucleic acids are widely available and can be readily
adapted to the invention. In addition to Berger, Ausubel and Sambrook,
all infra, useful general references for plant cell cloning, culture and
regeneration include Jones (ed) (1995) Plant Gene Transfer and Expression
Protocols--Methods in Molecular Biology, Volume 49 Humana Press Towata
N.J.; Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems
John Wiley & Sons, Inc. New York, N.Y. (Payne); and Gamborg and Phillips
(eds) (1995) Plant Cell, Tissue and Organ Culture; Fundamental Methods
Springer Lab Manual, Springer-Verlag (Berlin Heidelberg New York)
(Gamborg). A variety of cell culture media are described in Atlas and
Parks (eds) The Handbook of Microbiological Media (1993) CRC Press, Boca
Raton, Fla. (Atlas). Additional information for plant cell culture is
found in available commercial literature such as the Life Science
Research Cell Culture Catalogue (1998) from Sigma-Aldrich, Inc (St Louis,
Mo.) (Sigma-LSRCCC) and, e.g., the Plant Culture Catalogue and supplement
(1997) also from Sigma-Aldrich, Inc (St Louis, Mo.) (Sigma-PCCS).
Additional details regarding plant cell culture are found in Croy, (ed.)
(1993) Plant Molecular Biology, Bios Scientific Publishers, Oxford, U.K.
[0295] The nucleic acid constructs of the invention, e.g., plasmids,
cosmids, artificial chromosomes, DNA and RNA polynucleotides, are
introduced into plant cells, either in culture or in the organs of a
plant by a variety of conventional techniques. Where the sequence is
expressed, the sequence is optionally combined with transcriptional and
translational initiation regulatory sequences which direct the
transcription or translation of the sequence from the exogenous DNA in
the intended tissues of the transformed plant.
[0296] Isolated nucleic acid acids of the present invention can be
introduced into plants according to any of a variety of techniques known
in the art. Techniques for transforming a wide variety of higher plant
species are also well known and described in widely available technical,
scientific, and patent literature. See, for example, Weising et al.
(1988) Ann. Rev. Genet. 22:421-477.
[0297] The DNA constructs of the invention, for example plasmids,
phagemids, cosmids, phage, naked or variously conjugated-DNA
polynucleotides, (e.g., polylysine-conjugated DNA, peptide-conjugated
DNA, liposome-conjugated DNA, etc.), or artificial chromosomes, can be
introduced directly into the genomic DNA of the plant cell using
techniques such as electroporation and microinjection of plant cell
protoplasts, or the DNA constructs can be introduced directly to plant
cells using ballistic methods, such as DNA particle bombardment.
[0298] Microinjection techniques for injecting plant, e.g., cells,
embryos, callus and protoplasts, are known in the art and well described
in the scientific and patent literature. For example, a number of methods
are described in Jones (ed) (1995) Plant Gene Transfer and Expression
Protocols--Methods in Molecular Biology, Volume 49 Humana Press, Towata,
N.J., as well as in the other references noted herein and available in
the literature.
[0299] For example, the introduction of DNA constructs using polyethylene
glycol precipitation is described in Paszkowski, et al., EMBO J. 3:2717
(1984). Electroporation techniques are described in Fromm, et al., Proc.
Natl. Acad. Sci. USA 82:5824 (1985). Ballistic transformation techniques
are described in Klein, et al., Nature 327:70-73 (1987). Additional
details are found in Jones (1995) and Gamborg and Phillips (1995), supra,
and in U.S. Pat. No. 5,990,387.
[0300] Alternatively, and in some cases preferably, Agrobacterium mediated
transformation is employed to generate transgenic plants.
Agrobacterium-mediated transformation techniques, including disarming and
use of binary vectors, are also well described in the scientific
literature. See, for example, Horsch, et al. (1984) Science 233:496; and
Fraley et al. (1984) Proc. Nat'l. Acad. Sci. USA 80:4803 and recently
reviewed in Hansen and Chilton (1998) Current Topics in Microbiology
240:22 and Das (1998) Subcellular Biochemistry 29: Plant Microbe
Interactions, pp 343-363.
[0301] DNA constructs are optionally combined with suitable T-DNA flanking
regions and introduced into a conventional Agrobacterium tumefaciens host
vector. The virulence functions of the Agrobacterium tumefaciens host
will direct the insertion of the construct and adjacent marker into the
plant cell DNA when the cell is infected by the bacteria. See, U.S. Pat.
No. 5,591,616. Although Agrobacterium is useful primarily in dicots,
certain monocots can be transformed by Agrobacterium. For instance,
Agrobacterium transformation of maize is described in U.S. Pat. No.
5,550,318.
[0302] Other methods of transfection or transformation include (1)
Agrobacterium rhizogenes-mediated transformation (see, e.g., Lichtenstein
and Fuller (1987) In: Genetic Engineering, vol. 6, P W J Rigby, Ed.,
London, Academic Press; and Lichtenstein; C. P., and Draper (1985) In:
DNA Cloning, Vol. II, D. M. Glover, Ed., Oxford, IRI Press; WO 88/02405,
published Apr. 7, 1988, describes the use of A. rhizogenes strain A4 and
its Ri plasmid along with A. tumefaciens vectors pARC8 or pARC16 (2)
liposome-mediated DNA uptake (see, e.g., Freeman et al. (1984) Plant Cell
Physiol. 25:1353), (3) the vortexing method (see, e.g., Kindle (1990)
Proc. Natl. Acad. Sci., (USA) 87:1228.
[0303] DNA can also be introduced into plants by direct DNA transfer into
pollen as described by Zhou et al. (1983) Methods in Enzymology, 101:433;
D. Hess (1987) Intern Rev. Cytol. 107:367; Luo et al. (1988) Plant Mol.
Biol. Reporter 6:165. Expression of polypeptide coding genes can be
obtained by injection of the DNA into reproductive organs of a plant as
described by Pena et al. (1987) Nature 325:274. DNA can also be injected
directly into the cells of immature embryos and the desiccated embryos
rehydrated as described by Neuhaus et al. (1987) Theor. Appl. Genet.
75:30; and Benbrook et al. (1986) in Proceedings Bio Expo Butterworth,
Stoneham, Mass., pp. 27-54. A variety of plant viruses that can be
employed as vectors are known in the art and include cauliflower mosaic
virus (CaMV), geminivirus, brome mosaic virus, and tobacco mosaic virus.
[0304] Generation/Regeneration of Transgenic Plants
[0305] Transformed plant cells which are derived by any of the above
transformation techniques can be cultured to regenerate a whole plant
that possesses the transformed genotype and thus the desired phenotype.
Such regeneration techniques rely on manipulation of certain
phytohormones in a tissue culture growth medium, typically relying on a
biocide and/or herbicide marker which has been introduced together with
the desired nucleotide sequences. Plant regeneration from cultured
protoplasts is described in Payne et al. (1992) Plant Cell and Tissue
Culture in Liquid Systems John Wiley & Sons, Inc. New York, N.Y.; Gamborg
and Phillips (eds) (1995) Plant Cell, Tissue and Organ Culture;
Fundamental Methods Springer Lab Manual, Springer-Verlag (Berlin
Heidelberg New York); Evans et al. (1983) Protoplasts Isolation and
Culture, Handbook of Plant Cell Culture pp. 124-176, Macmillian
Publishing Company, New York; and Binding (1985) Regeneration of Plants,
Plant Protoplasts pp. 21-73, CRC Press, Boca Raton. Regeneration can also
be obtained from plant callus, explants, somatic embryos (Dandekar et al.
(1989) J. Tissue Cult. Meth. 12:145; McGranahan, et al. (1990) Plant Cell
Rep. 8:512) organs, or parts thereof. Such regeneration techniques are
described generally in Klee et al. (1987)., Ann. Rev. of Plant Phys.
38:467-486. Additional details are found in Payne (1992) and Jones
(1995), both supra, and Weissbach and Weissbach, eds. (1988) Methods for
Plant Molecular Biology Academic Press, Inc., San Diego, Calif. This
regeneration and growth process includes the steps of selection of
transformant cells and shoots, rooting the transformant shoots and growth
of the plantlets in soil. These methods are adapted to the invention to
produce transgenic plants bearing QTLs and other genes isolated according
to the methods of the invention.
[0306] In addition, the regeneration of plants containing the
polynucleotide of the present invention and introduced by Agrobacterium
into cells of leaf explants can be achieved as described by Horsch et al.
(1985) Science 227:1229-1231. In this procedure, transformants are grown
in the presence of a selection agent and in a medium that induces the
regeneration of shoots in the plant species being transformed as
described by Fraley et al. (1983) Proc. Natl. Acad. Sci. (U.S.A.)
80:4803. This procedure typically produces shoots within two to four
weeks and these transformant shoots are then transferred to an
appropriate root-inducing medium containing the selective agent and an
antibiotic to prevent bacterial growth. Transgenic plants of the present
invention may be fertile or sterile.
[0307] It is not intended that plant transformation and expression of
polypeptides that provide disease resistance, as provided by the present
invention, be limited to soybean species. Indeed, it is contemplated that
the polypeptides that provide disease tolerance in soybean can also
provide disease resistance when transformed and expressed in other
agronomically and horticulturally important species. Such species include
primarily dicots, e.g., of the families: Leguminosae (including pea,
beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, clover,
alfalfa, lupine, vetch, lotus, sweet clover, wisteria, and sweetpea);
and, Compositae (the largest family of vascular plants, including at
least 1,000 genera, including important commercial crops such as
sunflower).
[0308] Additionally, preferred targets for modification with the nucleic
acids of the invention, as well as those specified above, plants from the
genera: Allium, Apium, Arachis, Brassica, Capsicum, Cicer, Cucumis,
Curcubita, Daucus, Fagopyrum, Glycine, Helianthus, Lactuca, Lens,
Lycopersicon, Medicago, Pisum, Phaseolus, Solanum, Trifolium, Vigna, and
many others.
[0309] Common crop plants which are targets of the present invention
include soybean, sunflower, canola, peas, beans, lentils, peanuts, yam
beans, cowpeas, velvet beans, clover, alfalfa, lupine, vetch, sweet
clover, sweetpea, field pea, fava bean, broccoli, brussel sprouts,
cabbage, cauliflower, kale, kohlrabi, celery, lettuce, carrot, onion,
pepper, potato, eggplant and tomato.
[0310] In construction of recombinant expression cas
settes of the
invention, which include, for example, helper plasmids comprising
virulence functions, and plasmids or viruses comprising exogenous DNA
sequences such as structural genes, a plant promoter fragment is
optionally employed which directs expression of a nucleic acid in any or
all tissues of a regenerated plant. Examples of constitutive promoters
include the cauliflower mosaic virus (CaMV) 35S transcription initiation
region, the 1'- or 2'-promoter derived from T-DNA of Agrobacterium
tumefaciens, and other transcription initiation regions from various
plant genes known to those of skill. Alternatively, the plant promoter
may direct expression of the polynucleotide of the invention in a
specific tissue (tissue-specific promoters) or may be otherwise under
more precise environmental control (inducible promoters). Examples of
tissue-specific promoters under developmental control include promoters
that initiate transcription only in certain tissues, such as fruit, seeds
or flowers.
[0311] Any of a number of promoters which direct transcription in plant
cells can be suitable. The promoter can be either constitutive or
inducible. In addition to the promoters noted above, promoters of
bacterial origin that operate in plants include the octopine synthase
promoter, the nopaline synthase promoter and other promoters derived from
native Ti plasmids. See, Herrara-Estrella et al. (1983), Nature, 303:209.
Viral promoters include the 35S and 19S RNA promoters of cauliflower
mosaic virus. See, Odell et al. (1985) Nature, 313:810. Other plant
promoters include Kunitz trypsin inhibitor promoter (KTI), SCP1, SUP,
UCD3, the ribulose-1,3-bisphosphate carboxylase small subunit promoter
and the phaseolin promoter. The promoter sequence from the E8 gene and
other genes may also be used. The isolation and sequence of the E8
promoter is described in detail in Deikman and Fischer (1988) EMBO J.
7:3315. Many other promoters are in current use and can be coupled to an
exogenous DNA sequence to direct expression of the nucleic acid.
[0312] If expression of a polypeptide from a cDNA is desired, a
polyadenylation region at the 3'-end of the coding region is typically
included. The polyadenylation region can be derived from the natural
gene, from a variety of other plant genes, or from, e.g., T-DNA.
[0313] The vector comprising the sequences (e.g., promoters or coding
regions) from genes encoding expression products and transgenes of the
invention will typically include a nucleic acid subsequence, a marker
gene which confers a selectable, or alternatively, a screenable,
phenotype on plant cells. For example, the marker can encode biocide
tolerance, particularly antibiotic tolerance, such as tolerance to
kanamycin, G418, bleomycin, hygromycin, or herbicide tolerance, such as
tolerance to chlorosluforon, or phosphinothricin (the active ingredient
in the herbicides bialaphos or Basta). See, e.g., Padgette et al. (1996)
In: Herbicide-Resistant Crops (Duke, ed.), pp 53-84, CRC Lewis
Publishers, Boca Raton ("Padgette, 1996"). For example, crop selectivity
to specific herbicides can be conferred by engineering genes into crops
that encode appropriate herbicide metabolizing enzymes from other
organisms, such as microbes. See, Vasil (1996) In: Herbicide-Resistant
Crops (Duke, ed.), pp 85-91, CRC Lewis Publishers, Boca Raton) ("Vasil",
1996).
[0314] One of skill will recognize that after the recombinant expression
cassette is stably incorporated in transgenic plants and confirmed to be
operable, it can be introduced into other plants by sexual crossing. Any
of a number of standard breeding techniques can be used, depending upon
the species to be crossed. In vegetatively propagated crops, mature
transgenic plants can be propagated by the taking of cuttings or by
tissue culture techniques to produce multiple identical plants. Selection
of desirable transgenics is made and new varieties are obtained and
propagated vegetatively for commercial use. In seed propagated crops,
mature transgenic plants can be self crossed to produce a homozygous
inbred plant. The inbred plant produces seed containing the newly
introduced heterologous nucleic acid. These seeds can be grown to produce
plants that would produce the selected phenotype. Parts obtained from the
regenerated plant, such as flowers, seeds, leaves, branches, fruit, and
the like are included in the invention, provided that these parts
comprise cells comprising the isolated nucleic acid of the present
invention. Progeny and variants, and mutants of the regenerated plants
are also included within the scope of the invention, provided that these
parts comprise the introduced nucleic acid sequences.
[0315] Transgenic or introgressed plants expressing a polynucleotide of
the present invention can be screened for transmission of the nucleic
acid of the present invention by, for example, standard nucleic acid
detection methods or by immunoblot protocols. Expression at the RNA level
can be determined to identify and quantitate expression-positive plants.
Standard techniques for RNA analysis can be employed and include RT-PCR
amplification assays using oligonucleotide primers designed to amplify
only heterologous or introgressed RNA templates and solution
hybridization assays using marker or linked QTL specific probes. Plants
can also be analyzed for protein expression, e.g., by Western immunoblot
analysis using antibodies that recognize the encoded polypeptides. In
addition, in situ hybridization and immunocytochemistry according to
standard protocols can be done using heterologous nucleic acid specific
polynucleotide probes and antibodies, respectively, to localize sites of
expression within transgenic tissue. Generally, a number of transgenic
lines are usually screened for the incorporated nucleic acid to identify
and select plants with the most appropriate expression profiles.
[0316] A preferred embodiment of the invention is a transgenic plant that
is homozygous for the added heterologous nucleic acid; e.g., a transgenic
plant that contains two added nucleic acid sequence copies, e.g., a gene
at the same locus on each chromosome of a homologous chromosome pair. A
homozygous transgenic plant can be obtained by sexually mating
(self-fertilizing) a heterozygous transgenic plant that contains a single
added heterologous nucleic acid, germinating some of the seed produced
and analyzing the resulting plants produced for altered expression of a
polynucleotide of the present invention relative to a control plant
(e.g., a native, non-transgenic plant). Back-crossing to a parental plant
and out-crossing with a non-transgenic plant can be used to introgress
the heterologous nucleic acid into a selected background (e.g., an elite
or exotic soybean line).
Methods for Identifying Soybean Plants Tolerant to Iron Deficient Growth
Conditions
[0317] Experienced plant breeders can recognize tolerant soybean plants in
the field, and can select the tolerant individuals or populations for
breeding purposes or for propagation. In this context, the plant breeder
recognizes "tolerant" and "non-tolerant," or "susceptible" soybean plants
in fortuitous naturally-occurring field observations.
[0318] However, plant breeding practitioners will appreciate that plant
tolerance is a phenotypic spectrum consisting of extremes in tolerance,
susceptibility and a continuum of intermediate phenotypes. Tolerance also
varies due to environmental effects. Evaluation of phenotypes using
reproducible assays and tolerance scoring methods are of value to
scientists who seek to identify genetic loci that impart tolerance,
conduct marker assisted selection to create tolerant soybean populations,
and for introgression techniques to breed a tolerance trait into an elite
soybean line, for example.
[0319] In contrast to fortuitous field observations that classify plants
as either "tolerant" or ":susceptible," various methods are known in the
art for determining (and quantitating) the tolerance of a soybean plant
to iron-deficient growth conditions. These techniques can be applied to
different fields at different times, or to experimental greenhouse or
laboratory settings, and provide approximate tolerance scores that can be
used to characterize a given strain regardless of growth conditions or
location. See, for example, Diers et al. (1992) "Possible identification
of quantitative trait loci affecting iron efficiency in soybean," J.
Plant Nutr. 15:2127-2136; Dahiya and M. Singh (1979) "Effect of salinity,
alkalinity and iron sources on availability of iron," Plant and Soil
51:13-18; and Gonzalez-Vallejo et al. (2000) "Iron Deficiency Decreases
the Fe(III)-Chelate Reducing Activity of Leaf Protoplasts" Plant Physiol.
122 (2): 337-344.
[0320] The degree of IDC in a particular plant or stand of plants can be
quantitated by using a system to score the severity of the disease in
each plant. A plant strain or a number of plant strains are planted and
grown in a single stand in
soil that is known to produce chlorotic plants
as a result of iron deficiency ("field screens", i.e., in fields that
have previously demonstrated IDC), or alternatively, in controlled
nursery conditions. When the assay is conducted in controlled nursery
conditions, defined soil can be used, where the concentration of iron
(e.g., available iron) has been previously measured. The plants can be
scored at maturity, or at any time before maturity. The scoring system
rates each plant on a scale of one (most susceptible; most severe
disease) to nine (most tolerant; no disease), as follows:
TABLE-US-00006
TABLE 1
Plant or
Plant
Stand
Score Symptoms
1 Most plants are completely dead. The plants that are still
alive are approximately 10% of normal height, and have
very little living tissue.
2 Most leaves are almost dead, most stems are still green.
Plants are severely stunted (10-20% of normal height).
3 Most plants are yellow and necrosis is seen on most leaves. Most
plants are approximately 20-40% of normal height.
4 Most plants are yellow, and necrosis is seen on the edges of
less than half the leaves. Most plants are approximately 50% of
normal height.
5 Most plants are light green to yellow, and no necrosis is seen
on the leaves. Most plants are stunted (50-75% of
normal height).
6 More than half the plants show moderate chlorosis, but no
necrosis is seen on the leaves.
7 Less than half of the plants showing moderate chlorosis
(light green leaves).
8 A few plants are showing very light chlorosis on one or two
leaves.
9 All plants are normal green color.
[0321] It will be appreciated that any such scale is relative, and
furthermore, there may be variability between practitioners as to how the
individual plants and the entire stand as a whole are scored. Optionally,
the degree of chlorosis can be measured using a chlorophyll meter, e.g.,
a Minolta SPAR-502 Chlorophyll Meter, where readings off a single plant
or a stand of plants can be made. Optionally, multiple readings can be
obtained and averaged.
[0322] The IDC scoring of soybean stands can occur at any time. For
example, plots can be scored in the early season, typically mid July
(depending on geographic latitude), so that the results can be used in
making crossing decisions. Alternatively, soybean plots can be scored in
the late season, which generally yields more precise data.
[0323] In general, while there is a certain amount of subjectivity to
assigning severity measurements for disease symptoms, assignment to a
given scale as noted above is well within the skill of a practitioner in
the field. Measurements can also be averaged across multiple scorers to
reduce variation in field measurements.
[0324] Although protocols using field nurseries known to produce chlorotic
plants can be used in assessing tolerance, it is typical for tolerance
ratings to be based on actual field observations of fortuitous natural
disease incidence, with the information corresponding to disease
incidence for a cultivar being averaged over many locations and,
typically, several seasons of crop plantings. Optionally, field stands or
nursery/greenhouse plantings can be co-cultivated with IDC susceptibility
"reference checks." A reference check is a planting of soybean strains
with known susceptibilities to IDC, for example, highly tolerant strains
and highly susceptible strains. This parallel planting can aid the
breeder in scoring disease severity by allowing the breeder to compare
the plant pathology in the experimental stands with the plant pathology
in the reference stands.
[0325] When plants are studied in a fortuitous natural field setting, if
there is no chlorosis present, the rating system above can not be used,
because the existence of iron-deficient soil can not be ascertained.
However, if some number of plants demonstrate IDC symptoms, the growth
conditions in that field can be assumed to be iron-deficient, and the
entire stand can be scored as described above. These scores can
accumulate over multiple locations and years to show disease tolerance
for given cultivars. Thus, older lines can have more years of observation
than newer ones etc. However, relative tolerance measurements between
different strains in the same field at the same time can easily be made
using the scoring system noted above. Furthermore, the tolerance ratings
can be updated and refined each year based on the previous year's
observations in the field. The experiments described herein (see, Example
1) scored soybean tolerance to iron deficiency using the scale described
above at nursery locations at several locations and over several years.
[0326] In assessing linkage of markers to tolerance, either quantitative
or qualitative approaches can be used. For example, an average rating for
each line that is a single number (for each line) from 1 to 9 can be
assessed for linkage. This approach is quantitative and uses the scores
from lines that have both marker data and IDC scores. In an alternative
approach, an "intergroup" comparison of tolerant versus susceptible lines
is used. In this approach, those soybean lines that are considered to be
representative of either the tolerant of susceptible classes are used for
assessing linkage. A list of tolerant lines is constructed, e.g., having
average rating of 6 to 9 on the above scale (when averaged over years and
locations). The susceptible lines are those with an average rating of 1
to 4 over years and locations. Only lines that can be reliably placed in
the 2 groups are used. Once a line is included in the group, it is
treated as an equal in that group--i.e. the actual quantitative ratings
are not used.
Automated Detection/Correlation Systems of the Invention
[0327] In some embodiments, the present invention includes an automated
system for detecting markers of the invention and/or correlating the
markers with a desired phenotype (e.g., tolerance). Thus, a typical
system can include a set of marker probes or primers configured to detect
at least one favorable allele of one or more marker locus associated with
tolerance or improved tolerance to Phytophthora infection. These probes
or primers are configured to detect the marker alleles noted in the
tables and examples herein, e.g., using any available allele detection
format, e.g., solid or liquid phase array based detection,
microfluidic-based sample detection, etc.
[0328] For example, in one embodiment, the marker locus is 560210-TB,
SAC1006, SATT391, SAC1724, SATT307, P13073A-1, P10598A-1, SATT334,
SATT510, SATT335, P5219A-1, P7659A-2, SAT.sub.--117, SATT191, S60143-TB,
SATT451, SATT367, SATT495, P10649C-3, SATT613, SATT257, SATT581 and/or
SATT153 as well as any of the chromosome intervals:
[0329] (a) S60210-TB and SATT391 (LG-C1);
[0330] (b) P10598A-1 and SATT334 (LG-F);
[0331] (c) SATT510 and SATT335 (LG-F);
[0332] (d) P5219A-1 and P7659A-2 (LG-G);
[0333] (e) SAT.sub.--117 and S60143-TB (LG-G);
[0334] (f) SATT451 and SATT367 (LG-I);
[0335] (g) SATT495 and P10649C-3 (LG-L); and
[0336] (h) SATT250 and SATT346 (LG-M),
and the probe set is configured to detect the locus.
[0337] The typical system includes a detector that is configured to detect
one or more signal outputs from the set of marker probes or primers, or
amplicon thereof, thereby identifying the presence or absence of the
allele. A wide variety of signal detection apparatus are available,
including photo multiplier tubes, spectrophotometers, CCD arrays, arrays
and array scanners, scanning detectors, phototubes and photodiodes,
microscope stations, galvo-scans, microfluidic nucleic acid amplification
detection appliances and the like. The precise configuration of the
detector will depend, in part, on the type of label used to detect the
marker allele, as well as the instrumentation that is most conveniently
obtained for the user. Detectors that detect fluorescence,
phosphorescence, radioactivity, pH, charge, absorbance, luminescence,
temperature, magnetism or the like can be used. Typical detector
embodiments include light (e.g., fluorescence) detectors or radioactivity
detectors. For example, detection of a light emission (e.g., a
fluorescence emission) or other probe label is indicative of the presence
or absence of a marker allele. Fluorescent detection is especially
preferred and is generally used for detection of amplified nucleic acids
(however, upstream and/or downstream operations can also be performed on
amplicons, which can involve other detection methods). In general, the
detector detects one or more label (e.g., light) emission from a probe
label, which is indicative of the presence or absence of a marker allele.
[0338] The detector(s) optionally monitors one or a plurality of signals
from an amplification reaction. For example, the detector can monitor
optical signals which correspond to "real time" amplification assay
results.
[0339] System instructions that correlate the presence or absence of the
favorable allele with the predicted tolerance are also a feature of the
invention. For example, the instructions can include at least one look-up
table that includes a correlation between the presence or absence of the
favorable alleles and the predicted tolerance or improved tolerance. The
precise form of the instructions can vary depending on the components of
the system, e.g., they can be present as system software in one or more
integrated unit of the system (e.g., a microprocessor, computer or
computer readable medium), or can be present in one or more units (e.g.,
computers or computer readable media) operably coupled to the detector.
As noted, in one typical embodiment, the system instructions include at
least one look-up table that includes a correlation between the presence
or absence of the favorable alleles and predicted tolerance or improved
tolerance. The instructions also typically include instructions providing
a user interface with the system, e.g., to permit a user to view results
of a sample analysis and to input parameters into the system.
[0340] The system typically includes components for storing or
transmitting computer readable data representing or designating the
alleles detected by the methods of the present invention, e.g., in an
automated system. The computer readable media can include cache, main,
and storage memory and/or other electronic data storage components (hard
drives, floppy drives, storage drives, etc.) for storage of computer
code. Data representing alleles detected by the method of the present
invention can also be electronically, optically, magnetically o
transmitted in a computer data signal embodied in a transmission medium
over a network such as an intranet or internet or combinations thereof.
The system can also or alternatively transmit data via wireless, IR, or
other available transmission alternatives.
[0341] During operation, the system typically comprises a sample that is
to be analyzed, such as a plant tissue, or material isolated from the
tissue such as genomic DNA, amplified genomic DNA, cDNA, amplified cDNA,
RNA, amplified RNA, or the like.
[0342] The phrase "allele detection/correlation system" in the context of
this invention refers to a system in which data entering a computer
corresponds to physical objects or processes external to the computer,
e.g., a marker allele, and a process that, within a computer, causes a
physical transformation of the input signals to different output signals.
In other words, the input data, e.g., amplification of a particular
marker allele is transformed to output data, e.g., the identification of
the allelic form of a chromosome segment. The process within the computer
is a set of instructions, or "program," by which positive amplification
or hybridization signals are recognized by the integrated system and
attributed to individual samples as a genotype. Additional programs
correlate the identity of individual samples with phenotypic values or
marker alleles, e.g., statistical methods. In addition there are numerous
e.g., C/C++ programs for computing, Delphi and/or Java programs for GUI
interfaces, and productivity tools (e.g., Microsoft Excel and/or
SigmaPlot) for charting or creating look up tables of relevant
allele-trait correlations. Other useful software tools in the context of
the integrated systems of the invention include statistical packages such
as SAS, Genstat, Matlab, Mathematica, and S-Plus and genetic modeling
packages such as QU-GENE. Furthermore, additional programming languages
such as visual basic are also suitably employed in the integrated systems
of the invention.
[0343] For example, tolerance marker allele values assigned to a
population of progeny descending from crosses between elite lines are
recorded in a computer readable medium, thereby establishing a database
corresponding tolerance alleles with unique identifiers for members of
the population of progeny. Any file or folder, whether custom-made or
commercially available (e.g., from Oracle or Sybase) suitable for
recording data in a computer readable medium is acceptable as a database
in the context of the present invention. Data regarding genotype for one
or more molecular markers, e.g., ASH, SSR, RFLP, RAPD, AFLP, SNP, isozyme
markers or other markers as described herein, are similarly recorded in a
computer accessible database. Optionally, marker data is obtained using
an integrated system that automates one or more aspects of the assay (or
assays) used to determine marker(s) genotype. In such a system, input
data corresponding to genotypes for molecular markers are relayed from a
detector, e.g., an array, a scanner, a CCD, or other detection device
directly to files in a computer readable medium accessible to the central
processing unit. A set of system instructions (typically embodied in one
or more programs) encoding the correlations between tolerance and the
alleles of the invention is then executed by the computational device to
identify correlations between marker alleles and predicted trait
phenotypes.
[0344] Typically, the system also includes a user input device, such as a
keyboard, a mouse, a touchscreen, or the like, for, e.g., selecting
files, retrieving data, reviewing tables of maker information, etc., and
an output device (e.g., a monitor, a printer, etc.) for viewing or
recovering the product of the statistical analysis.
[0345] Thus, in one aspect, the invention provides an integrated system
comprising a computer or computer readable medium comprising set of files
and/or a database with at least one data set that corresponds to the
marker alleles herein. The system also includes a user interface allowing
a user to selectively view one or more of these databases. In addition,
standard text manipulation software such as word processing software
(e.g., Microsoft Word.TM. or Corel WordPerfect.TM.) and database or
spreadsheet software (e.g., spreadsheet software such as Microsoft
Excel.TM., Corel Quattro Pro.TM., or database programs such as Microsoft
Access.TM. or Paradox.TM.) can be used in conjunction with a user
interface (e.g., a GUI in a standard operating system such as a Windows,
Macintosh, Unix or Linux system) to manipulate strings of characters
corresponding to the alleles or other features of the database.
[0346] The systems optionally include components for sample manipulation,
e.g., incorporating robotic devices. For example, a robotic liquid
control armature for transferring solutions (e.g., plant cell extracts)
from a source to a destination, e.g., from a microtiter plate to an array
substrate, is optionally operably linked to the digital computer (or to
an additional computer in the integrated system). An input device for
entering data to the digital computer to control high throughput liquid
transfer by the robotic liquid control armature and, optionally, to
control transfer by the armature to the solid support is commonly a
feature of the integrated system. Many such automated robotic fluid
handling systems are commercially available. For example, a variety of
automated systems are available from Caliper Technologies (Hopkinton,
Mass.), which utilize various Zymate systems, which typically include,
e.g., robotics and fluid handling modules. Similarly, the common
ORCA.RTM. robot, which is used in a variety of laboratory systems, e.g.,
for microtiter tray manipulation, is also commercially available, e.g.,
from Beckman Coulter, Inc. (Fullerton, Calif.). As an alternative to
conventional robotics, microfluidic systems for performing fluid handling
and detection are now widely available, e.g., from Caliper Technologies
Corp. (Hopkinton, Mass.) and Agilent technologies (Palo Alto, Calif.).
[0347] Systems for molecular marker analysis of the present invention can,
thus, include a digital computer with one or more of high-throughput
liquid control software, image analysis software for analyzing data from
marker labels, data interpretation software, a robotic liquid control
armature for transferring solutions from a source to a destination
operably linked to the digital computer, an input device (e.g., a
computer keyboard) for entering data to the digital computer to control
high throughput liquid transfer by the robotic liquid control armature
and, optionally, an image scanner for digitizing label signals from
labeled probes hybridized, e.g., to markers on a solid support operably
linked to the digital computer. The image scanner interfaces with the
image analysis software to provide a measurement of, e.g., nucleic acid
probe label intensity upon hybridization to an arrayed sample nucleic
acid population (e.g., comprising one or more markers), where the probe
label intensity measurement is interpreted by the data interpretation
software to show whether, and to what degree, the labeled probe
hybridizes to a marker nucleic acid (e.g., an amplified marker allele).
The data so derived is then correlated with sample identity, to determine
the identity of a plant with a particular genotype(s) for particular
markers or alleles, e.g., to facilitate marker assisted selection of
soybean plants with favorable allelic forms of chromosome segments
involved in agronomic performance (e.g., tolerance or improved
tolerance).
[0348] Optical images, e.g., hybridization patterns viewed (and,
optionally, recorded) by a camera or other recording device (e.g., a
photodiode and data storage device) are optionally further processed in
any of the embodiments herein, e.g., by digitizing the image and/or
storing and analyzing the image on a computer. A variety of commercially
available peripheral equipment and software is available for digitizing,
storing and analyzing a digitized video or digitized optical image, e.g.,
using PC (Intel x86 or pentium chip-compatible DOS.TM., OS2.TM.
WINDOWS.TM., WINDOWS NT.TM. or WINDOWS95.TM. based machines),
MACINTOSH.TM., LINUX, or UNIX based (e.g., SUN.TM. work station)
computers.
EXAMPLES
[0349] The following examples are offered to illustrate, but not to limit,
the claimed invention. It is understood that the examples and embodiments
described herein are for illustrative purposes only, and persons skilled
in the art will recognize various reagents or parameters that can be
altered without departing from the spirit of the invention or the scope
of the appended claims.
Example 1
Intergroup Allele Frequency Distribution Analysis
[0350] Two independent allele frequency distribution analyses were
undertaken to identify soybean genetic marker loci associated with
tolerance to low-iron infection. By identifying such genetic markers,
marker assisted selection (MAS) can be used to improve the efficiency of
breeding for improved tolerance of soybean to iron-deficient growth
conditions.
Soybean Lines and Tolerance Scoring
[0351] The plant varieties used in the analysis were from diverse sources,
including elite germplasm, commercially released cultivars and other
public lines representing a broad range of germplasm. The lines used in
the study had a broad maturity range varying from group 0 to group 6.
[0352] Two groups of soybean lines were assembled for each analysis based
on their phenotypic extremes in tolerance to iron-deficient growth
conditions, where the plants were sorted into either highly susceptible
or highly tolerant varieties. The classifications of tolerant and
susceptible were based solely on observations of fortuitous, naturally
occurring fields displaying disease incidence in greenhouse and field
tests over several years. The degree of plant tolerance to iron-poor
growth conditions varied widely, as measured using a scale from one
(highly susceptible) to nine (highly tolerant). Generally, a score of two
(2) indicated the most susceptible strains, and a score of seven (7) was
assigned to the most tolerant lines. A score of one (1) was generally not
used, as soybean strains with such extremely high susceptibility were not
typically propagated. Tolerance scores of eight (8) and nine (9) were
reserved for tolerance levels that are very rare and generally not
observed in existing germplasm. If no disease was present in a field, no
tolerance scoring was done. However, if a disease did occur in a specific
field location, all of the lines in that location were scored. Scores for
test strains accumulated over multiple locations and multiple years, and
an averaged (e.g., consensus) score was ultimately assigned to each line.
[0353] Individual fields showing naturally-occurring FEC were monitored
for disease symptoms. Data collection was made in successive scorings on
multiple days. Scorings continued until worsening symptoms could no
longer be quantified or until the symptoms are confounded by other
factors such as other diseases, insect pressure, severe weather, or
advancing maturity.
[0354] In assessing linkage of markers to tolerance, a qualitative
"intergroup allele frequency distribution" comparison approach was used.
Using this approach, those soybean lines that were considered to be
representative of either the tolerant or susceptible classes were used
for assessing linkage. A list of tolerant lines was constructed, where
strains having a tolerance score of 6 or greater were considered
"tolerant." Similarly, soybean lines with scores of four or less were
collectively considered susceptible. Only lines that could be reliably
placed into the two groups were used. Once a line is included in the
"tolerant" or "susceptible" group, it was treated as an equal in that
group, i.e., the actual quantitative ratings was not used.
[0355] In one of the analyses, 62 soybean lines were identified that were
considered tolerant in the phenotypic spectrum; these plants formed the
"TOLERANT" group. Also, 64 soybean lines were identified that were judged
to be susceptible to iron-poor growth conditions; these strains formed
the "SUSCEPTIBLE" group. In the second analysis, there were 32 tolerant
lines and 36 susceptible lines.
Soybean Genotyping
[0356] Each of the tolerant and susceptible lines were genotyped with SSR
and SNP markers that span the soybean genome using techniques well known
in the art. The genotyping protocol consisted of collecting young leaf
tissue from eight individuals from each tolerant and resistant soybean
strain, pooling (i.e., bulking) the leaf tissue from the eight
individuals, and isolating genomic DNA from the pooled tissue. The
soybean genomic DNA was extracted by the CTAB method, as described in
Maroof et al., (1984) Proc. Natl. Acad. Sci. (USA) 81:8014-8018.
[0357] The isolated genomic DNA was then used in PCR reactions using
amplification primers specific for a large number of markers that covered
all chromosomes in the soybean genome. The length of the PCR amplicon or
amplicons from each PCR reaction were characterized. The length of the
amplicons generated in the PCR reactions were compared to known allele
definitions for the various markers (see, e.g., FIG. 4), and allele
designations were assigned. SNP-type markers were genotyped using an ASH
protocol (see, FIG. 3).
Intergroup Allele Frequency Analysis
[0358] An "Intergroup Allele Frequency Distribution" analysis was
conducted using GeneFlow.TM. version 7.0 software. An intergroup allele
frequency distribution analysis provides a method for finding non-random
distributions of alleles between two phenotypic groups.
[0359] During processing, a contingency table of allele frequencies is
constructed and from this a G-statistic and probability are calculated
(the G statistic is adjusted by using the William's correction factor).
The probability value is adjusted to take into account the fact that
multiple tests are being done (thus, there is some expected rate of false
positives). The adjusted probability is proportional to the probability
that the observed allele distribution differences between the two classes
would occur by chance alone. The lower that probability value, the
greater the likelihood that the low iron tolerance phenotype and the
marker will co-segregate. A more complete discussion of the derivation of
the probability values can be found in the GeneFlow.TM. version 7.0
software documentation. See, also, Sokal and Rolf (1981), Biometry: The
Principles and Practices of Statistics in Biological Research, 2nd ed.,
San Francisco, W. H. Freeman and Co.
[0360] The underlying logic is that markers with significantly different
allele distributions between the tolerant and susceptible groups (i.e.,
non random distributions) might be associated with the trait and can be
used to separate them for purposes of marker assisted selection of
soybean lines with previously uncharacterized tolerance or susceptibility
to low iron growth conditions. The present analysis examined one marker
locus at a time and determined if the allele distribution within the
tolerant group is significantly different from the allele distribution
within the susceptible group. A statistically different allele
distribution is an indication that the marker is linked to a locus that
is associated with reaction to iron-poor conditions. In this analysis,
adjusted probabilities less than approximately 0.10 are considered highly
significant. Allele classes represented by less than 5 observations
across both groups were not included in the statistical analysis. In this
analysis, 424 marker loci had enough observations for analysis.
[0361] This analysis compares the plants' phenotypic score with the
genotypes at the various loci. This type of intergroup analysis neither
generates nor requires any map data. Subsequently, map data (for example,
a composite soybean genetic map) is relevant in that multiple significant
markers that are also genetically linked can be considered as
collaborating evidence that a given chromosomal region is associated with
the trait of interest.
Results
[0362] FIG. 1 provides a table listing the soybean markers that
demonstrated linkage disequilibrium with the low iron
tolerance/susceptibility phenotype. Also indicated in that figure are the
chromosomes on which the markers are located and their approximate map
position relative to other known markers, given in cM, with position zero
being the first (most distal) marker known at the beginning of the
chromosome. These map positions are not absolute, and represent an
estimate of map position. The statistical probabilities that the marker
allele and tolerance phenotype are segregating independently are
reflected in the adjusted probability values.
[0363] FIG. 2 provides the PCR primer sequences that were used to genotype
the SSR marker loci. FIG. 2 also provides the pigtail sequence used on
the 5' end of the right SSR-marker primers and the number of nucleotides
in the repeating element in the SSR. The observed alleles that are known
to occur for these marker loci are provided in the allele dictionary in
FIG. 4. SNP-type markers were genotyped using an ASH protocol with
appropriate primers and allele-specific probes (see, FIG. 3).
Discussion
[0364] There are a number of ways to use the information provided in this
analysis for the development of improved soybean varieties. One
application is to use the associated markers (or more based on a higher
probability cutoff value) as candidates for mapping QTL in specific
populations that are segregating for plants having tolerance to iron poor
growth conditions. In this application, one proceeds with conventional
QTL mapping in a segregating population, but focusing on the markers that
are associated with low iron tolerance, instead of using markers that
span the entire genome. This makes mapping efforts more cost-effective by
dramatically reducing lab resources committed to the project. For
example, instead of screening segregating populations with a large set of
markers that spans the entire genome, one would screen with only those
few markers that met some statistical cutoff in the intergroup allele
association study. This will not only reduce the cost of mapping but will
also eliminate false leads that will undoubtedly occur with a large set
of markers. In any given cross, it is likely that only a small subset of
the associated markers will actually be correlated with tolerance to low
iron conditions. Once the few relevant markers are identified in any
tolerant parent, future marker assisted selection (MAS) efforts can focus
on only those markers that are important for that source of tolerance. By
pre-selecting lines that have the allele associated with tolerance via
MAS, one can eliminate the undesirable susceptible lines and concentrate
the expensive field testing resources on lines that have a higher
probability of being tolerant to iron-poor growth conditions.
Example 2
Association Mapping Analysis
[0365] An association mapping strategy was undertaken to identify soybean
genetic markers associated with tolerance to low iron growth conditions.
The study was completed twice, generating two independent data sets. By
identifying such genetic markers, marker assisted selection (MAS) can be
used to improve the efficiency of breeding for improved tolerance of
soybean to low iron growth conditions. Association mapping is known in
the art, and is described in various sources, e.g., Jorde (2000), Genome
Res., 10:1435-1444; Remington et al. (2001), "Structure of linkage
disequilibrium and phenotype associations in the maize genome," Proc Natl
Acad Sci USA 98:11479-11484; and Weiss and Clark (2002), Trends in
Genetics 18:19-24.
Association Mapping
[0366] Understanding the extent and patterns of linkage disequilibrium
(LD) in the genome is a prerequisite for developing efficient association
approaches to identify and map quantitative trait loci (QTL). Linkage
disequilibrium (LD) refers to the non-random association of alleles in a
collection of individuals. When LD is observed among alleles at linked
loci, it is measured as LD decay across a specific region of a
chromosome. The extent of the LD is a reflection of the recombinational
history of that region. The average rate of LD decay in a genome can help
predict the number and density of markers that are required to undertake
a genome-wide association study and provides an estimate of the
resolution that can be expected.
[0367] Association or LD mapping aims to identify significant
genotype-phenotype associations. It has been exploited as a powerful tool
for fine mapping in outcrossing species such as humans (Corder et al.
(1994) "Protective effect of apolipoprotein-E type-2 allele for
late-onset Alzheimer-disease," Nat Genet 7: 180-184; Hastbacka et al.,
(1992) "Linkage disequilibrium mapping in isolated founder populations:
diastrophic dysplasia in Finland," Nat Genet 2:204-211; Kerem et al.,
(1989) "Identification of the cystic fibrosis gene: genetic analysis,"
Science 245:1073-1080) and maize (Remington et al., (2001) "Structure of
linkage disequilibrium and phenotype associations in the maize genome,"
Proc Natl Acad Sci USA 98:11479-11484; Thornsberry et al. (2001) "Dwarf8
polymorphisms associate with variation in flowering time," Nat Genet
28:286-289; reviewed by Flint-Garcia et al. (2003) "Structure of linkage
disequilibrium in plants," Annu Rev Plant Biol., 54:357-374), where
recombination among heterozygotes is frequent and results in a rapid
decay of LD. In inbreeding species where recombination among homozygous
genotypes is not genetically detectable, the extent of LD is greater
(i.e., larger blocks of linked markers are inherited together) and this
dramatically lowers the resolution of association mapping (Wall and
Pritchard (2003) "Haplotype blocks and linkage disequilibrium in the
human genome," Nat Rev Genet 4:587-597).
[0368] The recombinational and mutational history of a population is a
function of the mating habit, as well as the effective size and age of a
population. Large population sizes offer enhanced possibilities for
detecting recombination, while older populations are generally associated
with higher levels of polymorphism, both of which contribute to
observably accelerated rates of LD decay. On the other hand, smaller
effective population sizes, i.e., those that have experienced a recent
genetic bottleneck, tend to show a slower rate of LD decay, resulting in
more extensive haplotype conservation (Flint-Garcia et al. (2003)
"Structure of linkage disequilibrium in plants," Annu Rev Plant Biol.,
54:357-374).
[0369] Elite breeding lines provide a valuable starting point for
association analyses. Association analyses use quantitative phenotypic
scores (e.g., disease tolerance rated from one to nine for each soybean
line) in the analysis (as opposed to looking only at tolerant versus
resistant allele frequency distributions in intergroup allele
distribution types of analysis). The availability of detailed phenotypic
performance data collected by breeding programs over multiple years and
environments for a large number of elite lines provides a valuable
dataset for genetic marker association mapping analyses. This paves the
way for a seamless integration between research and application and takes
advantage of historically accumulated data sets. However, an
understanding of the relationship between polymorphism and recombination
is useful in developing appropriate strategies for efficiently extracting
maximum information from these resources.
[0370] This type of association analysis neither generates nor requires
any map data, but rather, is independent of map position. This analysis
compares the plants' phenotypic score with the genotypes at the various
loci. Subsequently, any suitable soybean map (for example, a composite
map) can optionally be used to help observe distribution of the
identified QTL markers and/or QTL marker clustering using previously
determined map locations of the markers.
Soybean Lines and Phenotypic Scoring
[0371] Soybean lines were phenotypically scored based on their degree of
tolerance to low iron growth conditions (in contrast to simple
categorization of "tolerant" or "susceptible"). The plant varieties used
in the analysis were from diverse sources, including elite germplasm,
other commercially released cultivars and proprietary experimental
varieties. The RIL collections comprised 205 Pioneer soybean R3+ lines,
or alternatively, 177 Pioneer R3+ lines. The lines used in the study had
a broad maturity range varying from group 0 to group 6.
[0372] The tolerance scoring was based solely on observations in
fortuitous, naturally occurring fields displaying disease incidence in
multienvironmental field tests over several years. The degree of plant
tolerance to low iron growth conditions varied widely, as measured using
a scale from one (1; highly susceptible) to nine (9; highly tolerant).
Generally, a score of two (2) indicated the most susceptible strains, and
a score of seven (8) was assigned to the most tolerant lines. A score of
one (1) was generally not used, as soybean strains with such extremely
high susceptibility were not typically propagated. A tolerance score of
nine (9) was reserved for tolerance levels that are very rare and
generally not observed in existing germplasm.
[0373] Experimental plants were scored for the low iron
tolerance/susceptibility phenotype according to a scoring scale as
described above. If no disease (chlorosis) was present in a field, no
tolerance scoring was done. However, if disease did occur in a specific
field location, all of the lines in that location were scored. Tolerance
scores for the reference strains accumulated over multiple locations and
years, and an averaged (e.g., consensus) score was ultimately assigned to
each line. Tolerance scores for the 205 variety collection or the 177
variety collection were collected over a single grow season.
[0374] Individual fields showing iron chlorosis were monitored for disease
symptoms. Data collection was typically done in multiple scorings on
multiple days. Scorings continued until worsening symptoms could no
longer be quantified or until the symptoms are confounded by other
factors such as other diseases, insect pressure, severe weather, or
advancing maturity.
[0375] In assessing the linkage of markers to tolerance, a quantitative
approach was used, where a tolerance score for each soybean line was
assessed and incorporated into the association mapping statistical
analysis.
Soybean Genotyping
[0376] The independent populations of either 205 or 177 soybean lines that
were scored for disease tolerance were then genotyped. The 205 member
population was genotyped using 287 SSR and ASH markers. The 177 member
population was genotyped using 374 SSR and ASH markers. These SSR and SNP
markers collectively spanned each chromosome in the plant genome. The
genotyping protocol consisted of collecting young leaf tissue from eight
individuals from each soybean strain, pooling (i.e., bulking) the leaf
tissue from the eight individuals, and isolating genomic DNA from the
pooled tissue. The soybean genomic DNA was extracted by the CTAB method,
as described in Maroof et al., (1984) Proc. Natl. Acad. Sci. (USA)
81:8014-8018.
[0377] The isolated genomic DNA was then used in PCR reactions using
amplification primers specific for a large number of markers that covered
all chromosomes in the soybean genome. The length of the PCR amplicon or
amplicons from each PCR reaction were characterized. SNP-type markers
were genotyped using an ASH protocol. The length of the amplicons
generated in the PCR reactions were compared to known allele definitions
for the various markers (see FIG. 4), and allele designations for each
tested marker were assigned.
Statistical Methods
[0378] Monomorphic loci are considered uninformative and thus are
eliminated from LD analyses. The monomorphic loci are defined as those
whose gene diversity (
i = 1 n pi , ##EQU00001##
where p.sub.i is i.sup.th allele frequency in the population of study) is
less than 0.10. Since rare alleles (frequency<0.05) tend to cause
large variances for the estimates of r.sup.2, they were treated as
missing data and pooled together. Marker screening and partitioning are
conducted using PowerMarker software (version 2.72), which was developed
by Jack Liu and is available at http://152.14.14.48.
[0379] The rate of LD decay with genetic distance (cM) was calculated for
pairs of markers on the same chromosome and was evaluated using linear
regression in which the genetic distances were transformed by taking
log.sub.10, as described by McRae et al. (2002). Population structure was
evaluated using Pritchard's model-based method (Pritchard et al. 2000)
and the software, STRUCTURE (version 2.0; see the web at:
pritch.bsd.uchicago.edu/index.html). This version of the program controls
for linked markers and correlated allelic frequencies (Falush et al.
(2003) "Inference of population structure using multilocus genotype data:
linked loci and correlated allele frequencies," Genetics 164: 1567-1587).
It detects population structure in structured or admixed populations.
This method is more appropriate than conventionally used genetic
distance-based method, because Structure provides the likelihood
associated with different numbers of sub-populations and the estimated
percentage of shared ancestry with each sub-population for each entry.
[0380] Associations of individual SSR markers with tolerance to Fusarium
solani infection were evaluated by logistic regression in TASSEL (Trait
Analysis by aSSociation, Evolution, and Linkage) using the Structured
Association analysis mode. TASSEL is provided by Edward Buckler, and
information about the program can be found on the Buckler Lab web page at
the Institute for Genomic Diversity at Cornell University. The
significance level for each association was tested using an empirical
distribution that was established by running 5,000 permutations.
Modifications of established procedures were made to accommodate the
nature and characteristics of soybean and the soybean data set,
especially with regard to those aspects that differ from rice.
Results
[0381] FIG. 1 provides a table listing the soybean markers that
demonstrated linkage disequilibrium with the low iron tolerance phenotype
using the Association Mapping method. Also indicated in that figure are
the chromosomes on which the markers are located and their approximate
map position relative to other known markers, given in cM, with position
zero being the first (most distal) marker known at the beginning of the
chromosome. These map positions are not absolute, and represent an
estimate of map position. The SNP-type markers were detected by an allele
specific hybridization (ASH) method, as known in the art (see, e.g.,
Coryell et al, (1999) "Allele specific hybridization markers for
soybean," Theor. Appl. Genet., 98:690-696). FIG. 2 provides the PCR
primer sequences that were used to genotype these seven marker loci. FIG.
2 also provides the pigtail sequence used on the 5' end of the right
SSR-marker primers and the number of nucleotides in the repeating element
in the SSR. The alleles that are known to occur for the marker loci are
provided in the SSR allele dictionary in FIG. 4. FIG. 3 provides the PCR
amplification primer sequences and the allele-specific probes that were
used to genotype the SNP-type marker loci.
[0382] The statistical probabilities that the marker allele and disease
tolerance phenotype are segregating independently are reflected in the
adjusted probability values in FIG. 1, which is a probability (P) derived
from 5000 rounds of permutation analysis between genotype and phenotype.
The permutations method for probability analysis is known in the art, and
described in various sources, for example, Churchill and Doerge (1994),
Genetics 138: 963-971; Doerge and Churchill (1996), Genetics 142:
285-294; Lynch and Walsh (1998) in Genetics and analysis of quantitative
traits, published by Sinauer Associates, Inc. Sunderland, Mass. 01375, p.
441-442.
[0383] The lower the probability value, the more significant is the
association between the marker genotype at that locus and the low iron
tolerance phenotype. A more complete discussion of the derivation of the
probability values can be found in the software documentation. See, also,
Sokal and Rolf (1981), Biometry: The Principles and Practices of
Statistics in Biological Research, 2nd ed., San Francisco, W. H. Freeman
and Co.
Example 3
QTL Interval Mapping and Single Marker Regression Analysis
[0384] Two independent QTL interval mapping and marker regression analyses
were undertaken to identify soybean genetic markers and chromosome
intervals associated with tolerance that allow the plant to escape the
pathology associated with growth in iron-poor conditions. QTL mapping and
marker regression are widely used method to identify genetic loci that
co-segregate with a desired phenotype. By identifying such genetic loci,
marker assisted selection (MAS) can be used to improve the efficiency of
breeding for improved soybean strains.
Study A
[0385] Materials and Methods
[0386] A mapping population for iron-deficiency tolerance was created
using the mapping population UP1C6-43/90B73. The population had 458
progeny that were used in the mapping.
[0387] Phenotypic scoring took place in replicated plots that were planted
in fields known to promote FEC pathology. Planting sites near Cottonwood,
Minn. and near Glyndon, Minn. were used. All plots were scored once at
late vegetative to early reproductive stage of growth. Scoring was on a
scale of one to nine, where one is susceptible and nine is tolerant.
Phenotypic scoring of each of the progeny lines was based on two years of
collection data, with ten reps of data for each scored line. The overall
means of ten reps were used for QTL interval mapping.
[0388] Soybean Genotyping--No genotype information was available for the
parent UP106-43. For the purpose of identifying polymorphic loci, the
first plate was run with all available SSR markers (approximately 550).
Based on the collected data, 210 markers that were potentially
polymorphic were selected for further study. These markers were screened
against the rest of four plates. Because of missing data for one of the
parents, manual editing was used in analyzing the SSR data.
[0389] From the 210 SSR markers screened, 41 of those had no meaningful
data and were dropped from the analysis. Out of the remaining 169 SSR
markers, 143 were mappable. These 143 markers were mapped to 20 linkage
groups with between 2 and 14 markers per linkage group. Linkage groups
A2, B2, D1a, I and N only have two to four markers per linkage group.
Compared to the most comprehensive public map, this map covers about 50%
of the genome.
[0390] QTL Interval Mapping--QTL mapping analyses were performed on the
overall mean of two years of data of ten repetitions for each
observation. One thousand (1000) permutation tests were used to establish
the threshold for statistical significance (likelihood ratio
statistic--LRS). The LRS threshold for means at P=0.05 is 13.4. The LRS
provides a measure of the linkage between variation in the phenotype and
genetic differences at a particular genetic locus. LRS values can be
converted to LOD scores (logarithm of the odds ratio) by dividing by
4.61. Generally, the LRS values above 15 are considered significant for
simple interval maps. The term "likelihood" of "odds" is used to describe
the relative probability of two or more explanations of the sources of
variation in a trait. The probability of these two different explanations
(models) can be computed, and most likely model chosen. If model A is
1000 times more probable than model 13, then the ratio of the odds are
1000:1 and the logarithm of the odds ratio is 3. MapManager-QTXb20 (2004)
was used for the QTL interval mapping.
Results
[0391] QTL Interval Mapping--The present study identified two chromosome
intervals that correlate with QTL that associate with
tolerance/susceptibility to iron-deficient growth conditions according to
a constrained additive model. Two QTL were identified for FEC tolerance.
One QTL is on LG-F, which has LRS of 25.2, and explains 5% of total
variation. The flanking markers for this QTL are Satt334 and Satt510.
Another QTL is LG-L, this QTL has LRS of 67.8 and explains 14% of the
total variation. The flanking markers for this QTL are Satt613 and
Satt513. This QTL has much larger effect on FEC than the QTL on LG-F.
[0392] Marker Regression--Using single marker regression, there are a
number of markers showing association with FEC, as provided in FIG. 1.
Study B
[0393] Materials and Methods
[0394] A mapping population for iron-deficiency tolerance was created
using the mapping population P1082/90B73. The population had 460 progeny
that were used in the mapping.
[0395] Phenotypic scoring of each of the progeny lines was based on two
years of collection data, with ten reps of data for each scored line.
Phenotypic scoring took place in replicated plots that were planted in
fields known to promote FEC pathology. Sites near Ada, Minn. and near
Glyndon, Minn. were used. All plots were scored once at late vegetative
to early reproductive stage of growth. Scoring was on a scale of one to
nine, where one is susceptible and nine is tolerant. The overall means of
ten reps were used for QTL interval mapping.
[0396] Soybean Genotyping--A total of 245 markers were analyzed. After
review and editing, 200 SSR marker data were loaded to MapManager-QTX.
Out of the 200 SSRs loaded into the mapping program, 164 were placed into
18 linkage groups. The remaining 36 markers were unlinked although some
of them have linkage group information available. The map covers about
75% genome. There were no markers placed on LG-I or LG-N (there are
markers genotyped for these two linkage groups, but because of large
gaps, these markers can not be placed together).
[0397] QTL Interval Mapping--MapManager was used for both genetic and QTL
interval mapping. The 1000 permutation tests were used to establish the
threshold for statistical significance (likelihood ratio statistic--LRS).
The LRS threshold at P=0.05 was 26.1.
[0398] Results
[0399] A total of six QTL intervals were identified. These QTL are on
LG-C2, LG-D1b, LG-D2, LG-G, LG-L and LG-M. Using single marker
regression, there are a number of markers showing association with the
FEC tolerance phenotype. By reviewing the mapping results from both
interval mapping and marker regression analysis for two mapping
populations, several consistent QTL for FEC were identified. For example,
a QTL interval was identified on LG-M, which is defined by and includes
the termini SATT250 and SATT346. Furthermore, markers within these
intervals were also confirmed. For example, the EST-SSR marker S60126-TB
maps between SATT250 and SATT346, and also correlates with the low-iron
tolerance phenotype (p=0.0000).
DISCUSSION/CONCLUSIONS
[0400] This present mapping study has identified chromosome intervals and
individual markers that correlate with tolerance to iron-deficient growth
conditions. Markers that lie within these intervals are useful for use in
MAS, as well as other purposes.
[0401] While the foregoing invention has been described in some detail for
purposes of clarity and understanding, it will be clear to one skilled in
the art from a reading of this disclosure that various changes in form
and detail can be made without departing from the true scope of the
invention. For example, all the techniques and apparatus described above
can be used in various combinations. All publications, patents, patent
applications, and/or other documents cited in this application are
incorporated by reference in their entirety for all purposes to the same
extent as if each individual publication, patent, patent application,
and/or other document were individually indicated to be incorporated by
reference for all purposes.
Sequence CWU
1
69122DNAArtificialoligonucleotide primer 1tcattgcgtg attgattttc cg
22222DNAArtificialoligonucleotide
primer 2gtttcgctct gagtctccca gg
22322DNAArtificialoligonucleotide primer 3caatcaggtt agtggtccta cc
22420DNAArtificialoligonucleotide primer 4caaaaggttt tcagtggtgg
20524DNAArtificialoligonucleotide
primer 5tgctcaaagg gtcaatttct ttcc
24628DNAArtificialoligonucleotide primer 6tgtgtaattt ctatcacctt
attgtgcc
28725DNAArtificialoligonucleotide primer 7cgactaacac ctttcacttg acttg
25822DNAArtificialoligonucleotide
primer 8gcaggaattt gggggagtct gt
22923DNAArtificialoligonucleotide primer 9gctggccttt agaacgtctg act
231022DNAArtificialoligonucleotide primer 10cgttggattc gactttttgg ga
221127DNAArtificialoligonucleotide primer 11gcgttaagaa tgcatttatg tttagtc
271225DNAArtificialoligonucleotide primer 12gcgagttttt ggttggattg agttg
251328DNAArtificialoligonucleotide primer 13gcgagtttcg ccgttaccac
ctcagctt
281428DNAArtificialoligonucleotide primer 14ccctcttatt tcaccctaag
acctacaa
281520DNAArtificialoligonucleotide primer 15caagctcaag cctcacacat
201622DNAArtificialoligonucleotide primer 16tgaccagagt ccaaagttca tc
221722DNAArtificialoligonucleotide primer 17tgctcatgtg gtcctaccca ga
221826DNAArtificialoligonucleotide primer 18cgctatccct ttgtattttc ttttgc
261924DNAArtificialoligonucleotide primer 19aaagcatttt tggcagtttc ttgt
242022DNAArtificialoligonucleotide primer 20ggaatgtccc aagtgtcagc aa
222122DNAArtificialoligonucleotide primer 21gcgatcatgt ctctgccatc ag
222222DNAArtificialoligonucleotide primer 22cctcttgaaa ccgtgaaacc gt
222322DNAArtificialoligonucleotide primer 23ccccaacaac aacgatcatc aa
222422DNAArtificialoligonucleotide primer 24tttgtaggta accaccgcag gc
222526DNAArtificialoligonucleotide primer 25gcgcaattaa aaggataact tatatc
262626DNAArtificialoligonucleotide primer 26cccctctttg gccctcacac cttctc
262725DNAArtificialoligonucleotide primer 27gcggatatgc cacttctctc gtgac
252825DNAArtificialoligonucleotide primer 28gcggaatagt tgccaaacaa taatc
252925DNAArtificialoligonucleotide primer 29tggagattta atatagatgc cgcga
253024DNAArtificialoligonucleotide primer 30gcaccatgtt ctttttccat caaa
243126DNAArtificialoligonucleotide primer 31gcggaatatg atcattggta atgtac
263222DNAArtificialoligonucleotide primer 32cggcttcaaa cggcaaataa tc
223322DNAArtificialoligonucleotide primer 33gaaccacaga ggctgcaact cc
223422DNAArtificialoligonucleotide primer 34acctggttga agaggtggtg ga
223520DNAArtificialoligonucleotide primer 35cgccagctag ctagtctcat
203624DNAArtificialoligonucleotide primer 36aatttgctcc agtgttttaa gttt
243722DNAArtificialoligonucleotide primer 37accttcacca ccaccaccat ct
223822DNAArtificialoligonucleotide primer 38tagtttccgt tgctgggagg ag
223922DNAArtificialoligonucleotide primer 39gttcggaggg aggaaagtgt tg
224026DNAArtificialoligonucleotide primer 40ccataaaaca tagcaactgt cgtctc
264126DNAArtificialoligonucleotide primer 41gagcaggaca ttttttttat ccttga
264225DNAArtificialoligonucleotide primer 42tgcttccatt agtctctcat cctcc
254325DNAArtificialoligonucleotide primer 43gcgactttct tttcaatttc actcc
254421DNAArtificialoligonucleotide primer 44gcgcaattgt caccaacaca t
214523DNAArtificialoligonucleotide primer 45ccaaagctga gcagctgata act
234625DNAArtificialoligonucleotide primer 46ccctcactcc tagattattt gttgt
254726DNAArtificialoligonucleotide primer 47gggttatatc agtttttctt tttgtt
264820DNAArtificialoligonucleotide primer 48ccatcctcgt tagcatctat
204929DNAArtificialoligonucleotide primer 49gatggctgtc attgctacag
aggagtatc
295038DNAArtificialoligonucleotide primer 50gtgactccaa aggaaagaga
aatgtttctt aaatcatc
385131DNAArtificialoligonucleotide primer 51caattcttgt gggttgaagc
cttgttctga c
315230DNAArtificialoligonucleotide primer 52ggaatcaact tcttcgtgag
tgggttgttc
305332DNAArtificialoligonucleotide primer 53cacactatca acacctattg
gtgaccattg ta
325433DNAArtificialoligonucleotide primer 54ggagggtgct tatgtaaatg
atgtaaagac cat
335533DNAArtificialoligonucleotide primer 55catgaagctc caccatttgc
tagtacatga aac
335633DNAArtificialoligonucleotide primer 56ccagagttac caaaccatct
gtgagaaata tcc
335730DNAArtificialoligonucleotide primer 57gagggctatg ttttcttctc
cagatgtgag
305826DNAArtificialoligonucleotide primer 58aaggtcggct tggtggttaa aggcag
265914DNAArtificialoligonucleotide probe 59aatgataatt tagt
146013DNAArtificialoligonucleotide
probe 60aatgatcatt tag
136112DNAArtificialoligonucleotide probe 61gaatgacttt ga
126213DNAArtificialoligonucleotide probe 62gaatgatttt gac
136313DNAArtificialoligonucleotide
probe 63ttatagacac ttg
136412DNAArtificialoligonucleotide probe 64tataggcact tg
126512DNAArtificialoligonucleotide probe 65gaggagatgt ag
126612DNAArtificialoligonucleotide
probe 66gaggaaatgt ag
126713DNAArtificialoligonucleotide probe 67tcatctgtga taa
136813DNAArtificialoligonucleotide probe 68tcatgtgtga taa
136913DNAArtificialoligonucleotide
probe 69tcatctctga taa
13
* * * * *