SP3 Use Cases 2008 Document 2

From ICISWiki

Jump to: navigation, search

Contents

Specifications for a Molecular Selection software application, MOSEL

Background

Discussions with breeders at ICRISAT, IRRI and CIMMYT indicate that the lack of a reliable, easy to use application to display and integrate genotype, pedigree and phenotype information and allow selection of lines for promotion to the next cycle of breeding is a constraint to the adoption and effectiveness of molecular breeding.

Purpose

The proposed Molecular Selection tool, MOSEL, will access pedigree, genotype and phenotype data for test and reference lines. It will access marker and QTL data from mapping studies and will integrate the data sources into a single display. It will show the genetic proximity of loci in test lines to foreground and background target genotypes and allow selection of test lines based on marker and/or phenotypic traits.

Input

  • Test germplasm will be specified through a germplasm list containing germplasm identification data such as GID, Entry No, Entry Code, Designation and Source.
  • Genotyping and, optionally, phenotype data for test lines will be supplied through accessible data sets (studies in ICIS). Genotyping data consists of molecular genotypes for each test line at a number of loci.

Phenotype data consist of one or more phenotypic measurements on each line. If more than one, then an averaging scheme must be available. Initial prototypes of MOSEL will assume single phenotype values which should have been derived through appropriate analysis if necessary

  • Ancestors are determined from the test germplasm by tracing pedigrees according to one of three neighborhood types:
    • Maintenance sources for 1, 2 … or all generations. Maintenance sources are ancestors related to the test line by maintenance methods such as seed increase, regeneration or release.
    • Derivative sources for 1, 2 … or all generations. These are ancestors related to the test germplasm by derivative methods such as selection, purification or double haploidy. Any derivative neighborhood contains all maintenance ancestors also.
    • Progenitors for 1, 2 … or n generations. These are parental lines related by generative methods such as crossing, mutagenesis or mixture. These generative neighborhoods include all derivative and maintenance sources. Genotype and/or phenotype data should be supplied through accessible data sets which may be the same as those for test lines.
  • Founders are the earliest (genealogically), genotyped ancestors from which all the other ancestors and the test lines have been produced. Usually, they are the direct parents giving rise to the test set.
  • Genetic information about the loci may be provided through a map data set. This should group the observed loci and optionally order them within groups. The map data may also specify QTL and known gene information for any traits of interest.

Genotypes

  • Observed genotypes

For each line (test, ancestor or founder) and for each locus, the genotyping data specifies frequencies of one or more alleles. For each allele at each locus in each line, the pedigree information can be used to calculate the probability that that allele comes from each founder. Hence, the genotype of line i (i = 1, 2 …I) at locus j (j = 1, 2 … J) consists of the frequencies Xijk of allele k (k= 1, 2 … Kj) and the probabilities, pijkl that allele k came from founder l, (l = 1, 2 … L). For example, for germplasm i, locus j with two alleles and two founders, the genotype is:

     AlleleID   1=A   2=B
     Frequency  Xij1  Xij2
     Founder 1  Pij11 Pij12
     Founder 2  Pij12 Pij22

Parameterization

There are a number of parameters which control the display which will be set through a user interface. In general, the selection process will last more than one session so the application will be able to store and retrieve parameter settings.

  • Data source parameters
    • The identification of the list of test germplasm (ICIS germplasm list ID)
    • The identification of the test line genotyping data (ICIS study and effect)
    • The identification of the test line phenotype data (ICIS study and effect)
    • Ancestor genotyping data (ICIS study and effect)
    • Ancestor phenotyping data (ICIS study and effect)
    • Set of founder germplasm IDs (if the founders are amongst the ancestors) or identification of the list of founder germplasm (ICIS germplasm list ID)
    • Founder genotype data source
    • Founder phenotype data source
    • Map data source (ICIS map study)
  • Locus organization

The genetic loci specified by the test germplasm genotyping data may be assigned to linkage groups by the map data (3d). If any test loci are not on the map, they are allocated to an unordered default group. If no map data are specified, they are all allocated to the unordered default group.

Personal tools