News - 27th July 2016

A new James Hutton Institute designed barley Illumina iSelect 50k SNPchip will be available later in 2016. Please contact barleysnpchip@hutton.ac.uk for further information.

Barley iSelect SNP Chip

Release 21st March 2012

Overview

The barley iSelect chip is based on the Illumina Infinium genotyping assay (http://www.illumina.com/products/infinium_iselect_custom_genotyping_beadchip.ilmn). The current chip features a total of 7,842 SNP assays which are a combination of 2,832 existing barley OPA (BOPA) SNPs used for the existing Illumina Golden Gate assay and 5,010 new SNPs discovered from Next Generation Sequencing data.

New NGS SNPs

This set of SNPs was derived from Illumina RNAseq reads mapped onto the Harvest35 reference sequences (http://www.harvest-web.org/hweb/bin/wc.dll?hwebProcess~hmain~&versid=5). The raw Illumina reads were a mixture of 54 nt and 76 nt reads and came from the cultivars Barke, Betzes, Bowman, Derkado, Intro, Morex, Optic, Quench, Sergeant, and Tocada. In total, 191,210,425 reads were used for the mapping. Reads were quality trimmed and then mapped to the Harvest 35 unigene sequences using the Mosaik read mapping tool (http://bioinformatics.bc.edu/marthlab/Mosaik). The mapping was carried out using 2% mismatches and with the “-m unique” parameter setting which causes reads to be discarded that could be mapped to more than one location in the reference sequence. The Illumina reads mapped to 41,484 out of a total of 50,938 unigenes (81.4%). Out of a total of 191,210,425 reads, 112,721,213 were mapped successfully to the reference sequences (58.9%).

The initial round of SNP discovery was carried out using the GigaBayes tool (http://bioinformatics.bc.edu/marthlab/GigaBayes) using an -- initially -- deliberately relaxed set of parameter values:

--CRL 6 minimum read coverage = 6
--CAL 3 minimum number of instances of an allele required = 3
--PSL 0.5 minimum likelihood of SNP being genuine = 0.5

We then extracted SNP manifest sequences from the full set of SNPs from the GigaBayes run (n = 240,119) and these were submitted to the Illumina Assay Design Tool (ADT) in order to assess the suitability of the SNPs for the assay. Out of the total of 240,119 SNPs submitted, 92,063 passed the ADT run and out of these 76,831 had a final design score of >= 0.6 (Illumina’s recommended threshold for high-confidence design). This set of SNPs was then filtered further to remove any overlap with the existing BOPA SNPs and to exclude SNPs where there was evidence of heterozygosity within samples.

This left 31,616 potentially usable SNPs of varying robustness. To allow ranking of these we extracted the following statistics:

  • The minimum allele count for any one sample. This count indicates the minimum number - out of all samples in a SNP - of reads that were counted towards the actual genotype of the sample, i.e. in homozygous samples the major (and only) allele count and in heterozygous samples the minor allele count. This allows us to control the robustness of an actual genotype call (referred to as “minimum read replication” in the data).
  • The minor allele frequency. This allows us to remove false positive SNPs that are caused purely by read errors (referred to as “minor allele frequency” in the data).
  • The proportion of samples – out of the total number of samples that contributed reads to a SNP – which had the genotype of the minor allele at the SNP location. This reflects the level of diversity for the SNP (referred to as “diversity index” in the data).
  • The number of samples – rather than the proportion above - which had the genotype of the minor allele at the SNP location (referred to as “number of Samples With Minor Allele As Genotype” in the data).

We then carried out further filtering and sorting as follows:

  • Filter by “number of Samples With Minor Allele As Genotype” using a cutoff of >=2.
  • Sort the data by unigene name (ascending), then by “minimum read replication” (descending).
  • Remove duplicates so that only a single SNP remains per unigene. In combination with the previous sort, this will select from each unigene the SNP with the highest value for “minimum read replication”.
  • Sort the remaining SNPs again, first by “minimum read replication” (descending), then by “minor allele frequency” (descending). This puts the most robust SNPs topmost.
  • Select the top SNPs from this list (n = 5,010).

Validation of Illumina Mappings

To ascertain the validity of the Illumina read mappings the SNP discovery was based on, we extracted genotype calls from the Illumina read mappings at the BOPA SNP positions and compared them to existing genotype data from the Illumina Golden Gate assay. Validation rates were in the region of approx. 93-95% :

Sample Percentage Correctly Called
Betzes93.62
Bowman95.08
Intro94.59
Morex95.98
Optic94.95
Sergeant93.25
Tocada93.78
Derkado95.03
Barke94.70
Quench93.15

Download Data or Search for Individual Markers

We have made this data resource searchable for individual markers and hope to add additional functionality over time. If however you want to download all the data then you can do so by clicking here (plain text format) or clicking here (Excel format). This data is held in text format for import into your favourite text editor or Excel.

User Panel

Haai Germinate iSelect User

For help please contact barleysnpchip@hutton.ac.uk

1. Download Data


We have made this data resource searchable for individual markers and hope to add additional functionality over time. If however you want to download all the data then you can do so by clicking here (plain text format) or by clicking here (Excel format). This data is held in text format for import into your favourite text editor or Excel.

Download All Data (Text)

Download All Data (Excel)

2. Search iSelect Database


Enter a marker name here to search the database for additional information. Wildcards are automatically added to the beginning and end of your search string.

For example to search for all markers containing the characters '12' just enter 12 in the search box here or at the top right hand side of each page.


3. Browse SNPs by Chromosome


Display all the SNPs that have been mapped based on the chromosome they have been mapped to.