*2.1.2. Sample collection and storage*

possible by observing its external appearance. Moreover, the content of biochemical substances

There are several reports on the use of DNA-based analytical techniques such as random amplified polymorphic DNAs (RAPDs) [1–5], restriction fragment length polymorphisms (RFLPs) [3,6], amplified fragment length polymorphisms (AFLPs) [3,4,7,8], and microsatellites (single sequence repeats, SSRs) [7, 9–14], besides other approaches like markers in the spacer or noncoding regions [15], sequence-tagged site (STS) markers [3, 13, 16], and diversity arrays technology (DArT) [17] for the assessment of the genetic diversity in hops. Since most of these methods are based on fingerprinting approach, the results obtained are sometimes unclear and

Analysis of single nucleotide polymorphisms (SNPs; differences of single nucleotide in homologous DNA among different varieties) in genomic DNA might be a better and more reproducible tool for the identification of varieties. SNPs are widely distributed in the genome and could be used as markers for the assessment of genotypes. For example, variation in DNA sequence and expression of valerophenone synthase (VPS) gene, a key gene of the bitter acid biosynthesis pathway, has been investigated in hop, using SNPs [18]. However, a large amount of DNA sequence is needed to obtain sufficient SNPs in order to identify the different varieties. In this context, high throughput next-generation sequencing (NGS) generally provides several hundred thousand–times more sequence data in a single analysis compared to the conven‐ tional Sanger method, but the whole genome sequencing by either method is still very expensive and time-consuming. In fact, genome sizes of two representative hop varieties, lupulus and neomexicanus, are 2.74 and 2.97 Gb, respectively [19], which are comparable to

To overcome these problems, transcriptome analysis has been employed for the identification of hop varieties. Transcriptome is the entire mRNA content, transcribed from the genome, and its size ranges from one hundredth- to two hundredth-parts of the genome. Nevertheless, even by a conservative estimate of an average of one SNP per 1000 bp, based on the frequency of SNPs observed in the human genome [20], 13.5K to 30K SNPs could be expected in a relatively short period. Such a high frequency of SNPs would be enough for the identification of hop varieties. The discovery of a large number of SNPs and their specific combinations in each variety could lead to the identification of many hop varieties and detection of contaminants in mixed varieties using these specific SNP-combination-based markers. In the present study,

we developed an SNP-based identification method for the hop varieties [21].

**2.1. Identification of SNP markers by second generation sequencing and transcriptome**

In order to obtain intravariety DNA polymorphic regions required for developing a hop variety identification technique, we attempted searching for SNPs in a large amount of

**2. Research protocols, methods, results, and discussion**

prone to misjudgment, thereby limiting the detection of contaminating varieties.

in hop can vary depending on the cultivation conditions.

324 Next Generation Sequencing - Advances, Applications and Challenges

that of the human genome [20].

**analysis**

*2.1.1. SNPs*

The hop varieties to be identified were Saaz, Sládek, and Premiant, which originated in Czech Republic; Tradition, Spalter, Spalter Select, Perle, Tettnang, Brewer's Gold, Northern Brewer, Magnum, Herkules, German Nugget, and Taurus, which originated in Germany; and Cascade, Zeus, Summit, Galena, Super Galena, Nugget, and Columbus/Tomahawk, which originated in the United States (here, as is widely assumed, we considered that Columbus and Tomahawk were genetically identical). Pellets or dried samples of these varieties were obtained from appropriate suppliers. Three varieties (referred to as A, B, and C for convenience) were selected, and fresh leaves were collected from these varieties. The leaves were sampled and stored according to the procedure described below:

*Tissue*: Leaves as young (small, yellow-green, and soft) as possible were collected. Those with white foreign matter on the surface were excluded.

*Methodology*: To prevent RNase contamination, leaves were collected with gloved hands and were soaked in a reagent (RNA Save; Biological Industries Israel Beit Haemek Ltd., Israel) for preventing RNA degradation.

*Storage*: Although RNA was stable for at least 1 week even at room temperature, the leaves were stored under refrigeration for as much time as possible until being used for RNA isolation and transcriptome and SNP sequence analysis.
