*3.3.1. Biological sample*

Taking ALLPATHS for instance, the memory use is estimated to be roughly 1.7 bytes per read base, which equals to a 102-GB RAM of a 60× coverage 1-GB genome. This level of RAM requirement can be fully fulfilled nowadays. Alternatively, this RAM requirement can be solved by sharing memory from different computer nodes, or by distributing the workload to different nodes within a computer cluster, which is normally accessible in most universities and research institutions. In addition, the development of cloud computing allows one to gain access to high-speed computer clusters in a pay-as-you-go manner, and there are several

**Figure 2.** Outline of De Bruijn graph build during the sequence assembling process. A short model genome is se‐ quenced. Four short reads were generated from template. The *k*-mer length of 5 was chose to be used in sequence as‐ sembly. For each *k*-mer, the left *k*– 1 and right *k*– 1 were represented as nodes in the De Bruijn graph, and all left parts are connected to possible right parts by directed edges. The red digit shows the number of occurrence of each node. The cyclic edge at the rightmost end of the graph causes the gap of contig assembly. Thus, the final assembly does not

recently developed cloud-based sequence assemblers (summarized in Table 3).

**Table 3.** Cloud computing-based sequence assemblers

fully represent the "repeat" in the genome sequence.

68 Next Generation Sequencing - Advances, Applications and Challenges

*X. couchianus* were maintained by sibling inbreeding, and the fish that were sequenced were in their 77th generation of inbreeding. *X. hellerii* was maintained by reciprocal cross breeding between 2 distinct *X. hellerii* strains, differing by sword color. All the fish that were used for genome sequencing were female since the high degree of repetitive DNA generally found to make up Y-chromosomes can confound the downstream assembly.
