*3.3.4. Transposable elements analysis*

The construction of chromosomal level genome was accomplished by aligning *de novo* assembled contigs to the *X. maculatus* chromosome assembly using Mummer 3 package Nucmer3.0 (http://mummer.sourceforge.net). For each species, sequences of contigs and the location of *X. maculatus* chromosome alignments were recorded. By using a customized Perl script, these sequences and alignment information were organized into chromosomes.

To annotate the newly assembled *X. couchianus* and *X. hellerii* genome, two methods, rapid annotation of transfer tool (RATT) and *de novo* assembled transcriptome, were used and the

Transcript sequences and associated functional annotations can be transferred between closely related species. A modified gene annotation method, RATT, was applied using the *X. macu‐ latus* genome and gene model as a reference to quickly transfer genome annotation [27]. Since the *X. maculatus* genome was already available, using RATT to transfer annotation can minimize computational and human resources that are required for genome annotation. Both *X. couchianus* and *X. hellerii* genomic scaffold sequences were used as query species to be aligned to the well annotated *X. maculatus* genome using Nucmer3.0 with parameters imple‐ mented by RATT for annotation transfer. To avoid frame shift between two species, the synteny between both species and reference was established and insertions/deletions were also identified, respectively. *X. maculatus* gene models were then transferred and corrected to both query species. Of the 20,482 gene models annotated in *Xiphophorus* genome, 20,300 and 20,325

To compare to this RATT annotation transfer method, *X. couchianus* and *X. hellerii* genome annotations were also annotated with a different method using *de novo* assembled transcrip‐ tomes. This method is reference genome independent. Briefly, RNA samples from one month old whole fish of *X. hellerii* and *X. couchianus* and a collection of tissues of mature individuals of each species were sequenced using Illumina GAIIx platform as 60 bp paired-end reads as well as HiSeq-2000 platform as 100 bp paired-end reads. *De novo* transcript assemblies and reports of putative transcripts were performed using velvet v1.1.05 and Oases v0.1.22 [28, 29]. The transcriptome assembly resulted in 110,604 and 242,675 transcripts for *X. couchianus* and

Comparing these two methods of annotation to each other in perspective of transcriptome quality, *de novo* method produced very larger transcriptomes in number of transcripts and final

of them were transferred to *X. couchianus* and *X. hellerii*, respectively (Table 4).

**Table 4.** Comparisons between reference-based annotation and *de novo*-based annotation

*3.3.3. Genome annotation*

*X. hellerii*, respectively.

result from each were compared to each other.

70 Next Generation Sequencing - Advances, Applications and Challenges

As found previously, *X. maculatus* transposable elements (TEs) make up ~5% of the transcrip‐ tome [3]. Although the percentage of TEs is only slightly higher than the compact genomes of puffer fishes and is close to that of chicken genome, there is a high diversity of TE families in *X. maculatus* genome [3, 30, 31].

To annotate the TEs in *X. couchianus* and *X. hellerii* genomes, a previously established library was further completed employing RepeatScount (http://bix.ucsd.edu/repeatscout/) and RepeatModeler (http://www.repeatmasker.org/RepeatModeler.html) software. Redundant sequences were discarded, leaving 1019 sequences in the new library. RepeatMasker (http:// www.repeatmasker.org/) was subsequently utilized to mask genome assemblies. Custom Perl script was then used to establish repeat coverage and copy numbers. After removing TE sequences that are smaller than 80 nt and share less than 80% identity with reference library, TEs were found to make up ~12% of each *Xiphophorus* genome (*X. maculatus,* 12.11%; *X. couchianus*, 12.61%; *X. hellerii*, 12.14%; unpublished data). A detailed classification of TEs in each *Xiphophorus* genome is shown in Table 5.


**Table 5.** Transposable elements in *Xiphophorus* genomes
