**3.7.** *De novo* **assembly**

The performance and accuracy of the *de novo* transcriptome assembly is largely dependent on the complexity of the genome (e.g., genome size, number of paralogs, ploidy level), differential read coverage of the sequenced data, and sequencing error. Transcriptome assembly is complex and different from genome assembly in which read coverage is uniform. In contrast, in RNASeq, the abundance of reads vary based upon gene expression, in which case isoforms originating from same gene can have different expression levels and hence poses significant challenge in estimating the abundance especially for the lowly expressed genes if the sequenc‐ ing depth is too low. In general, *de novo* transcriptome requires much higher sequencing depth than the reference-based transcriptome assembly.

The *de novo* transcriptome assembly generally consumes more time and is more computationintensive than reference-based assembly [131]. The number of transfragments produced using the *de novo* approach is quite high, which can be due to multiple similar transcripts/isoforms at the locus from allelic variation, or could be due to artifacts. Additionally, the contiguity and completeness of the *de novo* assembled transcriptome is less than the reference-based assembly especially for the data with less sequencing depth [132].
