**4. Discussion**

Gene expression profiling via next-generation sequencing has emerged as a tool to assess the biological impact of environmental pollutants and natural chemical stressors. This approach supports better understanding of whole-genome expression responses to chemical stress, aids in the identification of ecological and toxicological modes of action, and provides hundreds (or in some cases thousands) of rigorously tested markers for stress responses. Gene expression profiles thus provide a more comprehensive, sensitive, and actionable insight into toxicity than typical toxicological parameters such as morphological changes, altered reproductive capacity, or mortality [39]. For example, it has been demonstrated that chemicals from the same class of compounds give rise to discernible gene expression profiles that bear more similarity to each other than to patterns corresponding to exposure to compounds from a different class [40]. It would be useful for a database for gene expression responses to environmental pollutants to be developed for recognizing compound-specific expression profiles and molecular signatures of stress responses.

Functional annotations of genes may provide insights into affected regulatory and proliferative and repair/adaptive pathways, with immediate management and conservation implications. To improve the rate of functional annotations, expanded sequencing with more complete sequence information will be required. Recent bioinformatics developments, such as improvements to Trinotate (https://trinotate.github.io/), also hold promise of improved functional annotations. However, we note that complete annotations of affected genes are not necessary. Unannotated transcripts are potentially useful markers not only for compound-specific expression profiling but also candidates for cost-effective and rapid individual biomarker assays based on real-time polymerase chain reaction (PCR), which could be developed for use in the laboratory or even in field settings. Despite the limited sampling, we found dozens of highly significant gene expression markers (after the correction for false discovery rate) for the two stressors tested. As many as 66 genes were expressed only in the presence of the two compounds, with zero transcripts (i.e., no sequences found) in the control group.

We sought quantitative differences between the RNA pools of control and arsenate- or sulfate-exposed mussels. We sought evidence of altered expression of genes known to be linked to toxin response (e.g., heat-shock proteins, glutathione transferase, metallothioneins, cytochrome P450, etc.), and confirmed that some of the pathways, such as poly(ADP-ribose), were indeed induced. We also sought to identify genes whose expression is influenced by toxin exposure, but of which we have no existing knowledge. We found many such putative markers of toxin-induced responses. In this latter case, more complete sequence data will be required to enhance functional annotation, but even if we do not know the functions of all the associated genes, they are still candidate markers for pollutant response.

#### **4.1. Comparisons among biomarker types**

positively correlated with the activity of glutathione-S-transferase, but there was little separation between treatments and control\_2 for both gene expression and enzyme activity. The expression of one annotated gene (c152797\_g1; BRAFLDRAFT\_74879) and one unannotated gene (c126914\_g1) was positively correlated with the concentration of reduced glutathione, and there was good separation between treatments and control\_2. The expression of two unannotated genes (c155139\_g1 and c155139\_g2) was negatively correlated with the concentration of reduced glutathione and there was good separation between the treatments and control\_2. Correlation between the expression of a sixth gene (unannotated; c154720\_g4) did not meet the α = 0.01 criterion for strong correlation, but the plot demonstrated a similar pattern as the other two genes negatively correlated with reduced glutathione concentration, with good separation between treatments and control\_2. For the eight mussels included in these correlations, concentrations of reduced glutathione were higher in the control and lower in the two treatments, whether these genes are directly related to reduced glutathione production or utilization is unknown. The small sample of mussels utilized for genetic analysis limits the interpretation of the relationships but suggests that these genes warrant

Gene expression profiling via next-generation sequencing has emerged as a tool to assess the biological impact of environmental pollutants and natural chemical stressors. This approach supports better understanding of whole-genome expression responses to chemical stress, aids in the identification of ecological and toxicological modes of action, and provides hundreds (or in some cases thousands) of rigorously tested markers for stress responses. Gene expression profiles thus provide a more comprehensive, sensitive, and actionable insight into toxicity than typical toxicological parameters such as morphological changes, altered reproductive capacity, or mortality [39]. For example, it has been demonstrated that chemicals from the same class of compounds give rise to discernible gene expression profiles that bear more similarity to each other than to patterns corresponding to exposure to compounds from a different class [40]. It would be useful for a database for gene expression responses to environmental pollutants to be developed for recognizing compound-specific expression profiles

Functional annotations of genes may provide insights into affected regulatory and proliferative and repair/adaptive pathways, with immediate management and conservation implications. To improve the rate of functional annotations, expanded sequencing with more complete sequence information will be required. Recent bioinformatics developments, such as improvements to Trinotate (https://trinotate.github.io/), also hold promise of improved functional annotations. However, we note that complete annotations of affected genes are not necessary. Unannotated transcripts are potentially useful markers not only for compound-specific expression profiling but also candidates for cost-effective and rapid individual biomarker assays based on real-time polymerase chain reaction (PCR), which could be

further investigation, particularly as related to As(V) exposure.

and molecular signatures of stress responses.

**4. Discussion**

110 Organismal and Molecular Malacology

Exposure to toxic compounds would be expected to affect gene expression, molecular flux through biochemical pathways, and ultimately histological structures in freshwater mussels. Further, the respective markers would be expected to be causally related to one another. Hence, we sought statistical indicators of such correlations among our data sets.

Correlations among histological and genetics data provide provisional bases for hypotheses regarding links among histological lesions and genes identified during this study. While many of the significant correlations involved genes without known function, a few involved genes of known function. Gene LOC101731886 was significantly negatively correlated with fractions of reproductive acini containing mature or developing gametes. The gene has been linked to the activity of poly [ADP-ribose] polymerase (15-like) (*PARP*) [41]. The *PARP* enzyme family is associated with physiological functions during cell division, transcriptional regulation, repair of DNA damage, and cytoplasmic cell response [42]. Females that provided the data for the correlations had completed oogenesis, and oocytes in acini were mostly atretic or resorbing; it seems logical that nucleus-related activities controlled by *PARP* in postgametogenic females would be negatively related to abundances of developing oocytes, since active gametogenesis had ceased. The positive relationship between abundances of lipofuscin in kidney cells and upregulation of gene LOC101731886 (expression of *PARP*) may be related to oxidative stress. Increased production of reactive oxygen species (ROS) has been related to DNA single-strand damage incurred during contaminant exposure and inadequate aquaculture holding conditions [43, 44]. Since abundances of intra-lysosomal lipofuscin granules are related to both contaminant exposure and oxidative stress [45, 46], it is logical that they are positively related to the expression of *PARP* activity to repair DNA.

Abundances of lesions in digestive cells were negatively correlated with genes BRAFL-DRAFT\_118372 (histamine N-methyltransferase activity) and CGI\_10002926 (uncharacterized protein with unknown function) [47–49]. A negative correlation between gene regulation of histamine methylation and lesions in digestive cells seems reasonable, because with increasing abundance of lesions, destruction of cytoplasmic organelles probably occurs. A positive correlation between the expression of Sp-Hypp\_8991 and digestive lesions may be related to activities to repair nuclear damage incurred during treatment and captive holding.

#### **4.2. Implications for future research**

Although there has been some limited application in marine mollusks, and one study has identified candidate stress-response genes in the freshwater mussel *Elliptio complanata* [50], RNAseq has not been evaluated for its utility for monitoring of the responses of freshwater mussels to coal-related environmental contaminants. Successful execution of this project provides proof of principle for using RNAseq technology to approach issues of toxicogenomics in freshwater mussels. Among key findings, our results collectively support the view that substantial changes of gene expression occur before dramatic changes in biochemical and histological effects. Our understanding of how changes in gene expression result in physiological and histological effects on the organism is hampered by the current situation in which many freshwater mollusk genes remain unannotated.

RNAseq-based assessment of global gene expression identified candidate markers for quantitative polymerase chain reaction (qPCR) assays, which could be developed and validated for rapidly assessing single-gene responses to exposure to toxic compounds. As noted above, Hamadeh et al. [40] demonstrated that chemicals from the same class of compounds give rise to discernible gene expression profiles that bear more similarity to each other than to patterns corresponding to exposure to compounds from a different class [40]. It would be useful for a comprehensive database for gene expression responses to environmental pollutants to be developed toward recognizing compound-specific expression profiles and molecular signatures of stress responses. Additional, well-controlled laboratory studies would be appropriate for the purpose of defining defensible biomarkers for toxicant-induced harm in freshwater mussels. Ultimately, practical work would apply (modifications of) the respective approaches to a range of sites in the Clinch, Powell, and other aquatic ecosystems.
