**3.1. Preliminary results from** *Aleurodicus dispersus*

Preliminary BLAST analyses of peptides translated from noncoding frames of *Euphausia*'s mitochondrial protein coding genes detected GenBank proteins from the mitogenome of *Aleurodicus dispersus*, a sap-sucking spiraling whitefly. These have high homology levels with peptides translated from the antisense sequence of several among *Euphausia*'s protein coding genes. These unusual CDs in this insect mitogenome remind previous descriptions of other unusual CDs in the mitogenome of the marine turtle *Lepidochelys olivacea*. This justifies detailed analyses of peptides translated from the six frames of the 13 protein coding genes of the mitogenomes of *Aleurodicus dispersus* (JX566506), and, for comparative purpose, of its closest relative with a complete mitogenome in GenBank, *Aleurodicus dugesii* (NC\_005939), whose predicted proteome seems coded according to regular rules.

#### **3.2.** *Aleurodicus dispersus* **protein coding genes**

This hypothesis of mutations directed at stop codons is in line with observations that polymerase errors are more frequent in stop codon contexts, interpreted as an adaptational bias to introduce mutations in stops [75]. In the next section, GenBank is explored to detect further mitogenomes in which genetic codes were switched by producing stop codons in ORFs and

Previous Blastp searches found proteins already described in GenBank and aligning with hypothetical peptides translated from randomly chosen frameshifted vertebrate mitochondrial genes. These analyses detected the unusual proteins translated from ORFs of the mitogenome of *Lepidochelys olivacea* [9]. In these cases, the regular mitochondrial proteins are coded in frames that include stops, and hence were not recognized as the regular gene. The annotated frame is stopless, but codes for other, unknown peptides. These other peptides are homologous to peptides translated after frameshift from regular mitochondrial protein coding genes, from other mitogenomes that did not undergo stop codon depletion in non-ORF frames.

The method described above only detects homologies for sequences sufficiently similar to "seed" sequences used for BLAST analyses of GenBank. Therefore, using as seed the human mitogenome, mainly vertebrate proteins were detected, as for the above-mentioned *Lepidochelys olivacea*. A similar situation occurs for detection of swinger DNA/RNA sequences: the original searches using as seed swinger transformed versions of the human mitogenome only detected vertebrate sequences [45], but BLAST analyses using a randomly chosen invertebrate mitogenome (from the North Pacific krill *Euphausia pacifica* (NC\_016184)) detected

This search principle for insect nucleotide sequences can also be applied for proteins. I use as seed the five peptides translated from the five "noncoding" frames of the 13 regular protein coding genes of *Euphausia*'s previously randomly chosen invertebrate mitogenome. These 65 peptides were blasted to search GenBank for proteins already described and with high

Preliminary BLAST analyses of peptides translated from noncoding frames of *Euphausia*'s mitochondrial protein coding genes detected GenBank proteins from the mitogenome of *Aleurodicus dispersus*, a sap-sucking spiraling whitefly. These have high homology levels with peptides translated from the antisense sequence of several among *Euphausia*'s protein

numerous additional swinger sequences, from insect mitogenomes [38].

homologies with peptides translated from *Euphausia*'s noncoding frames.

stop-depletion in other frames.

60 Mitochondrial DNA - New Insights

**2.1. Exploring GenBank for genetic code switches**

**2.2. Choice of seed sequences for BLAST searches**

**3.1. Preliminary results from** *Aleurodicus dispersus*

**2. Methods**

**3. Results**

All six frames of the 13 mitochondrial protein coding genes of *Aleurodicus dispersus* were translated according to the regular invertebrate mitochondrial genetic code. First, BLAST analyzed peptides translated from GenBank-annotated, stopless ORFs to verify which of these peptides are "normal," i.e., have regular homologies with the corresponding protein predicted for the regular ORF of the mitogenome of *Aleurodicus dugesii*.

These analyses confirm that GenBank annotations of the six *Aleurodicus dispersus* mitogenes COI, COII, AT6, COIII, ND3, and ND2 code for typical invertebrate proteins homologous with corresponding proteins in regular insect mitogenomes, notably *Aleurodicus dugesii*. The remaining seven genes follow different coding structures described below, based on frameshifts and/or stop depletion/translation. Blastp does not detect any homologies for proteins predicted according to GenBank annotations for genes AT8, ND1, ND6, ND5, ND4, and ND4l, and only partial homology for CytB.

Mitochondrial metabolism without the regular proteins usually translated from these seven genes seems impossible. Regular analyses of the mitogenome of *Aleurodicus dugesii* detect these proteins as they are annotated in GenBank. The possibility that these genes were transferred in *Aleurodicus dispersus* to the nucleus and that proteins are imported to the mitochondrion seems unlikely as ORFs occur at positions corresponding to gene locations coding for the seven missing proteins in the predicted mitoproteome of *Aleurodicus dispersus*.
