*4.3.1 UNG*

In humans, two splice variants of Uracil-N-Glycosylase (UNG) are expressed from its gene (*UNG*), with both isoforms containing an identical sequence except for their N-terminal which is unique to each protein. UDG/UNG are interchangeably used to refer to this protein; however, in this text we will refer to the protein as UNG and the

#### **Figure 2.**

*Detection and removal of uracil in DNA by base excision repair pathway. A simplified schematic showing the two arms of the base excision repair (BER) pathway in the repair of uracil that is either misincorporated (dU:dA) or mismatched (dU:dG) in DNA. A uracil DNA glycosylase, UDG/SMUG1/TDG/MBD4 in humans, detects and 'flips' out the uracil base from the DNA and cleaves it, leaving an abasic site. AP endonuclease (APE) then nicks the DNA backbone 5*′ *to the abasic site, creating a single-strand DNA break. From there, BER pathway will either proceed down the short patch BER or Long patch BER depending on the type of damage, stage of the cell cycle, and cell differentiation state. In short patch BER, DNA Polβ fills in the gap of the abasic site with the correct base and is also cleaves the deoxyribosephosphate (5*′ *dRP) left over from the abasic site. After DNA polymerisation, a single-strand DNA break is present and is ligated by LIG3:XRCC1. In long patch BER, DNA Polδ:Polε:PCNA inserts multiple bases from the abasic site, creating a 'flap' of single-stranded DNA. FEN1 is able to cleave this flap and LIG1 seals the single-strand DNA break left over from the process. Abbreviations: 5*′ *dRP = deoxyribosephosphate, dA = deoxyadenine, dG = deoxyguanine, dU = deoxyuridine, dT = deoxythymidine, APE = Apurinc endonuclease, BER = base excision repair, DNA Polβ = DNA polymerase, FEN1 = flap endonuclease 1, LIG = DNA ligase, MBD4 = methyl-CpG-binding domain protein 4, SMUG1 = single-strand selective monofunctional uracil DNA glycosylase, TDG = thymine DNA glycosylase, UDG = uracil DNA glycosylase, and XRCC1 = X-ray repair cross-complementing protein 1.*

superfamily encompassing all the uracil-targeting DNA glycosylases will continue to be referred to as UDG. UNG1 is expressed constitutively in the mitochondria, first as a 35 kDa precursor which is then processed at the N-terminal to a 29 kDa protein, and UNG2 is a 36 kDa serine/threonine phosphoprotein located in the nucleus, which maintains its N-terminal sequence. U detection and removal is predominantly carried out by UNG, in human cells, since it has the highest activity with single-stranded (ss)DNA and is very active with double-stranded (ds)DNA compared to the other UDGs present in human cells (SMUG1, TDG and MBD4) [50, 58, 59]. Additionally, UNG has at least 101–3 times higher turnover than the other three human UDGs [50, 59, 60], with UNG2 being the only DNA glycosylase present in the nucleus that is able to remove U:A pairs close to passing replication forks [61, 62], where UNG is mostly located in replication foci during S-phase [59, 61, 63]. Due to the efficiency of UNG compared to the other UDGs in humans, it is thought that UNG is predominantly responsible for removing U:G mismatches [59, 60]; however, this has not been directly reported except of U:G mismatches produced by activation-induced cytosine deaminase (AID), which are critical in the adaptive immune system.

When an infection occurs in humans, our adaptive immune system will try to generate antibodies specifically for that antigen, which will allow the infection to be efficiently cleared out. B lymphocytes are responsible for generating antibodies, but to generate specific antibodies they need a mechanism to induce heterogeneity, which is achieved via somatic hypermutation (SHM) and class switch recombination (CSR) [29]. The *Ig* loci produces the heavy chain of an antibody and codes for the Ig variable region (produces part of heavy chain that directly binds to antigen) and the Ig constant region (determines class of antibody, which could be IgM, IgG, IgA or IgE). AID deaminates Cs in specific regions of this loci [31]. During CSR, UNG2 targets U and, with APE, generates abasic sites with single-strand breaks. Since AID generates clustered regions of U-containing DNA, this can produce double-strand breaks once UNG2 and AP endonuclease have processed enough of them. This then triggers non-homologous end joining and connects the Ig variable region to a new constant region and determines the class of the antibody. During SHM, AID introduces U into Ig variable region and UNG2 removes these producing abasic sites and error prone polymerases then introduce mutations and alter the DNA sequence. Overall, this means various B lymphocytes produce unique antibodies coded from their *Ig* loci. The unique antibody is exposed on the surface of the B lympocyte and if that antibody has affinity to the antigen presented to it, it will survive and produce Plasma cells, which produce the cloned antibody and then allow the adaptive immune system to target the antigen [64].

In addition to UNG's role in the adaptive immune system, UNG is also involved in the innate immune system. Virally infected cells are exposed to proviral DNA (viral DNA that is yet to become active), which will allow the virus to propagate further by hijacking cellular functions. To counteract this, cells express APOBEC3 enzymes (another DNA cytosine deaminase) that can associate with proviral DNA, before it integrates into the cell's genome, and proceeds to deaminate C→U [65]. This then either leads to degradation of the proviral DNA (with the help of AP endonuclease) or if the proviral DNA does integrate within the genome of the host it is hypermutated (G:C→A:T) and, therefore, non-functional. Other functions of UNG may also include removal of some oxidative products of C including alloxan, isodialuric acid and 5-hydroxyuracil; but it is unknown if this is true *in vivo* [66]. Additionally, UNG2 might be involved in TET-mediated demethylation of cytosine, which would reverse the epigenetic silencing of certain genes [67].
