Checking for numts and avoiding them

Numt symptoms

Avoiding numts

Testing whether there is more than one mtDNA-like sequence

Establishing whether extra mtDNA-like sequences are not mitochondrial

 

Methods for mtDNA purification suggested in a mailing list

Other ways to separate mtDNA from numts

Establishing the nuclear location of mtDNA-like regions

Numts may not always give ambiguous sequence

References

 

Numt symptoms.

Symptoms of Numt contamination include PCR ghost bands; extra bands in restriction profiles(Sorenson and Quinn, 1998) ; sequence ambiguities, particularly if they are at polymorphic sites, or if they are encountered when sequencing from both strands(Sunnucks and Hales, 1996) , (Bensasson, 1999) ; frameshift mutations; stop codons; and an unexpected phylogenetic placement(Collura and Stewart, 1995) . Numts may not always give ambiguous sequence (see below).

Avoiding numts

Zeh et al (2003) used the uniparental (maternal) inheritance of their mtDNA sequence in the harlequin beetle-riding pseudoscorpion to show that the sequence they obtained was not nuclear (Zeh et al, 2003) . Another method that is frequently used because it avoids the need to firmly establish whether the problem is one of numts is to amplify the mtDNA region in question again, using different primers and amplifying a larger product. This may involve long PCR (see below) or just a larger PCR product (e.g. (Nardi et al, 2003) ). The methods described below are useful for characterizing numts but also provide suggestions for separation of mtDNA from nuclear. For example, using RT-PCR, long PCR, or by cloning and sequencing PCR products.

Testing whether there is more than one mtDNA-like sequence

Which of these methods is chosen, will depend on facilities, time and the suspected Numt proportion (unfortunately, false negatives are common for all approaches).

·      Cutting at polymorphism (CAPing). Cut the PCR product with restriction enzymes whose recognition sites fall on sites of ambiguous sequence. Is there more than one restriction profile after electrophoresis(Hu and Thilly, 1994) ? Partial digestion can be controlled for using a restriction enzyme approach(Zhang and Hewitt, 1996b) , or the other methods mentioned below can be used to characterise the Numts further. This approach is quick, easy and allows the detection of Numts when Numts are multiple and the mtDNA proportion is too high for the visualisation of single-copy Numts on electrophoresis gels (e.g. in the case of grasshoppers(Bensasson et al, 2000b) ).

·      Single-stranded conformation polymorphism (SSCP)(Sunnucks et al, 2000) . Run the PCR product on an SSCP gel to check for multiple bands, or smears. This method has the advantage that it is quick and may also give a quantitative measure of the number of Numts.

·      Cloning and sequencing PCR products. Many clones will be needed if the proportion of Numts is small relative to mtDNA, and if no other approaches are used to select which clones to sequence. However, this method has the advantage that the data gained are unambiguous, can be very used to determine if the extra sequences are nuclear, and is useful for many application, especially if the PCR product that is cloned is amplified with a high fidelity polymerase. For example, Numt sequence data obtained by cloning and sequencing of mtDNA-like PCR products can be used to reconstruct ancestral mtDNA states(Hu and Thilly, 1994) , to root mtDNA phylogenies(Quinn, 1992) ,(Zischler et al, 1995) , or to characterise spontaneous nuclear mutation(Bensasson et al, 2000a) , for continued checking for contaminants in that species (Vartanian and Wain-Hobson, 2002) .

Establishing whether extra mtDNA-like sequences are not mitochondrial

Extra mtDNA-like sequences in a PCR product could represent heteroplasmic mtDNA, duplications within the mitochondrial genome, or non-mitochondrial and non-nuclear “episomal” DNA (see Refs. Zhang and Hewitt, 1996c, Sunnucks and Hales, 1996, and Mirol et al, 2000 for fuller discussion).

Numts can be avoided, and their non-mitochondrial location established, if the proportion of amplified mtDNA is increased. This can be done by purifying mitochondria prior to DNA extraction (Sunnucks and Hales, 1996; Zhang and Hewitt, 1996b; Suggested methods for mtDNA purification), by long PCR amplification(Sorenson and Quinn, 1998; Sato et al, 1999; Hwang et al, 2001). Detailed protocols and discussion of an mtDNA purification approach and long PCR are also described in (Bensasson 1999, PhD thesis). Another approach is to use tissue that is rich in mtDNA relative to nuclear DNA (Sorenson and Quinn, 1998; Greenwood and Paabo, 1999) . For example, Numts sometimes amplify preferentially from the nucleated blood of birds(Sorenson and Quinn, 1998) , and mammalian hair, while mtDNA is more likely to predominate from muscle(Sorenson and Quinn, 1998) , liver(Sorenson and Quinn, 1998) , or mammalian blood(Greenwood and Paabo, 1999) . Tissues particularly rich in mtDNA are listed in Dowling et al. (1996); for example, amphibian or fish oocytes (Dowling et al, 1996) .

Although all of these methods are effective, none are guaranteed. Numts may still be amplified from long PCR products(Sorenson and Quinn, 1998 , Bensasson et al 1999); when using primers more specific to the pseudogene sequence, Numts can still be amplified from purified mtDNA(Collura and Stewart, 1995) ; and sometimes there is no discernible difference in mtDNA proportion among somatic tissues(Bensasson et al, 2000b) .

Other ways to separate mtDNA from numts

Numts can be avoided by RT-PCR(Collura et al, 1996) ,(Herrnstadt et al, 1999) (Williams and Knowlton, 2001) , but occasionally Numts are transcribed(Blanchard and Schmidt, 1995) ,(Blanchard and Schmidt, 1996) . Where mtDNA and Numt sequences are known and mtDNA is monophyletic, mtDNA-specific primers can be designed(Sorenson and Quinn, 1998) ; or where Numts are monophyletic they can be digested with restriction enzymes prior to PCR(Zhang and Hewitt, 1996a) ,(Sorenson and Quinn, 1998) . Alternatively, if PCR products are cloned and sequenced it is sometimes possible to infer, from the mode of evolution observed among individual clones, which sequence is mitochondrial and that others are nuclear(Lemos et al, 1999) ,(Mirol et al, 2000) (Vartanian and Wain-Hobson, 2002) .

Establishing the nuclear location of mtDNA-like regions

The nuclear location of mitochondrial pseudogenes has also been established by fluorescent in situ hybridisation (FISH) (Lopez et al, 1994) ,(Vaughan et al, 1999) , by enriching for nuclear DNA(Zhang and Hewitt, 1996b) , and by comparison to tissues rich in nuclear DNA (e. g., sperm heads)(Sorenson and Quinn, 1998) . For metazoans, a non-mitochondrial location can also be established by screening large-fragment genomic libraries(Yuan et al, 1999) or by Southern Blot analysis(Zhang and Hewitt, 1996c) . In the case of humans, PCR products have been compared to those amplified from tissues grown on ethidium bromide so they have few or no mitochondria(Wallace et al, 1997) ; and Numts have also been mapped using human / rodent cell lines(Zischler et al, 1995) ,(Zischler et al, 1998) .

Numts may not always give ambiguous sequence

PCR products obtained for aphids(Sunnucks and Hales, 1996) , grasshoppers(Bensasson et al, 2000b) , elephants(Greenwood and Paabo, 1999) , gorillas(Jensen-Seaman, 2000) , and gall wasps (Antonis Rokas, personal communication) contained many different Numt sequences in addition to the true mitochondrial sequence. Although in all these cases, the mtDNA sequence was usually the single most common mtDNA-like region, it represented less than 50% of the PCR product. If these PCR products had been sequenced directly, the sequence obtained is unlikely to have been mitochondrial, and would not necessarily be ambiguous. Sequence obtained by direct sequencing of a Numt-mtDNA mixture will only be ambiguous if the amplified sequences differ from each other at a site in approximately equal proportions. At sites where the mtDNA has changed recently, most Numts will have retained the ancestral state, so the ancestral state may still represent the majority of sequences in the PCR product. At other sites, which the mtDNA shares with many recent Numts, the derived (mitochondrial) state may represent the majority of sequences. If the majority base is called at each of these sites the resulting sequence will be the majority-rule consensus sequence, and this may not have a physical existence.

References

Bensasson D (1999) A study of mitochondrial-like DNA in Podisma pedestris and other grasshoppers School of Biological Sciences. University of East Anglia, Norwich

Bensasson D, Petrov DA, Zhang D-X, Hartl DL, Hewitt GM (2000a). Genomic gigantism: DNA loss is slow in mountain grasshoppers. Molecular Biology and Evolution 18: 246-253.

Bensasson D, Zhang D-X, Hewitt GM (2000b). Frequent assimilation of mitochondrial DNA by grasshopper nuclear genomes. Mol.Biol.Evol. 17: 406-415.

Blanchard JL, Schmidt GW (1995). Pervasive Migration of Organellar DNA to the Nucleus in Plants. Journal of Molecular Evolution 41: 397-406.

Blanchard JL, Schmidt GW (1996). Mitochondrial DNA Migration Events in Yeast and Humans: Integration by a Common End-joining Mechanism and Alternative Perspectives on Nucleotide Substitution Pattern. Journal of Molecular Evolution 13: 537-548.

Collura RV, Auerbach MR, Stewart CB (1996). A quick direct method that can differentiate expressed mitochondrial genes from their nuclear pseudogenes. Current Biology 6: 1337-1339.

Collura RV, Stewart C-B (1995). Insertions and duplications of mtDNA in the nuclear genomes of Old World monkeys and hominoids. Nature 378: 485-489.

Dowling TE, Moritz C, Palmer JD, Rieseberg LH (1996) Nucleic Acids III: Analysis of Fragments and Restriction Sites. In: Hillis, Moritz (eds) Molecular Systematics. Sinauer Associates, Inc.: Sunderland, MA, p 249-320

Greenwood A, Paabo S (1999). Nuclear insertion sequences of mitochondrial DNA predominate in hair but not in blood of elephants. Molecular ecology 8: 133-137.

Herrnstadt C, Clevenger W, Soumitra SG, Anderson C, Fahy E, Miller S, Howell N, Davis RE (1999). A Novel Mitochondrial DNA-like Sequence in the Human Nuclear Genome. Genomics 60: 67-77.

Hu G, Thilly WG (1994). Evolutionary trail of the mitchondrial genome as based on human 16S rDNA pseudogenes. Gene 147: 197-204.

Hwang UW, Park CJ, Yong TS, Kim W (2001). One-step PCR amplification of complete arthropod mitochondrial genomes. Mol. Phyl. Evol. 19: 345-353.

Jensen-Seaman MI (2000) Evolutionary Genetics of Gorillas. Yale University, New Haven, CT.

Lemos B, Canavez F, Moreira MAM (1999). Mitochondrial DNA-like sequences in the nuclear genome of the opossum genus Didelphis (Marsupialia: Didelphidae). Journal of Heredity 90: 543-547.

Lopez JV, Yuhki N, Masuda R, Modi W, O'Brien SJO (1994). Numt, a Recent Transfer and Tandem Amplification of Mitochondrial DNA to the Nuclear Genome of the Domestic Cat. Journal of Molecular Evolution 39: 174-190.

Mirol PM, Mascheretti S, Searle JB (2000). Multiple nuclear pseudogenes of mitochondrial cytochrome b in Ctenomys (Caviomorpha, Rodentia) with either great similarity to or high divergence from the true mitochondrial sequence. Heredity 84: 538-547.

Nardi F, Carapelli A, Dallai R, Frati F (2003). The mitochondrial genome of the olive fly

Bactrocera oleae : two haplotypes from distant geographical locations. Insect Molecular Biology 12: 605-611.

Quinn TW (1992). The genetic legacy of Mother Goose - phylogeographic patterns of lesser snow goose Chen caerulescens caerulescens maternal lineages. Molecular Ecology 1: 105-117.

Sato A, O'hUigin C, Figueroa F, Grant PR, Grant BR, Tichy H, Klein J (1999). Phylogeny of Darwin's finches as revealed by mtDNA sequences. Proc. Natl. Acad. Sci. USA 96: 5101-5106.

Sorenson MD, Quinn TW (1998). Numts: A challenge for avian systematics and population biology. The Auk 115: 214-221.

Sunnucks P, Hales DF (1996). Numerous Transposed Sequences of Mitochondrial Cytochrome Oxidase I-II in Aphids of the Genus Sitobion (Hemiptera: Aphididae). Molecular Biology and Evolution 13: 510-524.

Sunnucks P, Wilson ACC, Beheregaray LB, Zenger K, French J, Taylor C (2000). SSCP is not so difficult: the application and utility of single-stranded conformation polymorphism in evolutionary biology and molecular ecology. Molecular Ecology 9: 1699-1710.

Vartanian J-P, Wain-Hobson S (2002). Analysis of a library of macaque nuclear mitochondrial sequences confirms macaque origin of divergent sequences from old oral polio vaccine samples. PNAS 99: 7566-7569.

Vaughan HE, Heslop-Harrison JS, Hewitt GM (1999). The localization of mitochondrial sequences to chromosomal DNA in orthopterans. Genome 42: 874-880.

Wallace DC, Stugard C, Murdock D, Schurr T, Brown MD (1997). Ancient mtDNA sequences in the human nuclear genome: A potential source of errors in identifying pathogenic mutations. Proc. Natl. Acad. Sci. USA 94: 14900-14905.

Williams ST, Knowlton N (2001). Mitochondrial pseudogenes are pervasive and often insiduous in the snapping shrimp genus Alpheus. Molecular Biology and Evolution 18: 1484-1493.

Yuan JD, Shi JX, Meng GX, An LG, Hu GX (1999). Nuclear pseudogenes of mitochondrial DNA as a variable part of the human genome. Cell Research 9: 281-290.

Zeh JA, Zeh DW, Bonilla NM (2003). Phylogeography of the harlequin beetle-riding

pseudoscorpion and the rise of the Isthmus of Panamá. Molecular Ecology 12: 2759-2769.

Zhang D-X, Hewitt GM (1996a). An effective method for allele-specific sequencing using restriction enzyme and biotinylation (ASSURE B). Molecular Ecology 5: 591-594.

Zhang D-X, Hewitt GM (1996b). Highly conserved nuclear copies of the mitochondrial control region in the desert locust Schistocerca gregaria: some implications for population studies. Molecular Ecology 5: 295-300.

Zhang D-X, Hewitt GM (1996c). Nuclear integrations: challenges for mitochondrial DNA markers. Trends Ecol. Evol. 11: 247-251.

Zischler H, Geisart H, von Haeseler A, Paabo S (1995). A nuclear 'fossil' of the mitochondrial D-loop and the origin of modern humans. Nature 378: 489-492.

Zischler H, Geisert H, Castresana C (1998). A hominoid-specific nuclear insertion of the mitochondrial D-loop: implications for reconstructing ancestral mitochondrial sequences. Molecular Biology and Evolution 15: 463-469.

Numt census

Using numts