Strategies for cloning unknown cellular flanking DNA sequences from foreign integrants.
Hui EK, Wang PC, Lo SJ.
Abstract
Many virus and transposon DNAs can integrate into the host genome. In this review, techniques, including inverse polymerase chain reaction (IPCR), novel Alu-PCR and vectorette- or splinkerette-PCR are introduced as possible strategies for cloning flanking DNA regions of the integrants. Targeted gene-walking PCR, restriction-site PCR, capture PCR, and panhandle PCR and boomerang DNA amplification are also described. The principles, advantages and limitations of each approach are discussed.
Inverse and splinkerette PCR work really well, and are used almost exclusively to identify the transposon insertion sites.
The one problem with traditional mouse transgenes is that they can form concatamers. If that is the case, you will also isolate flanking transgene DNA. You will have to clone the PCR fragments and sequence individual clones.
I would suggest a combination of Southern blot and a PCR. The PCR primers can be designed properly where one primer is in transgene and another one in the expected flanking site. The product can be sequenced to ensure the correct product. Good luck!!
You might try a fly trick. Digest the genome with multiple common restriction enzymes. Probe for your transgene and removed that fragment(s). Sequence the fragment and blast it against the mouse genome. Portions of the fragment that are not your transgene are adjacent mouse genome sequences.
The PCR is not as easy - you have to do sth like RACE or inverse PCR to get out of the transgene into the surrounding genome! Southern blot won't help you localize it. Miriam Meisler cloned several insertion sites and wrote methods papers about it - some took years to clone, some were quick. THe easiest to get a rough idea is to do Fluorescent probe chromosomal stains - some companies do it. Then you know 3C or some such location - if you want to know if its a known gene iwth the same phenotype or not, that is worth it. But it won't help you clone the break - for that you really have to do the inverse PCR or RACE. I wish I had kept a mouse with a great phenotype and such an insertion myself.... but a postdoc worked for years on it and didn't get it and then gave up.
These solutions are clever, but if you can spend a few grand, sequencing the whole genome is always a possibility. It's getting much cheaper these days, and it will obviously be more accurate.
I am agree with several others: this is not an easy task and you might want to make a careful evaluation as to whether the outcome of the study will provide new and necessary information. Anyway, RACE and or inverse PCR is your approach. First of all, you need to determine with traditional Southern hybridization technique the nature of integration, that is: do you deal with a multicopy and single chromosomal integration site or multicopy and multiple integration sites. If there is a single site and multicopy integration, which is usually more typical, your RACE or inverse PCR is relatively straightforward. However, you will face significant cloning and sequencing task if you have multiple and multicopy chromosomal integration sites, which also often occurs. In either case, unless, you have a good reason that justifies your curiosity, I strongly feel that the information you gain is not proportional with the labor and cost investment.
If you want to know into which chromosome(s) your transgenes have inserted, then fluorescent in situ hybridization (FISH) will give you an answer, and tell you things like whether you are too close to the centromere or not. If you want a more specific location and to get information on copy number as well, then whole genome seq is the way to go. If that is too costly, then you could do inverse PCR, but you will likely have to modify this approach if you have multiple insertion sites. I would suggest using a restriction digest that you know is a single cutter within your transgene sequence, thus hopefully minimizing any overly long templates that could arise from tandem repeats. But you are still going to have to do Solexa sequencing, or make a custom chip, or find some other way to get and analyze single reads (unless you get lucky and have only one insertion site). Best of luck.
I appreciate all of the feedback. I have 6 different founders so genome sequencing would be far too costly. I like the FISH suggestion because I mostly need to determine which chromosome the transgene inserted into for downstream crosses with knock-out mice. I haven't heard about splinkerette PCR. Bianca - do you have a good reference? Several people said the inverse PCR was very difficult. Since the protocol is straightforward I assume that means it just didn't work. Have others had a different experience?
I have been trying to figure out the best way to do this for some time now. I agree that FISH or modifications thereof may be the most efficient way to get a crude localization of the transgene insertion site. However, you will need some other method of finding the exact location or determining if the insertion disrupted any important coding region. Another consideration is the size of the Transgene. If the line was created using a large DNA fragment such as a BAC, it is also possible that an incomplete copy was integrated, which complicates the detection since the ends of the transgene are not guaranteed to be what you expect. You may also wish to see some of these papers describing diverse methodology for finding Tg insertion sites:
Nakanishi et al, 2002 Genomics
"FISH Analysis of 142 EGFP Transgene Integration Sites into the Mouse Genome"
Bryda et al, 2007 Biotechniques
"Method for detection and identification of multiple chromosomal integration sites in transgenic animals created with lentivirus"
Haraguchi & Nakagawara, 2009 PLoS One
"A Simple PCR Method for Rapid Genotype Analysis of the TH-MYCN Transgenic Mouse"
Matsui et al, et al, 2002, Mammalian Genome
"Rapid localization of transgenes in mouse chromosomes with a combined Spectral Karyotyping/FISH technique"
Zhang et al, 2012 PLoS One
"Molecular Characterization of Transgene Integration by Next-Generation Sequencing in Transgenic Cattle"
Byrda & Bauer, 2010 Methods in Molecular Biology
"A Restriction Enzyme-PCR-Based Technique to Determine Transgene Insertion Sites"
Thirulogachandar et al, 2011 Anal Biochem
"An affinity-based genome walking method to find transgene integration loci in transgenic genome."
Strategies for cloning unknown cellular flanking DNA sequences from foreign integrants.
Hui EK, Wang PC, Lo SJ.
Abstract
Many virus and transposon DNAs can integrate into the host genome. In this review, techniques, including inverse polymerase chain reaction (IPCR), novel Alu-PCR and vectorette- or splinkerette-PCR are introduced as possible strategies for cloning flanking DNA regions of the integrants. Targeted gene-walking PCR, restriction-site PCR, capture PCR, and panhandle PCR and boomerang DNA amplification are also described. The principles, advantages and limitations of each approach are discussed.
Inverse and splinkerette PCR work really well, and are used almost exclusively to identify the transposon insertion sites.
The one problem with traditional mouse transgenes is that they can form concatamers. If that is the case, you will also isolate flanking transgene DNA. You will have to clone the PCR fragments and sequence individual clones.
Thanks for the follow up. I tried several of the different PCR methods to no avail. I tried to find a company that did mouse FISH but couldn't find one. The question isn't worth the cost of WGS for 6 founders. We wound up just testing the crosses to see if they worked - i.e not closely linked to our ko strain of interest. All 6 strains gave us positives of our tg in the ko bkgd. So we picked a low and a high copy number founder line to keep backcrossing. Now we just have to add a second tg to the mix and see if we can get all 3 together in both mouse lines. (Which is why I had hoped to figure out the insertion sites in advance.) If not we still have the other founders we can go back to and try. If we succeed then I think it would be well worth it to do WGS on one or both of them.
I agree with you that it is important to look for integration sites as transgene integrated on the same chromosome will not be expressed if you are using homozygous mice.
I wish you best of luck for it and will be interested to read what you figured out!
In the future, if interested in SKY/FISH analysis, check out the Van Andel Institute Cytogenetics core. They perform this analysis on human, mouse and rat samples at a pretty reasonable rate.
During my PhD I used inverse PCR (GATC-overhang-producing HpaII) to map a lot of P-element inserts in Drosophila. By some miracle, enough fragments circularized, instead of re-annealing linearly with others, so that I a got a good amount of amplicon from the outward facing primer pair.
Along with iPCR, I used Southern blotting with a P32-labeled, transgene-complementary probe on 6-cutter (e.g. BglII & others) -digested genomic DNA to determine whether there was more than one copy of the transgene. My scheme allowed me to resolve tandem arrays, but did not provide sequence information...that was done using iPCR.
Today I work with transgenic human cell lines. I had not been able to get iPCR to work. I just read a paper from Potter et al. 2010 "Splinkerette PCR for Mapping Transposable Elements in Drosophila." The technique uses a hairpin dsOligo adapter to provide a primer binding site for PCR of unknown flanking DNA. I am going to try adapting this to human transgene sequencing.
Lastly, I saw a recent paper that used NGS in a clever way to map transgenes in mammals. Prior to alignment of the sequences to the reference genome, they used some sort of algorithm to search for mate-pair sequencing fragments that were part transgene, part genomic. They only used that subset to align back to the genome. Srivastava et al. 2014 "Discovery of transgene insertion sites by high throughput sequencing of mate pair libraries."
Thanks Karmella - I would love to hear back if you get the hairpin dsOligo method to work for you. I just gave up since no one else seems to publish their tg insert sites but it can really be important since we cross our tg mice to various knockout strains and would want to avoid linked regions.
TLA (targeted locus amplification) Seq is most suited method in current time to identify precise location of trans-gene. Article Targeted sequencing by proximity ligation for comprehensive ...
Thank you Arpit - Another responder also recommended this method and cited a 2017 NAR vol 45, No. 8 article by Cain-Hom et al from Genentech that specifically addresses transgenic mice. There is also a company now that offers this service called Cegentis (www.cergentis.com). The question has become mute for my project so I never tried this and cannot vouch for it but I know many others were interested so I thought I would pass on what I have learned.