Hello,
I have PacBio raw sequencing datasets from cells infected with Trypanosoma cruzi. We're trying to identify whether the protozoan integrates kDNA sequences of ~12bp size in the host cells' genomes. However, I don't know yet what bioinformatics approaches to use to look for these sequences. I am thinking of two possible strategies:
1) Use pacbio minimap2 (pbmm2) to align the PacBio sequencing reads against T. cruzi kDNA sequences collected from NCBI and then visualize in IGV.
or
2) Assemble PacBio reads into the complete genome and then aligns kDNA NCBI sequences against the complete genome. Finally, I would visualize if any sequence aligned with a reasonable similarity.
Please, I would like to hear any opinions or suggestions regarding this problem and possible bioinformatics strategies I could use (including software) to solve this task. Thank you very much.