Hello,
I am working on viral metagenomics and have started using the High Throughput Sequencing reads for virus genome assembly. Since, I am at the beginner level, I have some confusions. As these are metagenomics samples with unknown viruses, I run de novo assembly of the sequencing reads for getting contigs followed by reference mapping of the reads against the longest contig first with iterative assembly to extend contigs at the ends in Geneious. After that, the extended contigs are aligned with the closest reference sequence for final genome assembly. My confusion is- how can I make it sure that the contig extension was performed correctly (no missassembly). Also, when it comes about visual inspection of contig vs reference sequence alignment, what are the things I need to pay attention for getting a reliable genome assembly? I aligned an extended contig to the closely related reference sequence and it seems that the contig covers only 8-10% of the genome sequence and there are early stop codons in the contig which does not make sense to me as blastn search showed 60% nucleotide identity of the contig with the reference sequence before contig extension. I am under the impression that there is something wrong in the contig extension step. Can you please share your experience of genome assembly using both de novo contigs and reference mapping? Thanks a lot for your time and help!