I used illumina to complete the genome sequencing of phage and the genome was assembled into one contig. How do I know if the genome I'm got is complete, and if it's linear, how do I know where it ends?
Your options depend on the NGS library preparation protocol you used.
The first thing to do, I believe, would be to check whether the de novo assembled contig of your phage genome is (pseudo)circular and is opened arbitrarily, as well as to run it through some ORF calling and ORF product functional annotation tools.
For example, assuming dsDNA phage, PhageTerm and other types of mapped read pile-up analyses will not really shed light on the genome termini of phage in case of defined phage genome termini (e.g., not "headful" packaging phage with circularly permuted genomes in different virions) if transposon-based library preparation was used (e.g., Nextera), this was explicitly stated by the authors of PhageTerm paper as well (Garneau et al.).
I would advise you to first try to predict the packaging strategy/genome physical termini type from the terminase large subunit (TerL) phylogeny. For this, you would need to take terminase/TerL protein aa sequences from a set of phages that were experimentally shown to have a particular packaging strategy (see Figure 6. and the corresponding dataset from Merrill et al.), as well as the TerL sequence of your phage and do a multiple sequence alignment to then proceed with the building of a phylogenetic tree. TerL sequences were shown to cluster in the phylogenies according to the respective phage packaging strategies/termini types. This could give a hint at what to expect for your phage genome termini if its TerL sequence convincingly falls within one of the packaging strategy clades in the TerL tree. Then, depending on how intergenomically similar your phage is to the ones that have their complete genome sequences available, you can try to have a look at whether some of the intergenomically similar phage genomes are really base-to-base complete (if there is an annotation for corresponding phage genome termini/associated paper that would show a faithful representation of the phage genome molecule "as seen within the capsid", in case defined genome termini are expected) and arrange your genome accordingly to have the approximate expected physical genome molecule termini region in the beginning and at the end of your phage genome representation. Then you can design custom primers to do a Sanger-sequencing from which would go beyond the ends of your current genome representation (have a look through a paper by Casjens and Gilcrease).
Hope that helps,
Nikita Zrelovs
References:
Garneau JR, Depardieu F, Fortier LC, Bikard D, Monot M. PhageTerm: a tool for fast and accurate determination of phage termini and packaging mechanism using next-generation sequencing data. Sci Rep. 2017;7(1):8292. Published 2017 Aug 15. doi:10.1038/s41598-017-07910-5
Merrill, B. D., Ward, A. T., Grose, J. H., & Hope, S. (2016). Software-based analysis of bacteriophage genomes, physical ends, and packaging strategies. BMC genomics, 17(1), 679. https://doi.org/10.1186/s12864-016-3018-2
Casjens, S. R., & Gilcrease, E. B. (2009). Determining DNA packaging strategy by analysis of the termini of the chromosomes in tailed-bacteriophage virions. Methods in molecular biology (Clifton, N.J.), 502, 91–111. https://doi.org/10.1007/978-1-60327-565-1_7