Filtering structural variants (SVs) from WGS/WES data is a crucial step in identifying potentially disease-causing mutations. Here's a breakdown of the process:
Challenges with WES data:
Short-read sequencing used in WES makes SV detection less sensitive compared to WGS due to limitations in capturing large genomic rearrangements.
Consider using specific SV detection tools designed for WES data for better accuracy.
General Filtering Steps:
Choose a variant caller: Several tools can detect SVs from WGS/WES data. Popular options include Manta, Delly, and Lumpy.
Quality score filtering: Set a minimum quality score threshold to retain high-confidence SV calls. Quality scores indicate the variant caller's confidence in the identified variant.
Depth filtering: Filter out SVs with low read depth, as these could be artifacts from sequencing errors.
Strand bias filtering: Remove SVs with significant strand bias, where reads supporting the variant come predominantly from one strand.
Database filtering: Exclude SVs present in public databases of known polymorphisms like dbSNP to focus on potentially novel variants.
Functional impact filtering: Annotate the remaining SVs using tools like SIFT or MutationTaster to assess their potential impact on protein function.
Sample-specific filtering: Consider incorporating information about the patient's phenotype and family history to prioritize relevant SVs.
Additional Tips:
Use visualization tools like IGV to inspect SVs in the genomic context.
Consider variant filtering workflows offered by platforms like GATK or specific analysis pipelines designed for SV analysis.
Remember, this is a general guideline. The specific filtering criteria will depend on your research question and the analysis tools you use.