I am new to bioinformatics and i am currently working on my whole-genome bisulfite sequencing (WGBS) data to analyze DNA methylation patterns. The data involves paired-end sequencing processed through standard bismark pipeline, and I have generated the necessary files, including bedGraph and coverage reports. However, I am facing some challenges in interpreting the methylation percentage results. After completing the methylation extraction step, I noticed that the methylation percentages for all CpG, CHG and CHH appear unusually high and almost equal.

I’m trying to understand whether this is normal and if others have observed similar trends with Bismark, especially after the methylation extraction step. Specifically, I’d like to ask:

  • Is it common to observe higher methylation percentages with Bismark after methylation extraction?
  • Normalization or filtering steps: Are there any additional steps to account for biases or noise in methylation calculations?
  • Does this higher methylation percentage signifies any issue in wet lab work, especially bisulfite conversion or library preparation?
  • Following script i used for analysis:

    ____________________________________________________________________________

    #Run Bismark alignment

    $BISMARK_PATH/bismark \

    --genome $GENOME_PATH \

    -1 $READ1 \

    -2 $READ2 \

    -o $OUTPUT_PATH

    echo "Step 2: Alignment with Bismark - Completed at $(timestamp)"

    # Step 3: Deduplication (optional, if you want to remove PCR duplicates)

    echo "Step 3: Deduplication - Started at $(timestamp)"

    $BISMARK_PATH/deduplicate_bismark \

    --bam \

    --paired \

    $OUTPUT_PATH/${READ1}_bismark_bt2.bam

    echo "Step 3: Deduplication - Completed at $(timestamp)"

    # Step 4: Methylation extraction:

    "$BISMARK_PATH/bismark_methylation_extractor" \

    --bedGraph \

    --comprehensive \

    --no_overlap \

    --ignore 3 \

    --ignore_r2 2 \

    --ignore_3prime 2 \

    --ignore_3prime_r2 2\

    --CX_context \

    --cytosine_report \

    --gzip \

    --parallel 8 \

    --buffer_size 75% \

    --genome "Path/to/genome" \

    -o "$OUTPUT_DIR" \

    "$BAM_FILE"

    echo -e "${GREEN}Methylation extraction for $BAM_FILE - Completed at $(timestamp)${NC}"

    ___________________________________________________________________________

    More Suraj Patil's questions See All
    Similar questions and discussions