I am new to bioinformatics and i am currently working on my whole-genome bisulfite sequencing (WGBS) data to analyze DNA methylation patterns. The data involves paired-end sequencing processed through standard bismark pipeline, and I have generated the necessary files, including bedGraph and coverage reports. However, I am facing some challenges in interpreting the methylation percentage results. After completing the methylation extraction step, I noticed that the methylation percentages for all CpG, CHG and CHH appear unusually high and almost equal.
I’m trying to understand whether this is normal and if others have observed similar trends with Bismark, especially after the methylation extraction step. Specifically, I’d like to ask:
Following script i used for analysis:
____________________________________________________________________________
#Run Bismark alignment
$BISMARK_PATH/bismark \
--genome $GENOME_PATH \
-1 $READ1 \
-2 $READ2 \
-o $OUTPUT_PATH
echo "Step 2: Alignment with Bismark - Completed at $(timestamp)"
# Step 3: Deduplication (optional, if you want to remove PCR duplicates)
echo "Step 3: Deduplication - Started at $(timestamp)"
$BISMARK_PATH/deduplicate_bismark \
--bam \
--paired \
$OUTPUT_PATH/${READ1}_bismark_bt2.bam
echo "Step 3: Deduplication - Completed at $(timestamp)"
# Step 4: Methylation extraction:
"$BISMARK_PATH/bismark_methylation_extractor" \
--bedGraph \
--comprehensive \
--no_overlap \
--ignore 3 \
--ignore_r2 2 \
--ignore_3prime 2 \
--ignore_3prime_r2 2\
--CX_context \
--cytosine_report \
--gzip \
--parallel 8 \
--buffer_size 75% \
--genome "Path/to/genome" \
-o "$OUTPUT_DIR" \
"$BAM_FILE"
echo -e "${GREEN}Methylation extraction for $BAM_FILE - Completed at $(timestamp)${NC}"
___________________________________________________________________________