01 January 2020 3 9K Report

I am currently using miRDeep2 to identify miRNAs in a non-model organism (a butterfly). After running mapper.pl to map the small RNA reads to the genome, I used miRDeep2.pl to predict novel miRNAs of the species. Here are some of my questions:

  • I have 8 samples with 2 conditions (4 replicates for each condition). Do I have to run miRDeep2.pl for each sample individually and select the common miRNAs as the novel miRNAs?
  • There are no previously reported mature miRNAs or hairpins in miRBase for my species. When I set them as 'none' and include the mature sequence of the related species (all Lepidoptera + drosophila), the outputs are all classified as 'novel'. But in this case, I can't figure out how many of them are conserved (or share homology) in other species. Then I tried to include all metazoa miRNAs&hairpins or all Lepidoptera miRNAs&hairpins as the 'reference miRNAs&hairpins' of my species (of course they are not). I got different numbers of predicted miRNAs and known miRNAs in each case (mapped to either metazoa or Lepidoptera)...
  • What are the criteria to select true-positive miRNAs from all predicted miRNAs? Based on my understanding I should choose those with significant randfold p-value (labeled as 'yes') and those with high miRDeep2 score. It says that the range of miRDeep2 score is from -10 to 10, but I got many extremely high scores up to 1.8e+6...Why is that...Also, there are miRNAs with very high miRDeep2 scores and read counts, but the randfold p-values are not significant. Do I consider them as true positive as well?
  • How to deal with precursors showing substantial sequence redundancy? There are many identical miRNA loci from different chromosomal locations. Since I will do differential expression analysis of mature sequences, I need to exclude the extra loci in the downstream analysis. Which one of those loci should I choose as the representative? Do I have to manually look for the redundant loci and modify them?
  • Similar questions and discussions