Yes, I found some scripts, but a user-friendly tool would be better. FASTX-Toolkit should do it, but this function is not available in Galaxy, and a command line should be used, I think.
-m: Maximum depth to print [Default: not set], for example to truncate any stacks with depth above 200, and only print 200 copies
-n: Skip sequences with less than "n" reads [default not set], for example to skip any singleton sequences (only had one read)
-x: Skip sequences with depth greater than "x" [default= not set], for example to skip any sequences with depth greater than 100,
Call program like:
./splitStackedFasta.pl -i [and any additional options]
Example:
./splitStackedFasta.pl -i test.fasta -m 2 -n 2
On your example would only output:
>2-1
ATAT
>2-2
ATAT
Whereas ./splitStackedFasta.pl -i test.fasta
would output:
>1-1
TGCG
>2-1
ATAT
>2-2
ATAT
>2-3
ATAT
>2-4
ATAT
>3-1
TGGC
>4-1
TGAG
>5-1
TTCA
Also, I had to put it in a .zip archive to upload to RG. So you need to unzip first. Let me know if you have any issues. I only wrote it in ~10 minutes so it isn't tested!
do you think that a similar script could be written for SAM/BAM/BED mapping file? It could be very useful as one can map unique reads and go back to the redundant reads in orther to visualize to mapped reads with their coverage depth.