Dear all

I have a question about an issue that I often find while doing my bioinformatic analyses and I often struggle to find a way around.

Let's say I have a fasta file like the one below

input.fasta

>ID1

ATGTGTCG.....

>ID2

CGCGTGTGATAT

>ID3

GCGCGCGCAAAA..

and then I have a tab-separated file where each ID is associated with a feature

input.txt

ID1 Nannochloropsis_oceanica

ID2 Nannochloropsis_gaditana

ID3 Nannochloropsis_oculata

and I'd like to edit the fasta identifiers by adding this feature, such that the desired output would be

output.fasta

>ID1_Nannochloropsis_oceanica

ATGTGTCG

>ID2_Nannochloropsis_gaditana

CGCGTGTGATAT

>ID3_Nannochloropsis_oculata

GCGCGCGCAAAA.

Does anyone know a simple unix/python/R code in order to automatically do that when working with thousands of sequences?

I used a python script from Tony Walters (https://gist.github.com/walterst/9147f9405cadf67a88471cc87b508333) to do that, but it is not working anymore in my own unix environment (some python error).

Thanks for your attention

Sergio

More Sergio Balzano's questions See All
Similar questions and discussions