Dear all
I have a question about an issue that I often find while doing my bioinformatic analyses and I often struggle to find a way around.
Let's say I have a fasta file like the one below
input.fasta
>ID1
ATGTGTCG.....
>ID2
CGCGTGTGATAT
>ID3
GCGCGCGCAAAA..
and then I have a tab-separated file where each ID is associated with a feature
input.txt
ID1 Nannochloropsis_oceanica
ID2 Nannochloropsis_gaditana
ID3 Nannochloropsis_oculata
and I'd like to edit the fasta identifiers by adding this feature, such that the desired output would be
output.fasta
>ID1_Nannochloropsis_oceanica
ATGTGTCG
>ID2_Nannochloropsis_gaditana
CGCGTGTGATAT
>ID3_Nannochloropsis_oculata
GCGCGCGCAAAA.
Does anyone know a simple unix/python/R code in order to automatically do that when working with thousands of sequences?
I used a python script from Tony Walters (https://gist.github.com/walterst/9147f9405cadf67a88471cc87b508333) to do that, but it is not working anymore in my own unix environment (some python error).
Thanks for your attention
Sergio