I have one excel sheet that has a collection of reaction and gene ids data. These gene_ids have transcripts (means they have ids like 54575.1 , 54575.2 etc). I do not require the transcripts so I removed those and now I got duplicates in the gene ids. I want to remove those duplicates. I have around 4000 rows and 15 columns so that manually it's not possible and it may cause errors.

Here is an example of data:

UGT1A3r 7364 54575 54575

UGT1A4r 54575 54490 54576

UGT1A8r 7363 54575 7363 54576

UGT1A9r 54490

UMPK 51727

UMPK2 51727

I want to remove those duplicates that come after in the next columns. Unique values should be there and subsequent entries should be removed. Can anyone tell me how to remove those duplicates? By which program I can remove it? Or is the excel itself enough?

Similar questions and discussions