Clustering of protein sequences based on Similarity?

More Shishir K Gupta's questions See All

Which proteins of bacteria are often involved in protein-protein interactions with host?

As I know, the effectors and some secretory proteins of pathogenic bacteria are involved in host interactions but I also read some articles where some groups used the whole proteome of bacteria...

11 December 2013 8,700 6 View

Is there any resource available from which we can extract the list of genes/proteins involved in social immunity of insects?

I'm interested in the subject of social immunity genes/proteins in insects.

09 October 2013 2,885 4 View

Could someone suggest a good pipeline for protein functional annotation?

I have around 20,000 protein sequences of an insect predicted by a gene prediction program.

09 October 2013 8,011 4 View

How to remove inparalogs from the orthologous group?

I have a group of orthologous proteins as A A' A" B C D D' E F G H where all alphabets belong to different species; A A' A" and D D' are inparalogs. I want to remove the inparalogs from this group...

08 September 2013 4,586 2 View

What should be the minimum acceptable accuracy of gene-predictor on gene level?

Due to absence of well annotated gene boundaries, I trained a gene predictor in the absence of UTR training parameters. In such condition what can be the minimum acceptable accuracy of...

08 September 2013 6,237 1 View

How to create consensus phylogenetic tree for sequence clusters?

I have several orthologous sequence clusters. I want to create a consensus phylogenetic tree by exploiting these clusters. The number of sequences per cluster varies from 2-13 and the sequence...

07 August 2013 9,626 14 View

Iterative enhancement of ortholog sequence clusters?

I have pHMMs of some ortholog sequence clusters (from orthomcl). I want to add more protein sequences in these clusters. The newly added sequence in these clusters should also fulfill the...

07 August 2013 572 0 View

How to identify and remove inparalogs from the OrthoMcl clusters?

OrthoMCL may group out inparalogs and orthologs together in the one cluster/group. How can we identify such clusters that have "inparalogs"? How can I remove inparalogs. I only want to restrict...

06 July 2013 1,114 1 View

Relating OrthoMCL clusters with Phylogenetics?

I need to relate the OrthoMCL clusters (that I derived from comparison of several insect proteomes) with phylogenetic tree but I am not able to understand how to proceed? I would appreciate your...

06 July 2013 3,576 3 View

How can I find the organism B which is phylogenetically close to organism A on the basis of genome similarity?

I have an organism A which is recently sequenced. I want to find the organisms that are phylogenetically closer to organism A. I want to find the closest organism whose genome should already be...

04 May 2013 712 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Which Scopus Journal provides the most affordable fees?

"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....

10 August 2024 8,621 1 View

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...

10 August 2024 8,992 0 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

Who will be moral responsible for the death of thousands of people in the event of an earthquake?

Who will bear moral responsibility for the deaths of thousands of people in the event of an earthquake? Weeks and months remain before the onset of strong earthquakes that bring death to...

08 August 2024 6,134 12 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

Muddsair Sharif

Kindly try RapidMiner and Weka 3.6. i had done similar task but it is for signaling/speach packets. if you will feel any problem then welcome to write again and i will help you.

regards

Muddsair

Lazaros Mavridis

You could try our new clustering algorithm

Here is the paper describing it, if you are interest sent me a mail and I will sent you an executable.

http://www.biomedcentral.com/1471-2105/14/213/abstract

Regards

Lazaros

Ying Zhang

Have you ever tried "uclust" (http://drive5.com/usearch/manual/uclust_algo.html)? Based on the description, I think it is a fit for your problem.

Roberto Santana

You may use the affinity propagation algorithm that takes as input a matrix of similarities between the points (your proteins) and output the clusters and an exemplar from each cluster. Matlab and C implementations are available from http://www.psi.toronto.edu/index.php?q=affinity%20propagation