How can you reduce noise in K-mean clustering?

More Anupam Singh's questions See All

How to perform Molecular Dynamics of peptide+small molecule systems?

I have a system in which a ligand is attached with the small peptide of the length of 8 Amino Acids. I want to perform Molecular Dynamics of this system. What procedure and software should I use...

05 June 2014 2,053 10 View

What are pros and cons of using carbon nanotubes versus polysiloxanes in drug delivery?

What are the benefits we get in terms of stability, drug delivery, drug targeting, toxicity, biodegradability etc. (if I am missing something) by selecting one over the other? Feel free to...

05 June 2014 9,090 3 View

How can I interpret spectral clustering?

How can we understand spectral clustering algorithm in a more generalized way? The research papers mention it in more mathematical form. What could be a layman's approach to understand it by...

03 April 2014 1,931 2 View

How do you select best protein-protein complex?

The servers available for the protein-protein docking generate a thousand of the structures where they talk about the probability of finding structures in those 1000 structures. Is there any way I...

02 March 2014 2,852 8 View

Which clustering method is suited for symmetrical distance matrices?

Where matrix entries are rmsd of the different proteins. The matrix nXn where n is the number of proteins in the system.

02 March 2014 4,268 11 View

How do you combine two .pdb protein structure files?

I have two *.pdb files of proteins. How can I connect them to form one? The two .pdbs are of the same protein solved by different groups. And what is the best way to find the two orientation of...

02 March 2014 8,420 21 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?

I am trying to simulate vehicular loading on an orthotopic steel deck bridge section in ABAQUS software. The red arrow mark in the attached figure indicates the direction in which the vehicle will...

08 August 2024 719 0 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Francisco Sanchez

I guess that it depends in which algorithm you use to get the detections, before using K. Or you can use another algorithm to filter positions applying different parameters like "Radius of neighborhood" or "N° positions required in neighborhood", and then start using Ripley. You can also use neighborhood density function that is a non cumulative way of study.

But always the way to reduce noise depends on the technique that you are using to get the data and how you treat the sample.

I´m currently working with STORM that is super resolution microscopy technique, to do overcounting and get thousands of artifacts is a really common mistake so we have to be very carefully about this. There are hundred of papers talking of clustering related to microscopy.

Cheers

Anupam Singh

Thank you Francisco!

I will try to incorporate these suggestions and also look into neighborhood density function. I work with protein-protein interactions so I have to find the best binding pose to predict the binding orientation through statistical mathematics.

Merry K P

Calculate mode and cluster

Thank you Merry!!

I will try it on my data set.

Hi Francisco,

I have tried neighborhood density function, it worked fine with little optimization of with maximum neighborhood distance(MND). But the problem is I have analyzed the data manually and selected the the MND. Is there any way I can generate the optimum MND computationally by some algorithm. Because the dataset has 2000 points in it and finding it everytime manually cost too much effort.

Kevin R Keane

Hi Anupam -

I'm not sure what you mean by "data points which are supposed to be treated as noise" - did you create them, or do they just appear to be noisy ?

data that doesn't fit nicely in to any class may be noise; or may be a new / previously undiscovered class ...

take a look at http://ti.arc.nasa.gov/tech/rse/synthesis-projects-applications/autoclass/references/ ; and software available for down load at http://ti.arc.nasa.gov/tech/rse/synthesis-projects-applications/autoclass/

I'll describe my understanding of AutoClass briefly as it applies to your question, but go to the above sources :) AutoClass implements a hybrid K-means with individual observations assigned probabilities of being in each cluster. The number of clusters is automatically determined; and, the sense of distance is automatically determined (each cluster has its own measure of distance, based on mean and standard deviation of the observations assigned to the cluster).

Tim Blackmore

I appreciate that this thread is 4 years old, but I stumbled upon it when I encountered the same issue. I solved it by using a density based spacial clustering method. I thought I would add this just in case anyone encounters a similar problem and ends up here!

https://uk.mathworks.com/matlabcentral/fileexchange/52905-dbscan-clustering-algorithm

I utilised an example from the Matlab file exchange which implements the method prescribed in the following paper.

http://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf