Is there a way of running cluster analysis with missing data?

More Christopher Brooke's questions See All

Acid Rain and Spruce Bark Beetles now Climate Change, Monocultures, and desicated soil, what changed?

So were we wrong about acid rain in the 1980s? Can we be sure that the new story: Climate Change, Desiccated Soil, and Monocultures, is correct? Another alternative from the 1980s is the massive...

10 July 2024 4,667 7 View

What is the current status of augmented learning in robotic surgery?

I would like to perform a literature review at this time on augmented learning and learning augmented algorithms to enhance performance-guided surgery

06 July 2024 246 1 View

Are we able to upload our publications onto this site?

I have a number of peer reviewed articles that were published that might be useful on this site. I don't know how articles appear on this site, but I would upload mine if acceptable. Thanks.

23 June 2024 2,344 1 View

Help with mounting meningeal tissues with Prolong Diamond?

Hello! I've recently been trying to optimise immunofluorescent staining of mouse meningeal tissue, and I'm having trouble mounting it just because it's so thick. We use prolong to mount and...

05 June 2024 624 2 View

How to extract RNA from powdered plant tissue stored in RNAlater?

My research group has samples of plant tissue that were pulverized in ln2 using a mortar and pestle before being weighed into tubes and stored in RNAlater at -20 degrees for a few months before...

28 May 2024 7,274 0 View

Can we enlist the help of hydrocephalus patients to determine if large scale Bose Einstein condensates occur in the brain?

My research, in particular my analysis of time perception, suggests that consciousness correlates with a large scale quantum coherent state in the brain in the form of a Bose-Einstein condensate....

16 April 2024 2,819 0 View

How much of AI is acceptable and ethical in research?

I ask this question from the perspective that AI algorithms can automate tasks, analyze vast amounts of data, and suggest new research avenues. It can also improve research efficiency and speed up...

15 April 2024 9,158 8 View

What do you think about engineering students using Artificial Intelligence from their first year of studies?

Introducing Artificial Intelligence in the first year of engineering studies offers students a foundational understanding of its principles and applications. This early exposure fosters relevance,...

11 April 2024 3,934 2 View

What are the effects on humans with shortening Childhood; the time between infancy and adolecence?

Humans have the longest childhood of any animal known. There are two huge growth spurts in our lives where cascading hormones shape our bodies. Infancy and adolescence, with the same surges of...

08 April 2024 2,355 22 View

The impact of monetary policy on the lending practices of listed banks?

ANY SCHOLAR WITH A SIMILAR WORK OR HAVE INTEREST IN THE ABOVE TOPIC?

23 February 2024 4,328 3 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

Why can't academics earn the money they deserve?

Only Journals make money from the articles we have worked on for years. Academics do not earn money from their refereeing. Then shouldn't the solution be a system in which academics can earn...

01 August 2024 6,469 6 View

Conjugation of PEG-Amine to an Amino Acid Using EDC?

I am attempting to conjugate PEG to an amino acid at the C-terminus, for the purposes of producing nanoparticles. I have been told that PEG modified with amine groups can be used for this purpose,...

31 July 2024 2,033 1 View

Learning in Animals with Unlimited versus Limited Neurogenesis?

“Yerkes (1912) trained an earthworm to choose one arm of a T-maze, using electric shock as punishment for error and the moist burrow as reward for correct choice. The habit was acquired in twenty...

31 July 2024 8,809 0 View

Murat Karakoyun

Hello Christopher Brooke ,

If you have missing values in your data set, you have 2 choices before you run the clustering process. First choice is that you can delete rows, which have missing value, from dataset. And the second one is you can use any missing value estimation method. You can find different estimation methods about missing values. Such as: replace with mean, replace with mode, replice with median, etc. So if you don't want remove missing rows, you should specify an estimation method and apply your dataset. And then run k-means or any other clustering algorithm.

Best regards.

Balqish Hassan

You can use mean value for the missing data. That's what I did for my cluster analysis using SPSS.

Christopher Brooke

Thanks for the input, knnImputation with the cluster means seems to have worked well.

Erik Cuevas

The rase way is to added through the use of a probabilistic model

Giuseppe Biondi-Zoccai

You can find here some additional and useful suggestions: https://www.displayr.com/5-ways-deal-missing-data-cluster-analysis/

Yasar Sattar

Recommend using osample, or esample - try running entropy matching with ebalance in stata

Ali O Ilhan

I think for any case with missing data, the key, at least for me, is to understand why the data is missing in the first place. Is there a pattern? Is the "missingness" seemingly random, or is there a process which we can define to some degree that creates the missing observations. As noted in the other answers, there are many approaches to missing data, but choosing one over the other rests (typically) on different set of assumptions, and that again brings me to the fundamental question of why the data is missing in the first place.

Sorry, I thought I finished but realized hit add before finishing. Of course not my area, but I am guessing there are specific mechanisms wherein some species are more likely to be missing than the others. If those causes are known, they can be used, in a modeling-based imputation method. In my experience, imputation has a crafty side in it, and blindly choosing one method over another based on some purely exogenous mathematical criteria generally does not do the trick.