Can anyone suggest a few latest research papers related to cluster analysis and cluster evaluation techniques in data mining?

Hi,

would you like to describe your motivation then it will be easy to gave you some suggestion. i am dealing with DM and clustering since three years!

Looking to hear from you.

Regards

Muddsair

Aman Kothari

Hi Muddsair,

Thank you for showing interest.

I am new to data mining. Basically, I want to study clustering algorithms and by using existing evaluation methods I want to verify their performance on various data sets.

Saptarsi Goswami

Hello

I would recommend you start with the following paper

Jain, Anil K. "Data clustering: 50 years beyond K-means." Pattern recognition letters 31.8 (2010): 651-666.

You will get the the pdf link in google scholar. Also clutering is a well documented process , I will recommend the following book before you read the papers.

Introduction to Data Mining

Pang-Ning Tan, Michigan State University,

Michael Steinbach, University of Minnesota

Vipin Kumar, University of Minnesota

http://www-users.cs.umn.edu/~kumar/dmbook/index.php

Happy researching

Andreas Kanavos

You can take a look at this book as well

Introduction to Information Retrieval

http://nlp.stanford.edu/IR-book/

Regards,

Andreas

Alok Ranjan

Dear Aman,

Here are few research articles which may suit your area of interest to warm up...

1. ANANALYSIS ALGORITHM AND APPLICATION OF CLUSTER IN DATA MINING

2. Research and improvement of clustering algorithm in data mining

3. Using cluster analysis for data mining in educational technology research

4. Data mining: Concepts and techniques

5. Top 10 algorithms in data mining

6. Data mining with big data

7. Survey of Multiobjective Evolutionary Algorithms for Data Mining: Part II

8. Go through this recent book .. Introduction to data mining with case studies (2014)

Best wishes

Anwar Ali Yahya

Hi Everybody

One of the latest trends in data clustering is based on Swarm Intelligence algorithms. Here is a useful and recent review paper on this

"Research on particle swarm optimization based clustering : A systematic review of literature and techniques"

Shafiq Alam, Gillian Dobbie, Yun Sing Koh, Patricia Riddle, Saeed Ur Rehman

Richard Hyde

If I can blow my own trumpet and point you towards my recent work:

Data Density Based Clustering (DDC)

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6930157

This is a fast density based clustering technique using recursive density estimation. The easiest way to describe it is that it is similar in approach to subtractive clustering, but has adaptive cluster radii and is infinitely faster. It is on a par with k-means for speed, but without the need to 'know the answer' first and provide the number of clusters.

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6930157

Perhaps more interesting is:

A Fully Autonomous Data Density Based Clustering Technique

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7009512

A development of the technique above that requires no user input whatsoever!

It uses the density measure along each axis to estimate the initial radii for DDC. The adaptive radii in DDC allow for the estimate to be very approximate and still provide the same clusters.

Both above are best used with data in hyper-ellipsoid type clusters and larger datasets. I am currently completing an extension for both of these that finds aritrary shaped clusters.

Joseph Alexander Brown

I guess I will get on the ask to be included bandwagon.

I would love to see some extensions to my work on extending K-means if you find it an interesting algorithm. If you have any questions about it I would be happy to provide more information.

Article K-Models Clustering, a Generalization of K-Means Clustering

Jeannie Fitzgerald

You may find the python scikit- learn library of clustering algorithms useful for experimentation. The documentation provides quite a lot of useful information on the relative strengths of the various approaches, together with details of suitable performance metrics:

http://scikit-learn.org/stable/modules/clustering.html

Can anyone share research papers on Kafka and Kafka streams ?

What are the evaluation methods for verifying performance of classification techniques in data mining?

Feedback defines the constitution of an organism?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

What are examples of AI for good projects a teacher can assign to students?