How do you avoid the curse of dimensionality problems during the feature reduction step in data mining?

More Dr. Indrajit Mandal's questions See All

Are you looking for research collaboration ?

we have few papers ready for submission, and we need one co-author for each article who can pay article fee. Interested authors may text here or contact me on my following email id [email protected]

29 July 2024 6,626 0 View

Dear researchers. pl help how to plot jablonski energy level graph and magnetic hysteresis curve in origin?

through origin software

17 July 2024 4,991 0 View

Is there a cloud masking algorithm that can mask cloud pixels within each KML using a Landsat 8 image?

To mask cloud cover, the following details are needed: - Algorithm for Landsat 8 Level 2, Collection 2, Tier 1 data - The algorithm should be applied to each KML to mask out cloud cover - Credible...

08 July 2024 4,683 2 View

How is artificial intelligence being utilized to enhance the diagnosis and treatment of sleep apnea?

AI has the potential to improve the management of sleep apnea by personalizing treatment, enhancing diagnostic accuracy, and advancing our understanding of the condition.

03 July 2024 9,393 2 View

Action research design is Quantitative Studies or Qualitative Studies?

14 June 2024 1,684 1 View

Offer or collaboration for book publications?

Dear Profs/ Scientist, if any one is interested or have plan to publish a good quality books/ book edited , I can contribute/help such type of research works . My research...

06 June 2024 5,932 2 View

How to draw SeDeM diagram for preformulation study of drug and excipients?

What is SeDeM expert tool? Is it in silico tool and how to draw the SeDeM diagram for preformulation study? Kindly suggest.

06 June 2024 4,408 0 View

Call of book chapter ?

CALL FOR BOOK CHAPTERS (No Publication Fee) Scrivener Publishing/wiley Dear Prof,Scientist/ experts, Greetings. We are editing the book entitled " FET Devices: Post CMOS Theory and Applications...

28 May 2024 9,284 0 View

Is there a Raising awareness of indigenous medicine among metropolitans?

The growing interest in holistic health practices is leading to increased awareness of indigenous medicine among metropolitan populations. While challenges exist in terms of access and cultural...

09 May 2024 7,041 1 View

Why many times RG is sending email? Is this person is co author or published article? Many times it is been sending.?

Pls dont send many times email is this person is co author or published article? Many times it is been sending. Pls stop it ok

06 May 2024 3,964 3 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Is there a problem with my RNA pellet?

Hello, I am currently having problems with RNA extraction. I am using mouse liver (C57BL6J), and I have extracted RNA from mouse liver before. Before this experiment, my final RNA pellets were...

11 August 2024 7,082 3 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Which Scopus Journal provides the most affordable fees?

"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....

10 August 2024 8,621 1 View

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...

10 August 2024 8,992 0 View

Juan M. Banda

Depends on what dimensionality reduction method are you using. If you are using PCA or SVD you can determine how much variance do your reduced number of dimensions contains (95% is standard). If you mean from a feature selection perspective, this is very subjective and it will depend on your feature selection method, types of features, and domain. What I usually do, is run experiments will all features, then either do correlation analysis, info gain, gain ratio, etc. to determine redundant (features) remove and then run experiments again. This will let you know how many of them can you remove while still maintaining a decent results. Hope this helps!

Haris Ahmad Khan

Talking simply, let's image that you select 2 features. While testing, you realise that value of both features increase or decrease in same way. So it means that both of features act in same way, and if you keep one of them and discard other, there will be no effect on results.

To realise that how much dimensionality reduction is enough, have a look at results. If they are good enough, then it means that redundant features are removed and system is working good.

As said above, variance is the best indication for dimensionality reduction of features.

Ahmed Hamed

I'm working on the same point.

i make use of the rough set theory which provides me with the best subsets (reduct and core) that preserves the same meaning as the whole data set.

another method that depends on the entropy can be found in the attached paper

Dr. Indrajit Mandal

Thank you for answers.

Ram Pakath

Could you elaborate on your question: "...... is sufficient?" for what purpose? What do you intend doing with the reduced feature set? Whether a given reduced set is sufficient or not would depend on what you do with the reduced set and how successful that attempt is. So, intended purpose may help provide a more focused response.

Partha Dey

I agree with Ahmed that Rough Set Theory actually deals with the sufficiency of selected features. The concept is known as reduct.

It give something more than a rule of thumb. RST actually provides a mathematical basis that a certain subset of features is sufficient for the task at hand (usually classification).

However, one prerequisite for RST reduct finding algorithms to work (it's not a free lunch) is that you need to discretize the features if they are continuous (real-valued). Though, that is a necessity in every classification (or even unsupervised learning) algorithm.

secs.ceas.uc.edu/~mazlack/dbm.w2011/Komorowski.RoughSets.tutor.pdf