Should I keep or remove duplicated features?

More Meriem Ferdjouni's questions See All

Could anyone provide me a spreadsheet to present the composition of sulphides in diagrams?

10 April 2024 6,290 0 View

How to calculate the Mean normal partial thromboplastin time ?

Is it like MNPT (geometrical mean of at least 20 healthy subjects) ? or it's the arythmetical mean ? can a pool of healthy subjects' plasma be considered as the reference APTT for calculating APTT...

01 April 2024 1,846 0 View

What are teachers' motivation strategies?

This research aims to investigate the relationship between teachers' intrinsic motivation strategies and EFL students' outcomes in Algerian university

22 March 2024 3,670 0 View

How can idownload MAUD software in my laptop?

how can i download MAUD software in my pc?

03 October 2023 8,919 3 View

Is there any Online database of phenolic compounds? (Except phenol-explorer) ?

I am searching for an online database of phenolic compounds extracted from plants which contains their UV spectra (or at least their λmax). In the database phenol-explorer, there is almost...

24 April 2023 3,774 3 View

Why the hydrophobic interactions are not linked to the interert residue in the Yasara software ?

Hello everyone, I have a question about the YASARA software. I want to discover the hydrophobic interactions between some protein residues. These interactions are not linked to the interest...

14 February 2023 4,936 1 View

Is there any relationship between Virilio's dromology approach and Hartmut's acceleration paradigm ?

I think loudly about the two concepts of time I.e : "dromology and acceleration".. through an interdisciplinary framework between philosophy and politics..

07 October 2022 6,194 0 View

Cooja - How to assign different services to a mote/node?

I need to create a network of 14 motes.each mote is either a service provider or service requester. A node provide a service or services.I want each mote to provide 3 services withS1 = 0.90, S2 =...

14 August 2022 7,486 1 View

Why autodock vina does not generate the dlg files for acids such as coumaric, caffeic, gallic ...etc?

Why autodock vina does not generate the dlg files foor acids such as coumaric, caffeic, gallic ...etc?

02 August 2022 5,239 2 View

PCR products from Yeast DNA (ITS) are not intense?

Dear Community, I extracted DNA from yeasts and performed a PCR with ITS primers. The PCR bands are not intense and do not allow sequencing. I've tried to extract DNA with a lithium protocol, but...

11 June 2022 503 2 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Hello Everyone ! I'm looking for a good journal to publish my manuscript with low publication cost?

I am Looking for a Science Journal with good impact factor and low publication cost to publish a review paper. Your suggestions would be appreciated.

06 August 2024 6,796 3 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

Khurram Hameed

If the features are 100% the same they can result in biased training e.g. imbalanced dataset. Yes, J. Rafiee

is right it's hard to say if the features are exactly the same. So, check on the results, if using all features (i.e. the similar one too) the NN results in biased training then try removing them.

Amin Honarmandi Shandiz

Dear Meriem Ferdjouni ,

Of course if you remove your duplicate features then your accuracy will decrease because if you have duplicate values then it will easy for your model to predict and accuracy will increase,

you suppose to remove to prevent overfitting.

Riadh Ayachi

try to clean the data using powerful techniques or preprocessed to balanced training data. besides, you can apply pruning techniques to remove redundant features safely without big damage in accuracy.

Bahaa Saif

Preparing Data is very important step in any MLP.In your project ACC decreased for removing the duplicated features ,this means that these features aren't same and important in model fitting .

Meriem Ferdjouni

J. Rafiee

Bahaa Saif Thank you for answering.

When I checked the duplicated features, I found them 100% exactly the same. However I am not using MPL as a final model. It is used for evaluating feature selection techniques.

It was selected because my classification problem is nonlinear.

Khurram Hameed Thank you for answering.

Could you elaborate on how removing features/columns can cause data imbalance please?

Thank you.

Abhishek D. Patange

See if you can use Deep learning-based networks as they involve automatic feature extraction rather than hadncrafted features. So there won't be any confusion of duplicate features.

Mohammad H. Nadimi-Shahraki

The following article can be useful too.

Article B-MFO: A Binary Moth-Flame Optimization for Feature Selectio...

Ahmed Refaat Ragab

Use this file