When is it necessary to check the correlation between features when doing a classification?

More Zeynab Mousavikhamene's questions See All

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about Uranium ore deposits in world.

11 August 2024 6,720 0 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

10 August 2024 8,198 5 View

Controlling for pupil light reflex when analyzing pupil size time course?

I used eye tracking to examine how participants from two different populations (A and B) react to an image. Participants in population A exhibit larger pupil sizes over time, but they also have...

10 August 2024 3,229 0 View

What are a “Farmers Producer Organization” (FPO) and its essential features?

10 August 2024 477 5 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How do interactions between the biosphere, the carbon cycle, and the water cycle impact global warming and interaction between the atmosphere and the hydrosphere?

09 August 2024 3,291 2 View

How to get moment output in Abaqus Standart?

I have input a moment load in module load Abaqus, i put my moment load on the node surface (using reference point). I have define moment in history output and make a set for moment too. But the...

08 August 2024 4,831 4 View

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

08 August 2024 8,162 0 View

List of journals impact factors?

Dear colleagues, Is it possible to send me the list of journals impact factor for the year 2024 (classification is for the year 2023)? excel format if it is possible. Thank you in...

29 June 2024 2,102 3 View

How does the application of (GANs) for data augmentation impact the robustness and accuracy of image classification models?

How does the application of generative adversarial networks (GANs) for data augmentation impact the robustness and accuracy of image classification models?

09 June 2024 2,923 2 View

How can attention mechanisms be integrated with convolutional neural networks to enhance performance in image classification tasks?

09 June 2024 2,432 3 View

How spectral bands and indices like (NDVI, NDBI) together used as input before supervised classification? In ArcGIS pro or any other software?

How Satellite Bands (Landsat/Sentinal) and indices (NDVI/NDBI) were composite together (Layer stacked) (In a single layer) before performing supervised classification (MLC/SVM/RF etc)? How it...

06 June 2024 2,207 1 View

Multi-Task Learning Architecture for Inductive Learning ability ?

Hi folks, I'm a computer scientist PhD student, and I'm working on implementing Multi-Task Learning architecture for a better generalization aims, it will be throughout a Deep Learning model. I...

21 May 2024 8,589 1 View

Video annotation tool for action classification?

Hello everyone, I have a dataset of videos for action classification, where each video contains multiple actions. I need to annotate these videos with the name of each class and the start and end...

17 May 2024 5,293 2 View

I need guidance (methodology) for the following study?

Estimating hourly surface runoff for the last 27 years (1992-2018) using multiple information including rainfall, soil moisture, topography, land use/land cover, soil classification.

06 May 2024 5,629 0 View

EV load Data classification and forecasting?

I want to ask that, is there any method to forecast EV load data recorded at irregular time intervals? I have read some articles related to EV load data forecasting. In these articles authors has...

05 May 2024 4,346 0 View

Is it possible to evaluate cropping pattern using supervised classification of google earth engine?

Cropping pattern can explain may thing. crop diversity, soil and climate and many more. it is emergence to identify the right cropping pattern. But right cropping pattern identification is much...

24 April 2024 8,541 4 View

Why plant taxonomist only follow APG system of classification. Why countries like India don't have their region based classification system ?

Ancient India believed in traditional system of schooling called Gurukul, however sociopolitical developments and modernization of culture gradually wiped away the tradition. This eventually...

20 April 2024 9,414 0 View

Argha Ghosh

Zeynab Mousavikhamene without feature selection and feature extraction or attribute selection, you can perform classification but, results will not be accurate. if you select featurre or attribute, results will be focused or concentrated.

Pangambam Sendash Singh

Feature/attribute correlation is an important step in the feature selection/reduction phase of the data pre-processing especially for the data type for which the features are continuous.

Md Sazal Miah

Hi!!

This discussion may help you : https://towardsdatascience.com/feature-selection-correlation-and-p-value-da8921bfb3cf?gi=26e9b0d7db7a

Ernest Bonah

If the idea is to reduce redundancy due to so many features or a problem of large datasets reducing the speed of classification, then you must continually check the correlation till your are satisfied. there must be a balance between classification and the amount of feature used

Panayiotis Leontiou

It is not necessary to check the correlation between features in order to create a classification model, but if you know how to do that, you should. If you have many features and most of them are irrelevant for your predictions, your model will probably not be able to reach its full potential.

By understanding which data are related to what you are trying to predict can significantly help your model achieve higher accuracy. At the same time training time will be reduced because of the reduced dimensions of data.

Let me give you a simple example. Let's say you want to predict whether a client of a car insurance company will cost money based on his/her characteristics. You already have information about past clients and whether they requested money to cover their car expenses or not. The data you have for the clients are: name, surname, age, years of driving experience, number of past accidents, marital status, address, etc. Some of these information will be very useful for your model, like age and years of experience, while others will not help your model at all, like name and surname. So you will be better leaving those information our of your training data.

If you are not sure if a feature is relative or not then it would be better if you include it, because you do not want to loose helpful information. If you can use all the features for training your classification model give it a try, then remove the ones you believe that are irrelevant and try again. You can compare the two results and decide which model to use, based on their accuracy.

We must not forget that our models can sometimes find a correlation between some features that we could not spot with data analysis. That is the reason that we are using these models after all, isn't it?

Elena Mikhalkova

I have done feature-analysis in some of my articles after (sic!) I got a sufficient result with the classification, mainly as an insight into the features, to explain the reader and myself which features were most important. As I see, you also have done the classification already. So, you, first, hypothesized about which features to use, then used them, and only after that you came up with the question: how do I check which features were correct to use? I think, presently we mainly rely on the expert opinion about the features to build classifiers, because the statistical instruments are not ready to use for the modern classification tasks, as data science and computer methods of data aggregation are developing very fast. I tend to think there are either yet no methods to bind correlation of features to classification performance or these methods are poorly rationalized and, hence, not reliable. Therefore, if you choose to do statistical analysis of features, you will end up with the rationalization of the statistical method first. However, the effort might pay off and you will contribute to the field. We try to do something like that in the pre-print I attach below.