How to pre-process data set for machine learning task?

More Tarun Kumar Gupta's questions See All

Which type of compound does lamda max of 218 indicate in a uv-vis spectrum of a partially purified compound through column and TLC?

A crude extract of fungal culture using EtOH was subjected to column and TLC and partially purified compound was obtained. UV vis spectrum of the compound/s has max absorbance at 218nm. The...

11 August 2024 9,801 2 View

Short Synthesis of Graphene Oxide from Natural Graphite Flakes?

I am in search of a modified Hummer's method, which can be used to synthesis graphene oxide within 6-8 hours. As I am a student it is not allowed for us to work after 5 in the laboratory. So I am...

01 August 2024 8,368 2 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View

Conjugation of PEG-Amine to an Amino Acid Using EDC?

I am attempting to conjugate PEG to an amino acid at the C-terminus, for the purposes of producing nanoparticles. I have been told that PEG modified with amine groups can be used for this purpose,...

31 July 2024 2,033 1 View

Does soybean seed coat or cotyledon contain chlorophyl or flavonoids? What types? determined by paper chromatography? other methods (high school lab)?

Am trying to develop a lesson/lab to determine and compare antioxidant properties of soybeans of various colors. Preference would be low tech and low cost. Any assistance is greatly appreciated.

25 July 2024 7,498 2 View

Does post-translational protein modification cause devisions on observed pI verses calculated pI?

In running two-dimensional gel electrophoresis on bacterial protein, some spots that appear to match a protein sequence have a significantly more acidic isoelectric point than the calculated pI....

24 July 2024 8,076 3 View

How to make soil water characteristic curve using centrifuge?

We have no access of pressure plate apparatus

22 July 2024 3,488 2 View

How I can add anew research in my account ?

Their is new research published in the biological control journal, and I am one of the authors. I want to add it to my account at researchgare. How can I do this?

21 July 2024 7,545 0 View

Can Document Analysis be used as the only method of research?

I have read that it is better to use other research methods alongside document analysis for triangulation, although it can also be used as a stand-alone method.

18 July 2024 4,218 0 View

How much sample we are required to do a pilot study for standardized an Achievement Test?

I am conducting a study on Effectiveness of e-content on academic achievement of secondary school science student.

17 July 2024 5,633 4 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View

How to Compress Information Neurally?

Samuel Morse, the inventor of the Morse Code, understood that certain letters in the English language occurred more frequently than others (Gallistel and King 2010). To deal with this, Morse used...

01 August 2024 4,456 2 View

Chanchal Suman

Max-min normalization can be used for contiguos features and no need to pre-process the binary features. I am not very sure, but i think it could give good result. You can refer to this page. I wish it will help you.

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html

Abdulkader Helwan

depends on your data.. however normalization and feature selection can be enough

Anwar Said

Various steps can be performed for pre-processing dataset based upon your data and your research problem. The most common steps are: 1)exploring data in the initial stage by plotting it to understand the nature of your data. You can plot the correlations in data 2) Describe your data like finding mean, mode, median and range to see the distribution of data. different visualization(histogram, line graph, box plot) can help you in this step. 3). After doing the first two steps, you should be able to explore and understand your data, now remove inconsistent values, duplicate records, missing values, invalid data and outliers etc to prepare your data for applying machine learning models. You can also perform normalization, feature selection, dimensionality reduction etc if required.