How to determine the appropriate pre-processing technique for artificial neural networks (ANNs)?

More Mahmoud Omid's questions See All

Is time really equal to money?

When someone says “Time is Money” they are also referring to what is called economic cost. Economic cost is the cost of deciding to do one thing over another. Why time equals money, but money may...

03 April 2018 7,490 98 View

Should there be ZERO guns or a more control of guns by civilians?

U.S.'s gun crisis. Everyone was so shocked and horrified by Las Vegas attack News. 59 people were killed and at least 527 were hurt Sunday night when Stephen Paddock rained gunfire on...

09 October 2017 388 72 View

How to tell if someone is lying?

Current US presidential election forced me to ask this question. As we walk around we are faced with all kinds of liars: politicians, combative locals who don’t want to tell you where the bad guys...

09 October 2016 6,055 88 View

What is the most beautiful theorem in mathematics?

Is it Euler’s identity eiπ + 1 = 0 where e is Euler's number (2.718), the base of natural logarithms, i is the imaginary unit, which satisfies i2 = -1, and π (3.14) is the ratio of the...

04 May 2016 1,846 80 View

Is it acceptable to use cell phones in classroom?

To Ban or Not to Ban (cell phones, tablets and iPads) in classroom , What's Your Policy?

03 April 2016 2,105 88 View

Why are those studying Humanities face with humiliation these days?

People who are studying Humanities (philosophy, ancient and modern languages, literature, ...) have declined steadily since the 1970s. Why? Recently, I read in Internet about Republican...

02 March 2016 2,023 66 View

What is the most visited tourist attraction in your country?

Without prejudges, Eiffel Tower is perhaps the most famous monument in the world, and the icon for Paris and France, If not, what do you think is the most famous monument around the...

02 March 2016 4,117 89 View

Why do politicians lie?

The reasons politicians often lie is because the public doesn't want to hear the truth. People want to hear what they want to hear. Why do they get away with lying? Why do we let them lie to get...

02 March 2016 10,265 92 View

Would you break the law to save a loved one?

Why? Do you have an example? Should it be no law on loving a person?

31 December 2015 3,456 89 View

What was the best (good) News for you in 2015?

When reading Newspapers, watching the News or looking at Internet, it can often feel as if the world is a very sad place, with the headlines delivering a conveyor belt of bad news, tragedy and...

11 December 2015 9,713 24 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Which Scopus Journal provides the most affordable fees?

"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....

10 August 2024 8,621 1 View

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...

10 August 2024 8,992 0 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

Who will be moral responsible for the death of thousands of people in the event of an earthquake?

Who will bear moral responsibility for the deaths of thousands of people in the event of an earthquake? Weeks and months remain before the onset of strong earthquakes that bring death to...

08 August 2024 6,134 12 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Mahmoud Omid Popular answer

Normalization is required so that all the inputs are at a comparable range. The general approach to data normalization is part of the data exploration which include:

Data exploration (sometime called data clean-up)
Data transformation
Outlier detection and removal
Data normalization
Data analysis
Validation of results

Ehsan Mirsadeghi

I like to look at your question in the field of feature extraction/selection.

As a result, you are able to analysis and process your data to prepare them as inputs for NN.

For this, you can look for methods for feature extraction. On the other hand, you can project your data to another space which is more robust to noise and outliers or you may use dimension reduction methods like PCA and LDA to find more important elements and discard poor and non-valuable elements.

It's clear that no exact solution exist for your problem. It is a good idea to use different transfer functions, number of hidden layers, training function, normalization methods and etc to find best architecture and parameters of NN for your application.

Jabar H. Yousif

The pre-processing stage (encoding) is the process that transforms the input data into a suitable form, which the network can identify and use. Essentially, every input in the network will associate with a bit-vector of output .It aims to increase the number of significant locations and decrease the usage of memory storage. Then the error function (Bishop 1995) or the cost function is used to measure the distance between the targets and the outputs of the network. Then the weights of the network are updated in the direction that makes the error function minimum. However, there is several error functions can be used for training neural networks such as Sum-of-squares error, City-block, Cross-entropy (single & multiple), Kohonen, Cross-entropy, etc.

So, it is not depend on using 0, 1 notation or the bipolar notation -1 for 0 , 1 for 1. It is depend on the behavior of data-sets pair (input, target).

Maryam Nikzad

Although the input of neural network can be in any range, there is a saturation effect so that the unit is only sensitive to inputs within a fairly limited range. Numeric values have to be scaled into a range which is appropriate for the network. Typically, raw variable values are scaled linearly. In some circumstances, nonlinear scaling may be appropriate (for example, if you know that a variable is exponentially distributed, you might take the logarithm.)

Mahmoud Omid

Krishnan Umachandran

ANN - Processing Units - ANN consists of a pool of simple processing units that communicate by sending signals to each other over a large number of weighted connections. Each unit performs a relatively basic task; receive input from neighbors or external sources and utilize that information to compute an output signal that is propagated to other units in the ANN.

http://www.dhtusa.com/media/NeuralNetworkIntro.pdf

Behrouz Ahmadi-Nedushan

The objectives of data preprocessing include size reduction of the input space, smoother relationships, data normalization, noise reduction, and feature extraction.

I recommend reading chapter 15 of the book: Fuzzy neural intelligent systems: Mathematical foundation and the applications in engineering (see attached PDF).

Data normalization can provide a better modeling and avoid numerical problems. Several algorithms can be used to normalize the data:

1) Min-Max Normalization, min-max normalization is a linear scaling algorithm. It transforms the original input range into a new data range (typically 0-1).

2) Zscore Normalization, in Zscore normalization, the input variable data is converted into zero mean and unit variance.

3) Sigmoidal Normalization, sigmoidal normalization is a nonlinear transformation. It transforms the input data into the range -1 to 1, using a sigmoid function

The outliers of the data points usually have large values. In order to represent those large outlier data, the sigmoidal normalization is an appropriate approach.

Li, H., Chen, C. P., & Huang, H. P. (2000). Fuzzy neural intelligent systems: Mathematical foundation and the applications in engineering. CRC Press.

http://www.sciencedirect.com/science/article/pii/S2212017313003137

Using highly variable data for training ANN models may induce "overrating". So for practical reasons as well as making all the inputs at a comparable range normalizing or standardizing the inputs in ANNs can make training faster and reduce the chances of getting stuck in local optima. If you use Matlab you can use either mapminmax or mapstd as follows:

[yn, ps] = mapminmax(ymin, ymax)

MinMax or mapminmax algorithm: y = (x-xmean)*(ystd/xstd) + ymean;

[yn, ps] = mapstd(ymean, ystd)

Z-Score or mapstd algorithm: y = (x-xmean)*(ystd/xstd) + ymean;

where yn is normalized data.

http://www.mathworks.com/help/nnet/ref/mapminmax.html

http://www.mathworks.com/help/nnet/ref/mapstd.html

One common problem with MLP is saturation When the weighted sum is big, the output of the tanh (or sigmoid) saturates, and the gradient tends towards 0. Hence, we should initialize the parameters such that the average weighted sum is in the linear part of the transfer function.

Normalization for regression and classification problems

Normalization of Inputs:

normalized with zero mean and unit variance,

Normalizatioh of Targets:

For regression problems: normalized with zero mean and unit variance,
For classification problems:
output transfer function is tanh: 0.6 and -0.6
output transfer function is sigmoid: 0.8 and 0.2

See Prof. Leon Bottou’s thesis for details:

Léon Bottou: Une Approche théorique de l'Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole, Ph.D. thesis, Université de Paris XI, Orsay, France, 1991.

http://leon.bottou.org/papers

Mahamad Nabab Alam

Normalization of input and target data is a key for properly training the artificial neural network.