What is optimal normalization range and weight range for a ANN with sigmoid activation function?

More Nikolaos Bismpikos's questions See All

How to reconstruct original observations using PCA?

I ran PCA on 4 variables using the prcomp library. All variables were normalized to have a mean of zero and a standard deviation of one (z-score) before the PCA. prc 1 and I performed a varimax...

26 June 2024 6,792 1 View

Error when running bfastlite using a monthly time-series matrix as input?

I have 12 raster images which I stack them. Then I converted the rasterstack to a matrix, I remove the NA's from the matrix and I convert the matrix to monthly time-series matrix. Finally, I run...

22 December 2023 236 0 View

How to batch download monthly mean band for several years using Google Earth Engine?

My goal is to download the monthly mean Red band (B4) from Landsat 8 for the year April 2013 to December 2022 (total 117 monthly mean images) using Google Earth Engine (GEE). For this reason I am...

08 December 2023 4,516 0 View

Preprocess daily Black Marble (NVP46A2) nighttime light product in R?

I downloaded NASA's Black Marble daily product (VNP46A2) which is in h5 format. One needs to preprocess the data using the Scientific Data Sets (SDS) included in the h5 file. Based on the User...

07 November 2023 8,232 0 View

How to export daily image from PROBA-V using Google Earth Engine?

I'm trying to download single NDVI image from Proba-V. Proba-V produces images every 2 days. Here (https://developers.google.com/earth-engine/datasets/catalog/VITO_PROBAV_C1_S1_TOC_100M) is the...

29 May 2023 8,103 2 View

Black Marble nighttime lights imagery in Google Earth Engine?

I want to download Black Marble daily nighttime lights (NTL) using Google Earth Engine (GEE). My question is if the pixel values of the images are in digital numbers (DN) or radiance. Here is a...

21 May 2023 815 4 View

Random Forest: GridSearch vs RandomSearch and interaction between the parameters?

In R, I am using the randomForest and caret packages for a Random Forest (RF) regression task. Genarally, there are two options when fine-tuning a model: GridSearch, RandomSearch. I have seen on...

02 April 2023 3,165 3 View

How to design any abstract phase profile for metalens design?

I want to use FDTD to design meta-lenses of different types and I have successfully done so for simple systems where equations are available for the desired phase profile. My question is, how can...

13 February 2023 4,906 2 View

How to download yearly MODIS NDVI 16 day 250m using Google Earth Engine?

I am trying to export yearly median MODIS-based NDVI products at 250m spatial resolution. I am using the 16 day MODIS/061/MOD13Q1 product. I have found this code which I am following...

25 January 2023 4,000 4 View

Repeated measures two groups two factors how many DFs?

Let's say we have cells from two genotypes (WT and KO), each tested with three drug concentrations on each of three separate days. Assume sphericity over the responses of 100 cells tested, 40 WT...

24 January 2023 5,179 6 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Which Scopus Journal provides the most affordable fees?

"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....

10 August 2024 8,621 1 View

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...

10 August 2024 8,992 0 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

Who will be moral responsible for the death of thousands of people in the event of an earthquake?

Who will bear moral responsibility for the deaths of thousands of people in the event of an earthquake? Weeks and months remain before the onset of strong earthquakes that bring death to...

08 August 2024 6,134 12 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How to design human-centered classroom in the age of A.I.?

08 August 2024 347 5 View

Are there any instruments for studying time similar to the way it is in space?

There are a huge number of methods for studying objects in space, according to the senses (and not only). Mechanical, thermal, optical, acoustic, electrical, magnetic, based on particle beams,...

06 August 2024 7,102 0 View

Vijay Gs

Hi Nikolaos,

I think mathematically, normalization in the range [-1 1] is same as normalization in the range [-n n].

Nikolaos Bismpikos

Hey, I am not talking about the normalization formula but whether this affects the output of the network for sigmoid activation function

Lov Kumar

Dear Nikolaos Bismpikos,

I think both are different things. Weight depends on the learning algorithm. So we can't find the range of weight, it depends on used training algo. If you use GA, then u can fix the range of weight. Because in GA, u defined your own equation to compute the value of weight.

Manex Serras

In my experience, normalization procedure is used in NN input if the different features that are being used are in a different numeric value/ranges. The normalization is done feature-wise usually, so the impact on the network would be similar.

The sigmoid function is not applied directly to the input features, I mean, if we have X = [x_1, ... , x_n] as input an W_i = [w_i1, ..., w_in ] the synaptic weights corresponding to the i'th neuron of the following hidden layer, the activation of that neuron w.r.t. the sigmoid function would be:

Y=sigmoid(X · transpose(W_i)) that it's always between the range (0,1).

If you start with very large values in the input and weights, the sigmoid signal would be 'crushed' to 1.

There are several ways to initialize the initial weights, but small random numbers is usually fine. At the end, these weights will be updated according to an optimization function to complete the designed task.

Hope that some of this information helps you.

Cheers.

Hey, yes I already know this infromation. But if the weights are small and the input normalization is at [-1,1] most of the time we'll be on the linear part of the sigmoid. That was my main concern asking if it makes any impact ( in terms of including non-linearity) to use a wider range like [-4,4]

Mahmoud Omid

Please see the attached link for similar thread and interesting answers:

https://www.researchgate.net/post/How_to_determine_the_appropriate_pre-processing_technique_for_artificial_neural_networks_ANNs?_tpcectx=profile_questions

Wolfram Rinke

Hello, normalizing on the [-1,1] range is a good decision, because you are interested in the threshold behavior of the Sigmoid function and not its value range per se.

If you use standard backpropagation algorithm than you should initialize the weights randomly (within a certain value range, maybe between [-1, 1] and not equally.

I hope that helps a bit.

Regards, Wolfram

Oyebade K. Oyedotun

Hello, using large valued weights (e.g. in the range [-4, 4] as stated by Nikolaos) could have two major effects somewhat irrespective of input value ranges; one has already been stated by Menex. On the other extreme, the pre-activation value (or Total Potential: T.P) of a hypothetical neuron could be driven to a very small value, such that the activation value of the Sigmoid function is almost zero.

Let, T.P = (Wx+b).........total potential or pre-activation value for a hypothetical neuron.

then, y= Sigm(Wx+b)..........activation value for a hypothetical neuron y.

y→ 1, for large values of T.P (i.e. T.P→ +∞)........extreme condition 1.

y→ 0, for extremely small values of T.P (i.e. T.P → -∞).....extreme condition 2.

Note that neither of the two extreme cases is desirable in neural networks, as it removes us from the linear region of the Sigmoid curve; we then are in the extremely upper or lower saturation domain of the Sigmoid curve. Hence, during backpropagation of error signals for weights update, derivatives of activation values within the saturation regions yields zeros which are of little or no use for learning. i.e erases backpropagated information on learning from the output layer!!!

I hope this helps.

Oyebade

Hey Oyebade, thanks for your reply. Are the above cases problematic even when no back-propagation / iterative algorithm is used for learning the weights? For example if we use a meta-heuristic algorithm/evolutionary algorithm to learn the weights are these two extreme cases also undesirable?

Hello Nikolaos, I think for meta-heuristic algorithms, the extreme cases shouldn't be a problem once such algorithms do not rely on error gradients for weights update. In fact, haven worked with particle swarm optimization (PSO) for neural network, it is not uncommon to find that network weights are well outside the [-1, 1] range after training. Although one could intentionally clamp the network weights to the range [-1, 1] in PSO, I have little doubt that this significantly affects the performance of the network.

Hubert Anysz

Nick,

I've checked that standarization method influence the prediction accuracy of ANN if sigmoidal function is applied.

"The influence of input data standardization method on prediction accuracy of artificial neural networks" will be published in two weeks time. There is another issue: what is an activation function. We should fit the range of data value for for the activation function e.g. we can not calculate ln(x) for x

Hey Hubert, I am looking forward to your publication