Unpredictable performance under random initialisation of network weights?

Dear Matthew Gadd

If we make the randomness more predictable, we can achieve consistent results.

To make the randomness predictable, we use the concept of seed.

Seed helps get predictable, repeatable results every time

If we do not set the seed, then we get different random numbers at every invocation

Setting the seed to some value, say 0 or 123 will generate the same random numbers during multiple executions of the code on the same machine or different machines.

To resolve the randomness of an ANN we use

numpy random seed
Tensorflow set_random_seed

let’s build a simple ANN without setting the random seed, and next, we will set the random seed. We will be implementing the code in ketas

I have used Housing dataset from Kaggle

Demonstrating the randomness of ANN

#Importing required libraries import numpy as np import pandas as pd from keras import Sequential from keras.layers import Dense# Reading the data dataset = pd.read_csv('housingdata.csv') dataset.head(2)# Creating independent and dependent variable X=dataset.iloc[:,0:13] y=dataset.iloc[:,13].values# creating train and test data from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test =X[:400], X[400:], y[:400], y[400:]# Building a simple ANN for regression def build_regressor(): regressor = Sequential() regressor.add(Dense(units=13, input_dim=13)) regressor.add(Dense(units=1)) regressor.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae','accuracy']) return regressor# creating the kears Regressor with 100 epochs from keras.wrappers.scikit_learn import KerasRegressor regressor = KerasRegressor(build_fn=build_regressor, batch_size=32,epochs=100)# Fitting the training data results=regressor.fit(X_train,y_train)# Making prediction the test data y_pred= regressor.predict(X_test)# printing the first 5 predictions for comparison y_pred= regressor.predict(X_test) y_pred[:5]

output for the first run of the program shown in Fig 01

output on the second run shown in Fig 02

You will get a different output at each execution

Fixing the randomness of our ANN

we will import two additional libraries and set the seed

from numpy.random import seed from tensorflow import set_random_seed

Setting the numpy seed and tensorflow seed

seed(0) set_random_seed(0)

Final code

#Importing required libraries import numpy as np import pandas as pd from numpy.random import seed from tensorflow import set_random_seed from keras import Sequential from keras.layers import Dense# settingt he seed seed(0) set_random_seed(0)# Reading the data dataset = pd.read_csv('housingdata.csv') dataset.head(2)# Creating independent and dependent variable X=dataset.iloc[:,0:13] y=dataset.iloc[:,13].values# creating train and test data from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test =X[:400], X[400:], y[:400], y[400:]# Building a simple ANN for regression def build_regressor(): regressor = Sequential() regressor.add(Dense(units=13, input_dim=13)) regressor.add(Dense(units=1)) regressor.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae','accuracy']) return regressor# creating the kears Regressor with 100 epochs from keras.wrappers.scikit_learn import KerasRegressor regressor = KerasRegressor(build_fn=build_regressor, batch_size=32,epochs=100)# Fitting the training data results=regressor.fit(X_train,y_train)# Making prediction the test data y_pred= regressor.predict(X_test)# printing the first 5 predictions for comparison y_pred= regressor.predict(X_test) y_pred[:5]

output on the first run and any run shown in Fig 03

Conclusion:

By nature, ANN’s are non-deterministic due to random initialization of the weights, biases, using dropouts, and different optimization techniques. We can set the seed for both numpy and TensorFlow to get consistent results using the same dataset either on the same computer or on different computers.

Hope this helps.

Regards

Qamar Ul Islam

Danilo Djekic

Hi Mathew,

First, I would like to clarify the problem you are facing. What do you mean by "random variation in the early performance of CNNs with randomly initialized weights"?

If you mean: I initialized my model many times, and each time I'm getting different results:

Then I agree with Qamar, it is good practice to fix a random seed to make your results more reproducible.

However, In practice, different random seeds shouldn't make such worryingly huge differences in results and convergence time with everything else staying the same.

This means the gradient varies a lot between different starting points. I would suggest printing your gradients for different random seeds and confirming whether this is true, or try a different optimizer or a loss function.

If you mean: During training my validation loss has large variance and doesn't seem to go down:

Large variance in loss is often a result of a too large learning rate, or a too small batch size. Try reducing the learning rate and see if the convergence stabilizes. If this doesn't work, and your validation loss keeps jumping around (while your training loss converges) you may be experiencing a drastic overfit- try simplifying your model.

Hope this helps :)

Danilo

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

What is the difference between mathematical R^4 space and physical 4D unit space?

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

Controlling for pupil light reflex when analyzing pupil size time course?

What are a “Farmers Producer Organization” (FPO) and its essential features?

Strugglling with m6A dot blot any suugesstion ?

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How to get moment output in Abaqus Standart?

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

Feedback defines the constitution of an organism?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Measuring the Intelligence of a Species?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

Need help with my research project on open source SIEM and machine learning?

Swimming/space travel depends on the proprioceptive muscle spindles?

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Some new emerging problems on application of RL for scheduling in IoT networks?

How to Compress Information Neurally?