I have a multivariate time-series forecasting problem which I am trying to solve using LSTM models. I have transformed my data such that it looks back last 20 timestamps to reconstruct the records at current timestamp, using deep LSTM model. Following is the model architecture :
model3 = Sequential()
model3.add(LSTM(1000,input_shape=(train_X.shape[1], train_X.shape[2]), return_sequences = True))
model3.add(Dropout(0.2))
model3.add(LSTM(500, activation = 'relu', return_sequences = True))
model3.add(Dropout(0.2))
model3.add(LSTM(250, activation = 'relu', return_sequences = False))
model3.add(Dropout(0.2))
model3.add(Dense(train_y.shape[1]))
model3.compile(loss='mse', optimizer='adam', metrics = ['accuracy'])
model3.summary()
history3 = model3.fit(train_X, train_y, epochs=100, batch_size=36, validation_data=(test_X, test_y), verbose = 2, shuffle= False)
Attached is the graph of the neural network output. The 'validation loss' metrics from the test data has been oscillating a lot after epochs but not really decreasing.
Can anyone explain how to interpret the graph ? Also, what could be the potential ways to ensure that the model does get improved with every new epoch ?