Of course, if you write the LSTM code yourself or you are able to edit the library codes, you can weight the gates. The LSTM uses 4 gates which are modeled by tanh and some sigmoid functions. You can fool around these operators, may be waste some time replacing them with softmax, argmax softmax, weighted sigmoid, hyperbolic tanh, softplus, gaussian and other similar functions.
A weighted sigmoid and tanh function can influence the behaviour of the gates. Hence instead of σ(W*[x, h] ) and tanh(W*[r,h,x]), you could have α*σ(W*[x, h] ) and β*tanh(W*[r,h,x]), where α, β, etc., are control parameters. The gates squash some nonlinearity and controls the amount of data that flows through them (input gate), or controls when to update a state (update gate), when to forget a state (forget gate) and amount of data allowed through the output (output gate).
The weight control parameters for the gate functions can be selected from a hyperparameter search that suits your desire. Remember, that constant weights will be easier to tune than learnable filters and will reduce the amount of trainable parameters especially for deeper models.
Another option could be to statistically model some probability scores that control the gates operation. This will somewhat mimic dropout operation in determining when a gate operation is allowed or skipped. Note that probability modelling is non-deterministic and stochastic, so depending on your problem.
Chinedu, many thanks for the detailed answer. I was thinking in the lines of tuning the weights or setting up a statistical model to control the gates, but then I am wondering why should I use LSTM in the first place, instead the newtork could be redesigned to offer more control. Do you have links to similar published works?
Dairi, I am trying to figure out if its possible to influence the gates using external input (e.g. in the case of real-time data feed so I can utilise external input that hasnt been seen during the training phase, new knowledge)...etc.
I am interested in modelling longitudinal data sensor input and also social network feed.