In linear regression, the effect of each independent variables can be measured and we can also determine if it has a positive or negative effect. can such be done in Artificial Neural Network?
There is no exact way of measuring the effect of each of the input nodes (input variable). However, if you see the weights connected to each of the nodes, higher weights clearly indicate higher influence of the input variable, even though we cannot consider it to be linearly related. Definitely, nodes with very small weights connecting to the next layer have much lower influence in the ultimate output.
There may be a more formal approach; I've used a basic method, correlation, in the past. In this simple approach, I have plotted each input against each output of the artificial neural network (ANN). The matrix of scatter plots then give a visual indication of the inputs contribution to the outputs of the ANN. The gradient of each scatter plot indicates positive, negative, or no contribution between the inputs and the outputs.
When the model has been trained/fitted, you can view the Weight/Theta matrix with any number of visualization tools. But as has been mentioned, the value of a weight doesn't linearly correlate to it's influence in the output - it's intentionally non-linear.
My questions would be 1) what can we do to manipulate the model parts and observe the resulting output 2) why, beyond some deeper understanding of the hidden layers, do you need to do this, what are we we learning by watching the weights?
The hidden layers are named that for the reason not that you can't see them, but that you don't need to see them - training/fitting produces a set of weights which produces non-linear decision boundaries.
But back to 1). a You could drop out some units in the layer and observe the results. b You could change your lambda value (the amount of smoothing on the decision boundary) separately, or while doing dropout experiments, and observing the results. I would basically look at how much the output changed when a weight (or a connected unit) was changed or removed.
3) you could also observe how the weight changes during training e.g. show your weight matrices over training iterations and watch how your backpropogation/gradient descent finds optimum weight values.. You could assign non-random weights, such as an all ones or all zeros, and capture/graph the delta.