Bias is defined as the difference between the prediction of our model and the correct value which we are trying to predict.
The error due to variance is taken as the variability of a model prediction for a given data point.
Regarding your questions, four different cases representing combinations of both high and low bias and variance are shown graphically. Dealing with bias and variance is really about dealing with over- and under-fitting. As more and more important parameters are added to a model, the complexity of the model rises ,bias steadily falls and variance becomes the primary concern.
you can refer to chapter 7 of the excellent book written by distinguished professors of Stanford university (see the link, it is freely available!):
Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The elements of statistical learning, New York: Springer.
You have a bias in your data when the distribution of the real values of your variable are different than what you collect in the representation of your variable trough your experiment.
The variance is a value of how spread is your data.
A big variance can exist even when there is a big bias, and as soon as bias is smaller, variance should match the reality. A low variance in your biased data could coexist with a high variance of the real possible values if bias is big.
I am not sure about the four statements you make. To distinguish between bias and variance I would say variance is the distribution of the dependent variable as a result of changes in the independent variable and bias is systematic variance in the dependent variable due to a variable correlated with the independent variable.
Bias is defined as the difference between the prediction of our model and the correct value which we are trying to predict.
The error due to variance is taken as the variability of a model prediction for a given data point.
Regarding your questions, four different cases representing combinations of both high and low bias and variance are shown graphically. Dealing with bias and variance is really about dealing with over- and under-fitting. As more and more important parameters are added to a model, the complexity of the model rises ,bias steadily falls and variance becomes the primary concern.
you can refer to chapter 7 of the excellent book written by distinguished professors of Stanford university (see the link, it is freely available!):
Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The elements of statistical learning, New York: Springer.
I will cast my answer in terms of the types of error in statistical measurement and estimation. And I think that the dartboard analogy in the Berhouz's post is very useful.
The target (eg the bull's eye) is the true but unknown value (not usually possible to really know this outside a simulation experiential) and you make multiple attempts (throws) to obtain (hit) this true value.
Bias refers to accuracy- how close you are on AVERAGE over mutiple attempts
Imprecision or un-reliability refers to how scattered the attempts are around the true value- this is asses ed by the VARIANCE.
We can study the success of our estimation/measurement strategy in terms of these two properties.
For example - this papers considers using an estimator (random coefficient shrinkage estimate) that trades some bias for a reduction in imprecision (it aims to reduce Mean Square error that is equal to the sum of the variance and the squared bias).
In econometrics (especially) you also often see the letters BLUE - this stands for Best (ie most precise) Unbiased (ie accurate) Linear Estimator eg Ordinary Least Squares estimate is BLUE providing these assumptions hold....... is saying that out of all the linear estimators,OLS is the most reliable accurate procedure.
Article A model- based approach to the analysis of a large table of ...