Do you need to make an empirical or theoretical pdf? Would you like a cumulative distribution function (cdf) or the pdf?
Because I don't know your background knowledge, I'll answer as if were a student. I'm sorry if you find it too basic.
The empirical pdf is a curve made from your observations whereas the theoretical pdf is a mathematical function fitted to your data. The cumulative distribution function (cdf, or F(x)) is the integral, or the sum, of probabilities up to x in your pdf f(x).
This page displays the cdf in the upper plot and the corresponding pdf in the lower plot:
See the difference between an empirical cdf and a cdf here: http://en.wikipedia.org/wiki/File:Empirical_CDF.png
To produce an empirical pdf manually, sort your data. Are you using daily data for a three month period (spring) or the maximum value? Regardless of your data, the method is the same.
1) Sort your data from smallest to highest.
2) Assign an index i to each observation. The first observation is i=1, the second is i=2 etc, up to i=N.
3) Calculate the probability of non-exceedence by using Weibull's plotting formula: i/(N+1).
If you'd rather like exceedence probabilities, simply calculate 1-(i/(N+1)).
Step 3) provides the empirical cummulative distribution function for your data.
In order to find the theoretical pdf, you have to do exploratory data analysis (histograms, pp-plots, qq-plots) to determine which distribution may be appropriate. For daily discharge, the normal, gamma, or lognormal distribution are candidates. For extremes, the Gumbel, Frechet or Weibull distribution are candidates. When you have decided on which distribution to chose, use a method to determine the curve parameters. The method of moments is one such method: http://en.wikipedia.org/wiki/Method_of_moments_(statistics)
You can use R language to do it. There is a lot of documentation to do it in R.
Afterwards, you can have a look on Tutorial 8 in the link opening page https://www.google.de/?gfe_rd=ctrl&ei=um4lU5jAOofLtAbisoHYBg&gws_rd=cr#q=probability+density+function+for+dischage
Do you need to make an empirical or theoretical pdf? Would you like a cumulative distribution function (cdf) or the pdf?
Because I don't know your background knowledge, I'll answer as if were a student. I'm sorry if you find it too basic.
The empirical pdf is a curve made from your observations whereas the theoretical pdf is a mathematical function fitted to your data. The cumulative distribution function (cdf, or F(x)) is the integral, or the sum, of probabilities up to x in your pdf f(x).
This page displays the cdf in the upper plot and the corresponding pdf in the lower plot:
See the difference between an empirical cdf and a cdf here: http://en.wikipedia.org/wiki/File:Empirical_CDF.png
To produce an empirical pdf manually, sort your data. Are you using daily data for a three month period (spring) or the maximum value? Regardless of your data, the method is the same.
1) Sort your data from smallest to highest.
2) Assign an index i to each observation. The first observation is i=1, the second is i=2 etc, up to i=N.
3) Calculate the probability of non-exceedence by using Weibull's plotting formula: i/(N+1).
If you'd rather like exceedence probabilities, simply calculate 1-(i/(N+1)).
Step 3) provides the empirical cummulative distribution function for your data.
In order to find the theoretical pdf, you have to do exploratory data analysis (histograms, pp-plots, qq-plots) to determine which distribution may be appropriate. For daily discharge, the normal, gamma, or lognormal distribution are candidates. For extremes, the Gumbel, Frechet or Weibull distribution are candidates. When you have decided on which distribution to chose, use a method to determine the curve parameters. The method of moments is one such method: http://en.wikipedia.org/wiki/Method_of_moments_(statistics)
If you want, there is an easy way out: There is software on the market for doing this. For all sorts of earth-scientific (and other) data I -for example- used @Risk from palisade.com. It is a easy to use plug-in for Microsoft Excel; you just load (or type) your values into Excel and press the button to obtain the best fitting distribution for your data (and all sorts of statistical values like different means, etc.). The software can do much more; mainly it is used for Monte Carlo Simulations. Unfortunately the plug-in costs about 1'500GBP, but maybe there is a scientific licence available.
PS: Needless to say, that one needs at least some statistical education in order to use this tool correctly; "erratic button pressing" will otherwise only produce statistical garbage.
Cheers Herwig
Remark: I am neither an employee of palisade nor do I benefit in any way from this "advertisement". I only express my personal experience here...
how to use monte carlo simulation and it's application in determination of pdf using @risk software? Can any one give suggestion for step wise proceeding?