Andrzej, the answer is simple. If you have no idea about the model, you can assume any and find an estimate with however no proof that it is correct. This is because without a model you cannot calculate the estimation error. But, I doubt that you know nothing about the model, since it is connected with your system dynamics.
Yuriy, regarding the model you're right, we know that the system is inertial, the signal does not grow faster than the exponential curve according to the criteria follow from Laurent generalization (assumptions to transform "z").
Of course, zero initial conditions.
The noise of nature nearing to normal distribution but with small perturbations, that interfere with the normal distribution.
In that case, the function seems to be oversampled. Then try the unbiased FIR filter [An unbiased FIR filter for TIE model of a local clock in applications to GPS-based timekeeping, IEEE Trans. on Ultrason., Ferroelec., and Freq. Control, 53, 5, 862-870, 2006] which was derived for such kind of systems. Use the ramp or quadratic impulse response function on an interval of N points.
I would suggest you consider using Parzen probability density estimators.
The Parzen density estimators are potential functions that when convolved with your data points yield a non-parametric probability density estimate of each of your classes of points. From these estimates of the pdf's you can compute your Maximum Likelihood Estimates.
The potential functions or kernels are locally-parametric, but do not assume anything about the global distributions, the aggregate of data points determine that.
I have used these very successfully in 4-dimensional feature spaces derived from infrared imagery of objects that heat up and then cool down - very dynamic!
Antonio good idea. Integration in real-time systems makes the problem. What do you think about the operators function z/(z-e^-aT) as a candidate for the probability density estimators in discretization points?
@Andrzej: Sure, why not. The form of the kernel is not crucial. If you have a physical/mathematical/computational reason for choosing one kernel rather than another, then try it and see the results you get. If they appear reasonable, then use it.
My solution maximum likelihood for real-time systems are in the appendix as a graphical interpretation.
In this example, the function ML(z) obtained after four steps H(z)=H1(z)*H2(z)*H3(z)*H4(z) with optimal shift k = 3.
If someone is convinced such to the solution and can perform clear mathematical proof, may be used for publication adding me and my collaborator as co-authors.
Else
If someone can indicate an important mistake to exclude such a solution I will be very grateful for help.
What do you thinks about the use of Markov Chain Monte Carlo Simulation Methods (MCMC) in dynamics systems with non Gaussian distribution of noise as maximum likelihood function (MLE) to estimation expected value (Ex)?