both are quadratic, both can be speeded up with similar techniques BUT they do not do the same thing !
the following might be an interesting read :
The Influence of Global Constraints on Similarity Measures for Time-Series Databases
Vladimir Kurbalija et al
Abstract. A time series consists of a series of values or events obtained over repeated measurements in time. Analysis of time series represents and important tool in many application areas, such as stock market analysis, process and quality control, observation of natural phenomena, medical treatments, etc. A vital component in many types of time-series analysis is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. Furthermore, it has been reported that such constrained measures can also achieve better accuracy. In this paper, we investigate two representative time-series distance/similarity measures based on dynamic programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), and the effects of global constraints on them. Through extensive experiments on a large number of time-series data sets, we demonstrate how global constrains can significantly reduce the computation time of DTW and LCS. We also show that, if the constraint parameter is tight enough (less than 10–15% of time-series length), the constrained measure becomes significantly different from its unconstrained counterpart, in the sense of producing qualitatively different 1-nearest neighbor graphs. This observation explains the potential for accuracy gains when using constrained measures, highlighting the need for careful tuning of constraint parameters in order to achieve a good trade-off between speed and accuracy.
This problem is usually solved using https://en.wikipedia.org/wiki/Dynamic_programming and always involves some sort of similarity function. Maybe you can explain a little bit more in detail what kind of answer you would expect here ?
both are quadratic, both can be speeded up with similar techniques BUT they do not do the same thing !
the following might be an interesting read :
The Influence of Global Constraints on Similarity Measures for Time-Series Databases
Vladimir Kurbalija et al
Abstract. A time series consists of a series of values or events obtained over repeated measurements in time. Analysis of time series represents and important tool in many application areas, such as stock market analysis, process and quality control, observation of natural phenomena, medical treatments, etc. A vital component in many types of time-series analysis is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. Furthermore, it has been reported that such constrained measures can also achieve better accuracy. In this paper, we investigate two representative time-series distance/similarity measures based on dynamic programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), and the effects of global constraints on them. Through extensive experiments on a large number of time-series data sets, we demonstrate how global constrains can significantly reduce the computation time of DTW and LCS. We also show that, if the constraint parameter is tight enough (less than 10–15% of time-series length), the constrained measure becomes significantly different from its unconstrained counterpart, in the sense of producing qualitatively different 1-nearest neighbor graphs. This observation explains the potential for accuracy gains when using constrained measures, highlighting the need for careful tuning of constraint parameters in order to achieve a good trade-off between speed and accuracy.
I try to find an example of Edit distance with real penalty (ERP) and Edit distance on real sequences (EDR) but I cannot. Both measures are for sequence of number, right?
My data set is symbolic sequences. What is the best measure for this type of data? I think LCSS is better than ERP or EDR. I am right?
my data:
s1: ABCD
s2: CDAFAG
s3: FAGEBCE
s4: ABC
....
I want to know that ERP and EDR can apply for symbolic sequences or not?
there is no "better" solution by itself : everything will depend on the nature of the data and your objective
to put it very simply (too roughly, maybe) :
- with LCSS, you do not care about the symbols which are not part of the LCSS between two strings; therefore
ABCD and AAAAAAAAAAAAAAAAAAABCD are at the same distance from ZBCD
- with ED, all the symbols are important and
ABCD is much closer to ZBCD than AAAAAAAAAAAAAAAAAAABCD
everything will depend on your data and application : if your strings may contain a lot of "garbage" and the informative part may be hidden in it (i would think of intrusion detection as a natural example ... but this may be due to my lack of knowledge in this area !), LCSS will make sense since it allows to get rid of the "garbage (uninformative) symbols" by concentrating on the common part between strings only ; on the contrary, if all the symbols are informative (think of orthographic correction), ED will make sense.
Unless you are working with very large sequences the meaning of the alignment takes precedence over the time complexity. In certain biological applications one develops new algorithms because the existing ones do not adequately address the priorities. See Fabrice's answer above.
The best measure depends on the permitted operations. For example, if you are working with genomes (where each symbol is a gene) where transpositions and reversals abound then of course you do not want to use substitution.
Here transpositions and reversals indicate shuffling of the order (that happens) whereas substitution indicates that a gene X suddenly becomes gene Y!!.