Suppose I want to establish a new metric for a certain bioinformatics problem. What are the mathematical prerequisites for establishing this metric? Are lemma's and theorems required for establishing the metri
I think it is not difficult to construct a function d over the set of elements for your model X that satisfies the following conditions:
d: X×X → R
(1) d(a,b) ≥ 0, for every a,b ∈ X. And d(a,a) = 0.
(2) d(a,b) = d(b,a) for every a,b ∈ X.
(3) d(a,b) + d(b,c) ≥ d(a,c) for every a,b and c in X.
the mapping d is called a metric on X.
you can grant (biological, physical, chemical, etc.) an interpretation for the mapping d based on the model and its element. The main point is that function d should preserve the mentioned properties.
Well, to avoid putting the cart in front of the horse, the first thing to consider is "what type of answer do I expect from the metric?" or even "what is my uncertainty about the system (I'd guess a set of sequences or something like this) I'd like to eliminate by measuring certain properties of the system?". "How could the new arrangement of elements according to my metric add to my knowledge?" Questions like this, no maths yet.
Once you know the purpose, you can select a set of measurable parameters you believe might be useful in reducing your uncertainty and start combining them into a single valued metric.
A simple example would be measuring the difference between two DNA sequences, where the difference is usually measured as a distance in multidimensional space. If you know the sequence is coding a protein, you might wish to ascribe different weights to point mutations coding for the same amino acid (the genetic code is degenerate) and to those coding to a different one. You might wish to factor in the effect of the point mutation on the phenotype, etc.
At the end of the day, you arrive to set of parameters, their weights, and the way they get combined into your response function (metric).