I am searching for a valid way to evaluate whether the difference in two laboratory parameter measurements (e.g. potassium) of the same parameter is clinically significant, despite the fact, that a paired t-test shows a p-value
"Clinical significance" is a subjective term. But I suppose you could establish some cutoff level that you deem clinically important, and compare your difference to that rather than to 0 (which is the default null hypothesis in significance tests). For example, if a paired t-test shows that the difference between measures is significantly greater than 0, but you think that something needs to be greater than 2 to be clinically important, then you can instead use the paired t-test (or the 95% confidence interval of the difference, which is formally equivalient) to test whether the difference is significantly different from 2.
In the meantime I did a little further research - maybe the RCV method (Reference Change Values) could be of service? It originally was proposed to see if consecutive values from one patient change with clinical significance, but maybe I could use this for my question as well since the analytical imprecision and the biological variation are considered.
I´ll try both methods and I will comment here when I have a result.
If anybody has further suggestions .... anytime :)
You might also check the difference in Net Benefit (ΔNB), difference in Relative Utility (ΔRU), and weighted NRI (wNRI), cf. van Calster et al. (see below).
For these, however, you will need a defined clincial "outcome".
Good question with probably no good answer. Stephen already mentioned the reason for that: the selection of any cutoff to prompt clinical intervention remains subjective.
For prediction models assessment of reclassification (or misclassification) is probably quite useful, as Alexander pointed out. A similar idea is calibration and I highly recommend the article by Nancy Cook (see below), which is a great introduction to the topic.
Another (I guess slightly overlapping) idea is the concept of 'number needed to treat' or a variation thereof (number needed to test etc). Here, risk of an undesired outcome determines 'clinical significance' in addition to statistical significance and effect size. In other words: the utilities of two identical interventions/tests (i.e. same statistical significance and effect size) in two populations (with different risks for undesired outcome) may differ.
I think that all depends on the range of data of measurements. We had to consider the measurement uncertainty and LOQ (limit of quantification) of measurement method. If the range is wide than the orthogonal regression is used. Comparing two laboratories (two analytical methods) isn't easy especially if you want to consider "clinical" significance, which you have to define first.
To be more specific: I have samples taken from one person in two different states, potentially influencing the analytical outcome by preanalytical alteration. Measurement and all other conditions remain the same. Some of the respective measurement pairs now differ significantly when using a paired t-test. What I want to know however, is if these differences are clinically relevant. So I thought that there maybe is a valid way to calculate this rather than leaving this decision eminence based.
But reading all your valuable suggestions I think I indeed have to define the clinical significance cutoff points myself and add some calculations thereafter.