Assuming we have six vectors: A, B, C, A', B', and C'. A, B, and C are the outputs of method X. A', B', and C' are the outputs of method Y. These methods are iterative, and the outputs are gathered at certain iterations for several experiment repetitions. For example, let's assume A = [a1, a2, ..., an], containing the outputs of running X at the 10th iteration in multiple experiments. Then, a1 is the output of the first experiment, a2 is the output of the second one, and so on.
We tend to check whether X and Y have different means at the specific iterations using the Wilcoxon Rank Sum test. Therefore, the comparisons are as follow:
A vs A' (e.g., the outputs of 10th iteration)
B vs B' (e.g., the outputs of 20th iteration)
C vs C' (e.g., the outputs of 30th iteration)
The question is that although we are performing multiple comparisons, the vectors are separate samples and different from each other. Thus, do we still need significance level correction? If yes, which correction method would you suggest?