There is a conceptual difference in standard deviation and standard error, standard deviation is of variable values where as standard error is standard deviation of estimator. standard error is measure of divergence of estimator value from expected value and standard deviation is measure of divergence of sample values from mean value so their are not substitutes.
now your first question
1 4.39 is standard error of sample mean, or otherwise you can calculate standard deviation from population and then divide it with square root of 3.
2 standard error differ sample to sample and each one we use in respective sample study only.
3. as discussed earlier standard error is used for variation in mean so we cant used standard deviation which is spread of data around mean.
If you referring to the global mean (grand mean), you do not distinguish between the three groups. So you have 12 values (X) from which a single estimate is calculated. The standard error (SE) of the estimate (the grand mean) is the SD(X) / sqrt(12). [SD = standard deviation]
If, in contrast, you are referring to the variability of a group of 3 means (m1, m2, m3) around their mean (what is identical to the grand mean), then the standard error for this is SD(m1,m2,m3) / sqrt(3).
Q2:
If you can assume that the variability in all three populations should similar, you'd better use the pooled SD or pooled SE (use google, wikipedia).
Q3:
The SD is a property (variability) of the data, whereas the SE is a property (variability) of the estimate (e.g. of the mean). If you want to assess if the estimates may estimate a given value, you will look at the SE (actuall you would look at the confidence interval, what often is calculated based on the SE).
Note that SEs on individual group means do not help much in assessing if estimates from different samples may estimate all the same value (-> is there reason to assume that the population means are different?). To answer this question, one should calculate the SEs (and, eventually, the confidence intervals) of the comparisons directly (a comparison can be a difference in the group means, it can also be a ratio). You can google for "Tukey HSD plot" to learn more about this.
If I get it right the SD given an indication of how each value from a particular sample differ from the mean value of that sample for eg in Sample A of the eg using the SD value of 7.5 , individual A1 (25) differ from the sample mean by +7.5 and if I add this value to A1 it will approximately equal to the mean. Similarly the SE of sample A gives an estimate of how good the sample mean is and in this case the sample mean can differ by + /- 3.75
If I have different samples for eg sample A,B,C then I have to use other test (Tukey test) to compare the sample means values (m1,m2,m3) based on confidence interval (CI)
Yes, the SD is a kind of an "average distance of individual values around their mean". More precisely, it is the square-root of the average squared distance. When your sample has a large SD this indicates that the individual are - on average - quite far away from the mean. A very small SD indicated that the values are (almost) all very close to the mean.
I mentioned Tukey's HSD because you can find instructive examples and plots that emphasize the effect sizes (i.e. the differences between the means - what you actually want to analyze) instead of the mean estimates (occasionally you might find it interesting what the mean value of a group is, but, honestly, mos reserchers put the main focus on the differences between means - but do neither show nor quantify these differences (what is a bad habit, mostly due to ignorance)).
Tukey's HSD makes use of a pooled SD/SE estimate and provides confidence intervals for the differences in means. These confidence intervals are adjusted in a way that the family-wise type-I error rate (FWER) is controlled when hypotheses about zero-differences between the sample means are rejected.
It depends on your aims if is reasonable to reject null hypotheses at all and further if the FWER should be controlled. If you go this way, you should think if the underlying assumptions, mainly the normal distribution of the residuals(*) are not severely violated.
(*) A "residual" is the difference (distance) between an individual value and its group mean.
You have shown why there is a need for the t-statistic or Tukey's HSD.
You have not said why you wanted to know the average, SD of SE. The reason for examining the population of weights determines which measure you employ.
In question 4 you ask should I use the SD or SE? Only you can answer this. Are you interested in the population spread or how good is the estimate of the mean?
Question 1. The mean of the means or grand mean has the correct SD, but usually an SE is not reported. Yes. n = 3.
Question 2. None of the calculated is better. Observe the spread. (agree with Jochen)
Question 3. Your question was the answer. What is your interest, spread or estimate of the mean?
Question 4. This SD is different from the sample SDs. Test for normality and examine the range.
Statistics is too often taught without focus on the question asked.