My research targeted measuring F1 and F2 of the long and short /a/. My question is related to the possibility of combining these two values into one value called formant. Is that phonetically correct?
You can use the difference between the first two formants (f2-f1). Literature has considered this as a robust method to verify the shift during language intermediate productions in relation to the target.
Suggested references
FABRCIUS A.; WATT, D, JOHNSON D. A comparison of three speaker-intrinsic vowel formant frequency normalization algorithms for sociophonetics. Language Variation and Change, 21, 2009, p. 413–435.
KENT, R.; K.; VORPERIAN, H. Static Measurements of Vowel Formant Frequencies and Bandwidths: A Review. J Commun Disord, 74, 2018, p. 74–97.
LINDBLOM, B.; SUNDBERG, J. Acoustical consequences of lip, tongue, jaw, and larynx movements. Journal of the Acoustical Society of America, 50, 1971, p.1166-1179.
No, Eman Altrike, it does not make any sense turning the values of F1 and F2 into one. The spectrogram of a vowel is the acoustic efect of a coordination of articulatory gestures. Each formant is the accoustic result of the performance of one articulator (jaw, tongue, lips). In relation to tongue projection/retraction, the most confident accoustic correlate is the difference (distance) between first and second formants. Good studies!
Turning both values into one value is not appropriate indeed. But, as I mentioned, language dynamic productions (in-between vowel space) have been well determined by interformant values (considering tongue retraction, f3-f2 differences would work well though).
Considering you have two vowel categories (long/short /a/) interformant measures may give some representation to your data.
As I mentioned in the above comment, check these references:
FABRCIUS A.; WATT, D, JOHNSON D. A comparison of three speaker-intrinsic vowel formant frequency normalization algorithms for sociophonetics. Language Variation and Change, 21, 2009, p. 413–435.
KENT, R.; K.; VORPERIAN, H. Static Measurements of Vowel Formant Frequencies and Bandwidths: A Review. J Commun Disord, 74, 2018, p. 74–97.
LINDBLOM, B.; SUNDBERG, J. Acoustical consequences of lip, tongue, jaw, and larynx movements. Journal of the Acoustical Society of America, 50, 1971, p.1166-1179.
It would help to know what you are trying to look into? Like Leônidas suggests a transformation may be appropriate for your research question and there are various ones out there depending on your needs (for example, there are several ways to normalise the vowel space between speakers).
The reason I don't think there is a clear cut answer to which transformation to use and whether to use one at all, is that we are dealing with a complex phenomenon. A useful way of thinking about this is that a formant is an just an acoustic measurement (or perhaps a series of acoustic measurements if you can deal with time series data) that reflects the state of a speakers articulatory system. It arises from some possibly very non-trivial interaction of the voice source and vocal tract resonances.The resonances are not formants per se because they are properties of the physical system that may or may not be fed acoustic energy. There's a developing body of literature on using computational models for determining the resonances and looking at their relation to observed formants. The results tell us that for large low voiced males the picture is mainly the way text books often lay it out, but that there are confounds (like variation in vocal tract anatomy due to age, gender and individual factors and dynamically nasality) that produce surprising effects.
And finally, depending on your question and materials, it might make sense to look at even more formants and possibly transformations between them. Before doing so, I would encourage you to plot the data and see how it behaves, so that you'll have a clearer understanding of what is happening. I've seen F4 recommended as some sort of normalising value for one of the lower formants and then looked at F4 in a data set I was working on only to discover that it was so badly tracked that I could not trust it and had to use other analysis methods.