To my knowledge, there are no unsupervised phonetic segmentation methods based only on energy. Energy can be used to help distinguish between voiced and unvoiced phonemes, but for this problem it is necessary to apply correlation calculation because voicing is manifested as periodicity in the speech waveform. For phonetic segmentation, a labeled corpus is needed for training and, besides energy, MFCCs are usually used, along with their derivatives (first and second).
Though finding the boundaries is difficult, you can use the concept of landmarks in speech. Landmark refers to the process of representing each segment with a single point. Of course, even landmarks can also be used to get the phone or segment boundaries. Refer to the work of Carol Espy Wilson and Sharlene A Liu (1996 Jasa paper)
The tool "Prosogram" could perform an automatic segmentation into local peaks in the intensity of the band-pass filtered speech signal. The tool does not need a labeled corpus. Yet, even if I have used several times, I do not know its accuracy when making an automatic segmentation without labels. You can try.
While it is notoriously hard to find phoneme boundaries without supervision, the location in time of syllable onsets is easier to obtain. I put some references for this below in case it is useful for your application:
Mermelstein, P. (1975). Automatic segmentation of speech into syllabic units. The Journal of the Acoustical Society of America, 58(4), 880-883.
Nagarajan, T., Murthy, H. A., & Hegde, R. M. (2003). Segmentation of speech into syllable-like units. Energy, 1(1.5), 2.
Hyafil, A., & Cernak, M. (2015). Neuromorphic Based Oscillatory Device for Incremental Syllable Boundary Detection. In Proceedings of Interspeech.
Räsänen, O., Doyle, G., & Frank, M. C. (2015). Unsupervised word discovery from speech using automatic segmentation into syllable-like units. In Proceedings of Interspeech.
There are much better ways to do this: https://www.researchgate.net/publication/271849557_Speech_Acoustic_Unit_Segmentation_Using_Hierarchical_Dirichlet_Processes
Conference Paper Speech Acoustic Unit Segmentation Using Hierarchical Dirichl...