Why is audio compression based on frame-by-frame processing?

More Guang Hua's questions See All

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

How to generate a citation of my paper from ResearchGate?

How we can cite the papers from ResearchGate. I am trying to create citations for this article, Quantum Machine Learning Algorithms for Optimization Problems: Theory, Implementation, and...

08 August 2024 6,690 3 View

Does Anyone have expertise in in vitro transcription and RNA pull down assay?

I am currently working on LncRNA; to know the lncRNA-protein interactions I want to do RNA pull down assay, so I need to design primers with T7 promoter. I need assistance in this regard.

07 August 2024 6,622 1 View

How to fix background error in rietveld refinement of one XRD peak using GSAS-II?

I want to refine one XRD peak of my in-situ xrd but the background is never working good which ultimately fails the refinement. How to refine and adjust the background using GSAS-II

05 August 2024 5,291 2 View

How can I add own Henry coefficients in Aspen Plus?

Hi, i would like to simulate an absorption process in Aspen Plus. I want to use the NRTL model und would like to add some individual Henry coefficients. Is that possible and how?

05 August 2024 2,333 2 View

Why might the impedance values for DI water and 0.1X PBS buffer solution exhibit a decreasing and increasing trend, respectively over time (HP 4194A)?

Hello everyone, I'm encountering an issue with my electrochemical impedance spectroscopy (EIS) measurements and would appreciate some insights. Experimental Setup: Electrodes: Gold interdigitated...

05 August 2024 3,783 2 View

Can usage of AI tools like chat GPT in research work is recommendable ?

AI tools like ChatGPT can enhance research work significantly when used responsibly and in conjunction with thorough human oversight.

05 August 2024 1,842 3 View

Usage of internal standards in LC-MS/MS analysis?

Have you ever seen a LC-MS/MS method uses both internal standards and external standards (in matrix matching purpose) but the concentrations of internal standards are outside the calibration curve...

05 August 2024 3,084 6 View

ANY free software for reconstructing neurons in the microscopic image?

Hi everyone, I am working on brain slices for visualizing a protein in the soma and dendrites, using a fluorescence tag. However, I need a tool (not paid) for reconstruction of the whole neuron,...

04 August 2024 4,725 2 View

How effective is the Citi Bloc standard basket in enhancing the accuracy and comparability of international construction cost assessments?

Citi BLOC Standard Basket Definitions: A standardized unit representing a fixed basket of construction materials, labor, and equipment costs priced in various cities. Purpose: To create a common...

04 August 2024 8,997 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Geotechnical Engineering (Proceedings of the ICE) time review?

Hello everyone, I recently submitted an article to Geotechnical Engineering (Proceedings of the ICE), and the current status has been listed as "EiC Pre-assessment: Ready" for the past 20 days. I...

10 August 2024 6,493 1 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Daniel Robert Franklin

Latency and memory requirements both then depend on the number of samples - for communications applications, latency is critical. Compression may be less effective as the transformed signal probably would no longer be sparse in the frequency domain (when averaged over the entire music or speech signal).

Guang Hua

Hi Daniel,

Thank you for your comments. So am I correct to say if we do not consider buffer, latency, etc., (indeed these have little to do with audio compression), then single frame processing is acceptable?

Best wishes,

Guang HUA

As I said, compression will probably be less effective as you are more likely to have energy present right across the frequency domain. Consider a short musical scale played on a piano. With framing, each frame is likely to only contain one significant frequency component, so the transformed frame will be very sparse. The entire signal, however, has energy present at many frequencies and so will not be sparse. This is even more the case for speech which is a time-varying mix of random and periodic components.

Also, using framing, it is trivial to add a checksum to each frame so that you can just drop any frames that are damaged. With a single block transform, you are at much greater risk of losing the entire block (you would probably need to add a lot of forward error correction).

Understood! Thank you so much Daniel. This is a nice explanation.

Don Knox

Might also be worth mentioning the transient smearing issue. Energy in a transient portion of the audio signal can result in artefacts during the psychoacoustic compression stages. These can appear both before and after the transient in time, causing the well-known 'pre-echo' problem. MP3 compression alters frame sizes in response to signal content in an attempt to minimse this effect.

Bruno Putzeys

DRF's answer pretty much nails the fundamental from an information theory perspective. There's also the psychoacoustic perspective (which, not quite surprisingly says the same thing). The ear is something of a time/frequency analyzer. We don't hear the full spectrum of a whole piece of music at once but on the other extreme neither do we hear each sample individually. The spectral content of a part of the signal that's coming 1 second in the future has no bearing on what the signal sounds like now. So there's no point in taking it into account. If you want to base a compression algorithm on things the ear can and cannot discern its operation should roughly mirror that of the ear.

Thank you so much Don and Bruno.

David G Shaw

Can I suggest that the answers above all fit into the category of the statistics of the signal are changing a lot over time. There is no good way to model this as a block, so the work is done in frames that are assumed to have moderately constant statistics over their epoch. As mentioned, when there are rapid changes, then encoders often shorten the block or frame over which analysis is done to deal with these sudden changes.

Abdelhalim abdelnaby Zekry

You want to increase the size of the audio frame to contain the whole audio message. At first i think such question needs investigation to answer it.

But i think the frame size is determined by the latency in the transmission system.

It is required to carry out all the signal processing in such latency time. One other important point is that the transmission medium may be dynamic such that the decoding may be very complicated or even impossible if we increased the frame size.. I think the size of the frames are dictated more by the more complex decoding process that the coding process. Also, the transmission medium imposes restriction on the frame size..

Also the cost of the processing even if it increases only proportional to the size of the frame must be considered as one can not use powerful computing platforms for all communication equipment.

An another important point that audio signal is not a continuous signal in nature but contains interruptions.

I think the optimum frame size varies among applications.

Best wishes