How can we prevent data leakage in DWT-ML-based time series forecasting?

19 April 2025 0 2K Report

I am currently learning the use of DWT decomposition in time series forecasting with ML models. However, I have a concern regarding potential data leakage during the wavelet decomposition process.

During the convolution step, wavelet filters may use future data points to compute the wavelet coefficients at different levels. This means that even if we split the data into training and testing sets first and then perform wavelet decomposition separately on each, the test set may still "see into the future" — unless we use a causal wavelet filter (i.e., db1).

Now, I have two main questions:

How can we effectively prevent data leakage during wavelet decomposition in time series forecasting?

Am I missing something? As I’ve come across many high-impact journal papers where wavelet-ML models are used for time series forecasting. These models often decompose the dataset using non-causal wavelet filters and then feed the decomposed signals into the ML model. These models often report very high prediction accuracy (even in 2- or 3-step-ahead forecasting). But if future data influences the decomposition, are those metrics truly valid?

Badges
Science topic

More Jahidul Alam's questions See All

Why Do TDS and EC Increase with Larger Wastewater Volumes, While BOD and COD Decrease?

I have carried out MFC experiments on three different volumes, 50, 500 and 1000 mL of wastewater. Results after MFC treatment shows that TDS and EC are more in larger volumes of water i.e. TDS and...

09 August 2024 9,621 0 View

How to enrich pig excreta for increasing nutrient quality organically ?

Pig slurry is rich in major and minor nutrients. Is there any way to improve / Enrich its manure quality to be used in agriculture organically ? please share your knowledge.

09 August 2024 5,605 2 View

Is it possible to plot the atom-projected band structure using GPAW?

Hi, I'm currently working on a project where I need to plot the atom-projected band structure using GPAW. I've been able to calculate the band structure for my material, but I'm having trouble...

07 August 2024 269 3 View

Unusual intensity drop in some sections of chromatograms in DDA?

Hi, we have measured tryptic peptides using both DDA and DIA method on QExactive. In DDA replicates i saw unusual intensity drops occurring at the same sections of chromatograms in DDA replicates...

07 August 2024 3,218 4 View

Leaf area of tomato ?

Hi How can this equation Ln(LA) = 1.038 + 0.89 ln(X) be applied to calculate the leaf area of a tomato? Can you explain with an example and what is the substitution of Ln and ln?

06 August 2024 2,508 2 View

Why did the authors extrapolate a phenotype that they experimentally proved in one bacterial strain across the whole genus of the organism?

I aim to be as skeptical as possible regarding whether a pair of orthologous genes results in the same phenotype in their different but related bacterial organisms under similar environmental...

05 August 2024 6,787 4 View

How to preform densitometry on SDS-page bands?

I ran a SDS-page of a bacterial lysate and I want to quantify protein concentration in a specific band. I was thinking of using a standards ladder or make some standards are different...

05 August 2024 9,805 3 View

XRD Analysis is showing only Calcium carbonate. It is not showing other compounds. Can anyone help me get the other compounds?

XRD Analysis is showing only Calcium carbonate. It is not showing other compounds. Can anyone help me get the other compounds

04 August 2024 3,019 3 View

Which solvent is better to dissolve with secondary metabolites extracted from fungi?

I work on MCF7 cell cell for anticaner purpose and I wa to do drug preperation the drug ( secondary metabolites extracted from Aspergillus) My question which solvent is better with these secodary...

03 August 2024 4,725 2 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Hello all, Looking for international reviewer to review Ph.D thesis in wireless sensor network.Can anybody help?

My name is Apurva Saoji. I am a Ph.D scholar in Computer engineering in India. I am looking for international expert in reviewing my PhD thesis, "Competitive Optimization Techniques to Minimize...

07 August 2024 4,600 2 View

Should I include H atom into C3N5 when i am doing DFT modelling?

Hi all, my experimental XPS results shown that my C3N5 sample consists of N-H bond, hence in this case I should incorporate the N-H bond into my DFT modelling. However, I do notice several papers...

07 August 2024 8,414 2 View

Are there any instruments for studying time similar to the way it is in space?

There are a huge number of methods for studying objects in space, according to the senses (and not only). Mechanical, thermal, optical, acoustic, electrical, magnetic, based on particle beams,...

06 August 2024 7,102 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

Why does the MFDFA algorithm need to calculate the profile of the time series?

As described in the Multifractal detrended fluctuation analysis (MFDFA) algorithm, it at first calculates the profile of the time series, and then other steps are operated on the profile....

05 August 2024 9,366 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Is there any machine to do real time pcr?

I want to know how do you make real time pcr solation ? is there any machine to make it? thanks for answering

05 August 2024 1,660 0 View