What information data does an advanced IAER-AMT hybrid algorithm provide to a climatologist?

01 January 1970 0 172 Report

Water turbidity (TU) is a widely used indicator of water quality. Given the time-consuming nature of direct TU measurement, the development of an accurate prediction model is essential. In this study, the Intermittent Model Tree (AMT) and its hybrid version via iterative absolute error regression (IAER), bootstrap pooling (BA), weighted sample control envelope (WIHW), and random subspace (RS) were used to predict TU in the Clackamas River, USA. Daily time series data of physicochemical water quality variables from 2006 to 2023, including water temperature (Tw), specific conductivity (SC), dissolved oxygen (DO), pH, as well as river physical parameters, including daily water discharge (Q) and water stage (WS), were used as potential input variables for TU prediction. The manual approach, principal component analysis (PCA), and correlation-based feature selection subset evaluation (CfsSubsetEval) techniques were compared under different input scenarios. Finally, the performance of the models was evaluated using various statistical measures, including coefficient of determination (R2), root mean square error (RMSE), percentage bias (PBIAS), Nash-Sutcliffe efficiency (NSE), and root mean standard deviation ratio (RSR). WS had the greatest impact on TU prediction, while Tw had the least correlation. Furthermore, the input scenario that included all variables resulted in the highest model performance. Based on the experimental dataset, the new hybrid algorithm IAER-AMT outperformed the others by achieving RMSE of 1.20 Formazin Nephelometric Units (FNU), NSE of 0.72, PBIAS of 3.17% and RSR of 0.53, followed by BA-AMT (RMSE = 1.30 FNU, NSE = 0.67, PBIAS = -9.73% and RSR = 0.57), WIHW-AMT (RMSE = 1.34 FNU, NSE = 0.65, PBIAS = -0.35%, RSR = 0.58), RS-AMT (RMSE = 1.37 FNU, NSE = 0.64, PBIAS = -20.95% and RSR = 0.60) and AMT (RMSE = 1.38 FNU, NSE = 0.63, PBIAS = -26.81% and RSR = 0.60).Introduction Water turbidity (TU), often referred to as water clarity, is a critical parameter of water quality that significantly affects aquatic ecosystems. TU is influenced by suspended particles such as silt, clay, microscopic organic and inorganic matter, algae, and plankton. These materials typically originate from anthropogenic activities (e.g., mining, agriculture, construction, urbanization, and sewage generation), soil erosion, and phytoplankton growth in open waters, making them the main sources of TU [44]. Although TU does not directly impact human health, high TU can obscure harmful pathogens like Cryptosporidium—an apicomplexan parasite that can cause respiratory and gastrointestinal diseases—thereby posing indirect health risks. Elevated TU can also harm aquatic life, disrupt ecosystems (especially water-dependent biota), and increase water treatment costs. Organic materials may alter water characteristics such as alkalinity, while inorganic matter can introduce toxins and odors, all of which degrade water quality and pose potential consumption risks [63]. Together with contaminants from human, livestock, and industrial waste, these factors further deteriorate water quality and render it unfit for consumption [63]. Therefore, effective monitoring and measurement of TU are essential for sustainable water resource management. One common approach for measuring TU involves in-situ sensors capable of continuously monitoring several water quality parameters. However, this method entails high initial costs, ongoing maintenance, and regular calibration to ensure accuracy [45,58]. Given that these sensors are sparsely distributed and primarily located at catchment outlets, there is a clear need for reliable, practical, flexible, and accurate alternatives to track temporal variations in river turbidity. Physical and numerical models, such as the consistent particle method (CPM) [61], offer predictive capabilities but often require extensive data for calibration and entail time-consuming setup process. Moreover, simpler models like multiple regression [47] and autoregressive integrated moving average (ARIMA) are limited by their linear structure and often cannot capture the complexity of TU dynamics. Effective models must accurately represent water quality characteristics, including extreme values, to predict TU in such complex systems. In recent years, machine learning (ML) has gained substantial attention in hydrology and water resources management as it is able to handle large datasets, uncover complex input–output relationships, address missing data, and operate with fewer input variables. ML models are generally easier to set up, implement, and calibrate than physical or numerical models and do not rely on parametric assumptions [13,64–66]. More recently, artificial neural networks (ANN) have extensively applied to predict a wide variety of environmental phenomena, including wildfire susceptibility [36], landslide and land subsidence [4,41,56,67], streamflow forecasting [27,32], as well as flood [11], and water quality modeling [3,19,31,42,55]. However, these methods often encounter challenges related to data length requirements and convergence [9]. As an alternative, adaptive neuro-fuzzy inference systems (ANFIS), which integrate ANN with fuzzy logic, along with other ML approaches such as support vector regression (SVR), extreme learning machine (ELM), gene expression programming (GEP), and group method of data handling (GMDH), have been widely applied in water resources studies for predicting various water quality parameters [21,37,50,54]. Each of these models, however, has limitations: for instance, SVR involves complex hyperparameter tuning, and ELM often requires large datasets and careful weight determination in membership functions [1,6]. Traditional ML models have been enhanced by integrating them with various metaheuristic algorithms; however, their implementation remains challenging due to inherent complexities. Moreover, selecting an appropriate, flexible, and robust model is not straightforward. Recent advancements in ML—especially tree-based methods (e.g., random forest (RF), M5 Prime, random tree (RT), and reduced error pruning tree (REPT)), lazy learners (e.g., KStar and locally weighted learning), and deep learning (DL) algorithms—consistently outperform traditional approaches [28,30]. These advanced models typically require less effort in hyperparameter optimization (except for DL models) and often incorporate classifier-based strategies such as bootstrap aggregation (bagging, BA), disjoint aggregation (dagging, DA), and additive regression to further boost predictive performance. Devi and Mamatha [14] compared linear regression, k-nearest neighbors (k-NN), and decision tree regression for predicting Saki Lake’s TU, demonstrating that decision tree regression produced superior results. Building on these insights, researchers have explored hybrid and ensemble techniques to enhance model performance. For example, Kargar et al. [29] integrated rotation forest-based classifiers (ROF) and weighted instance handler wrappers (WIHW) with models such as M5 Prime, REPT, RF, and RT for suspended sediment load prediction. Comparing these integrated approaches with empirical models, they found that hybridization generally improved performance, with the WIHW-RT hybrid model showing the most promise. Despite these advances, a knowledge gap persists concerning which classifier-based models most effectively enhance standalone model performance. Given its ability to encapsulate the predictive power of decision tree ensembles within a single tree structure [17], the AMT model emerges as a strong candidate for further hybridization. Additionally, Ehteram et al. [15] developed a hybrid model comprising a convolutional neural network (CNN), clockwork recurrent neural network (Clockwork RNN), and M5 Tree (CNN-CRNN-M5T) to predict water quality at the Gombak River in Malaysia. Their findings indicated that this integrated model outperformed others, underscoring its potential as a practical and cost-effective tool for water quality prediction. Rachid et al. [51] applied RF and SVM for predicting water potability, reporting that RF outperforms SVM. The literature consistently demonstrates that tree-based models serve as robust and reliable predictive tools across various domains. The current study addresses the existing knowledge gap by integrating tree-based models with several new hybrid algorithms for predicting TU in the Clackamas River, USA. Using 6,083 data records from 2006 to 2023, the specific objectives were to: (1) develop a model to accurately predict TU using a few readily available river water quality variables, such as water temperature (Tw), specific conductance (SC), dissolved oxygen (DO), pH, daily water discharge (Q), and water stage (WS); (2) evaluate the potential of hybrid tree-based models—particularly the AMT and its ensemble variant via iterative absolute error regression (IAER-AMT)—and compare them with benchmark models such as standalone AMT, BA-AMT, WIHW-AMT, and RS-AMT; (3) Determine the most effective combination of input variables for predicting TU; (4) Assess the efficiency of intelligent feature selection techniques, including principal component analysis (PCA) and correlation-based feature selection subset evaluation (CfsSubsetEval), relative to a manually constructed approach. To the best of our knowledge, this study is the first to apply a wide range of hybrid tree-based models, specifically the IAER-AMT algorithm, for TU prediction. The findings provide new insights into the potential of these algorithms to deliver simple, fast, accurate, and efficient predictions of turbidity in rivers and streams.

2. Materials and methods 2.1. Study area The present study was conducted using data collected from a station operated by the United States Geological Survey (USGS; https://or. water.usgs.gov). The selected station, USGS 14210000, is located on the Clackamas River, Oregon, USA (Latitude 45.2298, Longitude − 122.3539 in decimal degrees; seeFig. 1). The river’s catchment area covers 942 mi2 (2,440 km2 ) and features considerable variation in elevation, ranging from 2 to 2,200 m above sea level. The Clackamas River has a fair seasonal turbidity variation, making it an ideal candidate for testing newly developed models due to the availability of comprehensive time-series data. 2.2. Data This study utilized approximately 17 years of time-series data on physicochemical water quality variables from January 25, 2006, to October 21, 2023, collected daily at the Clackamas River. The variables considered as potential inputs for predicting water turbidity (TU; Formazin Nephelometric Units (FNU)) included daily mean water discharge (Q; ft3 s − 1 ), water stage (WS; ft), specific conductance (SC; μS cm− 1 ), pH, dissolved oxygen (DO; mg L− 1 ), and water temperature (Tw; ◦C) (Table 1). Outliers were identified and removed using a Q-Q plot approach to enhance data quality, even though ML models are robust to missing or noisy data. The cleaned dataset was then divided into two subsets: 70 % was allocated for model development (training) using data from January 25, 2006, to September 30, 2018, while the remaining 30 % was reserved for model evaluation (testing) using data from October 1, 2018, to October 21, 20232.3. Input scenarios In this study, both manual and intelligent feature selection techniques—specifically Correlation-based Feature Subset Evaluation (CfsSubsetEval, CSE) and Principal Component Analysis (PCA)—were employed to identify the most efficient and optimal input–output scenario. Input scenario 3 and 4, are determined using CSE and PCA, respectively, whereas the remaining scenarios were constructed manually. Although various feature selection methods such as F-test, Chisquare, mutual information, relief, and Laplacian score (LAP) exist [18], our focus was on identifying an optimal set of inputs based on high correlation and variability within the dataset. 2.3.1. Manual approach The manual approach began by calculating the correlation coefficients between each potential input and the output variable. Initially, the most highly correlated variable was selected to form the first input scenario. Subsequently, the next most correlated variable was added to construct a second scenario consisting of the top two correlated variables. This iterative process continued, adding one variable at a time based on descending correlation strength, until all potential inputs were included. Given six candidate input variables, this procedure resulted in six distinct input scenarios (Table 2). 2.3.2. CfsSubsetEval technique The performance of ML models can be significantly impaired by the presence of redundant and irrelevant features, leading to decreased accuracy and unreliable results. Feature selection algorithms are Fig. 1. Map of the study area in Oregon, USA, showing the USGS measurement site (14210000) and the watershed boundary in which the station is located. Table 1 Descriptive statistics of the water quality parameters during training (2006–2018, n = 4262) and testing (2018–2023, n = 1821) phases, including maximum, minimum, mean, and standard deviation (SD). Phase Parameters Maximum Minimum Mean SD Training Q (ft3 s − 1 ) 28900.00 651.00 2766.66 2531.06 WS (ft) 21.23 10.35 12.26 1.44 SC (μS cm− 1 ) 73.00 27.00 50.95 11.64 pH 8.40 6.70 7.41 0.16 DO (mg L− 1 ) 15.50 8.10 11.27 1.49 Tw (◦C) 20.60 0.40 9.95 4.77 TU (FNU) 155.00 0.00 2.66 6.87 Testing Q (ft3 s − 1 ) 24800.00 549.00 2340.14 2253.93 WS (ft) 19.97 10.46 12.40 1.29 SC (μS cm− 1 ) 75.00 27.00 54.02 11.24 pH 7.80 7.00 7.47 0.12 DO (mg L− 1 ) 14.80 8.30 11.21 1.48 Tw (◦C) 20.30 2.10 9.98 4.97 TU (FNU) 59.00 0.40 1.65 2.82 Table 2 Overview of the six input combination scenarios proposed for water turbidity (TU) modeling. The manual approach sequentially adds correlated variables to the model, while CfsSubsetEval (CSE) and principal component analysis (PCA) techniques reduce dimensionality by removing redundant and irrelevant features. Input combination scenarios Output 1 Q TU 2 Q, WS TU 3 Q, WS, SC= CSE method TU 4 Q, WS, SC, pH= PCA method TU 5 Q, WS, SC, pH, DO TU 6 Q, WS, SC, pH, DO, Tw TUtherefore critical for identifying and removing such features [22–23,24]. CfsSubsetEval, proposed by Hall [22], employs a greedy stepwise search algorithm to evaluate subsets of features based on their individual predictive abilities while accounting for redundancy among them. Specifically, CfsSubsetEval selects features that exhibit high correlation with the target variable and minimal inter-correlation with other features. The heuristic “merit” of a feature subset S containing k features is defined using Pearson’s correlation coefficients as follows: Meritss = kRcf ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ k + k(k − 1)Rff √ (1) where Meritss is the heuristic merit of subset S,Rcf is the average correlation between features and the target variable, and Rff is the average inter-correlation among features [22,23,24]. 2.3.3. PCA technique PCA is a multivariate statistical technique used to describe and reduce interdependencies among correlated features. By transforming a high-dimensional dataset into a lower-dimensional space, PCA preserves as much of the original variance as possible while eliminating redundancy. Through this transformation, PCA generates new uncorrelated variables known as principal components (PCs), which are linear combinations of the original variables. Each principal component captures a portion of the total variance in the data, with the first PC explaining the largest share, the second PC explaining the next largest share of the remaining variance, and so on. PCs are associated with eigenvalues (λ) and eigenvectors that define the directions of maximum variance in the feature space. The eigenvalues are ranked in descending order to reflect the amount of variance each PC accounts for. This ranking aids in selecting the number of components to retain, ensuring minimal information loss while simplifying the dataset. PCA is particularly useful for mitigating multicollinearity, as it transforms correlated features into a reduced set of orthogonal components that can serve as robust inputs for predictive modeling. 2.4. Hyperparameter tuning Hyperparameters can vary depending on the dataset and regional characteristics; therefore, calibrating the model to determine the optimal hyperparameter values has a significant impact on modeling performance. In this study, the optimal values for each hyperparameter were identified through a trial-and-error process. The model was initially executed using default parameters, after which the effects of adjusting these values—both lower and higher—on performance were systematically investigated. A reduction in Root Mean Squared Error (RMSE) was used as the primary indicator of enhanced model efficiency and optimal hyperparameter configuration. All analyses were conducted and model performance evaluated using the Waikato Environment for Knowledge Analysis (WEKA 3.9) software. 2.5. Model description This section provides an overview of the various ML models and ensemble techniques employed in this study, highlighting their fundamental principles and how they contribute to improving prediction accuracy. 2.5.1. Alternating model tree (AMT) AMT, proposed by Frank et al. [17], is an ensemble learning model that improves upon the traditional decision tree (DT) approach. Unlike standard DTs that rely solely on discrete splits, the AMT incorporates two principal types of nodes within a single tree structure: (1) splitter nodes (S-N) which partition the data by splitting attributes at their median values; and (2) prediction nodes (P-N) each of which contains a simple linear regression model to predict continuous numerical outcomes. AMT combines aspects of boosting and tree-based modeling by using a forward stagewise approach alongside cross-validation to grow the tree and determine its optimal size. Two key parameters must be set during training: (1) the number of iterations for tree growth and (2) the shrinkage parameter (λ), which controls the learning rate [17,20,46]. This combination allows AMT to capture complex nonlinear relationships while maintaining interpretability and efficiency. 2.5.2. Iterative absolute error regression (IAER) IAER is a regression algorithm designed to optimize model parameters by iteratively minimizing the absolute error between predicted and target values. IAER operates under the assumption that the data space can be partitioned into several regions, each of which can be modeled using a linear approximation. During the first iteration, the algorithm fits a linear regression model on a portion of the data, creating multiple local models that collectively form an ensemble. Through successive iterations, IAER refines these models by recalculating weights and adjusting parameters to reduce the overall absolute error. This iterative process not only improves accuracy but also accounts for experimental uncertainties, making IAER especially beneficial when dealing with heterogeneous data distributions [49,59].2.5.3. Bootstrap aggregating (BA) BA, commonly known as bagging, is an ensemble technique introduced by Breiman [5] that aims to improve predictive accuracy by reducing variance. Bagging involves two distinct steps: (i) Bootstrap Resampling: Multiple subsets of the training data are randomly sampled with replacement. Each subset is used to train a separate model, often referred to as a weak learner. (ii) Aggregation: Predictions from all the individual models are combined, typically by averaging for regression tasks or majority voting for classification tasks. By aggregating diverse models trained on different data subsets, bagging reduces overfitting and stabilizes the overall prediction, leading to more robust performance [25,60]. 2.5.4. Weighted instance handler wrapper (WIHW) The WIHW method enhances model training by assigning different weights to individual training instances. It employs a resampling strategy where the Euclidean distance between a new data point and each training instance is computed. Instances that are more similar (i.e., have a smaller distance) to the new point receive higher weights, thereby exerting more influence on the prediction. This approach helps to emphasize informative instances while diminishing the impact of outliers or less relevant data, potentially improving model accuracy [39]. 2.5.5. Random subspace (RS) RS, introduced by Ho et al. [26], is a feature partitioning method designed to address challenges in high-dimensional data, including class imbalance and redundancy. Unlike instance-based algorithms, RS partitions the feature space, making it particularly suitable for datasets with numerous redundant or irrelevant features [62]. In RS, an ensemble is constructed by randomly selecting multiple subspaces from the feature space and assigning a single classifier to each subspace, which aids in managing high dimensionality [10]. For each selected subset, the partial correlation among different features is calculated using the pseudoinverse, ensuring that the most informative relationships are captured [7]. The final RS prediction is then obtained by combining the outputs from the ensemble of base learners derived from these subspaces [26].2.5.6. Model development and integration

Fig. 2 illustrates how ensemble ML algorithms are coupled with

standalone algorithms to form hybrid methods. In this study, the primary predictive model is the AMT, which serves as the base learner. Multiple instances of the AMT are trained on different subsets of the training data (from one to N), and these individual models are then integrated using ensemble techniques such as IAER, BA, WIHW, and RS. This hybridization aims to leverage the strengths of both the base AMT model and the ensemble methods to enhance predictive accuracy and robustness. 2.6. Model evaluation Model validation on the testing split involved the application of both qualitative and quantitative metrics. Qualitative techniques included scatter plots, time-series graphs, Taylor diagrams, raincloud plots, and histograms of the frequency distribution of residual errors, each providing visual insights into model performance and error distribution. On the quantitative side, several statistical metrics were employed: the coefficient of determination (R2 ), root mean squared error (RMSE), Nash-Sutcliffe efficiency (NSE), percentage bias (PBIAS), and the root mean standard deviation ratio (RSR), as detailed in Table 3. where TUM and TUP are the measured and predicted turbidity respectively, TUM and TUM are the mean measured and predicted turbidity, respectively; and n is the number of testing datasets in our study. R2 , ranges from 0 to 1 and indicates the proportion of variance in the observed data that is explained by the model. Typically, an R2 value exceeding 0.75 is considered meaningful and strong [53]. However, R2 is sensitive to extreme or outlier values, which can mask differences between predicted and measured datasets [43]. Therefore, relying solely on R2 is not advisable for evaluating ML models. RMSE serves as a quadratic scoring rule that calculates the average magnitude of prediction errors, with greater emphasis on larger discrepancies due to the squaring of errors. The NSE, which ranges from negative infinity to one, compares the variance of model residuals to the variance of the measured data, providing insight into the noise-to-information ratio. NSE is widely regarded as a reliable error metric in hydrology, with classifications such as unsatisfactory (NSE ≤ 0.4), acceptable (0.40 < NSE ≤ 0.50), satisfactory (0.50 < NSE ≤ 0.65), good (0.65 < NSE ≤ 0.75), and very good (0.75 < NSE ≤ 1.00) [43]. The PBIAS metric evaluates the average tendency of predicted values to be larger or smaller than observed values, with thresholds categorizing performance as very good (< ±10 %), good (±10 % to < ±15 %), or satisfactory (±15 % to < ±25 %) [43]. Finally, the RSR metric provides a comprehensive evaluation of model performance by integrating error index statistics with a normalization factor, facilitating direct comparisons across different models and studies 3. Results 3.1. Variable importance To assess the relative contribution of each predictor to TU, we conducted two complementary analyses: (1) a correlation-based assessment (both Pearson’s r and Spearman’s rank ρ) and (2) a random forest (RF) feature-importance analysis. Fig. 3 depicts both Pearson’s and Spearman’s correlation coefficients (upper-diagonal panels) alongside scatter plots (lower-diagonal panels) and kernel density distributions (diagonal histograms). For TU, Q exhibits the highest Pearson correlation (e.g., r = 0.71; p < 0.05), while WS has the highest Spearman correlation (ρ = 0.70; p < 0.05). These strong positive relationships suggest that higher discharge and water stage values tend to coincide with elevated turbidity. Conversely, SC, pH, and Tw display moderate to weak negative correlations with TU, indicating that increasing conductivity, more alkaline conditions, or warmer temperatures are generally associated with lower turbidity. Among the predictor-predictor relationships, DO and Tw present the most pronounced negative correlation (r = − 0.96; p < 0.05), reflecting their well-known inverse dependence (cooler water often contains more dissolved oxygen, and vice versa). Additionally, Q and WS are themselves highly correlated (r = 0.94; p < 0.05), consistent with standard rating curves, highlighting that larger discharges typically elevate water levels. While these strong inter-predictor correlations can be indicative of shared hydrological processes, they also underscore the importance of evaluating each variable’s unique contribution to turbidity through more advanced, multivariate techniques. To complement the correlation-based findings, we trained a Random Forest regressor and extracted each predictor’s importance score (Fig. 4). This multivariate approach considers interactions among variables and quantifies how much each feature reduces prediction error across the ensemble of DTs. In our analysis, WS emerged as the most influential parameter (importance = 0.53), followed by Q (0.21). These results reinforce the earlier correlation findings, suggesting that flowrelated variables strongly drive the variability in turbidity. Physically, higher water levels and larger discharges can transport greater volumes of suspended sediments, thereby increasing turbidity. The remaining predictors—pH, SC, DO, and Tw—show comparatively smaller yet nontrivial importance scores, ranging from 0.07 to 0.05. Their roles may be partially overshadowed by the dominant hydrological parameters, but they still contribute to overall predictive performance. These combined correlation and feature-importance results guided subsequent modeling strategies and the selection of optimal input scenarios for TU prediction.configuration including all predictors—Scenario 6—yielded the most accurate TU predictions. Although the PCA and CfsSubsetEval featureselection methods each produced reduced sets of input variables, neither outperformed this manually curated scenario. Specifically, PCA led to about 215 % higher RMSE in the training phase and 165 % higher in the testing phase when compared to Scenario 6, while the CSE approach showed increases of 227 % and 199 % in the training and testing phases, respectively. These findings suggest that, although both PCA and CSE are useful for automatically screening out redundant or irrelevant variables, retaining a slightly larger set of physically relevant inputs (Q, WS, SC, pH, and DO) significantly strengthens the model’s ability to capture the multifaceted processes affecting turbidity in a river system. 3.3. Models’ performance evaluation Following model calibration, we validated the standalone and ensemble versions of the AMT using the testing dataset. Figs. 5–8 collectively illustrate how well each model replicates measured turbidity, covering aspects such as scatter plots (Fig. 5), time-series profiles (Fig. 6), Taylor diagram (Fig. 7) and distributional characteristics (Fig. 8). The measured-versus-predicted comparisons (Fig. 5) reveal that AMT and RS-AMT perform almost similarly in terms of coefficient of determination (R2 ≈ 0.76–0.78). By contrast, BA-AMT, IAER-AMT, and WIHW-AMT (all yielding R2 ≈ 0.80) show slightly higher correlations.outperforming the other ensemble methods (BA-AMT: 1.302 FNU; RSAMT: 1.373 FNU; WIHW-AMT: 1.344 FNU) and the standalone AMT (1.380 FNU). While none of the models perfectly captures rare extreme events (> 30 FNU)—with some underestimating and others overestimating—our ensemble-based algorithms still produce reliable, physically reasonable estimates and generally track medium-to-high turbidity values more accurately than the standalone AMT. Turbidity prediction is inherently challenging due to (1) complex, nonlinear sediment-transport processes; (2) the influence of multiple interacting environmental factors; and (3) the stochastic nature of extreme flow events. Despite these hurdles and the expected misalignment at the highest values, our hybrid models achieve high-quality performance, as demonstrated by the performance metrics and the time‑series comparison in Fig. 6. To further evaluate model behavior over a broad range of hydrological conditions, Fig. 6 compares predicted and measured turbidity time-series from 2018 to 2023, highlighting the NSE metric. Among the five models, IAER-AMT again leads with an NSE of 0.72—well above the 0.65 threshold often categorized as “good” and approaching the 0.75 threshold for “very good” performance. The other models fall within an NSE range of 0.63–0.67, indicating that they also replicate overall turbidity patterns satisfactorily, though somewhat less effectively. While extreme peaks pose challenges for all approaches, IAER-AMT demonstrates a higher fidelity to both low ( 4.0 FNU), indicating that they are more prone to under- or overestimations at peak turbidity levels. Table 5 collates the numerical performance of all five models, reinforcing the ensemble approaches’ superiority over the standalone AMT. IAER-AMT emerges as the top performer, achieving the lowest RMSE (1.20 FNU) and highest NSE (0.72), followed by BA-AMT (RMSE = 1.30 FNU, NSE = 0.67), WIHW-AMT (1.34 FNU, 0.65), RS-AMT (1.37 FNU, 0.64), and finally AMT (1.38 FNU, 0.63). Paired t‑tests, Wilcoxon signed‑rank tests, normal‑theory 95 % confidence intervals (CIs) on MAE differences, and bootstrap 95 % CIs on RMSE differences show that IAER‑AMT reduces mean absolute error by 0.0557 FNU vs. AMT (95 % CI; p = 0.003) and by 0.0695 FNU vs. WIHW‑AMT (95 % CI; p < 0.001), whereas differences with BA‑AMT and RS‑AMT were not significant (p > 0.05). Overall, and according to the Moriasi et al. [68] guidelines for hydrological modeling, IAER-AMT, BA-AMT, and WIHW-AMT fall into the “good” category (0.65 < NSE ≤ 0.75), whereas RS-AMT and AMT are rated “satisfactory” (0.50 < NSE ≤ 0.65). Regarding bias, IAER-AMT slightly underestimates turbidity (PBIAS = 3.17 %), whereas the other ensembles show mild to moderate overestimation (negative PBIAS)—all within the “very good” threshold of < ±15 %. The standalone AMT model, with − 26.81 % bias, is instead classified as “good” (±25 % < PBIAS < ±40 %). Across all error indices, IAER-AMT clearly offers the most balanced and accurate turbidity predictions, confirming the efficacy of integrating AMT with an iterative absolute-error approach for modeling riverine water quality. 4. Discussion Predicting TU in a river system is crucial not only for monitoring turbidity but also for enabling broader water quality assessments. Maintaining continuous monitoring stations is expensive, and sensors require frequent calibration. Thus, accurate and cost-effective alternatives—particularly data-driven models—are indispensable. Numerous factors can influence the predictive efficacy of these models, including data quality and resolution, model architecture, computational efficiency, input-variable selection, the ratio of training-to-testing data, and the total length of records [70]. The following subsections discuss how these elements affect model performance. 4.1. Effect of input variables and optimum input scenario on models’ performance Irrelevant input data can undermine the predictive accuracy of ML models; therefore, removing them is often beneficial. In this study, we compared manually selected inputs with those obtained via two featureselection techniques (i.e., PCA and CfsSubsetEval). Contrary to expectations, the “intelligence” methods did not yield the best combination of predictors. Feature-selection algorithms tend to favor variables strongly correlated with the target—potentially discarding variables that, despite lower individual correlations, can add complementary information to improve prediction. By systematically adding and removing predictors, our manually built approach revealed that including all predictors (Q, WS, SC, pH, DO, and Tw) delivered superior results. This process also doubles as a form of sensitivity analysis, enabling researchers to see how each predictor influences turbidity estimates. Feature selection remains valuable in scenarios with exceptionally large numbers of candidate variables, where a purely manual approach becomes intractable. Our findings—aligned with [33,35] —reinforce the importance of considering hydrological context and physical relevance, not just correlation strength, when determining an optimal input set.4.2. Effect of data splitting ratio on models’ performance Selecting the proportion of data allocated for training versus testing is a perennial question in ML research. Kisi et al. [38] compared different splits (50:50, 60:40, 75:25) and reported that increasing the training portion generally enhances predictive power. Similarly, Nguyen et al. [48] tested ratios ranging from 10:90 to 90:10 and found that 70:30 achieved superior performance. Our choice of a 70:30 split follows this consensus, balancing the need to train sufficiently complex models while preserving ample data for unbiased validation. This ratio is also prevalent in both time-series forecasting [2,40] as well as spatial modeling [8]; [69]. 4.3. Comparison of models’ performance We evaluated multiple tree-based algorithms, all relying on the same optimized set of input variables, yet displaying varying degrees of accuracy. Their structural differences—particularly how they split and combine decision nodes—yield unique responses to data noise and nonlinearities. Generally, tree-based models are robust to non-normal distributions and require minimal data preprocessing [9,16,34]. The ensemble techniques tested here also outperformed the standalone AMT model, indicating that hybridization can bolster predictive reliability and flexibility. Among the ensemble variants, IAER-AMT ranked highest in nearly all evaluation metrics, likely due to its iterative emphasis on minimizing absolute errors between predictions and observations. By sequentially refining these local “mini-models,” IAER-AMT achieves a more balanced fit, especially in handling outliers or complex patterns. These findings reinforce earlier studies [12] that demonstrated how mixing or chaining learning algorithms often delivers better results than any standalone approach.4.4. Seasonality of water quality parameters Fig. 10 underscores the seasonal variability of hydrological and water quality parameters at the study site. Although the traditional definition of seasons (winter, spring, summer, autumn) is applicable, the data more naturally split into a “warm” period (May-October) and a “cold” period (November-April). During the warm season, both Q and WS tend to be at their lowest levels, with minima commonly observed between July and September (particularly in August). This seasonal low-flow period coincides with higher SC readings: as discharge diminishes, the relative concentration of dissolved constituents can increase, elevating specific conductance. Warmer water also holds less dissolved oxygen; indeed, DO attains its minimum in July, aligned with peak temperatures (Tw). Under these conditions, turbidity remains relatively low—limited precipitation and runoff lead to less sediment entering the river, and extended base-flow conditions support clarity. During the cold season, elevated precipitation and snowmelt raise Q and WS, with both often peaking around April. Although SC shows some fluctuation in winter, its overall range is narrower, in part because greater flows dilute dissolved minerals. By contrast, cold water increases oxygen solubility, resulting in higher DO levels (commonly peaking around January). Meanwhile, turbidity tends to spike in mid to late winter (December-February) due to runoff, soil erosion, and possible storm events mobilizing sediments. Over the entire year, pH exhibits only modest fluctuations—ranging from about 7.3 to 7.6—with slightly higher values in the warm months. These minor pH changes likely reflect biological activity (e.g., algal photosynthesis) and temperature-dependent chemical equilibria, rather than large-scale shifts in watershed geochemistry. Overall, the cyclical interplay among discharge, water temperature, and dissolved oxygen—combined with seasonal variability in precipitation and runoff—drives monthly turbidity patterns in this river, consistent with the correlation analyses linking Q, WS, and DO to TU. 4.5. Limitation and future work recommendation In this study, model hyperparameters were fine-tuned primarily through trial and error. Although this approach yielded robust predictions, integrating metaheuristic algorithms (e.g., gradient-based optimizer (GBO), Runge Kutta optimizer (RUN), and differential evolution (DE); [52]) could further automate and optimize hyperparameter searches. Moreover, the data-splitting ratio—and the specific periods chosen for training and testing—can significantly influence model performance. Future research might systematically evaluate how shifting these splits across different hydrologic years affects predictive skill and model transferability. Furthermore, we strongly recommend applying these calibrated models to additional regions worldwide with both similar and contrasting characteristics to rigorously evaluate their generalization capabilities. Additionally, we recommend incorporating meteorological matchups, particularly rainfall—that are openly accessible through platforms such as Google Earth Engine [57]—as additional predictor variables and evaluating their effectiveness for turbidity prediction. Despite these limitations, our results offer practical insights for water resources managers, local stakeholders, and policymakers, particularly in regions with limited hydrometric networks. Accurate turbidity estimates can guide water-treatment decisions, aid in the early detection of contamination spikes, and inform sustainable water resource planning. 5. Conclusions This study evaluated the predictive capabilities of several ensemble tree-based ML algorithms for modeling daily TU in the Clackamas River, USA. A 17-year record (2006–2023) of physicochemical and hydrological parameters—Tw, SC, DO, pH, Q, and WS—served as potential model inputs. Three feature-selection strategies (PCA, CfsSubsetEval, and a manual approach) were compared, and multiple variants of the AMT—including IAER-AMT, BA-AMT, WIHW-AMT, and RS-AMT—were developed. The key findings are: • WS exerted the strongest influence on turbidity, whereas Tw showed comparatively weak correlation. • A manually-derived selection of Q, WS, SC, pH, DO and Tw outperformed PCA- or CfsSubsetEval-based subsets. Although automated feature selection can be useful for large datasets, manual approaches remain indispensable when physical insight indicates that less-correlated variables may still add crucial information. • The novel IAER-AMT hybrid model achieved superior predictive accuracy, followed by BA-AMT, WIHW-AMT, RS-AMT, and the standalone AMT model. These findings confirm that combining AMT with ensemble methods notably enhances model robustness. • IAER-AMT, BA-AMT, and WIHW-AMT displayed “good” predictive skill (0.65 < NSE ≤ 0.75), while RS-AMT and AMT were classified as “satisfactory” (0.50 < NSE ≤ 0.65). Compared to the standalone AMT, the four ensemble models improved performance by 14.3 %, 6.34 %, 3.2 %, and 1.6 %, respectively. • Although this study used six readily available variables, other waterquality factors—such as oxidation–reduction potential, and meteorological inputs such as precipitation—could further refine the model. • DL architectures could be tested and compared with tree-based ensembles, potentially improving high-turbidity event capture. Incorporating metaheuristic algorithms for hyperparameter tuning may likewise yield additional gains in model accuracy and stability. Overall, these results underscore the effectiveness of ensemble treebased ML methods for turbidity prediction, even when relying on a relatively small set of easily measured inputs. The IAER-AMT model, in particular, shows strong promise for extension to other rivers and for broader water-quality forecasting applications. Author statement During the preparation of this work, the authors used ChatGPT solely to correct spelling and grammar. They then reviewed and edited the content as necessary and accept full responsibility for the publication’s content. Data availability Data related to this study are available upon request. CRediT authorship contribution statement Khabat Khosravi: Writing – review & editing, Writing – original draft, Software, Formal analysis, Data curation, Conceptualization. Aitazaz Faroouqe: Conceptualization, Methodology, Supervision, Writing – review & editing. Ali Reza Shahvaran: Writing – original draft, Software, Methodology, Conceptualization. Prasad Daggupati: Writing – review & editing, Methodology, Conceptualization. Salim Heddam: Writing – original draft, Data curation, Conceptualization. Javad Hatamiafkoueieh: Writing – original draft, Data curation, Conceptualization. Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Khabat Khosravi reports article publishing charges and statistical analysis were provided by University of Prince Edward Island. Khabat Khosravi reports a relationship with University of Prince Edward Island that includes: employment and funding grants. Khabat Khosravi has patent pending to Khabat Khosavi and Aitazaz Farooque. Declarations Ethics Approval Not applicable. Consent to Participate Not applicable. Consent for Publication Not applicable. Conflicts of Interest The authors declare that there is no conflict of interest associated with this research or manuscript. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.References [1] Ahmad MW, Reynolds J, Rezgui Y. Predictive modelling for solar thermal energy systems: a comparison of support vector regression, random forest, extra trees and regression trees. J Clean Prod 2018;203:810–21. [2] Ali M, Prasad R, Xiang Y, Deo R. Near real-time significant wave height forecasting with hybridized multiple linear regression algorithms. Renew Sustain Energy Rev 2020;132:110003. [3] Alsaeed, R., Alaji, B., and Ebrahim, M. 2021. Predicting turbidity and Aluminum in drinking water treatment plants using Hybrid Network (GA- ANN) and GEP, Drinking Water Engineering and Science Discussion [preprint], DOI: 10.5194/ dwes-2021-8. [4] Azarakhsh Z, Azadbakht M, Matkan A. Estimation, modeling, and prediction of land subsidence using Sentinel-1 time series in Tehran-Shahriar plain: a machine learning-based investigation. Remote Sens Appl: Soc Environ 2022;25:100691. [5] Breiman L. Bagging Predictors Machine Learning 1996;24:123–40. [6] Bui DT, Pradhan B, Nampak H, Bu QT, Tran QA, Nguyen QP. Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibility modeling in a high-frequency tropical cyclone area using GIS. J Hydrol 2016;540:31[7] Chen T, Ryali S, Qin S, Menon V. Estimation of resting-state functional connectivity using random subspace based partial correlation: a novel method for reducing global artifacts. Neuroimage 2013;82:87–100. https://doi.org/10.1016/j. neuroimage.2013.05.118. [8] Chen W, Panahi M, Pourghasemi HR. Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE) for landslide spatial modeling. Catena 2017;157:310–24. [9] Choubin B, Darabi H, Rahmati O, Sajedi-Hosseini F, Kløve B. River suspended sediment modelling using the CART model: a comparative study of machine learning techniques. Sci Total Environ 2018;615:272–81. [10] Choudhury SD, Yu JG, Samal A. Leaf recognition using contour unwrapping and apex alignment with tuned random subspace method. Biosyst Eng 2018;170: 72–84. https://doi.org/10.1016/j.biosystemseng.2018.04.001. [11] Chu H, Wu W, Wang QJ, Nathan R, Wei J. An ANN-based emulation modelling framework for flood inundation modelling: application, challenges and future directions. Environ Model Softw 2020;124:104587. [12] De’ath, G., Fabricius, K.E. 2000. Classification and regression trees: a powerful yet simple technique for ecological data analysis Ecology (2000) https://doi/abs/ 10.1890/0012-9658%282000%29081%5B3178%3ACARTAP%5D2.0.CO%3B2. [13] Deng, T., Duan, H., Keramat, A. 2022. Spatiotemporal characterization and forecasting of coastal water quality in the semi-enclosed Tolo Harbour based on machine learning and EKC analysis. Engineering Applications of Computational Fluid Mechanics Volume 16, 2022 - Issue 1. [14] Devi PD, Mamatha G. Machine learning approach to predict the turbidity of Saki Lake, telangana, India, using remote sensing data. Meas: Sens 2024;33:101139. [15] Ehteram M, Ahmed A, Sherif M, El-Shafie A. An advanced deep learning model for predicting water quality index. Ecol Ind 2024;160:111806. [16] Fijani E, Khosravi K. Hybrid iterative and tree-based machine Learning algorithms for Lake water level forecasting. Water Resour Manag 2023:1–27. [17] Frank, E., Mayo, M., Kramer, S. 2015. Alternating model trees, in: Proceedings of the 30th Annual ACM Symposium on Applied Computing 13-17-April, pp. 871- 878. DOI: 10.1145/2695664.2695848. [18] Galal Uddin M, Nash S, Rahman A, Olbert A. Assessing optimization techniques for improving water quality model. J Clean Prod 2023;385:135671. [19] Galal Uddin M, Rahman A, Taghikhah FR, Olbert A. Data-driven evolution of water quality models: an in-depth investigation of innovative outlier detection approaches-a case study of Irish water quality index (IEWQI) model. Water Res 2024;225:121499. [20] Gao W, Alsarraf J, Moayedi H, Shahsavar A, Nguyen H. Comprehensive preference learning and feature validity for designing energy-efficient residential buildings using machine learning paradigms. Appl Soft Comput 2019;84:105748. https:// doi.org/10.1016/j.asoc.2019.105748. [21] Haghiabi, A.H., Nasrolahi, A.H., Parsaie, A. 2018. Water quality prediction using machine learning methods. Water Quality Research Journal (2018) 53 (1): 3–13. DOI: 10.2166/wqrj.2018.025. [22] Hall MA. Correlation-based feature selection for machine learning. The University of Waikato; 1999. p. 198. Doctoral dissertation. [23] Hall MA. Correlation-based feature selection of discrete and numeric class machine learning. (working paper 00/08). Hamilton, New Zealand: University of Waikato, Department of Computer Science; 2000. [24] Hall, M.A., Smith, L.A. 1997. Feature subset selection: a correlation based filter approach. [25] Hassan A, Bhuiyan MIH. Computer-aided sleep staging using complete ensemble empirical mode decomposition with adaptive noise and bootstrap aggregating. Biomed Signal Process Control 2016;24:1–10. https://doi.org/10.1016/j. bspc.2015.09.002. [26] Ho TK. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 1998;20(8):832–44. https://doi.org/10.1109/34.709601. [27] Hussain M, Bari SH, Mahmud I, Siddiquee H. Application of different artificial neural network for streamflow forecasting. Advances in Streamflow Forecasting from Traditional to Modern Approaches 2021:149–70. https://www.sciencedirect. com/science/article/pii/B9780128206737000068. [28] Hussain D, Khan AA. Machine learning techniques for monthly river flow forecasting of Hunza River, Pakistan. Earth Sci Inf 2020;13:939–49. [29] Kargar K, Safari MJS, Khosravi K. Weighted instances handler wrapper and rotation forest-based hybrid algorithms for sediment transport modeling. J Hydrol 2021;598:126452. [30] Kashyap V, Poddar A, Sihag P, et al. Forecasting compressive strength of jute fiber reinforced concrete using ANFIS, ANN, RF and RT models. Asian Journal of Civil Engineering 2023. https://doi.org/10.1007/s42107-023-00892-y. [31] Khairi MTM, Ibrahim S, Yunus MAM, et al. Artificial neural network approach for predicting the water turbidity level using optical tomography. Arab Journal of Science Engineering 2016;41:3369–79. https://doi.org/10.1007/s13369-015- 1904-6. [32] Khan M, Khan AU, Khann S, Haleem K, Khan F. Streamflow forecasting for the hunza river basin using ANN, RNN, and ANFIS models. Water Practice and Technology 2023;18(5):981–93. https://doi.org/10.2166/wpt.2023.060. [33] Khosravi K, Golkarian A, Melesse AM, Deo RC. Suspended sediment load modeling using advanced hybrid rotation forest based elastic network approach. J Hydrol 2020;610:127963. [34] Khosravi K, Pham BT, Chapi K, Shirzadi A, Shahabi H, Revhaug I. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. Sci Total Environ 2018;627:744–55. [35] Khosravi K, Barzegar R, Miraki S, Adamowski J, Daggupati P, et al. Stochastic modeling of groundwater fluoride contamination: introducing lazy learners. Groundwater 2020;58(5):723–34. [36] Khosravi K, Mosallanejad A, Bateni SM, Kim D, Jun C, Shahvaran AR, et al. Assessing Pan-Canada wildfire susceptibility by integrating satellite data with novel hybrid deep learning and black widow optimizer algorithms. Sci Total Environ 2025;977:179369. https://doi.org/10.1016/j.scitotenv.2025.179369. [37] Kikon, A., Dodmani, B., Barma, S., Naganna, S. 2023. ANFIS-based soft computing models for forecasting effective drought index over an arid region of India. AQUA - Water Infrastructure, Ecosystems and Society (2023) 72 (6): 930–946. DOI: 10.2166/aqua.2023.204. [38] Kisi O, Khosravinina P, Nikpour MR, Sanikhani H. Hydrodynamics of riverchannel confluence: toward modeling separation zone using GEP, MARS, M5 tree and DENFIS techniques. Stoch Environ Res Risk Assess 2019;33:1089–107. [39] Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell 1997;97 (1–2):273–324. [40] Kouadio L, Deo RC, Byrareddy V, Adamowski JF, Mushtaq S, Phuong Nguyen V. Artificial intelligence approach for the prediction of robusta coffee yield using soil fertility properties. Comput Electron Agric 2018;155:324–38. https://doi.org/ 10.1016/j.compag.2018.10.014. [41] Ku CY, Liu CY. Modeling of land subsidence using GIS-based artificial neural network in Yunlin County. Taiwan Scientific Reports 2023;13:4090. https://doi. org/10.1038/s41598-023-31390-5. [42] Kumar L, Afzal MS, Ahmad A. Prediction of water turbidity in a marine environment using machine learning: a case study of Hong Kong. Reg Stud Mar Sci 2022;52:102260. [43] Legates D, McCabe G. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 1999;35(1): 233–41. [44] Leigh C, Kandanaarachchi S, McGree JM, Hyndman RJ, Alsibai O, Mengersen K, et al. Predicting sediment and nutrient concentrations from high-frequency waterquality data. PLoS One 2019;14(8):e0215503. https://doi.org/10.1371/journal. pone.0215503. [45] Leigh C, Alsibai O, Hyndman R, Kandanaarachchi S, King OC, McGree J, et al. A framework for automated anomaly detection in high frequency water-quality data from in situ sensors. Sci Total Environ 2019;664:885–98. [46] Moayedi H, Aghel B, Foong LK, Bui DT. Feature validity during machine learning paradigms for predicting biodiesel purity. Fuel 2020;262:116498. https://doi.org/ 10.1016/j.fuel.2019.116498. [47] Muharni Y, Hartono N. An application of multiple regression for predicting turbidity of standard water quality for industrial and household consumption. J Ind Serv 2021;7(1):8–11. [48] Nguyen QH, Ly HB, Ho LS, Al-Ansari[48] Nguyen QH, Ly HB, Ho LS, Al-Ansari N, Le HV, Tran VQ, et al. Influence of data splitting on performance of machine Learning models in prediction of Shear strength of soil. In: Mathematical Problems in Engineering Mathematical Problems in Engineering, Special Issue: Artificial Intelligence for Civil Engineering, Article ID 4832864 |; 2021. https://doi.org/10.1155/2021/4832864. [49] Nordemann DJR, Rigozo NR, de Souza Echer MP, Echer E. Principal components and iterative regression analysis of geophysical series: application to Sunspot number (1750-2004). Comput Geosci 2008;34(11):1443–53. https://doi.org/ 10.1016/j.cageo.2007.09.022. [50] Rahbar A, Mirarabi A, Nakhaei M, et al. A Comparative analysis of data-driven models (SVR, ANFIS, and ANNs) for daily Karst spring Discharge prediction. Water Resour Manage 2022;36:589–609. https://doi.org/10.1007/s11269-021-03041-9. [51] Rachid B, Abderrahim S, Hafid A, Souad R. Predicting water potability using a machine learning approach. Environ Challenges 2025;19:101131. https://www.sci encedirect.com/science/article/pii/S2667010025000496. [52] Samadi-Koucheksaraee A, Shirvani-Hosseini S, Ahmadianfar I, Gharabaghi B. In: Optimization Algorithms Surpassing Metaphor. Singapore: Springer; 2022. https:// doi.org/10.1007/978-981-19-2519-1_1. [53] Santhi C, Arnold JG, Williams JR, Dugas WA, Srinivasan R, Hauck LM. Validation of the SWAT model on a large river basin with point and nonpoint sources. J Am Water Resour Assoc 2001;37(5):1169–88. [54] Saravani MJ, Saadatpour M, Shahvaran AR. A web GIS based integrated water resources assessment tool for javeh reservoir. Expert Syst Appl 2024;252:124198. https://doi.org/10.1016/j.eswa.2024.124198. [55] Saravani MJ, Noori R, Jun C, Kim D, Bateni SM, Kianmehr P, et al. Predicting chlorophyll-a concentrations in the world’s largest lakes using kolmogorov-Arnold networks. Environ Sci Tech 2025;59(3):1801–10. https://doi.org/10.1021/acs. est.4c11113. [56] Selamat SN, Majid NA, Taha MR, Osman A. Landslide susceptibility model using artificial neural network (ANN) approach in Langat River basin, Selangor. Malaysia Land 2022;11:833. https://doi.org/10.3390/land11060833. [57] Shahvaran AR, Pour HK, Binding C, Van Cappellen P. Mapping satellite-derived chlorophyll-a concentrations from 2013 to 2023 in Western Lake Ontario using landsat 8 and 9 imagery. Sci Total Environ 2025;968:178881. https://doi.org/ 10.1016/j.scitotenv.2025.178881. [58] Shi Z, Chow C, Fabris R, Liu J, Jin B. Applications of online UV-vis spectrophotometer for drinking water quality monitoring and process control: a review. Sensors 2022;22(8):2987. [59] Strasters JK, Kim ST, Khaledi MG. Multiparameter optimizations in micellar liquid chromatography using the iterative regression optimization strategy. J Chromatogr A 1991;586(2):221–32. [60] Tahraoui H, Amrane A, Belhadj AE, Zhang J. Modeling the organic matter of water using the decision tree coupled with bootstrap aggregated and least squares

Badges
Science topic

More Abbas Kashani's questions See All

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

Is it true that the vacuum bomb is about to arrive?

We assume this to be true. We also assume that the vacuum bomb is the latest version of explosives with an explosive power of a few to ten kg of TNT equivalent. It has the unique characteristic of...

04 August 2024 4,534 1 View

Is it true that science is leaving the era of mathematics and entering the era of matrix mechanics?

We assume this to be true. Science leaves the era of mathematics and enters the era of matrix mechanics and the turning point is the discovery of numerical statistical theory called Cairo...

31 July 2024 3,900 2 View

Which filtration method to go for run off water from dirty solar panels to be used again?

We are working on a robot that cleans solar panels using fresh water supply and a rotating brush. We are trying to conserve as much water at possible by recycling the dirty water that is collected...

28 July 2024 5,778 2 View

Can we find a statistical matrix mechanics equivalent to the Schrödinger equation?

We assume that we can find a statistical matrix mechanics equivalent to Schrödinger's PDE in two consecutive steps: i-Transform the Schrödinger PDE describing the wave function Ψ into its square...

27 July 2024 3,959 4 View

Has anyone looked into the stability of FEM methods on curved meshed?

Google scholar isn't giving me any results on this

27 July 2024 595 2 View

Is it true that there is a physico-statistical meaning to the constant π other than circular geometry but the iron guardians of the SE deny it?

We assume that there is a physico-statistical meaning to the constant π other than circular geometry, but the iron guardians of the Schrödinger equation deny this. The iron guardians of the...

22 July 2024 4,285 9 View

How many samples size should I select to compare both groups?

I want to study the differences between two groups: the treatment group and the comparison group. The total population consists of 60000 women in the treatment group, distributed across different...

21 July 2024 669 3 View

Is it possible to calculate finite numerical integration via matrix mechanics?

We assume that it is possible to calculate finite numerical integration via statistical matrix mechanics (a product of the statistical method of Cairo techniques). It is obvious that here it is...

21 July 2024 6,809 5 View

Assuming that the vacuum explodes spontaneously, how is this explosion sustained and amplified?

We assume that the energy in infinite free space (in addition to flammable particles if any) is drawn toward the center of the hottest point, as if to add more fuel to the fire. The Big Bang...

20 July 2024 503 1 View

How to enrich pig excreta for increasing nutrient quality organically ?

Pig slurry is rich in major and minor nutrients. Is there any way to improve / Enrich its manure quality to be used in agriculture organically ? please share your knowledge.

09 August 2024 5,605 2 View

How much area/acres of agriculture land can irrigate by 1 cusec of water?

Kindly response

08 August 2024 1,214 2 View

How can I inoculate a lyophilized anaerobic bacteria into the culture medium without an anaerobic chamber?

There are anaerobic gas pak jars and environment providers in the laboratory that provide anaerobic environmental conditions for the growth of anaerobic bacteria. However, there is no anaerobic...

05 August 2024 4,945 3 View

How can principles of sustainable agriculture be integrated into the workplace environment??

To achieve sustainability in the work environment, how will we use agriculture in it?

04 August 2024 5,049 4 View

What is the role of oxidative stress in toxin-induced carcinogenesis?

What is the role of oxidative stress in toxin-induced carcinogenesis? I am currently researching the impact of environmental toxins on children's health and would greatly appreciate insights from...

02 August 2024 3,174 1 View

How do environmental toxins interact with genetic predispositions to increase cancer risk?

I am currently researching the impact of environmental toxins on children's health and would greatly appreciate insights from experts in the field. If you are an expert or researcher working on...

02 August 2024 2,627 1 View

Are current regulations effective in preventing cancer caused by toxins?

I am currently researching the impact of environmental toxins on children's health and would greatly appreciate insights from experts in the field. If you are an expert or researcher working on...

02 August 2024 4,474 2 View

Do we save MSMEs of the Nation ?

SMEs are backbone of the country's white economy, most of them are II generation entrepreneurs and migrated from family owned business policies, and having sufficient higher educations for...

01 August 2024 5,562 2 View

Dear colleagues, Can anybody by chance have pdfs of two books in Chinese: Liu (2003) and Zhang et al. (2004)?

Liu Zhaosheng. 2003. Triassic and Jurassic sporopollen assemblage from the Kuqa Depression，Tarim Basin of Xinjiang，NW China. Palaeontologia Sinica， New Series A，no. 14（Whole no. 190）....

28 July 2024 6,390 1 View

How can biogenic synthesis techniques be applied to develop plants capable of extracting and processing heavy metals from contaminated soil, and wha?

Understand how to utilize biosynthesis to develop natural and sustainable solutions to address heavy metal pollution and improve environmental and agricultural conditions.

26 July 2024 1,469 1 View