27 June 2024 1 7K Report

In other words, I did not use the trained XGBoost model to make predictions on the test set and then use SHAP for interpretation. The reasons are as follows:

  • Even with the best and most comprehensive data available, we are unable to achieve accurate predictions due to the high variability in patterns.(I constructed a 1048days-30min time dissolution dataset: Input feature: Soil Temperature/ Soil moisture/ AirTemperature/ Rain/ Barometric pressure/Relative Humidity; Target feature: Soil CO2(ppm) )
  • Cross-validation and test sets are primarily used to ensure the stability and generalizability of the model, while SHAP interpretation is for understanding and analyzing the model's decision-making process. These two aspects are not equivalent.
  • If we consider the test set as an exam, only performing well in the exam can prove the model's reliability and allow for SHAP interpretation. However, in my context, can cross-validation be considered as reviewing and understanding the learned knowledge rather than just taking an exam? Based on this idea, even if the questions are very difficult or the patterns vary greatly, the process of review and understanding is still valuable.
  • More Shiang Lu's questions See All
    Similar questions and discussions