I am about to conduct a comprehensive study on evaluation metrics of machine learning (classification, regression, and clustering) algorithms.

I am aware of the existence of various real-world data sets and synthetic data generators. However, I am not aware existence of:

  • Any brand new and very challenging real-world tabular data sets?
  • Any new synthetic data generator with a reasonable number of hyperparameters resembling some real-world data phenomenon?
  • Any advice and link or hints would be greatly appreciated.

    Similar questions and discussions