I need data-sets of scientific or any random workflows for the machine learning based scheduling and resource allocation. Looking for suitable resources for the same.
Many classic OR benchmark data-sets (e.g., http://people.brunel.ac.uk/~mastjjb/jeb/info.html, http://www.schedulingbenchmarks.org/, http://www.om-db.wi.tum.de/psplib/data.html) await you.
But, the classic data-sets usually do not equip with features (aka attributes) for learning. So, you can:
define features (and labels) after downloading those benchmark problems; then
launch any machine learning method to predict the labels for indicating the (sub-)optimal solutions, via classification (discrete labels) or regression (continuous labels).
You may refer to some examples of effective feature definitions:
Article A Tensor Based Hyper-heuristic for Nurse Rostering
Thesis A suboptimum- and proportion-based heuristic generation meth...
(Table 6.4: Features defined for flow shop scheduling; and Chap. 6)
can you see this link (https://pdfs.semanticscholar.org/541d/3becf71e934db6ef3081dd1ee8f821e9dba5.pdf). You could find important information regarding the schedual b benchmark dataset