I am working on a problem in which there is no historical data on which to build an associative rule model and eventually a decision tree model. As such, we are trying to create some data. Our original feature set contained 21 feature variables. Each feature variable has 3 - 6 potential categorical values that it can be assigned (most have 5). We developed a tool that would randomly assign one of the associated categorical values to each of the features and then have the users make a selection based on those values. This approach led to having 5x10^13 possible combinations which are too many for the handful of users we plan on using.
This leads to my question of am I overthinking this. I know a number of factors impact the ML model, but for a proof of concept, we are trying to keep things simple. Can we still get good data with a data set of 150 when there are so many combinations. Alternatively, since we are working with a toy problem we can greatly reduce the number of features and make all the choices binary but this approach loses some luster since there is not much decision making for the user.
I would appreciate any thoughts.