I've the following problem to solve:

I've a set of objects with some features and I'd like to try the best combination to fulfill a requirement.

To make a practical example, I have a bunch of sticks with different lengths. I do not know how many or how much they are long or what's their order in advance. Keeping that in mind, I'd like to find the best combination (no need of order) that make the total length as close as possible to a desired value, penalizing the number of sticks used. The example is really simplified, since the properety of the data in my problem is not direct correlated with the desired output (I cannot use brute force for example), but gives the idea of the issues.

To me it sounds like a reinforcement learning problem, since I've a way to reward the policy. So said, I've no idea where to start and I've not found works on the topic.

Thank you very much for your time!

More Daniele De Martini's questions See All
Similar questions and discussions