On a standard multi-core machine, it is easy to specify the number of cores you want to use. As a comparison, it would be interesting to know how cuda programmes scale with the number of GPU cores that the programme uses. How does one alter the number of cores that will be used to handle a task?