For a set of n elements, say S_n = {1,2,...,n}, a set partition is a set P = {s_1,s_2,...,s_k} of nonempty subsets s_i of S_n whose intersection is empty and whose union equals S_n.
I can't compare different algorithms for doing this but one that comes to mind immediately which is probably not bad is simply to count to b^n in different bases "b" where the base depends on how many partitions you want.
So, for instance, if you want partitions of 10 things into 2 different sets, enumerate all 10-digit base 2 numbers: the 0s in a given number correspond to one partition and the 1s correspond to the other. Similarly, for partitions of 10 things into 3 different sets, enumerate all 10-digit base 3 numbers.
Obviously, the number of partitions increases explosively but this just means you'll probably have to resort to stochastic methods anyway. For example, if your problem requires generating 10 partitions for 1000 things, there are 10^1000 possibilities, so you can pick random numbers in this range, and represent them in base 10 to get your partitions. However, at this point you might just as well generate 1000 random numbers from 0 to 9, which should scale (about) linearly with the number of partitions.
Yes, it is kind of hard to deal with the output sensitivity, this is unavoidable (even stochastic can't help you since this is a generation problem, not a counting problem). Combinatorial explosion is a reality in any generation problem unless you can find a lovely loop-less algorithm which will likely be still explosive in output. Since this is not counting, the desired set must be returned in exact form. What I would recommend on my end is to find a correspondence between another combinatorial object and try to generate that, then translate it to set partitions. Otherwise, I would jump right onto using the base technique Devon gave. I couldn't name any algorithms by name or paper off the top of my head.
Consult section 7.2.1.5 (Generating all set partitions) in D. E. Knuth, The Art of Computer Programming, Vol. 4A (Combinatorial Algorithms, Part 1), Addison Wesley, 2011. A common approach is to use restricted growth strings.
I don't think your answer is correct. Notice that the question asks for partitions with non-empty parts, whereas your "counter" algorithm, as stated, doesn't provide a way to respect this constraint. For example, if one seeks a 10-partition of 1000 objects, then a partition that doesn't use each of the 10 "digits" at least once, is not valid. For example, consider 3-partitions of 3 items. Here, 012 is a valid partition, but all of 000 001 002 010 and 011 are invalid partitions, because part 1, 2, 1, 2, and 2 is empty, in each respective partition. Thus, your "counter" algorithm would cycle through vast numbers of invalid partitions before hitting on valid ones. In this sense, your algorithm is not only not the fastest algorithm, but is one of the slower ones.
A minimal principle one could follow to define what one means by "fast" is that each iteration produces a valid combinatorial object.
A clever idea is needed to patch this algorithm to make it correct, and it is listed in the paper by M. C. Er published in 1987. It suffices to consider "codewords" of the form
1111
1112
1121
1122
1123
1211
1212
1213 ...
and so on. The invariant here is an interesting little puzzle. This is like a "truncated counter" and is correct.
A recursive algorithm would also work where a k-partition of n items involves making a k-1 partition of n-1 items, and so on, and then appending the new element into each part, or into its own separate part.