This is a really interesting question that remains an open question in the metaheuristic community.
As you may certainly already know, the behavior of a metaheuristic may greatly differ from one problem to another one, because of problem characteristics ( ruggedness, plateaus, convexity,....).
Indeed, metaheuristics are parameterized algorithm and the population size is one of the main parameters. Then the optimal parameter size might differ greatly from one problem to another, hence suggesting to find a robust parameter setting of the population size for a set/ class of problem.
As the dimension n (number of decision variables) is one main characteristic of the problem. the population size (P) is determined with respect to the dimension, e.g. in CMA-ES it is 4+3ln(n) or Differential Evolution the recommended population size is around [10 x n , 15 x n ].
About PSO, it differs from one paper to another. But i often remember the population size P=30. (Check the two paper above). In particular, look at Shi, Y., & Eberhart, where it empirically investigate the mean best fitness while the population is increased outperforming small population size, but also suggesting that the population size may be linearly increase with the dimension.
If you have particular question about algorithm configuration or how to get a get robust parameter setting, don't hesitate to contact me.
It is well-known that the size of the population used by meta-heuristic algorithms play a significant role in order to achieve a best performance. Small population sizes tend to result in faster convergence, but increases the risk of converging to a local optimum and adopting large population size explore the search space better but it causes slow convergence. Therefore, the optimal population size depends on complex interaction of a number of factors, including the problem to which the meta-heuristic algorithms is being applied.
The solutions populations are needed from many points of views. Additionally to that pointed out by Nacim Belkhir and Soheila Ghambari, an important point of view consists on the necessity of conciliating the solutions of the system linked to the optimization task studied with the more span system, what has to do, among other factors, with the necessity of evaluating the solutions from the population obtained by complementary simulation and graphical representation procedures, those that supplement the studied model with others that contribute to it higher representativeness of the studied object or problem. From this point of view sufficiently enough populations sizes are required to carry out an enough rich analysis of the studied problem. In theory your model could constitute an approximation to the real problem. It is demonstrated that a such parameter h exists that the optimal solution to the original problem could be found among those that differs from the optimal one of the studied model (approximate) in not less than h. I my book Enginnering Systems published in Spanish Lenguage this last aspect is ssstudied. This book you can download from my researchgate profile.
Not conquering the previous answer, here is what I have to say. In my 2005 publication https://www.researchgate.net/publication/216486175_Biology_Physics_Small_Worlds_and_Genetic_Algorithms (available at RG) I have shown that there is a lower limit how numerous should the population be. The resulting numbers are surprisingly small and are not directly applicable to all 'population based' algorithms; they are strictly valid for generic version of genetic algorithm operating on chromosomes composed of binary genes. Yet some 'translation' or 'accommodation' to other types of algorithms seems not so complicated. Just assume that every your 'chromosome' is a set of searched unknowns, each expressed as a 64-bit machine number. This makes 64N binary genes, and according to formula (26) you need at least Ni individuals satisfying inequality Ni2 >= 128N. See also the table following formula (32). Smaller population is almost guaranteed to fail, i.e. stop at local optimum. Using more numerous population is thus recommended, but larger population means longer execution time (and more memory, too, but this is not a problem today). In summary, I would recommend the population 2-5 times larger than the critical, given above. But certainly not 30 times more numerous!
Chapter Biology, Physics, Small Worlds and Genetic Algorithms
An important issue when using meta-heuristic algorithms is the selection of proper values for algorithm parameters. Trial and error can be used. Also simulated data is good way to adjust the algorithm parameters. Since the optimum solution for the simulated data set is known, then the results of implementation with different parameter values can be used to calibrate the parameters of the algorithm. The sixth section of the following paper discusses about it.
I agree with Soheila. As she said, You should choose population size by considering a trade-off between speed of convergence and exploring as much search space as possible.
One possible answer to your question is given by Grady et al. (2005):
"In the implementation of the genetic algorithm for optimization, the population size, the number of subpopulations, and the maximum number of generations for evolution must be chose. All of these quantities are determined based on the number of independent variables, n, in the system to be optimized. The population must contain enough individuals to adequately span the computational space. The population size should contain a minimum of sqrt(n) individuals, spread over subpopulations numbering 2*sqrt(n). The minimum number of generations is 200*sqrt(n)"
Grady SA, Hussaini MY, Abdullah MM. Placement of wind turbines using genetic algorithms. Renewable Energy 2005; 30(2):259-270.
No one can give you a clear cut answer to this question. Population size (PopSize) in meta-heuristic problems are best taken as control parameters to be adjusted for quality of solution. By observing how the performance of the algorithm changes when population size vary, one can easily determine the right number of search agents for a particular problem. A parameter sensitivity analysis, therefore, is in order in this type of situation.