Although the points raised are valid, you can probably obtain enough data to do good work on, from a relatively "small" dataset.
On projects where we are comparing several samples (e.g. time series, or cross section studies), we routinely sequence up to 12 samples per single HiSeq 2000 lane, and this give us plenty of data.
You may also consider the MiSeq plataform: it has only the one lane, but if it gives even more reads than one HiSeq lane (and the low cost is really attractive).
That was the kind of answer I was looking for. Are you working on 16S PCR products or whole DNA extracted from the soil? And how is the assemble and the findings about the metabolism in data banks like KEGG?
We are currently working with gut microbiome samples, so, loads of diversity.
Assembly is... complicated... to say the least. We usually spend some time at the beginning just trying different assembly strategies, to get get good quality, long contigs, with a few samples before running the whole plate through.
Annotations can be done with a variety of tools. Camera has a nice pipeline, which you can use online (if you don't have too much data) or contact them to install locally, and it includes KEGG annotations, COG, 16S prediction/searches (just remember that it is probably an outdated version of KEGG, as the current database isn't free anymore, but still pretty good).
http://camera.calit2.net/
Incidentally, we use a lot of custom made code to annotate our data.
i agree with chris. your questions are valid and important. illumina Hiseq platform gives your large amount for data. For 2x100bp dna run you can get upto 40gb of good quality data. Miseq
(esp v2 hardware and) offers a cheap alternative but gives you sufficient data to get good results. i am most satisfied with miseq 2x150 kit in comparison to 2x 250.. the 2x250 drastically losses its quality after 10 bp or so.
Since you are interested on 16s amplicons therefore miseq 2x10 run should give you enough of data and coverage to answer your queries.