I have a table with species/site data and one with ecological conditions for each site. I would like to define some species groups having similar requirements. Is there any other method apart from the clustering?
There is a number of such a methods. PCA is the best for environmental variables, for biota assemblages Canonical Corespondance Analysis give good resuls ploting species, assemblages and shaping them env. factors together. SIMPER is good when you divide your communities ad hoc - it allow to recognize characteristic taxa for your groups of samples. I ues NMDS against PCA to biota assemblages and SOM neural network when working on more complicated data. You can easily use these methods with PRIMER 6 and CANOCO software. PAST software is free, avialable on line but it not includes all these methods.
There are many different ways to get such groups. What they already suggested you is ok, alternatively you can use the free software JUICE (http://www.sci.muni.cz/botany/juice/). In the manual there is a part dedicated to the Cocktail method, that exactly use 'sociological species group' (groups of species with same ecological requirement). There you can find information how to build such groups using the phi coefficient (it is an index of fidelity). You can also find a lot of literature on line about this topic.
Note though, that PCA should not be used for community data (only for env), and NMDS ordinates your sites due to species compositions (I think that was not the intension here). You could use RDA to constrain community data with some environmental variables, but then again RDA should be only used for a bunch of a priori defined env.variables (not the whole lot of variables that you just happened to measure).
In my opinion Canonical Corespondance Analysis will be best choice to plot species, sites and env. factors as vectors using the tripopot option. Yes SIMPER is also good when you want to identify characteristic species for your samples. You can easily follow these methods with PRIMER 6 (cluster, MDS, SIMPER) and CANOCO (CCA) software.
You can also try with Multivariate regression tree (MRT) that not only gives sites groups with the same ecological conditions but also the limit values of the varaibles for each group of sites.
Another piece of advice - CCA and RDA both are direct ordination methods where you can include environmental variables. However the choice between them depends also on the of the gradient of variation in your data. RDA is based on a linear model whereas CCA is based on a unimodal model. Smilaeur and Leps in their book that accompanies CANOCO advise to analyse your community data first using DCA - an indirect method (ie inputting just community data) based on a linear method. The gradient resulting from this method is in units of standard deviation. If your gradient (ie the length of the first ordination axis) is 3 SD or less, a linear model is more appropriate i.e. RDA, whereas if it is 4 or greater, you should use a unimodal model ie CCA.
according to WikiPedia, "Indicator value is a term that has been used in ecology for two different indices. The older usage of the term refers to Ellenberg's indicator values, which are based on a simple ordinal classification of plants according to the position of their realized ecological niche along an environmental gradient.[1] More recently, the term has also been used to refer to Dufrêne & Legendre's indicator value, which is a quantitative index that measures the statistical alliance of a species to any one of the classes in a classification of sites.[2]". for the latter that works for you, there are many ways to address your question in addtion to good suggettions above, see refernces for the latter.
First, you can simply run any kind of (unconstrained) ordination method based on species data alone just to inspect an overall sample variation. Here, with regards to data (samples) agglomeration in two-dimensional space, an ecological gradient might be already acknowledged and interpreted accordingly.
Second, run several types of constrained ordinations, factor analyses or discriminant methods by means of CANOCO or any similar software package. Contrary to simple ordination methods, some statistical inference is possible as a basis of scientific reasoning. This is the preliminary way I approach while asking similar questions.
Hy Spyros. Before any analysis, and I agree with CCA or RDA methods, you should ask your own data about the question of steady state conditions (or not). All these methods are based on the hypothesis that species and environment are, more or less, in equilibrium conditions or in other words that species reflect exactly ecological conditions. This is the principle of bioindication. But be aware of the changes that may have occured in the sites you sampled. Sometimes historical conditions explain better the species than present conditions!
You could also try Principal Coordinate Analysis (PCoA). You first calculate ecological similarity using Gower metric (in case you have abiotic variables with different units) and then ordinate the similarity values. Another option is NMDS.
For plants you can use the PlantLife Form concept of Raunkiaer as modified Ellenberg & Mueller-Dombois1967; for birds & insects you can use the guild concept. Both are essentially the same for grouping organisms with similar requirements and functions. See the book Island Ecosystems, IBP Synthesis Vol 15 by Dieter Mueller-Domboiset al.1981.
Be carefull about the realized and fundament niches, what you see in your data may not 100% reflect requiremetns, other processes may be at play (competition, predation, best habitat not available...). I would suggest to also do a literature review to document species requirement and do a cluster analyses on the trait matrices, to create group of species having similar requirements, and then look at if you see similar patterns. Food for thoughts