There is a good repository of geneset collections at Broad institute's MSig database http://www.broadinstitute.org/gsea/msigdb/index.jsp . You can download collections based on Gene ontology, gene positions, their role in cancer or immune response.
There is a good repository of geneset collections at Broad institute's MSig database http://www.broadinstitute.org/gsea/msigdb/index.jsp . You can download collections based on Gene ontology, gene positions, their role in cancer or immune response.
A gene set is any set of genes that share some common feature. This can be a function, the location of the gene product, the participation of the product in some metabolic or signalling pathway, the protein structure, the presence of transcription-factor-binding sites or other regulatory elements, the participation in multiprotein complexes, ... litearally anything you can imagine.
The most commonly used resources to retrieve "curated gene sets" are GO and KEGG:
The GeneOntology (GO) structures genes in a hierarchichal ("tree-like") structure according to the "molecular function", the "biological process", or the "cellular localization". So from GO you may, for instance, get all genes (gene products) that have esterase activity (a molecular function), or that are located in the inner mitochondrial membrane (a cellular localization).
The KEGG pathway database provides gene sets that are related to "pathways", where "pathways" are defined based on metabolism, information processing, cellular processes, organismal systems, human diseases and drug development.
But there are many others, as you can see from MsigDB linked by Ahmed above.
But one does not neccesarily need to use sets as they are provided by these databases. You can create your own gene sets based on your needs and your knowledge (I know a lab working since many years on Il1-signalling; they set up their own "Il1-signalling pathway" gene sets that incorporate all their knowledge that is usually a bit more detailed than that found in databases like KEGG, GO, et al.).
However, for GSEA we usually use a prerank-list (a text file with extension '.rnk') where gene official symbols were ranked based on p-value and we also provide the direction based on Fold change.