I have 12 species from same genus. For supermatrix approach we generally use the multiple sequence alignment single copy orthologs sequences then concatenate them and remove the badly aligned columns and generate ML trees. In my case I got around 2000 single copy orthologs that I used.
To add outgroup species, do I first need to cluster the outgroup sequences with my 12 species sequences? If the reply is yes, then I wonder it may decrease the number of single copy orthologs as they are the possibly the part of core genome and I suppose probably they all not will be present in outgroup or may be removed by MCL clustering because of less similarity than others.