I am planning to perform a ddRAD project but while going through the literature I came across these two words i e RAD tag and RAD loci which is confusing me.
I would say a RAD tag is a fragment of DNA amplified by the particular method (for ddRAD, it would be a fragment with the appropriate 2 cut sites). So a RAD library has lots of RAD tags in it.
A RAD locus is a genomic location from which RAD tags come. They are not equivalent because lots of RAD tags come from a RAD locus, and even different sequences (due to SNPs or sequencing errors). The Stacks program adds to the jargon by saying that a set of RAD tags aligning to a RAD locus form a stack, since the reads pile up in a stack when looking it with IGV or other visualization software.
Eric, am I correct in that your definition of RAD tag is the equivalent to a 'read'?
I have the impression tag and locus are (also) used as equivalents, i.e. both are identified by a stack of reads. For instance Wikipedia says "An important aspect of RAD markers and mapping is the process of isolating RAD tags, which are the DNA sequences that immediately flank each instance of a particular restriction site of a restriction enzyme throughout the genome." citing a paper by Miller et al. (http://en.wikipedia.org/wiki/Restriction_site_associated_DNA_markers)
The following is a quote from the Stacks manual "For example, the program will identify loci with SNPs that didn’t have high enough coverage to be identified by the SNP caller. It will also check that homozygous tags have a minimum depth of coverage, since a low-coverage polymorphic locus may appear homozygous simply because the other allele wasn’t sequenced." (p.22) If a tag can be homozygous, it means it can also be heterozygous, so here the authors don't use 'tag' to refer to a single read. To me it looks like they use locus and tag interchangeably.
I find this matter also very confusing, because nonidentical stacks ~ loci can be merged at different stages in the process, when, depending on the program settings, they are recognised as alleles of the same locus. First within an individual and subsequently in a population. So what were 2 or more loci at one stage become 1 locus later on.
Mastretta-Yanes et al. (2014) write "we refer to a locus as a short DNA sequence produced by clustering together unique RAD alleles". Here the word 'unique' is confusing - it should have been left out - but later in their paper it is clear that loci can be polymorphic. These authors use the word tag only to refer to the barcode seqs used to identify the individuals in the data set. (http://onlinelibrary.wiley.com/doi/10.1111/1755-0998.12291/abstract)
I have a different question about Stacks, but will start another thread for it.
Hmmm, definitely confusing and there is confusion about it! I can't say that the terminology was worked out with any great rigor. One thing to note is that RAD markers were first developed for RAD microarrays, and the tag term coined at that point, so in my mind it refers to the fragment and not just a read.
I think it is useful to think of reads which may contain different sequence errors collapsing to a particular tag, and tags which may contain different alleles collapsing to a locus. Other authors may or may not use it the same way!
Your definitions are certainly useful. Best to have 3 terms to refer non-overlapping concepts. At the same time we need to be aware that others may apply different definitions. For the sake of completenes: I've also come across phrases like "we define a locus as having a minimal stack depth of xxx and exactly 1 snp". So here they use locus to refer only to the subset they analysed.