Hi,
I am new to the field of structural biology and am trying to understand how sequence similarity clusters are defined in the Protein Data Bank. As a non-scientist, I would be grateful for a high-level answer.
Specifically, my question is: what is the definition of a sequence similarity cluster and does the definition stay the same over time?
a.) In other words, once put into a sequence similarity cluster, does the protein chain always stay in that cluster? That is, are the same protein chains always grouped together in a cluster (although new chains may get added to the cluster over time as the Protein Data Bank grows)?
b.) Or, is it the case that protein chains get grouped with different chains over time as the Protein Data Bank grows?
c.) How are new sequence similarity clusters born?
Thank you!