I want to perform a similarity screening between a reference vector and a set query vectors of same length/type by using a cosine (vector) similarity score (CSS). By definition a calculated CSS is ranging between 0 (no similarity exists between compared vectors) and 1 (the compared vectors are absolutely similar). However, in order to make a decision which compared entities to use for further analysis, I need a CSS threshold (cut-off value). For example: If compared vectors result in a CSS value which is greater or equal to the CSS threshold (t) then I'll use it, otherwise (CSS < t) it'll be rejected.
I tried to find around the web (scientific articles and books) if such CSS threshold exists (obtained as a rule of thumb), but to date nothing.
Do you know of some pre-determined CSS threshold or maybe a method how can be calculated/defined?