Want to know how Urkund is working. Under what basis it gives the similarity index. Any way is there to avoid similarity index for mathematics researchers.
Urkund is first a database filled with documents of all sorts - in the many millions - academic ones as well as others. All reports from project work at universities (at least those in Sweden) are to be uploaded to this system. Second, the system can then be used by examiners to see whether there are similarities among student reports - basically through text recognition. Urkund also browses the web for plagiarism detection. As an examiner, if you upload a new document you will get as a response a measure of similarity with other documents; based on the level of similarity the student(s) may then need to explain why the text is relatively unoriginal. I do not know which universities are using it, or how similar the document needs to be bring about an alert. We do use it and we have found in our courses that the plagiarism problem has decreased quite a lot.
I should also say that the system Urkund has caught a few students, who have taken stretches of text directly from a database (e.g., from a book or article), or borrowed from fellow students, and those detections (and a penalty - such as no access to the university for a while, including a detention from taking exams - have in fact lead to a decrease in cheating of this form. (Of course, it has not reduced the amount of cheating overall, as so-called scholars all over the world steal (i.e., copy) heaps of papers, transform them, and then publish them as their own. Junk science, that's what it is.
The gist of this is that if you use a bit of text that's not your own you
(1) will not learn as much,
(2) the real authors will not be acknowledged as they should, and
(3) you have effectively stolen property that belong to someone else.