Plagiarism can be materialized in different forms and based on these forms will be the level of difficulty in detecting them. For example: in Maurer, H., F. Kappe, B. Zaka. Plagiarism – A Survey. Journal of Universal Computer Sciences, vol. 12, no. 8, pp. 1050 – 1084, 2006, he gives a hierarchy of plagiarism categories which range from copy paste plagiarism to idea copying to non existent or incorrect references. Simple ways of detecting plagiarism are based on distance metrics, others rely on cryptographic hashing on fragments for comparison. Other more promising approaches in my opinion can be along the lines of :Intrinsic Plagiarism Detection Using Character n-gram Profiles by Efstathios Stamatatos.
We have used simple text mining techniques to determine students projects originality and reject those that are copied. The simplest task was to detect 100% copied projects ;)
This shared task series investigates a number of different interesting plagiarism scenarios, including attempts to detect plagiarised text which has been put through several automatic machine translation engines and task which involve finding re-used text in a pair of documents, and is a good starting point for investigating the SOTA in what is quite a complex area.
Here's the overview PDF, with references to the individual systems in the shared task