A plagiarized code may be either an exact copy of the original code, or a variant, applying various textual transformations such as;
- Changing the order of operands/operators in expressions,
- Changing data types,
- Adding redundant statements or variables, and
- Changing comments,
- Changing white space ,
- Renaming and reordering identifiers,
- Reordering and replacing code blocks,
- Reordering statements within code blocks,
I would like to make a research on the subject of software code plagiarism checking.
Is there any publicly available benchmark dataset?