Here's an open-ended question relating to copyright, ethics, power relations in academia, and corpus linguistics:

What is the situation in your country/university with respect to the intellectual property rights of corpora/data collected and constituted by a PhD student during the preparation of their thesis?

All other considerations aside (i.e. suppose that the data is original, with no prior copyright holders, and that they have been duly collected with the consent of participants):

(1) Does the PhD student retain the intellectual property rights to such data? Or do they automatically become the intellectual property of the university, by means of an employment contract or another legal document (e.g. one that PhD students may be forced to sign in order to be authorised to defend their thesis)?

(2) What happens if the PhD student wishes to share/publish their data/corpora under an Open Access license (e.g. Creative Commons) after their defence or even before it? Do they need the permission of their supervisor, of a higher-level university body, of their funding agency, of all of the above? Has it ever happened in your university? Have there been cases where the researcher wanted to share data under an Open Access license and were prevented from doing so by another level of the hierarchy?

(3) If the data does become the intellectual property of the university, is there any obligation for the university afterwards (e.g. are they obliged to make them available through an institutional repository)? If the data becomes part of an institutional repository, does the PhD student have any say on the type of license under which they will be distributed? (for example, do they get to choose "non-commercial")?

(4) After the defence, is it possible for the university (or even an individual supervisor) to formally ask their former student (now Dr) to refrain from using the data/corpus they had collected during their thesis? Note that, in theory, if the corpus automatically becomes the intellectual property of the university, this is entirely possible. Do you know any cases of universities sending formal "cease and desist" letters against their former PhD students?

I would like to collect information about current practice and law in different countries with respect to this issue. For example, some countries limit these practices (considered an abusive utilisation of copyright); some Codes of Conduct in Dutch universities explicitly state that, unlike other productions, the copyright of a PhD thesis is retained by the PhD holder; in "business-friendly" Belgium, the issue is dealt under labour law (therefore a PhD student is just another employee and everything they produce belongs to their employer).

Researchers are becoming increasingly aware that the current situation is not really conducive to early-career researchers sharing their corpora under Open Access licenses.

Legal experts will provide data and analyses, as these matters can get complicated. But I would also like to hear some experiences and the opinions of corpus linguistics practitioners. Any pointer to your country's laws, university's code of conduct, case law, cases reported in the media, stories and anecdotes or even personal experiences (if you don't mind sharing them) are welcome.

Thank you very much for participating in the discussion and thank you for your help!

More George Christodoulides's questions See All
Similar questions and discussions