“Philosophical discussion in the absence of a theory is no criterion of the validity of evidence.”
-- A. N. Whitehead. Adventure of ideas. (1933:221)
In case of an investigation or in a disciplinary technology, empirically (irrationally speaking, i.e., speaking in a strict non-Cartesian way)speaking, data/corpora is the raw material (ephemeral ‘arbitrary signifiers’ in case of linguistics) to built up a theory following inductive method.
Why, then, mere ‘corpus’ is tagged with linguistics, an epistemological disciplinary technology?
‘Corpus’ is not tagged with Physics, Geology, Psychology, Sociology etc (e.g., Corpus Physics or Corpus Sociology), though they are also dealing with data!
Collection of data and arranging them (typing?) in a digital machine do not involve any knowledge or wis(h)dom but a special skill that needs clerical precision. Documentation, no doubt, is a tiresome job. Utilizing a tool (a digital machine) as a repertoire, does not necessarily entail the birth of discipline.
Ascribing static (“thetic...”, Kristeva,1974) meaning to those entries, though needs epistemology and that can be handled by well-established theory-based disciplines: Lexicology, Semantics, Pragmatics etc. If we have such levels of linguistic analysis, do we need such dubious coinage, “Corpus Linguistics”?
And each empirical discipline needs data for further observation, experimentation and inductive generalization (one may raise Popper’s [1934, 2009] points for refuting Inductivism here), i.e., data is an initial part of the whole, but neither a theory nor a praxis.
However, it is a salebrated discipline now! Why is it so? What is the purpose of such discipline?
My friend says, “We, the residents of the so-called third world, are part of the data-collection team—don’t you understand that? How dare you? You cannot be allowed to perform theoretical plays.” (Galtung, 1980)