I am trying to figure out ways of discursive diffusion by revising corpora and mining databases. Anybody has experience with tools like RapidMiner applied to some discourse analytic framework?
You could try to contact Chris Culy, who specializes in the visualization of linguistic data-i.e. the visual representation of language data in graphs or other formats to help linguists both to present their findings and to spot patterns they might otherwise miss. I went to his workshop at Coventry University last December. While the workshop was of no immediate relevance for me, many corpus linguists working in a range of areas (grammar, lexis, engineering discourse, newspaper articles etc) said that they had found it very useful. He was based at Tuebingen University, but he is now freelance. He offers free visualization software if you're interested. I'm not sure how the software would work with a really large corpus, but if there were problems, you could always look at your corpus section by section.
A quick Google search will give you an idea of where he is at.