What would you suggest for big data cleansing techniques? Is your big data messy? If so, how (or how much) does it affect your work / research? How do you get rid of noise? How do you verify big data veracity, esp. if the source is social media? I would appreciate any suggestions and/or pointers to recent articles in the media, research papers, or documented best practices on big data verification, quality assessment or assurance.

More Victoria Rubin's questions See All
Similar questions and discussions