What would you suggest for big data cleansing techniques? Is your big data messy? If so, how (or how much) does it affect your work / research? How do you get rid of noise? How do you verify big data veracity, esp. if the source is social media? I would appreciate any suggestions and/or pointers to recent articles in the media, research papers, or documented best practices on big data verification, quality assessment or assurance.