Recently, I'm working on constructing a dataset for detecting fake news in the Bengali language. I'm looking for guidance on:

1.Identifying credible sources for real and fake news

2.Handling code-mixing and dialects

3.Annotation standards and class balance

4.Cultural/linguistic challenges in low-resource NLP tasks

If anyone has worked on similar datasets or has experience in multilingual fake news detection, I’d really appreciate your insights🙂

More Abdullah Al Noman's questions See All
Similar questions and discussions