I am trying to do a topic modeling study of a dataset of about 4 million tweets using Mallet and running into issues with working memory, or "heap space." My computer does have around 15 GB of working memory, but Mallet, by default, utilizes only 1 GB. So I was getting the following error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

I expanded the Mallet heap space allocation in the manner prescribed on https://programminghistorian.org/lessons/topic-modeling-and-mallet#issues-with-big-data. But it didn't help. So I was wondering if anyone had a solution.

Thanks.

Similar questions and discussions