09 September 2018 6 3K Report

I am running an R script that downloads and preprocesses all the available methylation data sets from TCGA. I'm using the Bioconductor package MethylMix for this. However, when I try to process the 450K breast cancer methylation data set (size ~13GB), I get a "Cannot allocate vector of size 12.8 GB" error.

I am running R 3.4.0 on 64-bit x86_64_pc-linux-gnu using my school's computing cluster, and each node has the following properties:

  • Dual Socket
  • Xeon E5-2690 v3 (Haswell) : 12 cores per socket (24 cores/node), 2.6 GHz
  • 64 GB DDR4-2133 (8 x 8GB dual rank x8 DIMMS)
  • No local disk
  • Hyperthreading Enabled - 48 threads (logical CPUs) per node

so it seems as though there should be enough memory for this operation. The operating system is Linux, so I thought that R will just use all available memory, unlike on Windows? And checking the process memory using ulimit returns "unlimited." I am not sure where the problem lies. My script is a loop that iterates over all cancers available on TCGA, if that makes any difference.

More Lon W. R. Fong's questions See All
Similar questions and discussions