Have you tried transforming the datasets into a data graph where each element is a tensor, then pre-process the tensors using a twisted manifold? If that is works then try the fnn algorithm as feed forward back prop. model.
I would suggest trying the new Data Stax Cassandra offering (trial version) or the community version. You would normally start cassandra with the - g option. This worked pretty well for CERN's LHC project.
Your problem then moves into the data collection systems where there might be some economizing algorithms that may be useful. Check your model's ability to us multi-core computing techniques.