Im new to data science, and was wondering if someone could point me in the right direction to start understanding how to perform the correct statistical analysis/decomposition/modeling analysis of the following data set.

I have a number of continuous variables that I can add or not add to a control sample (which has no additives) to change the output variables by some amount relative to the control's output values (example dataset picture attached).

How can I correlate each input to the output and use this to build a model that will help me predict the expected output given the addition of some new ratio of additives A, B, C, D, E, F, G at amounts a*, b*, c*, d*, e*, f*, g*. A-G are not necessarily independent.

In the real data set, I have ~400 sample data points for ~30 input variables, and ~20 output variables, so machine learning will not help as this is most definitely intractable. Further, this is a biological system so building the underlying network itself would be worthy of a PhD, an amount of work that Im trying to figure out how to avoid.

More Nikhil Goel's questions See All
Similar questions and discussions