Application of Machine Learning (PIML) in Petroleum Reservoir Engineering Applications
With reference to the availability of Reservoir Rock Properties, we use to have a relatively lesser data for Porosity (despite having normal distribution of data); very less data for Permeability (obviously due to its log-normal distribution @ field-scale); and almost nil data for Rock Compressibility (becomes critical in layered formations), with reference to the availability of the relatively larger Reservoir Fluid Properties.
Further, @ field-scale, we can never measure the fundamental multi-phase fluid flow data properties @ sub-pore scale, associated with capillary forces (capillary pressure: IFT) and wettability (contact angle); and, we can only get “equivalent” values from relatively smaller laboratory-scale investigations using experimental techniques (lab data estimated, not at the required scale of Darcy or continuum), in the absence of any reservoir physical and chemical heterogeneities.
1. So, with huge data noise; with heterogeneous data noise; with significant outliers; and with drastic distribution shift in data; along with data sparsity associated with several fluid regimes of the reservoir; and with a randomness in data partitioning; how will we be able to quantity the sources of uncertainty associated with Reservoir Rock Properties (data side)?
2. And, in the case of fundamental multi-phase Reservoir Fluid Flow parameters (IFT, contact angle), we do not even have the real field-data, while, we just manage these parameters using lab-scale investigations. If so, then, how could we justify ‘Data Quality’ in Reservoir Engineering applications?
3. While dealing with the strongest prior, i.e., Partial Differential Equations in the case of Reservoir Engineering applications, (unlike dealing with the relatively weaker priors like Ordinary PDEs, or, stochastic DEs, or, symmetry constraints, or, intuitive physics constraints); while, given the poor justification towards Data Quantity and Data Quality, how will we be able to deduce the Reservoir "Architecture" from such data sets?
4. With poor architecture, how would it remain feasible to deduce ‘Improved Loss Functions’ from its translation from data-preprocessing, deep neural network, data-processing and automatic differentiation?
What exactly is happening during 'Automatic Differentiation' and 'Physics Information' modules (from DNN output)?
5. How will we be able to deduce "individual weighting functions" associated with PDEs, initial conditions, boundary conditions and data?
6. With compromised loss function estimation, how would be able to go ahead with reasonable ‘optimization’, before, we start ‘inferring’ the generated results towards forecasting – given, either the neural simulation or an inverse problem?
Suresh Kumar Govindarajan
Professor (HAG) IIT-Madras
https://home.iitm.ac.in/gskumar/
https://iitm.irins.org/profile/61643
17-Aug-2024