Jorge Guevara
in
Workshop: Machine Learning for Geophysical & Geochemical Signals
Abstract
Jorge Guevara, Blanca Zadrozny, Alvaro Buoro, Ligang Lu, John Tolle, Jan Limbeck, Mingqi Wu, Defletf Hohl IBM Research and Shell Inc.
An Interpretable Machine Learning Methodology for Well Data Integration and Sweet Spotting Identification. The huge amount of heterogeneous data provided by the petroleum industry brings opportunities and challenges for applying machine learning methodologies aimed to optimize and automate process and procedures in this area. For instance, petrophysical data recorded in well logs, completions datasets and well production data also constitute good examples of data for training machine learning models with the aim of automating procedures and giving data-driven solutions to problems arisen in the petroleum industry. In this work, we present a machine learning methodology for oil exploration that 1) integrates heterogeneous well data such as: completions, engineering values, well production data and petrophysical data; 2) performs feature engineering of petrophysical data from horizontal and vertical wells using Gaussian Process Regression (Kriging); 3) it enables the discovery of new locations with high potential for production by using machine learning modeling for sweet spotting identification; 4) it facilitates the analysis of the effect, role, and impact of some engineering decisions on production by means of interpretable Machine learning modeling; 5) it allows the incorporation of prior/expert knowledge by using Shape Constraint Additive Models and; 6) it enables the construction of hypothetical "what-if" scenarios for production prediction, by means of conditional plots based on residual plots analysis. We validated this methodology using real well production data. We used nested leave-one-out cross-validation for assessing the generalization power of models. Among the results, it is important to highlight that 1) performance improves by including prior knowledge via SCAMs, for example, we have a percentage change of 24\% between the best RMSE result from black-box ML models vs a model that incorporates prior knowledge. 2) we were able to construct hypothetical what-if scenarios based on actual petrophysical data and hypothetical completion and engineering values, 3) we were able to assess the validity of ML models through effect analysis via conditional plots.