A machine learning analysis based on big data for eagle ford shale formation

Title

A machine learning analysis based on big data for eagle ford shale formation

Subject

Machine learning
Forecasting
Hydrocarbons
Big data
Shale
Regression analysis
Resource valuation
Well testing
Decision trees
Random forests
Multivariant analysis
Petroleum reservoir evaluation
Seismology
Oil well logging
Petrophysics

Description

Hydrocarbon production from shale formation has become an essential part of the global energy supply in the past decade. The life of a project in an unconventional play significantly depends on the prediction of Estimated Ultimate Recovery (EUR). However, the conventional methodology to predict EUR becomes less accurate for shale formations, which significantly affects the economics returns of projects in unconventional plays. The objective of this article is to investigate the most important independent variables, including petrophysics and completion parameters, to estimate EUR by the machine learning algorithm. A novel machine learning model based on Random Forest Regression is introduced to predict EUR and to rank the importance of the independent variables. In this article, production/petrophysics/engineering/ data with more than 25 variables from 4000 wells in Eagle Ford is summarized for analysis. The data is collected from production monitoring, well logging, well testing, seismic interpretation and lab experiments. This paper has three major components. Firstly, a multivariate linear regression model is created to predict the overall EUR. Secondly, the spatial autocorrelation analysis is carried out to identify whether spatial variables could affect the accuracy of the multivariate regression model. Thirdly, the Random Forest Regression models are trained to examine their reliability in predicting EUR with spatially autocorrelated data. The importance of key predictors is also identified. The final models are tuned with optimized hyperparameters. Through the article, the predictive capabilities of each Random Forest Regression model are discussed in detail to understand the physics behind unconventional hydrocarbon production mechanisms. The results and workflow presented in this paper are insightful and novel. Firstly, we test the multivariate regression analysis with all the petrophysics and completion variables using the backward elimination method. This widely used model has a limitation of excluding the spatial information. In order to identify the impact of spatial variable, we calculate the Moran's Index and find out that the data in this study is clustered or spatially autocorrelated. The p-value for EUR, Oil EUR and Gas EUR are 0.000002, 0.000000 and 0.12, which all reject the null hypothesis that the data is randomly distributed. To include the spatial information in the prediction, we use advanced machine learning technology, Random Forest, to predict the EUR with a combination of petrophysics, completion variables and spatial information. The key variables to predict EUR, Oil EUR and Gas EUR by the Random Forest Regression are identified. However, the importance of the key variables to predict Oil EUR and Gas EUR are different. Therefore, we split the overall EUR Random Forest Regression model (57% explained) into two prediction models, one for Oil EUR prediction and one for Gas EUR prediction. The Gas EUR Random Forest Regression model has better performance (76% explained) compared to the Oil EUR Random Forest Regression model (60% explained). This study provides a deeper understanding of unconventional hydrocarbon production prediction from a big data perspective, and proposes a novel and reliable machine-learning model to predict EUR to evaluate economic returns in Eagle Ford. Compared to the traditional multivariate regression model, our Random Forest Regression models are more reliable. In addition, the Random Forest technique is able to rank the importance of the relevant independent variables, and the rank of importance can be applied to guide and to improve data collection and model training for further study on this topic. The workflow presented in this article can be also used to train data for other unconventional resource plays. 2019, Society of Petroleum Engineers
2019-September

Creator

Liang, Yu
Zhao, Peidong

Publisher

SPE Annual Technical Conference and Exhibition 2019, ATCE 2019, September 30, 2019 - October 2, 2019

Date

2019

Type

conferencePaper

Identifier

26386712
10.2118/196158-MS

Citation

Liang, Yu and Zhao, Peidong, “A machine learning analysis based on big data for eagle ford shale formation,” Lamar University Midstream Center Research, accessed May 18, 2024, https://lumc.omeka.net/items/show/28311.

Output Formats