Accepted Manuscript

Application of artificial intelligence to forecast hydrocarbon production from shales Palash Panja, Raul Velasco, Manas Pathak, Milind Deo

PII: S2405-6561(17)30114-1

DOI: 10.1016/j.petlm.2017.11.003

Reference: PETLM 175

To appear in: Petroleum

Received Date: 15 June 2017 Revised Date: 22 September 2017 Accepted Date: 22 November 2017

Please cite this article as: P. Panja, R. Velasco, M. Pathak, M. Deo, Application of artificial intelligence to forecast hydrocarbon production from shales, Petroleum (2017), doi: 10.1016/j.petlm.2017.11.003.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Application of Artificial Intelligence to Forecast Hydrocarbon Production from Shales

Palash Panja1 *, Raul Velasco1, Manas Pathak2, Milind Deo2

1Energy & Geoscience Institute

432 Wakara Way, Suite 300, Salt Lake City, UT 84108

2Department of Chemical Engineering, University of Utah 50 Central Campus Dr., Salt Lake City, UT 84112

*ppanja@egi.utah.edu Corresponding author

Abstract

Artificial intelligence (AI) methods and applications have recently gained a great deal of attention in many areas, including fields of mathematics, neuroscience, economics, engineering, linguistics, gaming, and many others. This is due to the surge of innovative and sophisticated AI techniques applications to highly complex problems as well as the powerful new developments in high speed computing. Various applications of AI in everyday life include machine learning, pattern recognition, robotics, data processing and analysis, etc. The oil and gas industry is not behind either, in fact, AI techniques have recently been applied to estimate PVT properties, optimize production, predict recoverable hydrocarbons, optimize well placement using pattern recognition, optimize hydraulic fracture design, and to aid in reservoir characterization efforts. In this study, three different AI models are trained and used to forecast hydrocarbon production from hydraulically fractured wells. Two vastly used artificial intelligence methods, namely the Least Square Support Vector Machine (LSSVM) and the Artificial Neural Networks (ANN), are compared to a traditional curve fitting method known as Response Surface Model (RSM) using second order polynomial equations to determine production from shales. The objective of this work is to further explore the potential of AI in the oil and gas industry. Eight parameters are considered as input factors to build the model: reservoir permeability, initial dissolved gas-oil

ratio, rock compressibility, gas relative permeability, slope of gas oil ratio, initial reservoir pressure, flowing bottom hole pressure, and hydraulic fracture spacing. The range of values used for these parameters resemble real field scenarios from prolific shale plays such as the Eagle Ford, Bakken, and the Niobrara in the United States. Production data consists of oil recovery factor and produced gas-oil ratio (GOR) generated from a generic hydraulically fractured reservoir model using a commercial simulator. The Box-Behnken experiment design was used to minimize the number of simulations for this study. Five time-based models (for production periods of 90 days, 1 year, 5 years, 10 years, and 15 years) and one rate-based model (when oil rate drops to 5 bbl/day/fracture) were considered. Particle Swarm Optimization (PSO) routine is used in all three surrogate models to obtain the associated model parameters. Models were trained using 80% of all data generated through simulation while 20% was used for testing of the models. All models were evaluated by measuring the goodness of fit through the coefficient of determination (R ) and the Normalized Root Mean Square Error (NRMSE). Results show that RSM and LSSVM have very accurate oil recovery forecasting capabilities while LSSVM shows the best performance for complex GOR behavior. Furthermore, all surrogate models are shown to serve as reliable proxy reservoir models useful for fast fluid recovery forecasts and sensitivity analyses.

Keywords: Surrogate models; LSSVM; ANN; Oil recovery; Artificial intelligence; Unconventional reservoirs 1. Introduction

Surrogate models are particularly useful for quick predictions given a range of input parameters. These models are used to forecast oil production and perform sensitivity and uncertainty analyses. Polynomial equations and other non-linear equations known as response surface models (RSM) have been popularized for their simple mathematical structure and for easier

implementation. Recently, artificial intelligence applications have gained the interest of engineers and scientists due to their unconventional ways of connecting input data to output. RSM coupled with a proper design of experiments [1] was proven to be an efficient and fast proxy model for forecasting production performance and analyzing uncertainties [2]. Oil rate and water cut results were also predicted using RSM [3]. Response surface models are widely applied to various aspects of reservoir engineering including estimating initial hydrocarbon uncertainty, [4] production uncertainty [5-10], finding an optimal scheme for well placement [7, 11-14], history matching [13, 15, 16], and determining the dew point of water in natural gas processing unit [17]. Field cases have been studied using pattern recognition techniques [18] to determine pressure and production variation according to well locations.

Even though researchers have developed numerical, analytical, and semi-analytical techniques to understand the physics underlying the production from hydraulically fractured tight formations [19-22], many of these systems grow ¿in complexity rendering most of these methods inapplicable. The AI approach on the other hand is very useful when dealing with highly complex systems. At the cost of understanding the physical mechanisms taking place in tight formations, AI helps us analyze and forecast hydrocarbon production and assess performance. In this study, two of the most common AI techniques namely, ANN and LSSVM as well a second order polynomial RSM are used to predict oil and gas-oil ratio production from hydraulically fractured low permeability reservoir. The comparison of these three methods in terms of performance and accuracy is also discussed. The application of ANN started before LSSVM in the early 90's, data from well tests were already being interpreted using ANN [23, 24]. Rock characteristics such as lithology were determined from well logs using fuzzy neural networks [25]. Reservoir heterogeneity with respect to porosity, permeability, and oil saturations

were characterized from geophysical well logs such as gamma ray, bulk density, deep induction, etc. using ANN [26]. Thermodynamic properties from reservoir fluids such as bubble point pressures and formation volume factors at the bubble point have been predicted from four inputs: solution GOR, reservoir temperature, oil gravity, and gas density using ANN, SVM and nonlinear regression [27]. Similarly, crude oil viscosity and solution GOR as functions of pressure have been determined from 12 variables including compositions of oil, bubble point pressure, bubble point viscosity [28], etc. using ANN. Calculations of gas condensate dew point pressures were also made using gas composition, temperature, and heavy fraction properties [29-31] and condensate to gas ratio [32]. Results predicted by ANN for asphaltene precipitation [33] showed promising results compared to experimental studies [34]. Oil rates have also been measured in the pipe line using ANN for varying pressures and temperatures [35]. Various applications of LSSVM include porosity and permeability determination [36-39], water conning in horizontal wells [40, 41], well placement [40], gas-oil relative permeability curves [42], phase equilibrium calculations of hydrates [43], oil flow rate predictions [44], and temperature-pressure relationship in natural gas production and processing [45]. Wide applications of artificial intelligence in improved oil recovery were recently described by researchers [46-49]. Other applications include the description of CO2 solubility [50] and calcium carbonate [51] in brine sequestration processes.

Eight important parameters are considered as input data that include geological parameters (initial reservoir pressure, rock compressibility, and permeability), operational parameter (bottom hole pressure), completion parameters (fracture spacing), rock-fluid properties (Corey gas relative permeability exponent), and fluid properties (initial solution gas-oil ratio and the linear slope of solution gas-oil ratio versus pressure) which are selected from a previous study [52]

where a mechanistic study revealed these parameters to be highly significant. Six models (5 time-based models for production after 90 days, 1 year, 5 years, 10 years, and 15 years and one rate-based model when oil rate drops to 5 bbl/day/fracture) of oil recovery and produced GOR are developed for each surrogate model (RSM, ANN and LSSVM). Data is generated from a generic reservoir model with one vertical hydraulic fracture placed in the middle of the reservoir using a commercial reservoir simulator. The mathematical formulations and workflow to create these surrogate models are discussed in this article. The results obtained for all models are compared using error analyses in terms of coefficient of determination (R2) and normalized root mean square error (NRMSE). 2. Reservoir Model

Unconventional reservoirs such as shales and other tight formations are very complex in terms of possible natural fracture presence and heterogeneity. However, it is possible to build a homogeneous reservoir model using average properties if the variation is not very large. Typically, wells are drilled vertically and then directed horizontally for 1 to 2 miles, where as many as 100 vertical hydraulic fractures are induced to generate high conductive flow paths to the wellbore. Simulating an entire reservoir model that consists of 100 hydraulic fractures is very computationally expensive. Hence, only a small representative portion of the reservoir is simulated where production from a single vertical fracture is considered. The reservoir properties are assumed to be homogeneous as listed in the Table 1.

Table 1: Simulation parameters

Reservoir top depth (ft) Reservoir thickness (ft)

200 750 0.05 200

Reservoir width (ft) Fracture width (ft) Fracture height (ft)

Fracture half-length (ft) 375

Fracture orientation Parallel to YZ plane

Reservoir porosity (%) 5

Initial water saturation (%) 16

The number of unique input parameter combinations could lead to an enormous number of experiments or simulations. The Box-Behnken method [53] is chosen in this study to keep the required number of simulations to a minimum. This simulation design is also suitable for second order response surface models. Using the Box-Behnken design, 114 simulations are modeled for eight input parameters in three levels (minimum, medium and maximum) as shown in Table 2. The values of all input parameters are converted to a -1, 0, and 1 scale using a linear relationship, except for matrix permeability and rock compressibility (where logarithmic values are used instead).

Table 2: Input parameters and their values

Variable Symbol Minimum Medium Maximum

(-1) (0) (+1)

1 Matrix Permeability (nD) X1 10 225 5000

2 Gas Rel. Permeability Exponent, ng X2 1 2 3

3 Rock Compressibility (1/psi) X3 4x10-6 4x10-5 4x10-4

4 dRs/dp (SCF/STB/psi) X4 0.50 0.65 0.80

5 Initial Gas Oil Ratio, Rsi (scf/stb) X5 800 1900 3000

6 Initial Pressure, Pi (psi) X6 4000 5250 6500

7 BHP (psi) X7 500 1000 1500

8 Fracture Spacing (ft) X8 60 180 300

Apart from the 114 simulation results that were used to train the models, 30 additional simulations were ran to test the models. Therefore, the training data is comprised of approximately 80% of the total data set (114 out of 144) while the testing portion is comprised of approximately 20% (30 out of 144). The list of simulations used to train and test the models can be found in the Appendix (Tables A.3 and A.4). IMEXTM from the Computer Modeling Group,

Calgary, Canada was used to conduct all black-oil simulations. The minimum number of simulation grid blocks necessary to obtain accurate results and avoid convergence issues was used as prescribed by Panja et al. [54]. 3. Surrogate Model

As mentioned earlier, three types of surrogate models - a Response Surface Model (RSM), a Least Square Support Vector Machine (LSSVM) model, and an Artificial Neural Networks (ANN) model - were developed and compared in this study. Simulation results in terms of oil recovery and gas oil ratio (GOR) were obtained in two ways: by recording oil recovery and GOR after certain production times and when the oil rate dropped to 5 bbl/day/fracture. In other words, five time-based models and one rate-based model were developed as summarized below:

• Time based model: Models for oil recovery and GOR at 90 days, 1 year, 5 years, 10 years and 15 years.

• Rate based model: Models for oil recovery and GOR when oil rate drops to 5 bbl/day/fracture.

All unknown parameters in the surrogate models (RSM, LSSVM and ANN) are obtained using an optimization routine known as Particle Swarm Optimization (PSO) using Matlab (MathWorks® Inc.). The same optimization routine was used for all surrogate models to eliminate any performance bias. Sometimes, unacceptable physical values such as negative oil recovery factors or gas oil ratios are obtained using surrogate models. To avoid this pitfall, logarithms of the outcomes (recovery factors and gas-oil ratio) are used to build the models. A simplified schematic of methodology used to develop the surrogate model is shown in Figure 1.

Figure 1: Surrogate model development schematic All unknown parameters are listed in Table 3. These parameters are discussed in more detail in the upcoming sections.

Table 3: Number of parameters determined in each surrogate model

Method Number of parameters Optimized parameter

RSM 1 intercept 44 coefficients all

LSSVM 1 Bias term (b) 1 Regularization parameter (y) 1 Kernel parameter (a) 92 Support values (ai) Regularization parameter (y) Kernel parameter (a)

ANN 15 Biases ( 14 hidden layer +1 output layer) 126 Weights (8X14 for hidden layer+14 for output layer) All

The first two models (i.e. RSM and LSSVM) are discussed in detail in a previous article [55].

Therefore, these two models are intentionally discussed in brief here and the reader is referred to the referenced article for more details.

3.1. Response Surface Model (RSM)

The response surface model is the most common method used in many branches of engineering. Basically, an algebraic equation is fitted to develop a relationship between input and output data. During equation fitting with training data, the parameters (coefficients, intercepts etc.) are determined through an optimization routine to minimize error. A second order polynomial equation is chosen in this study. The equation is defined as:

For 8 input variables, there are 8 interaction coefficients, ak, 36 second order interaction coefficients, aij, and one intercept, a0, as shown in equation 1. A workflow to develop surrogate models (RSM and LSSVM) is shown in Figure 2.

i=1 j=i

Figure 2: Workflow used to develop RSM and LSSVM. Modified from Panja et al. [55] As part of the development of a model, validation is performed using test data to assess robustness. An accepted error margin is set for the surrogate model. In this fashion, surrogate models are continuously improved unless the error reaches its acceptance limit. 3.2. Least Square Support Vector Machine (LSSVM)

The Support Vector Machine (SVM) is usually used for classification and regression analysis. A modified form of SVM, namely the least square support vector machine (LSSVM) is used in this

study. LSSVM is close to SVM formulation but solves a linear system instead of a quadratic programming (QP) problem. It has been widely applied in various fields because it is easier to implement, speedy solution convergence, etc. On the other hand, LSSVM has the inherent nature of overfitting to minimize error. Various combinations of data training and testing sets such as 90-10 (%), 85-15(%), 80-20(%), and 70-30(%) were tried. Eventually, a data set with 80% used for training and 20% used for testing yielded the best prediction capabilities in this study. The same combination was used for the other two surrogate models (RSM and ANN). The input and output relationship in LSSVM is given by Equation 2:

y t = wT ^(^j) + b w/ ere Xj e Rv an d yj e R .........................( 2 )

The final form of LSSVM is given by Equation 3: rO 1 ••• 1

1 1 i)+-Y

a i = yi .......( 3 )

ajv. ( ) Jn. ( )

(N+1) X (V+1)

K(x,xi) is known as the kernel function which is chosen a priori. The Radial Basis Function (RBF) kernel is used in this study as shown in Equation 4

K^Xj'Xy) = e xp (--j

Radial Basis Function: Where,

xi: Input vector of ith data b: Bias term

y: Regularization parameter a: Kernel parameter ai: Support values

It is evident from equations 2, 3, and 4 that if the regularization parameter, y, and the kernel

parameter, a , are provided, the bias term and all support values can be determined from a linear

relationship. This is accomplished by using an optimization technique where initial y and a

values are guessed and iteratively improved as described in figure 2. For the optimization part, the training data is further divided into LSSVM training data (80%) and optimization data (20%) as shown in Figure 3

All Data ( 144)

«= 80% (114)

Training Data » 20% (30)

ft 80% (91) * 20 % (23) Test Data

Support Values: a. Y><?

Figure 3: Division of total data set into LSSVM training, optimization, and test data LSSVM training is over once all parameters in the Table 3 are found. At this point, the model can be applied for any unknown input vector using the RBF kernel as shown in Equation 5

V ( ll*-*ill2\ y(x) = ^ ai exp (--—2-) + b ................( 5 )

i=1 ^ ' 3.3. Artificial Neural Networks (ANNs)

The Artificial Neural Networks (ANNs) algorithm was developed based on human learning processes through brain and nerve networks. This is a connectionist technique where input and output are linked through neurons. The most common feed forward architecture consists of one input layer, one or more hidden layers, and one output layer as shown in Figure 4.

ANN Biases

Matrix Permeability

¡Gas Rel. Perm, Expoi Rock Compressibility y

Rock Compressibility

Slope of GOR

Initial GOR

Initial Pressure

Bottom Hole Pressure

Input nodes

Hidden nodes

Output nodes

Figure 4: Basic structure of ANN with input, hidden, and output layers. Links between input and output are established through the internal computations in the hidden layers. The complexity and non-linearity of the model are increased by increasing the number of hidden layers where the individual components of a layer are known as nodes. In this study, there are nine input nodes (eight input parameters as listed in Table 2 and one bias) contained in one hidden layer. A weight was given to each connection for every node and a bias term was added to each hidden and output node. Bias and weight values used in this study are summarized for one output in Table 3. In the process of training the ANN model, all weights and biases are determined by minimizing the error between the predicted output and the training output via activation function at each the node. All output data is normalized as shown in Equation 6:

Yj — Minimum ( Yf)

K' YlOVTYlClliZ6d / \

M ax imum ( Yj) — M inimum ( Yj) Computations in hidden nodes and output nodes are shown in Figure 5.

Figure 5: Input-to-output structure and calculations inside (a) hidden and (b) output nodes As shown in figure 5 computation consists of two calculations: summation and transformation through activation functions where activation functions may be linear or non-linear. In this study, sigmoid transfer function was used as shown in Equation 7

The sensitivity of the model to the number of hidden nodes (neurons) was also investigated. As described earlier, the non-linearity relationship between input and output data increases with the number of hidden nodes. However, increasing non-linearity doesn't always guarantee higher prediction accuracy. To find out the optimum number of hidden nodes (neurons), a sensitivity study was conducted on the training and testing data for oil recovery and gas oil ratio after 5 years of production as shown in Figure 6.

1 + exp(-x)

Number of Neurons in the First Layer Number of Neurons in the First Layer

(a) (b)

Figure 6: Coefficients of determination using different number of hidden nodes for (a) Oil

recovery and (b) gas oil ratio at 5 years.

2 . 2 It is evident from figure 6 that the R is close to unity for training data. On the other hand, the R

value for the test data increases initially with the number of neurons for oil recovery and gas oil

ratio. A maximum R value can be clearly identified at 14 neurons for the case of oil recovery. Therefore, 14 hidden neurons are used in this study. The ANN parameters used in study are summarized below:

• Number of neurons in the first layer (hidden layer) = 14

• Number of neurons in the second layer (output layer) = 1

• Number of weights = Number of neurons in the first layer * Number of input + Number of neurons in the first layer * Number of neurons in the second layer=126

• Number of biases = Number of neurons in the first layer + Number of neurons in the second layer = 15

The unknown parameters in the ANN structure are summarized in Table 3. During training of the ANN, these 126 weights and 15 biases are determined using an optimization routine, namely the Particle Swarm Optimization (PSO) which is discussed in the upcoming sections. 3.4. Goodness of Fit

There are various error measuring tools used in every branch of science and engineering. Their uses are mostly dependent on the model and purpose of the system. During the fitting portion of the model (training the model) the Mean Square Error (MSE) is set as the objective function to determine the optimized model parameters using PSO. As the minimum value of MSE is the indication of a good match between experimental or simulated values and modeled values, MSE is minimized during optimization in PSO. The MSE is calculated between experimental or simulated values and modeled values as shown in Equation 8.

. 2i=i(Yobs,i — Ymodel,i)

MSE =... ....................( 8)

Yobs and Ymodel are the simulated and modeled values respectively, n is the number of data sets. In this study, the Normalized Root Mean Square Error (NRMSE) and the coefficient of determination (R2) are adopted to measure the discrepancy between simulated data and model data. The NRMSE is used over MSE to compare various models (time based and rate based models) in the same scale. The coefficient of determination, R2, is defined as shown in Equation

R2 =1............................(9)

¿¿tot

Where,

SSres = !£= i( Yobs,i - Ymodei,i) , the residual sum of squares

SStot = 1.?= i( Yobs - Ymodei,i) , the total sum of squares

Y0bs = i ^ofc s,i , the mean of observed values The NRMSE is defined in Equation 10,

NR MSE = --—..............................( 1 0 )

*obs,max 'obs.min

Where Yobs,max is the maximum value and Yobs,min is the minimum value of the observed data.

The value of R varies from 0 to 1. R values close to unity and small NRMSE values are indication of a good fit.

4. Optimization Routine: Particle Swarm Optimization

Inspired by the motion of bird swarms, the Particle Swarm Optimization (PSO) routine was developed by Eberhart and Kennedy [56]. In this method, each potential solution is treated as particle. Each particle is characterized by its position and velocity. The position of a particle is defined in a hyperspace whose dimension is equal to the number of unknown parameters being optimized as shown in Table 3. For example, in the case of ANN, particles fly in a 141-dimensional hyperspace. Several particles are initially defined in hyperspace where they iteratively change their position to determine the optimum position. Fitness of a particle is determined by a fitness function such as the MSE. This algorithm is similar to the method followed by a bird groups searching for food in a vast area.

Two solutions, pbest and gbest, at any iteration during execution of the algorithm are tracked. The local best or pbest is defined as the best position of a particle in the hyperspace as determined by the fitness value. The global best or gbest is the overall best value by any particle so far in the population. At each iteration step, the velocity is updated first and then position. Accelerating the particle towards its pbest and gbest by updating velocity is done by two separate random numbers (random 1 and random 2) as shown in Equation 11

ffc+j = • VfcH" C1 ' randomt ■ (Pbesi - Pk) + c2 ■ random2 (gbest - p'k)

( 1 1)

Inertia Component Cognitive Component Social Component

Weighted current velocity Influence of particle Influence of entire population

The cognitive components guide the local search from its local best (pbest) and the social

component is responsible for global search depending on the population best (gbest) [57]. In

equation 11, v\+1 is the velocity in the next iteration step which is partially preserved from the

current velocity by an inertia weight, w ^range 0.4 to 0.9). The acceleration coefficients (C1 and C2) for cognitive and social components are chosen by trial and error. The position of a particle for the next iteration step is updated by the following equation:

Pk+1 =Pk+ vk+1 • At ..............................(12)

The values of wi, C1, C2, and other parameters used in this study are given in Table 4.

Table 4: Particle swarm optimization parameters in various surrogate models

Surrogate Model Particle Swarm Optimization

Number of Particles Maximum

Name Parameters Optimized for a single parameter C1 C2 w Iteration

RSM 1 intercept 44 coefficients 100 2 2 0.6 1000

LSSVM 1 Regularization parameter (y) 1 Kernel parameter (a) 100 2 2 0.6 1000

ANN 15 Biases 126 Weights 100 2 2 0.6 1000

Initial position and velocity of each particle is randomly distributed. After the initialization of positions and velocities of all particles, fitness is calculated. In subsequent steps, positions and velocities are updated iteratively by the local best and the global best parameters as summarized in the flowchart shown in in Figure 7 [58].

Figure 7: Particle Swarm Optimization flow chart. Modified from Ahmadi et al. [58] The entire flowchart can be divided into four parts, namely, initialization, fitness evaluation, condition check, and updates of velocity and position. Acceptance of any particle as potential solution is determined by its fitness value which is calculated in each iteration step. As described earlier, one local best, pbest, and one global best, gbest, are recorded during each iteration. The number of iterations is only limited by time and computational constraints; hence the maximum iteration number is defined by the user.

5. Results and Discussion

Five time-based models (90 days, 1 year, 5 years, 10 years and 15 years) and one rate-based model (5 bbl/day/fracture) were trained using RSM, LSSVM and ANN. The objective of this study is to compare performance of three surrogate models. Production performance in terms of oil recovery and gas-oil ratio are compared with simulation data. Since all time-based models behave similarly, only two time-based models (one early production model after 90 days and a long production model after 10 years) along with one rate-based model are discussed here. The fitness of a model with training data and test data are both discussed in this section. Once the model is trained, it is tested against an unknown data set (i.e., test data) to check for robustness of forecasting capabilities. As discussed earlier, the fitness is determined by two measures, the coefficient of determination (R ) and normalized root mean square error (NRMSE) which are used here to compare different surrogate models. 5.1. Training

In this section, model fitness as compared with training data is evaluated and discussed. It is important to assess individual models to check for overfitting. Three surrogate models (RSM, LSSVM, and ANN) for oil recovery and gas oil ratio after 90 days, 10 years of production and at a terminal rate (5 bbl/day/fracture) are shown in Figures 8.

Gas Oil Ratio (SCF/STB) at 90 days ji

Training Data \ • js* •

- RSM • ANN • LSSVM

1000 2000 3000 SIMULATION

Oi Recovery (%) at 10 yrs. 1

Training Data

• ANN • LSSVM

| 4500

§ 3000

1500 0

Gas Oil Ratio (SCF/STB) at 10 yrs.

Training Data

• • # •

•t / • • •

.. RSM • ANN • LSSVM

20 30 SIMULATION

3000 4500 SIMULATION

6000 7500

Figure 8: Comparison of RSM, ANN and LSSVM models using training data for (a) Oil

recovery after 90 days (b), Oil recovery after 10 years (c), Oil recovery after oil rate drops to 5 bbl/day/fracture (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas Oil Ratio after oil rate drops to 5 bbl/day/fracture

Typical high model fitness as compared with training data can be observed for all cases in Figure 8. However, the capture of the production behavior using surrogate models without apprehending the underlying physics is a great challenge. It is difficult to model the GOR from low permeability reservoirs [59] due to its complex behavior. Mainly at higher value of GOR (obtained from 10 years model), flow becomes boundary dominated. On the other hand, lower GOR (obtained from 90 days and 1 year models) occurs when flow is at the transient linear

regime. Overall, both oil recovery and GOR from surrogate models are in good agreement with

simulation results. The errors are calculated in terms of R and NRMSE as listed in Tables A.1 and A.2. For visual comparison, R2 and NRMSE for RSM, ANN and LSSVM models of oil recovery are shown in Figure 9a and b respectively.

90d-4ri lyr, 5yrï, IQyri, 15 vre. Rate ÇQtfjys 1 yr, S yrS, IQyn. ISyre, Rate

bated b4icd

(a) (b)

Figure 9: Fitness of RSM, ANN and LSSVM models for of oil recovery for training data (a) Coefficient of determination, R2 and (b) NRMSE

R2 and NRMSE for oil recovery using RSM, LSSVM and ANN for all time- and rate-based models are greater than 0.95 and less than 6% respectively. These values are evidence of well-trained models.

5.2. Testing

Some models may have the tendency to overfit with training data and consequently fail to predict unseen test data with high accuracy. In this study, 20 percent of all data was used to check the forecasting capabilities of all developed models. Results for oil recovery and GOR after 90 days, 10 years production, and at terminal oil rate (5 bbl/day/fracture) are shown in Figure 10.

Oil Recovery (X) at 90 days

Test Oata •

' \It • # «

«/ s it

/* rfRSM • ANN • LSSVM

0 0.6 1.2 1.8 2.4 3. SIMULATION

Oil Recovery (%) at 10 yrt.

Test Data / •

• • / " » * • i •• t »

• /« / 8

■■■J f *

• RSM • ANN • LSSVM

Gas Oil Ratio (SCF/STB) at 90 days Test Data

1000 2000 3000 SIMULATION

1 3000

g 2000

Gas Oil Ratio (SCF/STB) at 10 yrs. • • * • «

Test Data T* <

• V -jfi'

Y J • JBf /

• RSM • ANN • LSSVM

10 15 SIMULATION

2000 3000 SIMULATION

4000 5000

Oil Recovery [%) at oil rate 5STB/day/frac.

Test Data 1 • :

• • • • i. • •

• / RSM • ANN • LSSVM

S 4000

Gas Oil Ratio (SCF/STB) at oil rate 5STB/day/frac. /

Test Data •

y* f / / • •

¿f. ' * ' * . • •

• RSM • ANN • LSSVM

18 27 SIMULATION

4000 SIMULATION

(c) (f)

Figure 10: Comparison of RSM, ANN and LSSVM models using test data for (a) Oil recovery

after 90 days, (b) Oil recovery after 10 years, (c) Oil recovery after oil rate drops to 5 bbl/day/fracture, (d), Gas Oil Ratio after 90 days (e), Gas Oil Ratio after 10 years (f), and Gas Oil Ratio after the oil rate drops to 5 bbl/day/fracture

Although all models had high fitness compared with training data, they showed relatively lower fitness when compared with test data. Considering the fact that test data was not accounted for

during training, the models show promising forecasting capabilities without significant

•2 • 2 aberrations. R and NRMSE are calculated as listed in Tables A.1 and A.2. The R and NRMSE

values for RSM, ANN, and LSSVM oil recovery models are shown in Figures 11 a and b respectively. 1.0

90 dsvs 1 VT.

Syr:* 10 yrs. lSvrs. Rate based

Figure 11: Fitness of RSM, ANN and LSSVM oil recovery models for testing data: (a) Coefficient of determination, R2 and (b) NRMSE

Except for a few cases, the forecast accuracy for all models are within decent ranges. As shown in the figures above, RSM shows higher accuracy predicting oil recovery followed by LSSVM. As shown in Figures 10 and 11, AI tools have the potential to predict oil recoveries and fluid ratios given a small training data set. Large amounts of completion, geological, and production data can indeed be used to train more robust models and complement current conventional tools to evaluate the potential of tight oil reservoirs. The cost, however, is that AI skips the physical description and understanding of the multiphase production mechanisms in tight formations. This cost may not be too high to pay since the current conventional understanding of these systems may not be sufficiently developed yet. In fact, researchers have recently reported on the discrepancies between conventional thinking and fluids under nanoconfinement in tight formations [60, 61].

6. Conclusion

Artificial intelligence tools aimed to predict oil recovery and gas-oil ratio from hydraulically fractured tight formations can be successfully developed using simulation information as a training framework. In this study, three models were developed based on RSM, ANN, and LSSVM to predict recovery from wells producing under time-based (90 days, 1 year, 5 years, 10 years, and 15 years) and rate-based constraints (5 bbl/day/fracture). Eight key factors, namely, matrix permeability, gas relative permeability exponent, rock compressibility, initial gas-oil ratio, slope of solution gas-oil ratio versus pressure, initial pressure, flowing bottom-hole pressure, and fracture spacing were considered as input parameters for all cases. After all models were trained with the same database, they were used to predict production for different

scenarios. Using simulation as a comparison basis, all models were evaluated in terms of their oil recovery and producing gas-oil ratio predictive capabilities. It was found that RSM and LSSVM have better predictive capabilities for oil recovery than ANN. In addition, LSSVM

exhibits the highest accuracy with respect to gas-oil ratio prediction.

Field-scale modeling and simulation of hydraulically fractured ultra-low permeability reservoirs lead to very expensive computational overhead in commercial simulators. Surrogate reservoir models, on the other hand, are useful for quick oil production forecast and assessment. Additionally, these models can be used for risk and uncertainty analysis. Overall, artificial intelligence applications such as LSSVM have promising applications in various aspects of production and reservoir engineering.

Nomenclature

Symbol Description Units

y Regularization Parameter -

a Kernel Parameter -

a; Support Values Unit of Output

Mean Of Observed Values Unit of Output

ao The Intercept Of The Surrogate Model Unit of Output

AI Artificial Intelligence -

aii Coefficient Of 2nd Order Interaction Of Inputs -

ak Coefficient Of Independent Input -

ANN Artificial Neural Networks -

b Bias Term Unit of Output

BHP Bottom Hole Pressure psi

Cl Acceleration Coefficient For Cognitive Components

C2 Acceleration Coefficient For Social Components

DOE Design Of Experiments

dRs/dp Slope Of Gas/Oil Ratio In PVT (SCF/STB)/psi

gbest Population's Best Particle's Position

GOR Gas/Oil Ratio SCF/STB

LSSVM Least Square Support Vector Machine -

MSE Mean Square Error Unit of Output

...........ng.............................................................. Exponent Of Relative Permeability Curve For Gas -

NRMSE Normalized Root Mean Square Error

pbest Particle's Best Position

Pi Initial Reservoir Pressure psi

PSO Particle Swarm Optimization -

PVT Pressure-Volume-Temperature -

..........R2..................................................... Coefficient Of Determination -

Rsi Initial Gas/Oil Ratio SCF/STB

RSM Response Surface Model -

SSres Residual Sum Of Squares Unit of Output

SStot Total Sum Of Squares Unit of Output

SVM Support Vector Machine -

vk Velocity Of Particle -

wi Inertia Weight -

Ymodel.i Modeled Value Unit of Output

Yobs,i Observed Data Unit of Output

Yobs,max The Maximum Value Of Observed Data Unit of Output

Yobs,min The Minimum Value Of Observed Data Unit of Output

References

[1]. Yeten, B., A. Castellini, B. Guyaguler, and W.H. Chen. A Comparison Study on Experimental Design and Response Surface Methodologies. in SPE Reservoir Simulation Symposium. The Woodlands, Texas: Society of Petroleum Engineers Inc. (2005).

[2]. Amorim, T.C.A.D. and D.J. Schiozer. Risk Analysis Speed-Up With Surrogate Models. in SPE Latin America and Caribbean Petroleum Engineering Conference. Mexico City, Mexico: Society of Petroleum Engineers (2012).

[3]. Li, B. and F. Firedmann. A Novel Response Surface Methodology Based on "Amplitude Factor" Analysis for Modeling Nonlinear Responses Caused by Both Reservoir and Controllable Factors. in SPE Annual Technical Conference and Exhibition. Dallas, Texas: Society of Petroleum Engineers (2005).

[4]. Peng, C.Y. and R. Gupta. Experimental Design in Deterministic Modelling: Assessing Significant Uncertainties. in SPE Asia Pacific Oil and Gas Conference and Exhibition. Jakarta, Indonesia: Society of Petroleum Engineers (2003).

[5]. Dejean, J.P. and G. Blanc. Managing Uncertainties on Production Predictions Using Integrated Statistical Methods. in SPE Annual Technical Conference and Exhibition. Houston, Texas: Society of Petroleum Engineers (1999).

[6]. Corre, B., P. Thore, V.d. Feraudy, and G. Vincent. Integrated Uncertainty Assessment For Project Evaluation and Risk Analysis. in SPE European Petroleum Conference. Paris, France: Society of Petroleum Engineers Inc. (2000).

[7]. Manceau, E., M. Mezghani, I. Zabalza-Mezghani, and F. Roggero. Combination of Experimental Design and Joint Modeling Methods for Quantifying the Risk Associated With Deterministic and Stochastic Uncertainties - An Integrated Test Study. in SPE Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of Petroleum Engineers Inc. (2001).

[8]. Venkataraman, R. Application of the Method of Experimental Design to Quantify Uncertainty in Production Profiles. in SPE Asia Pacific Conference on Integrated

Modelling for Asset Management. Yokohama, Japan: Copyright 2000, Society of Petroleum Engineers Inc. (2000).

[9]. Chewaroungroaj, J., O.J. Varela, and L.W. Lake. An Evaluation of Procedures to Estimate Uncertainty in Hydrocarbon Recovery Predictions. in SPE Asia Pacific Conference on Integrated Modelling for Asset Management. Yokohama, Japan: Copyright 2000, Society of Petroleum Engineers Inc. (2000).

[10]. Mohaghegh, S.D. Quantifying Uncertainties Associated With Reservoir Simulation Studies Using a Surrogate Reservoir Model. in SPE Annual Technical Conference and Exhibition. San Antonio, Texas, USA: Society of Petroleum Engineers (2006).

[11]. Guyaguler, B. and R.N. Horne. Uncertainty Assessment of Well Placement Optimization. in SPE Annual Technical Conference and Exhibition. New Orleans, Louisiana: Copyright 2001, Society of Petroleum Engineers Inc. (2001).

[12]. Manceau, E., F. Roggero, and I. Zabalza-Mezghani. Use Of Experimental Design Methodology To Make Decisions In An Uncertain Reservoir Environment From Reservoir Uncertainties To Economic Risk Analysis. World Petroleum Congress (2002).

[13]. Landa, J.L. and B. Güyagüler. A Methodology for History Matching and the Assessment of Uncertainties Associated with Flow Prediction. in SPE Annual Technical Conference and Exhibition. Denver, Colorado: Society of Petroleum Engineers (2003).

[14]. Carreras, P.E., S.E. Turner, and G.T. Wilkinson. Tahiti: Development Strategy Assessment Using Design of Experiments and Response Surface Methods. in SPE Western Regional/AAPG Pacific Section/GSA Cordilleran Section Joint Meeting. Anchorage, Alaska, USA: Society of Petroleum Engineers (2006).

[15]. Yang, C., L.X. Nghiem, C. Card, and M. Bremeier. Reservoir Model Uncertainty Quantification Through Computer-Assisted History Matching. in SPE Annual Technical Conference and Exhibition. Anaheim, California, U.S.A.: Society of Petroleum Engineers

(2007).

[16]. Slotte, P.A. and E. Smorgrav. Response Surface Methodology Approach for History Matching and Uncertainty Assessment of Reservoir Simulation Models. in Europec/EAGE Conference and Exhibition. Rome, Italy: Society of Petroleum Engineers

(2008).

[17]. Ahmadi, M.A., R. Soleimani, and A. Bahadori, A computational intelligence scheme for prediction equilibrium water dew point of natural gas in TEG dehydration systems. Fuel, 137 (2014). p. 145-154.

[18]. Mohaghegh, S.D., J.S. Liu, R. Gaskari, M. Maysami, and O.A. Olukoko. Application of Well-Base Surrogate Reservoir Models (SRMs) to Two Offshore Fields in Saudi Arabia, Case Study. in SPE Western Regional Meeting. Bakersfield, California, USA: Society of Petroleum Engineers (2012).

[19]. Velasco, R., P. Panja, and M. Deo. New Production Performance and Prediction Tool for Unconventional Reservoirs, URTEC-2461718-MS. in Unconventional Resources Technology Conference, 1-3 August. San Antonio, Texas, USA: Unconventional Resources Technology Conference (2016).

[20]. Patzek, T.W., F. Male, and M. Marder, Gas production in the Barnett Shale obeys a simple scaling theory. Proceedings of the National Academy of Sciences, 110(49) (2013). p. 19731-19736.

[21]. Wattenbarger, R.A., AH. El-Banbi, ME. Villegas, and J.B. Maggard. Production Analysis of Linear Flow Into Fractured Tight Gas Wells, SPE-39931-MS. in SPE Rocky

Mountain Regional/Low-Permeability Reservoirs Symposium, 5-8 April. Denver, Colorado, USA: Society of Petroleum Engineers (1998).

[22]. Nobakht, M., L. Mattar, S. Moghadam, and D.M. Anderson, Simplified Forecasting of Tight/Shale-Gas Production in Linear Flow. Journal of Canadian Petroleum Technology, 51(06) (2012). p. 11.

[23]. Al-Kaabi, A.U. and W.J. Lee, Using Artificial Neural Networks To Identify the Well Test Interpretation Model (includes associated papers 28151 and 28165 ). 8(03) (1993.

[24]. Juniardi, I.R. and I. Ershaghi. Complexities of Using Neural Network in Well Test Analysis of Faulted Reservoirs. Society of Petroleum Engineers (1993).

[25]. Zhou, C.D., X.-L. Wu, and J.-A. Cheng. Determining Reservoir Properties in Reservoir Studies Using a Fuzzy Neural Network. in SPE Annual Technical Conference and Exhibition. Houston, Texas: Society of Petroleum Engineers (1993).

[26]. Mohaghegh, S., R. Arefi, S. Ameri, and M.H. Hefner. A Methodological Approach for Reservoir Heterogeneity Characterization Using Artificial Neural Networks. in SPE Annual Technical Conference and Exhibition. New Orleans, Louisiana: Society of Petroleum Engineers (1994).

[27]. E. El-Sebakhy, T.S., S. Al-Bokhitan, Y. Shaaban, I. Raharja, Y. Khaeruzzaman. Support Vector Machines Framework for Predicting the PVT Properties of Crude-Oil Systems. Kingdom of Baharin: 15th SPE Middle East Oil & Gas Show and Conference (2007).

[28]. Oloso, M., A. Khoukhi, A. Abdulraheem, and M. Elshafei. Prediction of Crude Oil Viscosity and Gas/Oil Ratio Curves Using Recent Advances to Neural Networks. in SPE/EAGE Reservoir Characterization and Simulation Conference. Abu Dhabi, UAE: Society of Petroleum Engineers (2009).

[29]. Rabiei, A., H. Sayyad, M. Riazi, and A. Hashemi, Determination of dew point pressure in gas condensate reservoirs based on a hybrid neural genetic algorithm. Fluid Phase Equilibria, 387 (2015). p. 38-49.

[30]. Ahmadi, M.A. and M. Ebadi, Evolving smart approach for determination dew point pressure through condensate gas reservoirs. Fuel, 117 (2014). p. 1074-1084.

[31]. Ahmadi, M.A., M. Ebadi, and A. Yazdanpanah, Robust intelligent tool for estimating dew point pressure in retrograded condensate gas reservoirs: Application of particle swarm optimization. Journal of Petroleum Science and Engineering, 123 (2014). p. 7-19.

[32]. Ahmadi, M.A., M. Ebadi, P.S. Marghmaleki, and M M. Fouladi, Evolving predictive model to determine condensate-to-gas ratio in retrograded condensate gas reservoirs. Fuel, 124 (2014). p. 241-257.

[33]. Ahmadi, M.A., Neural network based unified particle swarm optimization for prediction of asphalteneprecipitation. Fluid Phase Equilibria, 314 (2012). p. 46-51.

[34]. Ahmadi, M.A. and S.R. Shadizadeh, New approach for prediction of asphaltene precipitation due to natural depletion by using evolutionary algorithm concept. Fuel, 102 (2012). p. 716-723.

[35]. Ahmadi, M.A., M. Ebadi, A. Shokrollahi, and S.M.J. Majidi, Evolving artificial neural network and imperialist competitive algorithm for prediction oil flow rate of the reservoir. Applied Soft Computing, 13(2) (2013). p. 1085-1098.

[36]. Fatai Adesina Anifowose, A.A. Prediction of Porosity and Permeability of Oil and Gas Reservoirs using Hybrid Computational Intelligence Models. Cairo, Egypt: North Africa Technical Conference and Exhibition, SPE (2010).

[37]. Fatai Adesina Anifowose, A.O.E., Safiriyu Ijiyemi. Prediction of Oil and Gas Reservoir Properties using Support Vector Machines. Bangkok, Thailand: International Petroleum Technology Conference, (2011).

[38]. Ammal F. Al-anazi, G., Ian D, Support-Vector Regression for Permeability Prediction in a Heterogeneous Reservoir: A Comparative Study. SPE Reservoir Evaluation & Engineering, 13(03) (2010).

[39]. Mohammad-Ali Ahmadi, M.R.A., Seyed Moein Hosseini, Mohammad Ebadi, Connectionist model predicts the porosity and permeability of petroleum reservoirs by means of petro-physical logs: Application of artificial intelligence. Journal of Petroleum Science and Engineering, 123 (2014). p. 183-200.

[40]. Mohammad-Ali Ahmadi, A.B., A LSSVM approach for determining well placement and conning phenomena in horizontal wells. Fuel, 153 (2015). p. 276-283.

[41]. Mohammad Ali Ahmadi, M.E., Seyed Moein Hosseini, Prediction breakthrough time of water coning in the fractured reservoirs by implementing low parameter support vector machine approach. Fuel, 117 (2014). p. 579-589.

[42]. Ahmadi, M.A., Connectionist approach estimates gas-oil relative simulation in petroleum reservoirs: Application to reservoir simulation. Fuel, 140 (2015). p. 429-439.

[43]. Eslamimanesh, A., F. Gharagheizi, M. Illbeigi, A.H. Mohammadi, A. Fazlali, and D. Richon, Phase equilibrium modeling of clathrate hydrates of methane, carbon dioxide, nitrogen, and hydrogen + water soluble organic promoters using Support Vector Machine algorithm. Fluid Phase Equilibria, 316 (2012). p. 34-45.

[44]. Reza Gholgheysari Gorjaei, R.S., Mohammad Torkaman, Mohsen Safari, Ghassem Zargar, A novel PSO-LSSVM model for predicting liquid rate of two phase flow through wellhead chokes. Journal of Natural Gas Science and Engineering, 24 (2015). p. 228-237.

[45]. Ahmadi, M.-A., M.Z. Hasanvand, and A. Bahadori, A least-squares support vector machine approach to predict temperature drop accompanying a given pressure drop for the natural gas production and processing systems. International Journal of Ambient Energy, 38(2) (2015). p. 122-129.

[46]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Connectionist Model to Monitor the Efficiency of an In Situ Combustion Process: Application to Heavy Oil Recovery. Energy Technology, 2(9-10) (2014). p. 811-818.

[47]. Ahmadi, M.-A., M. Masumi, R. Kharrat, and A.H. Mohammadi, Gas Analysis by In Situ Combustion in Heavy-Oil Recovery Process: Experimental and Modeling Studies. Chemical Engineering & Technology, 37(3) (2014). p. 409-418.

[48]. Ahmadi, M.A., M. Masoumi, and R. Askarinezhad, Evolving Smart Model to Predict the Combustion Front Velocity for In Situ Combustion. Energy Technology, 3(2) (2015). p. 128-135.

[49]. Ahmadi, M.A., M. Zahedzadeh, S.R. Shadizadeh, and R. Abbassi, Connectionist model for predicting minimum gas miscibility pressure: Application to gas injection process. Fuel, 148 (2015). p. 202-211.

[50]. Ali Ahmadi, M. and A. Ahmadi, Applying a sophisticated approach to predict CO2solubility in brines: application to CO2sequestration. International Journal of Low-Carbon Technologies, 11(3) (2016). p. 325-332.

[51]. Ahmadi, M.-A., A. Bahadori, and S.R. Shadizadeh, A rigorous model to predict the amount of Dissolved Calcium Carbonate Concentration throughout oil field brines: Side effect of pressure and temperature. Fuel, 139 (2015). p. 154-159.

[52]. Panja, P., T. Conner, and M. Deo, Factors Controlling Production in Hydraulically Fractured Low Permeability Oil Reservoirs. International Journal of Oil, Gas and Coal Technology, 3(1) (2015). p. 18.

[53]. Box, G.E.P. and D.W. Behnken, Some New Three Level Designs for the Study of Quantitative Variables. Technometrics, 2(4) (1960). p. 455-475.

[54]. Panja, P., T. Conner, and M. Deo, Grid sensitivity studies in hydraulically fractured low permeability reservoirs. Journal of Petroleum Science and Engineering, 112(0) (2013). p. 78-87.

[55]. Panja, P., M. Pathak, R. Velasco, and M. Deo. Least Square Support Vector Machine: An Emerging Tool for Data Analysis. in SPE Low Perm Symposium, 5-6 May. Colorado, Denver: Society of Petroleum Engineers (2016).

[56]. Eberhart, R. and J. Kennedy. A new optimizer using particle swarm theory. in Micro Machine and Human Science, 1995. MHS '95., Proceedings of the Sixth International Symposium on. (1995).

[57]. Banerjee, C. and R. Sawal. PSO with dynamic acceleration Coefficient based on Mutiple Constraint Satisfaction. in International Conference on Advances in Electronics Computers and Communications. Bangalore, India (2014).

[58]. Ahmadi, M.A., R. Soleimani, M. Lee, T. Kashiwao, and A. Bahadori, Determination of oil well production performance using artificial neural network (ANN) linked to the particle swarm optimization (PSO) tool. Petroleum, 1(2) (2015). p. 118-132.

[59]. Panja, P. and M. Deo, Unusual behavior of produced gas oil ratio in low permeability fractured reservoirs. Journal of Petroleum Science and Engineering, 144 (2016). p. 7683.

[60]. Pathak, M., H. Kweon, P. Panja, R. Velasco, and M.D. Deo. Suppression in the Bubble Points of Oils in Shales Combined Effect of Presence of Organic Matter and Confinement. in SPE Unconventional Resources Conference, 15-16 February, . Calgary, Alberta, Canada: Society of Petroleum Engineers (2017).

[61]. Velasco, R., M. Pathak, P. Panja, and M. Deo. What Happens to Permeability at the Nanoscale? A Molecular Dynamics Simulation Study. in SPE/AAPG/SEG Unconventional Resources Technology Conference, 24-26 July. Austin, Texas, USA: Unconventional Resources Technology Conference (2017).

Appendix A: Supplementary Information

Table A.1: Coefficient of determination (R2) of RSM, LSSVM and ANN for all models

Output Training Data Test Data

Model RSM LSSVM ANN RSM LSSVM ANN

Oil 90 days 0.99 0.99 0.96 0.69 0.52 0.51

1 year 0.98 0.99 0.98 0.78 0.69 0.53

5 years 0.99 0.99 0.99 0.63 0.81 0.60

Recovery 10 years 0.99 0.99 0.98 0.91 0.9 0.72

15 years 0.99 0.99 0.99 0.97 0.93 0.84

Rate Based 0.98 0.98 0.99 0.57 0.54 0.48

90 days 0.98 0.99 0.95 0.92 0.84 0.80

1 year 0.98 0.98 0.96 0.93 0.91 0.90

Gas Oil 5 years 0.98 0.98 0.97 0.41 0.73 0.30

Ratio 10 years 0.88 0.92 0.93 0.76 0.73 0.46

15 years 0.83 0.77 0.84 0.79 0.75 0.32

Rate Based 0.84 0.88 0.92 0.68 0.45 0.43

Table A.2: Normalized Root Mean Square Error (NRMSE) of RSM, LSSVM and ANN for all

models

Output Training Data Test Data

Model RSM LSSVM ANN RSM LSSVM ANN

90 days 1.9 1.9 3.5 16.5 20.3 20.7

1 year 2.4 2.3 2.5 12.4 14.7 18.1

Oil 5 years 2.0 1.9 2.1 16.1 11.5 16.7

Recovery 10 years 1.9 1.7 2.6 7.9 8.5 14

15 years 2.7 2.4 2.1 4.9 7.3 10.8

Rate Based 3.5 3.3 2.4 20.7 21.2 22.6

90 days 2.6 2.0 4.6 8.7 11.8 13.3

1 year 3.3 3.3 4.2 7.9 9.3 9.7

Gas Oil 5 years 3.0 3.1 3.7 24.0 16.1 26.2

Ratio 10 years 5.7 4.6 4.3 16.1 17.2 24.3

15 years 5.8 6.8 5.6 14.4 15.5 25.7

Rate Based 5.2 4.5 3.8 14.1 18.4 18.8

Ol Recovery (H) at 1 yr. /

y • RSM • ANN • LSSVM

SIMULATION

mi £ o s

5 2000

Gas Oil Ratio (SCF/ST8) at 1 yr. r1

? '<#1*!

- RSM • ANN • LSSVM

1000 2000 3000 4000 SIMULATION

i ¡20

Ol Recovery (%) at IS yrs. /

••

J* V • RSM • ANN • LSSVM

20 30 SIMULATION

5 6000

Gas Oil Ratio (SCF/STB) at IS yrs.

• • • • •

• • • •

• RSM • ANN • LSSVM

6000 SIMULATION

(c) (f)

Figure A.1.: Data training comparison of RSM, ANN and LSSVM models for (a) Oil recovery after 1 year (b) Oil recovery after 5 years (c) Oil recovery after 15 years (d) Gas Oil Ratio after 1 year (e) Gas Oil Ratio after 5 years (f) Gas Oil Ratio after 15 years

Table A.3: List of simulations using Box-Behnken DOE used to train surrogate models

Sr. No. Km (nD) Vn? Cf (1/psi) dRs/dp ((SCF/STB)/psi) Rsi (SCF/STB) Pi (Psi) Pwf (Psi) Xf (ft.)

1 10 1 4.00E-05 0.65 1900 5250 1000 180

2 10 3 4.00E-05 0.65 1900 5250 1000 180

3 5000 1 4.00E-05 0.65 1900 5250 1000 180

4 5000 3 4.00E-05 0.65 1900 5250 1000 180

5 10 2 4.00E-06 0.65 1900 5250 1000 180

6 10 2 4.00E-04 0.65 1900 5250 1000 180

7 5000 2 4.00E-06 0.65 1900 5250 1000 180

8 5000 2 4.00E-04 0.65 1900 5250 1000 180

9 10 2 4.00E-05 0.5 1900 5250 1000 180

10.......... 10 2 4.00E-05 0.8 1900 5250 1000 180

11 5000 2 4.00E-05 0.5 1900 5250 1000 180

12 5000 2 4.00E-05 0.8 1900 5250 1000 180

13 10 2 4.00E-05 0.65 800 5250 1000 180

14 10 2 4.00E-05 0.65 3000 5250 1000 180

15 5000 2 4.00E-05 0.65 800 5250 1000 180

16 5000 2 4.00E-05 0.65 3000 5250 1000 180

17 10 2 4.00E-05 0.65 1900 4000 1000 180

18 10 2 4.00E-05 0.65 1900 6500 1000 180

19 5000 2 4.00E-05 0.65 1900 4000 1000 180

20 5000 2 4.00E-05 0.65 1900 6500 1000 180

21 10 2 4.00E-05 0.65 1900 5250 500 180

22 10 2 4.00E-05 0.65 1900 5250 1500 180

23 5000 2 4.00E-05 0.65 1900 5250 500 180

24 5000 2 4.00E-05 0.65 1900 5250 1500 180

25 10 2 4.00E-05 0.65 1900 5250 1000 60

26 10 2 4.00E-05 0.65 1900 5250 1000 300

27 5000 2 4.00E-05 0.65 1900 5250 1000 60

28 5000 2 4.00E-05 0.65 1900 5250 1000 300

29 225 1 4.00E-06 0.65 1900 5250 1000 180

30 225 1 4.00E-04 0.65 1900 5250 1000 180

31 225 3 4.00E-06 0.65 1900 5250 1000 180

32 225 3 4.00E-04 V 0.65 1900 5250 1000 180

33 225 1 4.00E-05 0.5 1900 5250 1000 180

34 225 1 4.00E-05 0.8 1900 5250 1000 180

35 225 3 4.00E-05 0.5 1900 5250 1000 180

36 225 3 4.00E-05 0.8 1900 5250 1000 180

37 225 1 <4.00E-05 0.65 800 5250 1000 180

38 225 1 4.00E-05 0.65 3000 5250 1000 180

39 225 3 4.00E-05 0.65 800 5250 1000 180

40 225 3 4.00E-05 0.65 3000 5250 1000 180

41 225 1 4.00E-05 0.65 1900 4000 1000 180

42 225 1 4.00E-05 0.65 1900 6500 1000 180

43 225 3 4.00E-05 0.65 1900 4000 1000 180

44 225 3 4.00E-05 0.65 1900 6500 1000 180

45 225 1 4.00E-05 0.65 1900 5250 500 180

46 225 1 4.00E-05 0.65 1900 5250 1500 180

47 225 3 4.00E-05 0.65 1900 5250 500 180

48 225 3 4.00E-05 0.65 1900 5250 1500 180

49 225 1 4.00E-05 0.65 1900 5250 1000 60

50 225 1 4.00E-05 0.65 1900 5250 1000 300

51.......... 225 3 4.00E-05 0.65 1900 5250 1000 60

52 225 3 4.00E-05 0.65 1900 5250 1000 300

53 225 2 4.00E-06 0.5 1900 5250 1000 180

54 225 2 4.00E-06 0.8 1900 5250 1000 180

55 225 2 4.00E-04 0.5 1900 5250 1000 180

56 225 2 4.00E-04 0.8 1900 5250 1000 180

57 225 2 4.00E-06 0.65 800 5250 1000 180

58 225 2 4.00E-06 0.65 3000 5250 1000 180

59 225 2 4.00E-04 0.65 800 5250 1000 180

60 225 2 4.00E-04 0.65 3000 5250 1000 180

61 225 2 4.00E-06 0.65 1900 4000 1000 180

62 225 2 4.00E-06 0.65 1900 6500 1000 180

63 225 2 4.00E-04 0.65 1900 4000 1000 180

64 225 2 4.00E-04 0.65 1900 6500 1000 180

65 225 2 4.00E-06 0.65 1900 5250 500 180

66 225 2 4.00E-06 0.65 1900 5250 1500 180

67 225 2 4.00E-04 0.65 1900 5250 500 180

68 225 2 4.00E-04 0.65 1900 5250 1500 180

69 225 2 4.00E-06 0.65 1900 5250 1000 60

70 225 2 4.00E-06 0.65 1900 5250 1000 300

71 225 2 4.00E-04 0.65 1900 5250 1000 60

72 225 2 4.00E-04 0.65 1900 5250 1000 300

73 225 2 4.00E-05 K 0.5 800 5250 1000 180

74 225 2 4.00E-05 0.5 3000 5250 1000 180

75 225 2 4.00E-05 0.8 800 5250 1000 180

76 225 2 4.00E-05 0.8 3000 5250 1000 180

77 225 2 4.00E-05 0.5 1900 4000 1000 180

78 225 2 <4.00E-05 0.5 1900 6500 1000 180

79 225 2 4.00E-05 0.8 1900 4000 1000 180

80 225 2 4.00E-05 0.8 1900 6500 1000 180

81 225 2 4.00E-05 0.5 1900 5250 500 180

82 225 2 4.00E-05 0.5 1900 5250 1500 180

83 225 2 4.00E-05 0.8 1900 5250 500 180

84 225 2 4.00E-05 0.8 1900 5250 1500 180

85 225 2 4.00E-05 0.5 1900 5250 1000 60

86 225 2 4.00E-05 0.5 1900 5250 1000 300

87 225 2 4.00E-05 0.8 1900 5250 1000 60

88 225 2 4.00E-05 0.8 1900 5250 1000 300

89 225 2 4.00E-05 0.65 800 4000 1000 180

90 225 2 4.00E-05 0.65 800 6500 1000 180

91 225 2 4.00E-05 0.65 3000 4000 1000 180

92 ■■■ 225 2 4.00E-05 0.65 3000 6500 1000 180

93 225 2 4.00E-05 0.65 800 5250 500 180

94 225 2 4.00E-05 0.65 800 5250 1500 180

95 225 2 4.00E-05 0.65 3000 5250 500 180

96 225 2 4.00E-05 0.65 3000 5250 1500 180

97 225 2 4.00E-05 0.65 800 5250 1000 60

98 225 2 4.00E-05 0.65 800 5250 1000 300

99 225 2 4.00E-05 0.65 3000 5250 1000 60

100 225 2 4.00E-05 0.65 3000 5250 1000 300

101 225 2 4.00E-05 0.65 1900 4000 500 180

102 225 2 4.00E-05 0.65 1900 4000 1500 180

103 225 2 4.00E-05 0.65 1900 6500 500 180

104 225 2 4.00E-05 0.65 1900 6500 1500 180

105 225 2 4.00E-05 0.65 1900 4000 1000 60

106 225 2 4.00E-05 0.65 1900 4000 1000 300

107 225 2 4.00E-05 0.65 1900 6500 1000 60

108 225 2 4.00E-05 0.65 1900 6500 1000 300

109 225 2 4.00E-05 0.65 1900 5250 500 60

110 225 2 4.00E-05 0.65 1900 5250 500 300

111 225 2 4.00E-05 0.65 1900 5250 1500 60

112 225 2 4.00E-05 0.65 1900 5250 1500 300

113 225 2 4.00E-05 0.65 1900 5250 1000 180

114 225 2 4.00E-05 0.65 1900 5250 1000 180

Table A.4: List of simulations performed using random input parameter values used to test the

surrogate models

Sr. Km ng Cf dRs/dp Rsi Pi Pwf Xf

No. (nD) - ) (1/psi) ((SCF/STB)/psi) (SCF/STB) (psi) (psi) (ft.)

1 1496 ^ 1.17 1.0E-05 0.62 2069 5738 743 300

2 361 1.27 3.8E-05 0.67 1456 4170 1417 180

3 31 1.35 8.0E-06 0.58 1625 4637 769 60

4 44 1.78 2.0E-05 0.59 1724 4560 1266 60

5 2475 2.66 1.9E-05 0.69 2408 5670 689 300

6 12 2.61 1.4E-05 0.58 2820 6111 787 60

7 210 1.12 2.0E-05 0.75 1775 4861 591 180

8 28 1.80 1.9E-05 0.79 2969 5951 1076 60

9 4390 2.05 6.0E-06 0.72 2531 5688 1183 300

10 840 1.83 5.4E-06 0.60 1460 4017 1047 300

11 225 2.31 4.0E-05 0.68 2106 5505 926 180

12 187 2.26 5.9E-06 0.53 1689 4967 1144 180

13 14 1.58 4.3E-06 0.77 2779 6290 1148 60

14 694 1.86 1.5E-05 0.76 1539 4003 1179 300

15 13 1.03 3.0E-05 0.75 1273 5156 1136 60

16 16 2.97 1.9E-05 0.58 1413 5061 1445 60

17 256 1.33 6.2E-06 0.68 1323 5152 709 180

18 18 1.21 9.4E-06 0.51 2281 5925 y~09 60

19 1618 1.74 1.2E-05 0.63 1552 4806 736 300

20 1612 1.40 3.8E-05 0.59 2527 5962 619 300

21 892 1.98 5.7E-06 0.55 1566 5178 1107 300

22 25 1.68 2.9E-05 0.55 1900 4089 950 60

23 604 2.90 1.8E-05 0.63 1614 4440 959 300

24 251 2.84 9.5E-06 0.53 2944 5804 1162 180

25 4237 1.11 6.2E-06 0.68 1710 5184 1270 300

26 565 2.48 1.1E-05 0.64 1966 4382 850 300

27 1448 1.54 1.2E-05 0.71 1393 4853 1162 300

28 168 1.85 5.3E-06 0.71 2676 5518 916 180

29 147 2.10 1.6E-05 0.69 1431 4479 1342 180

30 1692 2.89 6.7E-06 0.51 2672 5846 1333 300