Contents lists available at ScienceDirect

South African Journal of Chemical Engineering IChemE

journal homepage: http://www.journals.elsevier.com/ south-african-journal-of-chemical-engineering

ADVANCING

CHEMICAL ENGINEERING WORLDWIDE

Extraction, analysis and desaturation of gmelina ^CroSSMark seed oil using different soft computing approaches

F. Chigozie Uzoh*, D. Okechukwu Onukwuli

Chemical Engineering Department, Faculty of Engineering, Nnamdi Azikiwe University, Awka, Nigeria

ARTICLE INFO

ABSTRACT

Article history: Received 3 March 2016 Received in revised form 15 July 2016 Accepted 31 July 2016

Keywords:

Gmelina seed oil

Response surface methodology

Artificial neural network

Genetic algorithm

Levenberg—Marquardt algorithm

Optimization

Artificial Neural Network (ANN)-Genetic Algorithm (GA) interface and Response Surface Methodology (RSM) have been compared as tools for simulation and optimization of gmelina seed oil extraction process. A multi-layer feed-forward Levenberg Marquardt back-propagation algorithm was incorporated for developing a predictive model which was optimized using GA. Design Expert simulation and optimization tools were also incorporated for a detailed simulation and optimization of the same process using Response surface methodology (RSM). It was found that oil yield increased with rise in temperature, time and volume of solvent but decreased with increase in seed particle size. The maximum oil yield obtained using the numerical optimization techniques show that 49.2% were predicted by the RSM at the optimum conditions of; 60 °C temperature, extraction time 60 min, 150 mm seed particle size, 150 ml solvent volume and 49.8% by ANN-GA at extraction temperature 40 °C, extraction time 40 min, 200 mm seed particle size, 100 ml solvent volume, respectively. The prediction accuracy of both models were more than 95%. Models validation experiments indicate that the predicted and the actual were in close agreement. The extract was analyzed to examine its physico-chemical properties (acid value, iodine value, peroxide value, viscosity, saponification value, moisture and ash content, refractive index, smoke, flash and fire points and specific gravity) and structural elucidation by standard methods and instrumental techniques. Results revealed that the oil is non-drying and edible. Desaturation of the oil further reveal its potential in alkyd resin synthesis.

© 2016 The Authors. Published by Elsevier B.V. on behalf of Institution of Chemical Engineers. This is an open access article under the CC BY-NC-ND license (http://

creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

The high costs, the environmental impact, and the decrease in fossil resources are the main reasons behind attracting a great deal of attention of the research community towards searching for alternative raw materials in different industrial fields. Now, with the increase in world demand for oil and the challenges to expand the existing oil supply for human consumption and industrial utilization (Basumatary et al., 2012), there is need to utilised less expensive and non-edible product (oil) in the synthesis of the resins or biodiesel and other

products in order to meet up with the competitive environment of such industries. One of such product which can be utilised to yield a desirable result both in terms of cost, renewability, biodegradability and non-edibility is gmelina seed oil (GSO).

Gmelina seed oil have been found to be a sustainable material for biodiesel and alkyd resin synthesis in terms of its availability and renewability. Gmelina seed oil based biodiesel have been produced keeping two criteria in mind; the biodiesel met all the technical and industrial standards of ASTM D6751 and EN 14214, and, met all the ecologically relevant

* Corresponding author. E-mail address: cf.uzoh@unizik.edu.ng (F.C. Uzoh). http://dx.doi.Org/10.1016/j.sajce.2016.07.001

1026-9185/© 2016 The Authors. Published by Elsevier B.V. on behalf of Institution of Chemical Engineers. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

standards (Basumatary et al., 2012; Sangay et al., 2014). However, there has been doubt over sustained use of gmelina seed oil for large scale production of biodiesel and alkyd resin due to low yields. Most methods of oil extraction seem to be very costly due to inability to control some inherent factors. A lot of researches has been carried out to find alternative ways of producing oil for process industries and for food industry. It has been found that almost all the seeds contain oil, hence, these gives ground for other researchers to consider studies on the possible uses of other oil producing substances found in people's everyday lives. There are various ways of extracting oil from oilseeds but solvent extraction has been reported to be most efficient techniques (Topallar and Gecgel, 2000). There is need therefore, for process industries to optimize current methods of extraction, thereby improving the profitability of production and ensuring a sufficient supply of oil.

Gmelina arborea is a fast growing tree, which grows on different localities and prefers moist fertile valleys, they attain moderate to large height up to 40 m and 140 cm in diameter (Tewari, 1995). It is occurring naturally throughout greater part of India at altitudes up to 1500 m. It also occurs naturally in Myanmar, Thailand, Laos, Cambodia, Vietnam, and in southern provinces of China, and has been planted extensively in Sierra Leone, Nigeria and Malaysia (Adegbehin et al., 1988). Now, gmelina seeds are already proven to produce oil (Adeyeye, 1971), this fact itself is already useful information for researchers who seek to find alternative sources of oil. The ability of the oil to fit depends on its constituents, its compositions, rate of production and availability of the processing technology. The study of these constituents is important for their effective uses. In a previous work (Uzoh and Onukwuli, 2014), we reported kinetics and optimization of gmelina seed oil using response surface methods, in which an optimal yield of 49.90% was predicted.

Mathematical and statistical techniques such as Response Surface Methodology (RSM) and Artificial Neural Network (ANN) can assist in analyzing experimental data, finding optimum situation and predicting results. Statistical design, first applied in industry in the 1930s, was promoted by Box and Wilson (Box and Wilson, 1951) with the development of the response surface methodology (RSM). RSM can be defined as a group of mathematical and statistical techniques used to model and analyze results/data in which the response/property of interest is affected by numerous factors, and the aim is to obtain an optimum response (Montgomery, 2005). Once a system is fully characterized, the objective of process design may be pursued further by formulating a model that may permit a rough local approximation to the actual surface. Such predictive models are usually developed from design of experiment to provide analytical solution(s) for the desired response; obviating the need for experimenting in an ad hoc manner in search of optimal setup. RSM and other designs spread throughout several industries over the next 30 years. The use of experimental design was expanded by Genichi Taguchi and others (Taguchi and Wu, 1980; Taguchi, 1987, 1991). However, the Robust Parameter Designs (RPD) approach initially proposed by Taguchi, caused lots of controversy among statisticians (Wagoner, 1998). With the emergence of Response surface Methodology (RSM), many efficient approaches which could be effectively handle RPD problems are available (Montgomery, 2005).

Most of the traditional optimization techniques based on gradient methods have the possibility of getting trapped at a local optima, depending upon the degree of non-linearity and

the initial guess. Hence, it does not ensure the global optimum and also have limited application. Non-traditional search and optimization methods based on natural phenomena; neural networks and evolutionary computation (simulated annealing, genetic algorithm and differential evolution) have been developed to overcome this problem (Babu, 2004). Artificial neural network (ANN) is a highly simplified model of the structure of a biological network (Mandal et al., 2009). The fundamental processing element of ANN is an artificial neuron (or simply a neuron). A biological neuron receives inputs from other sources, combines them, performs generally a non-linear operation on the result, and then outputs the final result (Bas et al., 2007). The basic advantage of ANN is that it does not need any mathematical model since an ANN learns from examples and recognizes patterns in a series of input and output data without any prior assumptions about their nature and interrelations (Mandal et al., 2009). ANN is a good alternative to conventional empirical modeling based on polynomial and linear regressions (Kose, 2008). In recent years, the genetic algorithms have been successfully applied in a variety of fields where optimization in the presence of complicated objective functions and constraints abounds. The reasons of widely used GAs are its global search ability and independence of initial value. Genetic algorithm (GA) is a stochastic general search method which proceeds in an iterative manner by generating new populations of individuals from the old ones. GA uses stochastic operators such as selection, crossover and mutation on an initially random population in order to compute a new population (Holland, 1975). The search feature of the GA is contrast with those of the gradient descent and LM in that it is not trajectory-driven, but population-driven. The GA is expected to avoid local optima frequently by promoting exploration of the search space, in opposition to the exploitative trend usually allocated to local search algorithms like gradient descent or LM (Ghaffari et al., 2006). The central focus of the current research is to develop a predictive model for gmelina seed oil extraction from RSM and ANN and subsequently optimized the process through RSM and ANN-GA interface. Characterize and desaturate the extract in order to examine its applicability as surface coating raw material. In gmelina seed oil extraction process (where the performance of the system were described by many process variables), a proper optimization process would seek to achieve a balance of trade-off among the list of desirable output responses. RSM and ANN-GA are preferable for this objective. Although, Yasin et al. (2014), used the method for lead ions removal from aqueous solutions using intercalated tartrate-Mg-Al layered double hydroxides, it has not yet received much attention. Specifically, it must be highlighted that the ANN-GA combination has not yet been used for modeling and optimization of any oil extraction processes. Gmelina seed has been found most suitable in that; it is not edible, widely available, will not interfere in the food chain and can stand out as sustainable material for energy.

2. Materials and methods

2.1. Materials

The gmelina fruits were collected locally from a forest in ministry of forest reserve, Anambra State, Nigeria. It was soaked in water for eight days so as to easily separate the fruit pulp from the seed (de-pulp). The seeds were sun-dried and crushed mechanically using corona blender; the crushed

samples were then separated into different particle sizes using laboratory test sieves (150 mm, 300 mm, 600 mm, 850 mm and 1 mm). The samples were then dried using the mermmet oven, stored in air tight containers and were labeled adequately.

Ethanol, sodium hydroxide, potassium hydroxide and hydrochloric acid were obtained from BDH chemical ltd., Poole, England. Petroleum ether, diethyl ether, phenolphthalein indicator, glacial acetic acid, chloroform, distilled water, carbon tetrachloride, wij's iodide solution were purchased from Merck chemicals, Germany. Potassium iodide (KI) and sodium thiosulphate were obtained from M&B, England. The organic solvent used for the oil extraction was n-hexane. All the reagents were commercial grade and were used without further purification.

Methods

Ten grams of grinded meal was extracted with n-hexane. The extraction temperature was varied from 20 °C to boiling point of the solvent while the reaction time was varied between 5 and 60 min. The solvent to solid ratio was investigated from 2:1-5:1 and particle size was varied from 150 mm to 1000 mm. At the end of the extraction, the micelle was filtered using a vacuum filtration (Millipore glass base and funnel) to remove suspended solids. Subsequently, the solvent was separated from the oil using rotary vacuum evaporator (Laborota 4000) and was collected in the receiving flask. The oil which was remained in the sample flask was weighed after the process was completed. The percentage of extracted oil was calculated by dividing the amount of obtained oil by the amount of the seeds multiply by 100. The percentage oil yield was calculated using the expression below:

Where, Y is the oil yield (%)

W0 is the weight of pure oil extracted (g) and

W is the weight of the sample of gmelina seed used in the

experiment.

The extract was chemically modified (desaturation) according to the method described in Odetoye et al. (2011).

2.3. Analysis of gmelina seed oil

The analysis of the oil was performed with a Thermo Finnigan Trace GC/Trace DSQ/A1300, (E.I Quadropole) equipped with a SGE-BPX5 MS fused silica capillary column (film thickness 0.25 mm) for GC-MS detection, and an electron ionization system with ionization energy of 700 eV was used. Carrier gas was helium at a flow rate of 10 mL/min injector and MS transfer line temperatures were set at 220 °C and 290 °C respectively. The oven temperature was programmed from 50 °C to 150 °C at 3 °C/min, then held isothermal for 100min, and raised to 250 °C at 10 °C/min. Diluted samples (1/100, v/v, in methylene chloride) of 1.00 mL were injected manually in the slitless mode. The identification of individual components was based on the comparison of their relative retention times with those of authentic samples on SGE-BPX5 capillary column, and by matching their mass spectral of peaks with those obtained from authentic samples and/or the Wiley 7N and TRLIB libraries spectra and published data. The chemical

compositions of oil were also confirmed by SHIMADZU FTIR-84008. Viscosity was determined by Brookfield viscometer, RVT Model (#Spindle 3, RPM 20). The physico-chemical properties of the extract were determined by standard methods (ASTM, 1973).

2.4. Statistical modeling

2.4.1. RSM modeling

In this study, a curve was observed in the response surface of reaction yield. Since the first order equation was not feasible, a second order polynomial equation was used for estimating a portion of the yield surface with curvature as suggested in equation (2). The experimental design was based on full-factorial central composite rotatable design that guarantees a reduced number of experimental runs. The experimental analysis enhanced the characterization of the full effects of the four (4) - factors at five (5) distinct settings. The global design matrix of equation (1) written in coded notations recommends sixteen (16) experimental runs at 24 - full factorial distinct points, eight (8) runs at the 8-unique axial points and six (6) repeated runs at Center point which give rise to a total of thirty experimental runs. To study all standard effects of these process variables on the responses analytically often requires fitting appropriate predictive model usually obtained from regression analysis to the resulting data from which detailed statistical analysis and optimization exercise are based.

k k k Y = ft + E biXi + E E bijXiXj + E E djXiXj + ' i=1 i < j=2 i=1 j=1

Y is the measured system response (oil yield) of the gmelina seed oil extraction process. The first term by the right side of the equation b0 is the model intercept, the second terms characterizes the main linear effects of individual process variables, the third term incorporates the interaction effects between the variables, while the fourth term represents the main quadratic effects and e is the random error of experimentation. Xij represents the matrix of the uncoded process variables.

Equation (1) serves as the global predictive equation from which specific solution may be derived. The determination of the unknown coefficients of b0, ft, ftjand dij is accomplished via regression analysis implemented on the statistical analysis software Design-Expert Version 9.1.7.1 trial from the stat-Ease Inc. using the data recorded from the investigation. The coded values of the independent variables for the design of the experiment for gmelina extraction process are given in Table 1. For statistical analysis, the variables Xi (i = 1, 2 ... 4) were coded A, B, C and D. The determination of unknown coefficient of equation (2) applies the design matrix of Table 2 formulated by judicious transformation of the

Table 1 - Experimental range and level of the independent variables for gmelina seed oil extraction.

Independent variable Rang e and level

-a -1 0 1 +a

Particle size (mm) (A) 150 300 600 750 1000

Temperature (0C) (B) 20 30 40 50 60

Volume of solvent (ml) (C) 50 75 100 125 150

Time (min) (D) 10 20 30 40 60

Table 2 - CCRD matrix for GSO extraction process.

Runs Independent variables Responses

A (mm) B (0C) C (ml) D (min) Y (%)

1 -1 1 1 1 31.79

2 1 1 -1 1 17.67

3 0 0 0 0 31.67

4 0 0 2 0 43.56

5 1 -1 -1 1 20.14

6 0 0 0 0 31.62

7 1 1 -1 -1 15.43

8 0 -2 0 0 21.54

9 1 -1 -1 -1 17.75

10 -1 1 -1 -1 32.76

11 0 0 -2 0 20.81

12 1 -1 1 -1 19.66

13 0 0 0 0 31.65

14 -1 -1 -1 1 20.55

15 1 -1 1 1 22.68

16 1 1 1 1 29.61

17 -1 -1 1 -1 28.65

18 0 0 0 0 31.65

19 -1 1 1 -1 52.09

20 1 1 1 -1 27.02

21 0 2 0 0 47.25

22 0 0 0 2 29.25

23 2 0 0 0 19.54

24 0 0 0 -2 23.87

25 -1 1 -1 1 29.11

26 -2 0 0 0 50.82

27 -1 1 1 1 54.60

28 0 0 0 0 31.65

29 -1 -1 -1 -1 17.93

30 0 0 0 0 31.64

actual values of the four control variables at various levels over which the experiments were executed to their coded equivalents using -1 and +1 notations to designate low and high level factor setting and '±a' and '0' for axial and center points respectively. It has been shown that working with the coded variables enhances matrix transformation and helps in results analyses. A useful value of a is evaluated from the general expression a = fVn (=2) where f is the number of factorial points (=16) and n is the number of factors (=4) or for 2-level designs a = 2n2. The data given in Table 1 were used to formulate a global design matrix of Table 2 from which further analyses were derived. Y is the response (oil yield) across the various experimental runs. Equation (2) was fitted to the experimental data presented in Table 2 to obtain the final predictive equation for the reaction progress in terms of the coded variables.

2.5. Neural network modeling (ANN)

2.5.1. Data set

The experimental data employed in the ANN design is presented in Table 3. The experimental data were randomly divided into three sets: 20, 5, and 5 of data sets were used for training, validation and testing, respectively. The training data was utilized in evaluating the network parameters. The validation data set on the other hand was applied to ensure robustness of the network parameters in line with standard ANN procedure, while the testing stage was intended to avoid "overfitting" phenomenon required for controlling the modeling error. The testing process is also used to assess the predictive ability of the generated model.

2.5.2. NN description

Neural network is an ideal tool for modeling response of dynamic systems with low system knowledge. In other words, it is useful for functional prediction and system modeling where the physical processes are not understood or are highly complex. In the present study, Neural Network Toolbox V7.12 of MATLAB mathematical software was used to predict the yield of gmelina oil during extraction process. The extraction process was identified in this study to be significantly influenced by four main process variables viz: particle size, temperature, solvent ratio, and extraction duration. Multiple layer perceptron (MLP) based on feed-forward ANN was applied to build the predictive model for the pilot plant. The network consists of an input layer, one hidden layer and an output layer. The inputs for the network are particle size, temperature, solvent ratio and time. Output is percentage gmelina oil yield. The proposed feed-forward network, data always flow in the forward direction i.e. from the input layer to the output layer. The connection between the input, hidden and output layers consists of weights (w) and biases (b) that are considered parameters of the neural network (NN). The neurons in the input layer simply introduce the scaled input data via w to the hidden layer. The neurons in the hidden layer carry out two tasks (Desai et al., 2008). First they sum up the weighted inputs to neurons, including b, as shown by the following equation.

i = xjwj + b

Where xi is the input parameter.

The weighted output was then passed through a transfer function. The most popular transfer functions for solving nonlinear and linear regression problems are hyperbolic tangent sigmoid (tansig), log sigmoid (logsig), and linear (purelin) (Khayet and Cojocaru, 2012). In this study tansig was used as transfer function between the input and hidden layer, while purelin was used as transfer function between the hidden and output layer. These transfer functions are described by the following expressions;

purelin = sum

tansig =

1 - exp( - sum) 1 + exp( - sum)

The output produced by the hidden layer becomes an input to the output layer. The neurons in the output layer produce the output by the same method as that of the neurons in the hidden layer. An error function is carried out based on the predicted output. Training an ANN model is an iterative process in which pre-specified error function is minimized by adjusting the w appropriately. The commonly used error function; the mean square error (MSE), defined in (6) was used in this study.

MSE = nE Y - Yn)2

Yt is the target output, YN is the predicted output and N is the number of points.

There are various types of training algorithms. One of the most employed classes of training algorithm for feed-forward neural network (FFNN) is the back-propagation (BP) method. Training of ANN by means of BP algorithm is an iterative process where the MSE is minimized by adjusting the w and b appropriately. There are many variation of BP for training

Table 3 - Experimental values (training, validation and testing), actual and model predicted of GSO during extraction.

S/N Independent variables Actual yield (%) Predicted yield (%)

Particle size (mm) Temp. (°C) Solvent vol. (ml) Time (min)

1a 300 50 125 40 31.39 45.1193

2c 750 50 75 40 17.67 23.4080

3a 600 40 100 30 31.67 31.0183

4b 600 40 150 30 43.56 40.5608

5a 750 30 75 40 20.14 20.0018

6a 600 40 100 30 31.62 31.0183

7a 750 50 75 20 15.43 20.6185

8c 600 20 100 30 21.54 23.2544

9a 750 30 75 20 17.75 18.5794

10b 300 50 75 20 32.76 29.9187

11a 600 40 50 30 20.81 23.0919

12a 750 30 125 20 19.66 20.7987

13a 600 40 100 30 31.65 31.0183

14c 300 30 75 40 20.55 28.5667

15a 750 30 125 40 22.68 23.6967

16b 750 50 125 40 29.61 29.8223

17a 300 30 125 20 28.65 30.1895

18a 600 40 100 30 31.65 31.0183

19a 300 50 125 20 52.09 54.0941

20c 750 50 125 20 27.02 25.9370

21a 600 60 100 30 47.25 40.5167

22b 600 40 100 60 29.25 40.5755

23a 1000 40 100 30 19.54 17.4763

24a 600 40 100 10 23.87 25.8305

25a 300 50 75 40 29.11 32.8749

26c 150 40 100 30 50.82 35.2047

27a 300 50 125 40 54.60 45.1193

28b 600 40 100 30 31.65 31.0183

29a 300 30 75 20 17.93 23.6607

30a 600 40 100 30 31.64 31.0183

a Training data set. b Validating data set. c Testing data set.

NNs. During training step the w and b are iteratively updated by LM (Levenberg-Marquardt-trainlm) algorithm until convergence to a certain minimal value is attained. Different variables may have various magnitudes, and some could be unmerited, but have favorable effect on the monitored quantity. In the implementation, the inputs and output were normalized within a uniform range of [0-1] following the relation;

(x xmin)

Where x is the variable, xmax is the maximum value, and xmin is the minimum value.

2.6. Genetic algorithm (GA) optimization technique

Once the generalized ANN model was developed, the input space is then optimized via GA. The input vector made up of the input parameter of the model converts the decision parameter for the GA. Genetic algorithm is a global optimization method developed on the principle of natural selection. The algorithm begins with a population of random solution in some structured array. This was followed by a number of operations intended to achieve convergence. The development of the GA follows some steps as initialization of solution population identified as chromosomes, fitness computation based on objective function, selection of best chromosomes, and genetic propagation of chosen parent chromosomes by

genetic operators like cross-over and mutation. Cross-over and mutation are implemented to produce the new and better population of chromosomes (Hagan and Menhaj, 1994). Genetic algorithm (GA) optimization depends on genetics and natural selection. In order to solve a problem, this algorithm develops a vast group of solutions. GA 'fitness level' is allocated to each solution, based on which solutions are ranked and the most suitable ones are identified. These optimal solutions generate new solutions which are closer to the target based on the logic that those more fit are more likely to reproduce. Reproduction of solutions is obtained using the GA operator such as crossover and mutation (Mitchell, 1999). The crossover operator combines the features of two solutions (by swapping their segments) to breed two similar solutions while mutation randomly changes the individual solutions to create new ones. In other words, mutation rules introduce new variability to the solution population while crossover exchanges information between two particular solutions (Michalewicz, 1994). Different network structures require different parameter settings, which means that each new structure must be optimized separately. Derivate-based methods are not applicable as the derivatives cannot be directly accessed. Therefore, black box methods like GA yield the most satisfactory results. In the present work, GA optimization was applied to determine the most suitable ANN structure. Population size, crossover fraction, migration fraction, and generation were set to 10, 0.9, 0.01, and 10, respectively, and the search stopped after exceeding the maximum

xmax x

number of generations. The developed code in MATLAB consisted three parts of main, GA optimizer, and ANN modeler. The main section read the input and output values, adjusted various ANN structures, and called the GA optimizer to optimize the training parameters for each structure. The GA optimizer had two objectives: RMSE and R2, while its variable was number of epochs. The ANN modeler used the training functions separately in all structures and the epoch number was optimized by the GA optimizer.

3. Results and discussion

3.1. ANN modeling

The LM algorithm utilized in this study is an approximation to the Newton's method (Yetilmezsoy and Damirel, 2008). The algorithm uses the second order derivatives of the mean square error between the desired outputs and the actual output so that better convergence behavior could be achieved. The best topology of the ANN model includes four (4) inputs, one hidden layer with five (5) neurons and one output layer (45-1) as shown in Fig. 1. The mean square error (MSE) and the coefficient of determination R2 for training, validation and all data presented in Fig. 2 and Table 4 are reasonable, since the test set error and the validation set error have similar characteristics, and it doesn't appear that any significant over fitting has occurred. The analysis of the network response utilized a linear regression between the network outputs and the corresponding targets. The adequacy of the developed NN

Fig. 1 - Schematic of the 4-5-1 ANN architecture used in the study.

101-.-.-.-.-

0i-,--------,— | 1 -

0 2 4 6 8 10

Slop Training EpOCh

Fig. 2 - Optimal error and learning curve of the LM-trainlm algorithm.

Table 4 - Statistical performance measure of the ANN and RSM model.

Best R2 RSM

architecture ANN

Training Validation Test All

4-5-1 0.9653 0.9653 0.9653 0.961 0.9523

model is demonstrated using a scatter plot of the actual and the predicted yield shown in Fig. 3. The resulting R2 value of 0.856 implies a reasonable correlation of all actual experimental data by the proposed model.

3.2. ANOVAfor gmelina seed oil extraction process

The process characterization (or factor screening) was carried out to evaluate the system variables which provide the most important effects in tune with the research objectives. The necessary computational tasks were aided with the ANOVA function of the statistical software Design-Expert Version 8.1.7.1 In view of curvature, a reduced order quadratic model was appropriate for predicting the overall process characteristics. The ANOVA results derived from the predictive model shows that the main linear effects due to individual control factors such as particle size (xj, time (x2), solvent ratio (x3), and temperature (x4) coded as A, B, and C and D respectively, are all significant process variables, with the observed P-values < 0.05 in the numerical analysis. This is equally true with the linear interaction effects between particle size and temperature (AD), particle size and solvent ratio (AC) and temperature and solvent ratio (CD). The quadratic effect of temperature, denoted by D2 is significant. The data obtained for specific investigation were refitted with a modified model obtained by excluding the non-significant variables from the general predictive equation and the results of statistical analysis obtained for the sequence of experimentations are summarized in Table 5. The coefficients of determination R2 values of 0.9523 obtained for the gmelina seed oil extraction process shows that more than 95% of the overall system variability can be explained by the empirical model of Equation (1) which is specific case of the general predictive

Best Linear Fit: A = (0.698)T * (9.6)

Fig. 3 - Scatter plot of ANN predicted and actual yield of Gmelina oil.

Table 5 - Analysis of variance for GSO extraction.

Source Sum of squares df Mean Square F Value p-value Prob >F

Model 3187.53 14 227.68 21.40 <0.0001 significant

A-X1 1062.40 1 1062.40 99.84 < 0.0001

B-X2 714.61 1 714.61 67.15 < 0.0001

C-X3 815.03 1 815.03 76.59 < 0.0001

D-X4 26.50 1 26.50 2.49 0.1354

AB 229.07 1 229.07 21.53 0.0003

AC 92.16 1 92.16 8.66 0.0101

AD 2.27 1 2.27 0.21 0.6512

BC 112.04 1 112.04 10.53 0.0054

BD 3.13 1 3.13 0.29 0.5954

CD 3.29 1 3.29 0.31 0.5862

A2 0.22 1 0.22 0.020 0.8881

B2 0.31 1 0.31 0.030 0.8658

C2 11.93 1 11.93 1.12 0.3064

D2 117.06 1 117.06 11.00 0.0047

Residual 159.62 15 10.64

Lack of Fit 159.62 10 15.96 59,857.40 < 0.0001 significant

Pure Error 1.333E-003 5 2.667E-004

Cor Total 3347.15 29

Std. Dev. = 3.26 R-Squared = 0.9523

Mean = 29.45 AdjR-Squared = 0.9078

C.V. % = 11.08 Pred R-Squared = 0.7253

PRESS = 919.41 Adeq Precision = 16.924

equation derived for the investigation from the multivariate regression analyses implemented on design expert.

Y = 31.65 - 6.65A + 5.46B + 5.83C + 1.05D - 3.78AB - 2.40AC + 2.65BC - 2.07D2

Where Y is the predicted value of the dependent variable (oil yield). The coefficient of A, B, C and D are the main linear effects of the independent process variables: particle size (xj, time (x2), solvent ratio (x3), and temperature (x4) respectively, in coded units. AD, DB and CB represent the linear interaction effects between particle size/temperature, temperature/time and solvent ratio/time, respectively. A2, B2,C2 and D2 are the quadratic effects of the respective process variables. The "Pred R-Squared" of is 0.7253 in reasonable agreement with the "Adj R-Squared" of 0.9078. "Adeq Precision" measures the signal to noise ratio. A ratio greater than 4 is desirable. The design ratio of 16.924 indicates an adequate signal. This model can be used to navigate the design space and the model F-value of 21.40 further indicates that the model is significant. There is only a 0.01% probability that the "model F-value" this large could

occur due to noise. P value less than 0.05 indicates model terms are significant.

3.3. Characteristics of crude and desaturated gmelina seed oil

The physio-chemical properties of crude modified GSO are shown in Table 6. It was observed that the neutralization process reduced the FFA content of crude GSO from oil from 1.85 to 1, with a refractive index of 1.465 from 1.441. In neutralization process, the free fatty acid content of oil was converted to oil soluble soaps. Traces of impurities like proteins and/or proteins fragments, phosphotides and gummy or mucilaginous substances were also removed by the neutralization process. The acid value increases from 3.92 to 6.28 mg KOH/g. Of particular interest is the changes observed in the properties of the oil as it went through the various modification procedure (epoxidation, hydroxylation and dehydration) to obtain a chemically modified oil. At first, there was an increase in the density through epoxidation and hydroxylation, but a seemingly decrease in density after dehydration. The

Table 6 - The characteristics of the crude and modified gmelina seed oil.

Properties Crude oil Neutralized oil Epoxidized oil Hydroxylated oil Dehydrated oil

Free fatty acid value (%) 1.85 1 - - 1

Color Golden yellow Golden yellow Whitish yellow Whitish yellow Brown

Refractive index (25 0C) 1.441 1.465 - - -

Specific gravity (25 0 C) 0.893 0.899 0.9393 0.9456 0.9012

Viscosity (pa s) 3.336 3.392 - - 3.486

Saponification value (mgKOH/g) 38.23 - - - 143.61

Acid value 3.92 - - - 6.28

Iodine value (gI2/100 g) 34.9 - - - 125.5

Set to touch (hr) - - - - 4

Drying time (hr) - - - - 6

Oil content (wt%) 55.16 - - - 4.38

Physical state (25 0 C) Liquid Liquid Liquid Liquid -

Ash content (%) 5 - - - -

increase in density during epoxidation and hydroxylation indicates increase in mass per unit volume of the oil sample which could be attributed as a result of the reduction of low molecular weight free fatty acid content in the oil and inclusion of oxygen atom in the fatty acid structure. The increase in acid value and the decrease in density of the GSO may be attributed to loss of water molecules. Moreover, the acid value increase during the chemical modification step could be as a result of competing side reactions such as hydrolysis of triglycerides to free fatty acids due to presence of mineral acid (H2SO4) during the epoxidation process. The color of the oil became whitish yellow from golden yellow after the epoxi-dation and hydroxylation but turned brown after dehydration due to the relatively high temperature at which the reaction was carried out. Moreover, the viscosity of the oil slightly increased during the process of the modification. The iodine value showed an astronomical increase (39.3 g I2/100 g to 125.5 g I2/100 g) indicating the increase in the level of unsaturation after the chemical modification process. This shows that epoxidation, hydroxylation and dehydration reactions actually took place at the point of unsaturation of the aliphatic chain by ring-opening, addition and elimination reactions respectively. It is evident from the result that modified GSO is quite suitable in alkyd resin synthesis and its level of unsaturation will accommodate the cross-linking reactions for al-kyds to form dry, hard solid film (Marshall, 1986). The oil content of the seed for the crude and modified was found to 55.16 and 40.38% respectively. On the basis of the oil content, gmelina arborea seed would be highly suitable economically for industrial applications in surface coating, as any oil bearing seed that can produce up to 30% oil are regarded as suitable. The saponification value of crude and dehydrated GSO were 38.23 and 143.61 mg KOH/g respectively. The saponification value reveals average molecular weight of fatty acids of triglyceride present in GSO. The iodine value and viscosity of the dehydrated GSO were within limit of ASTM standard. The drying time and set to touch time gave satisfactory result. Therefore these dehydrated oil sample were acceptable to prepare alkyd resin.

The fatty acid profile of the crude GSO and FTIR spectra of the crude and modified GSO are shown in Figs. 4 and 5 (a and b) respectively. The fatty acid profile as analysed by gas chromatography mass-spectrometry (GC-MS) showed abundance of palmitoleate (31.94%wt) and arachidic acid (17.16% wt). The most abundant unsaturated and saturated fatty acids were palmitoleate (31.94%wt) and methyl stearate (14.22%wt) respectively. The oil contains 50.86% saturated fatty acid and

49.10% unsaturated fatty acid. The high saturation in GSO shows that if it is used in the production of alkyd resin, it will produce alkyd of slower drying rate but color retentive. GSO, therefore, if structurally modified, may give alkyd resin with better performance characteristics. In the spectrum of crude GSO, 3298 cm-1 correspond to the hydroxyl group (O-H) of the unsaturated fatty acid in the oil. The carboxyl group (C=O) is indicated at 1740 cm-1. The straight chain of -CH- stretch in aliphatic compound is found at the band 2934 cm-1. Alkene group (CH=CH) is attributed to the band of 3206 cm-1. The FTIR spectra of crude GSO shows the peak value at 938 cm-1, and 1378 cm-1 which correspond to cyclic ester of saturated oil. IR spectra and functional compounds of desaturated GSO showed the broad band shoulder around 3902 cm-1,3870 cm-1 and 37,684 cm-1 (initially at 3298 cm-1 in the unmodified oil) corresponding to the hydroxyl (OH) of the unsaturated fatty acid in the mixture, which shows an increase of unsaturation as the GSO pass through the various processes of the modification. The peaks of the epoxidized GSO were similar to the crude samples while those due to the hydroxylated GSO were similar to the dehydrated samples.

3.4. Optimization by GA

The developed NN model was applied for optimization studies of the gmelina oil extraction process within the design space. The value of GA specific parameter were as follows; population size = 10, cross-over probability = 0.9 and mutation probability = 0.01. Optimum conditions which guarantees maximum 49.8% oil yield were selected after observing the results of the GA for ten successive iterative runs (Fig. 6). The optimized conditions were obtained as follows; 200 mm particle size, 40 °C temperature, 100 ml volume of solvent and 40 min extraction period. A validating experiment performed using the predicted optimal parameters gives 48.80% yield of gmelina oil.

3.5. Comparative predictive and optimal evaluation of ANN-GA and RSM

The oil yield predicted by ANN-GA is observed to be closer to the actual readings obtained from the laboratory when compared to that predicted by RSM. It was shown that ANN gave almost the same prediction when compared to the RSM. The regression coefficient (R2) from ANN prediction technique was 0.961, while that of RSM was 0.9523. This shows that ANN can account for more than 96% of the variability in the system,

Fig. 4 - Fatty acid profile of crude GSO.

Fig. 5 - a. FTIR spectra of crude and modified GSO. b. FTIR spectra of crude and modified GSO.

while RSM can only account for 95% of the variability of the system. The ANN model shows good fitting with the experimental data and the obtained result indicates the performance of ANN is better than RSM. The models showed that all the mentioned variables had a significant effect on response (oil yield). The prediction accuracy was more prominent for ANN since it was built up from the experimental results of RSM. RSM results were optimized in different combinations of operating variables, while ANN results were used with GA to optimize the oil yield from gmelina seed oil extraction process. It was found that ANN was a powerful technique for prediction and even over prediction of yield response using the operating parameters as the inputs. RSM gave an optimal yield of 49.2%, while ANN-GA gave 49.8%. The correspondent experiment performed using the predicted optimal parameters gave 47.9% and 48.8% yield of gmelina seed oil for RSM and ANN-GA respectively. This observation translates to 2.64% 2.05% error for RSM and ANN-GA respectively. It can be concluded that ANN-GA interface is 1.21% more efficient than RSM in trapping the optimal, though, they yielded similar results without considerable differences. Overall, the predicted optimal for ANN-GA seem to be more feasible.

4. Conclusion

An RSM based on CCRD and a multi-layer neural network (4-51 ANN architecture) models were developed in order to justify the aim of the current research. The input variables were temperature, time, volume of solvent, and particle size, while the oil yield was the output. The performance of the models was evaluated by mean square error (MSE) and coefficient of determination (R2) and the results show that the models are very efficient. The models showed that all the mentioned variables had a significant effect on response (oil yield). The prediction accuracy was more prominent for ANN since it was built up from the experimental results of RSM. RSM results were optimized in different combinations of operating variables, while ANN results were used with GA to optimize the oil yield from gmelina seed oil extraction process. It was found that ANN was a powerful technique for prediction and even over prediction of yield response using the operating parameters as the inputs. RSM gave an optimal yield of 49.2%, while ANN-GA gave 49.8%. The correspondent experiment performed using the predicted optimal parameters gave 47.9%

Fig. 6 - GA optimization result.

and 48.8% yield of gmelina seed oil for RSM and ANN-GA. It was established that ANN-GA interface was 1.21 more efficient than RSM in trapping the optimal, though, they yielded similar results without considerable differences. Characterization and desaturation result indicate that the oil is non-drying, inedible and can used as surface coating material.

Compliance with ethical standard

This research was funded by the authors.

We declare that there is no conflict of interest, and that the article does not contain any studies with human or animal participants performed by any of the authors.

References

Adegbehin, J., Abayomi, J., Nwaigbo, L., 1988. Gmelina arboreal in

Nigeria. Commonw. For. Rev. 159-166. Adeyeye, A., 1971. Composition of seed oils of gmelina arboreal

and teak tectora-grandis. Pak. J. Sci. Ind. Res. 34 (9), 359. Babu, B.V., 2004. Process Plant Simulation. Oxford University

Press, New Delhi, India, pp. 257-290. Bas, D., Boyaci, I.H., 2007. Modeling optimization II: Comparison of estimation capabilities of response surface methodology with artificial neural networks in a biochemical reaction. J. Food Eng. 78, 846-854. Basumatary, S., Deka Dinesh, C., Deka Dibakar, C., 2012.

Composition of biodiesel from Gmelina arboreal seed oil. Adv. Appl. Sci. Res. 3 (5), 2745-2753. Box, G.E.P., Wilson, K.G., 1951. J. R. Stat. Soc. B 13, 1. Desai, K.M., Survase, S.A., Soudagar, P.S., Lele, S.S., Singhal, R.S., 2008. Comparison of artificial neural network (ANN) and response surface methodology (RSM) in fermentation media optimization: case study of fermentative production scleroglucan. Biochem. Eng. J. 41, 266-273.

Ghaffari, A., Abdollahi, H., Khoshayand, M.R., Bozchalooi, I.S., Dadgar, A., Rafiee-Tehrani, M., 2006. Performance comparison of neural network training algorithms in modeling of bimodal drug delivery. Int. J. Pharm. 327, 126-138.

Hagan, M.T., Menhaj, M.B., 1994. Training feed-forward network with Marquardt algorithm. IEEE Trans, Neural Netw. 5, 989-993.

Holland, J.H., 1975. Adaptation in Natural and Artificial Systems. The University of Michigan Press, Michigan.

Khayet, M., Cojocaru, C., 2012. Artificial neural network modeling and optimization of by air gap membrane distillation. Sep. Purif. Technol. 86, 171-182.

Kose, E., 2008. Modelling of colour perception of different age groups using artificial neural networks. Expert Syst. Appl. 34, 2129-2139.

Mandal, S., Sivaprasad, P.V., Venugopal, S., Murthy, K.P.N., 2009. Artificial neural network modeling to evaluate and predict the deformation behavior of stainless steel type AISI 304L during hot torsion. Appl. Soft Comput. 9, 237-244.

Marshall, G.L., 1986. Eur. Polym. J. 22, 217-230.

Michalewicz, Z., 1994. Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag, Berlin.

Mitchell, M., 1999. An Introduction to Genetic Algorithms. The MIT Press, Massachusetts.

Montgomery, D.C., 2005. Design and Analysis of Experiments, sixth ed. John Wiley & Sons, Inc.

Odetoye, T.E., Ogunniyi, D.S., Olatunji, G.A., 2011. Improving Jatropha Curcas Linnaeus oil alkyd drying properties. Elservier J. Prog. Org. Coat. 73 (2012), 374-381.

Sangay, B., Priyanka, B., Dibakar, O.D., 2014. Gmelina arborea and Tarbernaemontana. divaricate seed oils as non-edible feedstock for biodiesel production. J. Chem Tech Res. 6 (2), 1440-1445.

Taguchi, G., 1987. System of Experimental Design: Engineering Methods to Optimize Quality and Minimize Cost. UNIPUB, New York.

Taguchi, G., 1991. Introduction to Quality Engineering. Asian Productivity Organization, UNIPUB, New York.

Taguchi, G., Wu, Y., 1980. Introduction to Off-line Quality Control.

Central Japan Quality Control Association, Nagoya, Japan. Tewari, D.N., 1995. A Monograph on Gamari (Gmelina arboreal

Roxb). International Book Distributors. Topallar, H., Gecgel, U., 2000. Kinetics and thermodynamics of oil extraction from sunflower seeds in the presence of aqueous acidic. Turk. J. Chem. 24 (3), 247-254. Uzoh, C.F., Onukwuli, O.D., 2014. Extraction and characterization of Gmelina seed oil; kinetics and optimization studies. Open J. Chem. Eng. Sci. 1 (2), 1-18. Wagoner, R., 1998. Experimental Process Design. More head State University.

Yasin, Y., Faujan Bin, H., Ahmadb, B.H., Ghaffari-

Moghaddamc, M., Khajehca, M., November 2014. Application of a hybrid artificial neural network—genetic algorithm approach to optimize the lead ions removal from aqueous solutions using intercalated tartrate-Mg-Al layered double hydroxides. Environ. Nanotechnol. Monit. Manag. 1-2, 2-7.

Yetilmezsoy, K., Damirel, S., 2008. Artificial neural network(ANN) approach to modeling of lead(II) adsorption from aqueous solution by Antep pistachio (pistacia vera L) shells. J. Hazard Matter 153, 1288-1300.