Applicability of Neural Networks for the Fermentation of Propionic Acid by Propionibacterium acidipropionici

According to our best knowledge, this is the first report applying Artificial neural networks (ANN) for simulation of batch propionic acid (PA) fermentation. Therefore, the main focus of this research was to investigate the applicability of ANN on PA fermentations. To demonstrate this, we used the results of 40 Propionibacterium acidipropionici fermentations (ca 2,000 data points) to build up the ANN, and additional two independent fermentations to demonstrate the prediction capability of the observed ANN. Analyzing the predicted output parameters we observed, that ratio of propionic acid to acetic acid (PA/AA) variables can only be used for ANN after normalization. Finally, the fit of the ANN model to the measured data was fine (average correlation coefficients over 0.9). A special feature was also tested: fermentation time was also used as an input parameter, thus making the ANN suitable to predict time course of PA fermentations as well which was also satisfying.


Introduction
An artificial neural network (ANN) approach offers an attractive alternative to nonlinear multivariate process modelling.Over the last two decades, ANNs have emerged as attractive tools for nonlinear multivariate process modelling especially in situations where the development of phenomenological or conventional regression models is impractical or cumbersome [1,2].An ANN is an information processing paradigm that is inspired by biological nervous systems, e.g. the brain, which process information [3,4].Indeed, an ANN is a massively interconnected network structure consisting of many simple processing elements capable of performing parallel computation for data processing [5].The fundamental processing elements of ANNs, namely connected units or nodes called artificial neurons, simulate the basic functions of biological neurons [6,7].The benefit of ANNs is that they are generic in structure and able to learn from historical data.Furthermore, the main advantages of ANNs compared to the response surface methodology (RSM) are: 1. ANNs do not require any prior specification of a suitable fitting function and 2. ANNs have universal approximation capabilities.
They can approximate almost all kinds of nonlinear functions including quadratic functions, whereas RSM is useful only for quadratic approximations [8].
It is believed that ANN would require a much greater number of experiments (number of patterns) than RSM to build an efficient model.In fact, ANNs can also work well even with relatively less data as long as the data is statistically well distributed in the input domain, which is the case with Design of Experiments (DOE) [9].Therefore, experimental data of RSM should be sufficient to build an effective ANN model [10].There are only a few articles in the literature which present comparative studies between ANNs and RSM.ANNs have consistently been found to perform better than RSM in all the studies published [11].
Over recent years, ANNs have been successfully applied to the modelling and control of various biological processes [12][13][14][15][16]. ANNs are now the most popular artificial learning tools in biotechnology, with applications ranging from pattern recognition in chromatographic spectra and expression profiles to functional analyses of genomic and proteomic sequences [17][18][19].
PA is a valuable C-3 platform chemical.Accordingly, PA and its derivatives are used in the agricultural, food and pharmaceutical industries, e.g. as an important chemical intermediate in the synthesis of herbicides, perfumes, cellulose fibers and pharmaceuticals [20][21][22].As a three-carbon-long building block, it is used as a precursor of high-volume commodity chemicals such as propylene [23].PA and its calcium, sodium and potassium salts are widely used as preservatives in animal feed and food for human consumption.Some of them are also important mould inhibitors [24,25].
Most used wild type Propionibacterium species for PA production are P. acidipropionici and P. freudenreichii.The latter is also capable of producing vitamin B12 [26,27].P. acidipropionici is able to grow in microaerophile conditions and also able to produce more propionic acid in comparison to P. freudenreichii [28].Propionic acid fermentation is known to suffer from end-product inhibition and byproduct formation, mainly acetic and succinic acids, which lower the yield and productivity of propionic acid [29].To overcome these limitations genetically modified microorganisms and various fermentation techniques are in the scope of research [30].Nowadays there is no developed and successful method for economical bio-PA production [31].There are a lot of articles concerning the utilization of alternative and renewable carbon as well as nitrogen sources [32].With wild type Propionibacterium species and free-cell batch fermentation technique only 20-30 g L −1 of PA is achievable [28].
In the present investigation, more than 2,000 data points were used from more than 40 fermentations to obtain a practical and acceptable model using an ANN.The fermentation parameters were derived from four experimental designs, thus ensuring the random deviation of the data.The results as PA titers, yields, N-PA/AA ratios and productivity were evaluated by the ANN.To the best of our knowledge, this is the first report modelling PA fermentation by Propionibacterium acidipropionici using an ANN.The main focus of this research was to investigate the applicability of ANN on PA fermentations.To demonstrate this, we used the results of 40 PA fermentations to build up the ANN.Since it was expected that the ANN is able to predict PA fermentations, the prediction capability of the observed ANN was tested with such two fermentations, which were neither involved in ANN building, training, nor validation processes.

Microorganism
Propionibacterium acidipropionici DSM 20273 (equivalent to ATCC 4965) from DSMZ German Collection of Microorganisms and Cell Cultures was used in this study.

Media
The modified Peptone Yeast Glucose (PYG) medium for inoculum preparation was composed of the following per litre: 5.00 g trypticase peptone, 5.00 g peptone, 10.00 g yeast extract, 5.00 g beef extract, 5.00 g glucose, 2.00 g K 2 HPO 4 , 1.00 ml TWEEN 80 and 40.00 ml salt solution (see below).While the glucose was sterilized separately, every component of the medium and piece of equipment was sterilized in an autoclave and added aseptically to the broth.The salt solution was composed of the following per litre: 0.25 g CaCl 2 • 2H 2 O, 0.50 g MgSO 4 • 7H 2 O, 1.00 g K 2 HPO 4 , 1.00 g KH 2 PO 4 , 10.00 g NaHCO 3 and 2.00 g NaCl.
The fermentation medium was composed of the following: molasses, calcium lactate, glycerol, whey powder or glucose as carbon sources, yeast extract, corn steep liquor, tryptone, casein or corn germ flour as nitrogen sources were used as required according to the experimental design.In addition, each experimental broth contained 2.5 g K 2 HPO 4 and 1.25 g KH 2 PO 4 per litre.

Inoculum preparation
Stock cultures were inoculated from stab cultures into 100 mL of modified PYG medium in 100 mL Erlenmeyer flasks to produce the pre-inoculum before being incubated at 32 °C for 72 h in a shaking incubator at an agitation speed of 150 rpm.For preparing inoculum 30 mL of the pre-inoculum cultures were transferred into 270 mL of the modified PYG medium in 300 mL Erlenmeyer flasks to prepare the inoculum.These were incubated at 32 °C for 3 days in incubator shaker at an agitation speed of 150 rpm.

Fermentation conditions
For starting the batch PA fermentations 10 mL of inoculum were added to every 250 mL Erlenmeyer flask containing 240 mL of the fermentation medium.The flasks were incubated at 32 °C for 7 days on a rotary shaker at 150 rpm.To control the pH, CaCO 3 was kept in a small excess to avoid forming a cake at the bottom of the flasks due to low levels of agitation.Therefore, ca.1.5 g of sterile CaCO 3 was added to the broth during sampling if the presence of CaCO 3 was no longer visible.

Experimental design for PA fermentations
To gain a statistically well distributed data-matrix input for the ANN four experimental designs were applied.At first, a full factorial design was used with two factors at five levels: carbon sources (molasses, calcium lactate, glycerol, whey protein powder, glucose) and nitrogen sources (yeast extract, corn steep liquor, tryptone, casein, corn germ flour).
Thereafter three central composite experimental designs were applied.The factor pairs were the following: yeast extract and molasses, yeast extract and glycerol, corn germ flour and molasses.The factors were varied at three levels (50 %, 100 %, 150 %). 100 % was the quantity applied in the full factorial experimental design.
The sets of experimental conditions of all fermentations are listed in Table 1.

Analysis of PA fermentations samples
Cell growth was monitored by measuring the optical density (OD) at 600 nm in a 1 mL cuvette using a spectrophotometer.Samples of the broth containing suspended cells were diluted fourfold with 2.5 % of HCl solution.For blank cell free supernatant of centrifuged (13.000 rpm, 5 min) samples were used with the same dilution.
Sugars and organic acids were quantified from filtered and ten-fold diluted supernatants of samples by using High-Performance Liquid Chromatography which consisted of an inline degasser, isocratic pump, autosampler and refractive index (RI) detector at 40 °C with an organic acid analysis HPLC column (Aminex HPX-87H, Bio-Rad) operated at 65 °C with 0.5 mM H 2 SO 4 as the mobile phase at a flow rate of 0.5 mL/min.

Experiments to test ANN prediction
To verify the generated ANN, two fermentations were randomly selected and investigated in 250 mL Erlenmeyer flasks.The first medium contained the following per litre: 20.8 g yeast extract, 58.8 g glycerol, 2.5 g K 2 HPO 4 and 1.25 g KH 2 PO 4 .The second medium contained the following per litre: 11.6 g yeast extract, 44.4 g molasses, 2.5 g K 2 HPO 4 and 1.25 g KH 2 PO 4 .

Construction of ANN
TIBCO Statistica software (version 13.4.0.14) was used in both the experimental design and ANN.
Raw experimental results were used for building the ANN except in the case of the PA/AA ratio.For that variable, the maximum value obtained for each carbon source was used for normalization because different carbon sources physiologically generate different orders of magnitude in terms of the PA/AA ratio.
During ANN construction, we used the software's Automated Network Search (ANS) function and allowed the software to use minimum 2 and maximum 30 multi-layer-perceptron (MLP i.e. noodle) together with output neurons searching among all possible activation functions (identity, logistic, tangenthyperbolicus, exponential) without any weighing decay and fixed seeds with Brozden, Fletcher, Goldfarb, Shanno (BFGS) algorithm.

Results
More than 200 cases with 11 data points were used to gain information about PA fermentation by Propionibacterium acidipropionici using the ANN.
The repeats in centrum points of the experimental design have a variance as low as 0.042, 0.855, 0.067, and 0.005 for N-PA/AA, PA, yield and productivity, respectively.The same variances for all runs are 0.202, 4.749, 0.234, and 0.083, which indicate appropriate quality of resulted data for constructing ANN, modeling the effect of different variables on PA production.
The selected best model of ANS had performance on Training sets of 0.980 and on Test sets of 0.954 as well as on Validation of 0.941 with SOS error function.
It was achieved by 20 hidden neurons with exponential activation function.70 % of the data points were used to train 15 % of the data points were used for test ANNs and. the remaining 15 % of the data points were used in the validation.The determination coefficients are summarized in Table 2.
Residues were also inspected and showed approximately normal distribution and there was no anomaly and regularity among them, therefore the statistical model is considered adequate (Figs.1-4).
Fig. 1 shows the measured (Target) and predicted (Output by the ANN) N-PA/AA ratio.To achieve the relatively good   fit presented, raw data concerning the PA/AA ratio should be normalized to their maximum values, otherwise a weak fit was observed for almost all outputs of the ANN.The reason for large differences in the PA/AA ratio is related to the different carbon sources applied.Several studies have shown that glycerol can be a good carbon source for PA fermentation with a higher yield of PA and a much lower amount of AA formed compared to glucose [33].Glycerol induces homopropionic acid fermentation and AA production was minimized to almost 1 mol for every 30 mol of PA produced or even less [34].Zhang and Yang also reported the free-cell fermentation of glycerol by Propionibacterium acidipropionici where the PA/AA ratio was more than 100 g/g [35].These high values can cause an unsatisfactory fit, but normalization can solve this problem.The fit of the model for the training, testing and validation data of N-PA/AA were 0.962, 0.939 and 0.937, respectively (Table 2).One of the most important output data points of a fermentation is the concentration of the product.The fits of the model for the training, testing and validation data of PA were 0.968, 0.937 and 0.958, respectively (Table 2).All three values were in excess of 0.900 so produced acceptable fits.The measured (Target) and predicted (Output) PA values are shown in Fig. 2. It can be observed that the data points are located in the vicinity of the line with a small degree of variance.A typical PA batch fermentation takes ~3 days to reach a PA concentration of ~20 g L −1 with a PA yield typically of 0.4 g/g glucose [36].Since time is one of the input variables, lower values are measured from samples taken early in the experiment.
The output parameter with the poorest fit of the four was the yield as is shown in Fig. 3. Zhang and Yang explained that the theoretical yields for PA production from glucose and glycerol were 0.55 g/g and 0.80 g/g, respectively [33,34].In Fig. 3, the data points were arranged in approximately two groups scattered around 0.4 and 0.7, slightly below the theoretical values for glucose and glycerol, respectively.Molasses showed similar results to those of glucose.
Conventional PA fermentation suffers from low productivity and yield due to strong end-product inhibition and the co-production of other by-products, mainly acetic and succinic acids.The PA productivity of the fermentation highly depends on the applied carbon source and the fermentation technique.The range of PA productivity which could be reached with a typical PA batch fermentation may vary between 0.15 and 0.30 g L −1 h −1 [37].Fig. 4 presents the predicted and measured productivities.The best fit was achieved by the productivity parameter.The data points were located in the vicinity of the line with a relatively small standard deviation.Since time was also an input parameter, lower values were achieved at the beginning of the fermentations.
To demonstrate the goodness of results Fig. 5. introduce graphically the distributions of standard residuals.All of the standard residues have normal distributions, and narrow ranges.The widest case is the propionic acid Fig. 5 Statistical analysis of results by Box-Whisker plot representing the distribution of standard residuals concentration, of which measured values also were the highest, so even its distributions is acceptable.

Verification
To check the accuracy of the model, two techniques were applied: while building the ANN, a validation step is also generally involved, but here two additional fermentations were simulated as well as monitored.Data points from these fermentations were not used in the developed ANN.Fig. 6 shows the measured and predicted values of PA concentrations of the two verification experiments.By comparing the experimental and predicted PA concentrations (using glycerol as a substrate).It can be observed that the forecasted values are similar to those measured.Considering the fact that the fit of the PA concentration for all involved cases in the ANN was in excess of 0.95, a good fit was also expected here.Furthermore, this was supported by the second verification experiment (Molasses) where a good fit was achieved as well.
The N-PA/AA ratio could only be determined in the glycerol-based verification experiment in a few samples because the AA concentration was often below the detection limit so assumed to be zero.Nevertheless, the few data points detected fitted well as can be seen in Fig. 7.In the case of the molasses-based verification experiment, a slightly higher difference was obtained especially between 20 and 50 hours but was still acceptable.
The yield produced the poorest fit of all the parameters during the development of the ANN.This fact is also reflected by verification experiments as well as presented in Fig. 8.After approximately 50 hours, remarkable differences were observed in the glycerol-based flasks.For the molasses-based verification experiment, another deviation was also observed after 20 h.However, the trends were still accurately predicted.The key to estimating the yield accurately is to know the amount of unused (i.e.residual) substrate.
In terms of process development, productivity was the most important parameter and resulted in an excellent fit without significant differences between the predicted and measured data as can be seen in Fig. 9.In the initial

Conclusion
The applicability of ANN for simulation of PA fermentation was investigated.A statistical design of 40 fermentations were used to generate more than 2,000 datapoints for training, testing and validating the observed ANN.The predicted fermentation parameters, namely PA concentration, PA/AA ratio, yield and productivity, were in good agreement with experimental values having a fit in excess of 0.95 except yield.For appropriate description of PA/AA ratio, it is necessary to normalize the experimental values, because without it for some carbon sources (like glycerol) it can be infinite high.Since time was an input parameter during the construction of ANN, the time course of PA fermentation became also predictable.Using the developed ANN, it was possible to determine in advance the course of two additional fermentations, the results of which confirmed the prediction.Instead of carry out in vitro experimental designs, this ANN can be used to forecast their result in silico, without the need to conduct real trials.Of course, some experimental runs should be performed to verify the model predicted results.It is worth expanding the ANN database with new fermentation data, specifically with data that expand the studied ranges.It is conceivable that such an expanded ANN will be able to predict the yield with a sufficient degree of certainty.Many other possibilities could be exploited with regard to ANNs, e.g.numerous input parameters could be added and more information extracted by adding other output parameters.After creating a properly designed database, an ANN can be generated that can significantly facilitate and accelerate experimental work in a reliable manner.

Fig. 1 Fig. 2
Fig. 1 Measured (blue dots) and predicted (solid red line) N-PA/AA ratio data points of the ANN

Fig. 3 Fig. 4
Fig. 3 Measured (blue dots) and predicted (solid red line) yield data points of the ANN

Table 1
Set of experimental conditions

Table 1
Set of experimental conditions (continuous)

Table 2
Coefficients of determination of the ANN