Model Building on Selectivity of Gas Antisolvent Fractionation Method Using the Solubility Parameter

Solubility parameters are widely used in the polymer industry and are often applied in the high pressure field as well as they give the possibility of combining the effects of all operational parameters on solubility in a single term. We demonstrate a statistical methodology to apply solubility parameters in constructing a model to describe antisolvent fractionation based chiral resolution, which is a complex process including a chemical equilibrium, precipitation and extraction as well. The solubility parameter used in this article, is the Hansen parameter. The evaluation of experimental results of resolution and crystallization of ibuprofen with (R)-phenylethylamine based on diastereomeric salt formation by gas antisolvent fractionation method was carried out. Two sets of experiments were performed, one with methanol as organic solvent in an undesigned experiment and one with ethanol in a designed experiment. The utilization of D-optimal design in order to decrease the necessary number of experiments and to overcome the problem of constrained design space was demonstrated. Linear models including dependence of pressure, temperature and the solubility parameter were appropriate to describe the selectivity of the GASF optical resolution method in both sets of experiments.


Introduction
Gas antisolvent precipitation (GAS) is one of the most promising innovative applications of supercritical fluids of the last decades.Its semicontinuous scaled up version might be also interesting for continuous pharmaceutical production [1].During GAS processing the environmentally benign carbon dioxide is mixed intimately with an organic solvent containing the solute to be precipitated.Due to the instant and high oversaturation, small crystals or amorphous particles are formed with controllable mean particle sizes, narrow size distributions and sometimes controllable morphologies as well.Applications of and parameter effects on antisolvent precipitations were extensively reviewed [2][3][4].While for typical GAS applications the goal is to completely precipitate the solute in a desired size, crystallization habit and morphology, gas antisolvent fractionation (GASF) is a combination of the precipitation and an extraction step [5,6].The GASF technique can be efficiently used in diastereomeric salt based optical resolutions [7][8][9][10][11] and purification of scalemic mixtures as well [7,12,13].However, the efficiency of GASF is influenced by various operational parameters.The most important ones are the pressure, the temperature, the concentration of the solutes, the solvent and the CO 2 .Furthermore, their effects are often non-linear.Our goal was to find a better way to optimize the optical resolutions with GASF than evaluating the effect of all these parameters individually.
The efficiency of the GASF optical resolution method can be characterized as the selectivity (S) of the process.Selectivity is calculated as the product of yield and diastereomeric purity.To describe selectivity, pressure (p), temperature (T ) and the Hansen solubility parameter (HSP, δ) are taken into account.HSP is an attribute which describes how likely one material dissolves in another to form solution. Materials with similar HSP mix well with each other.Beside the nature of the components, HSP also depends on temperature and pressure.The formulas for calculating HSP of pure materials can be found in the literature [14] also for carbon dioxide and organic solvents [15] and their mixtures [16].The parameter optimization difficulties of supercritical antisolvent precipitation (or related) techniques have already been addressed using (Hansen) solubility parameters.However, only limited correlations were found [17][18][19][20].
Despite the fact that HSP itself is a function of the conditions of the operating system (pressure, temperature, quality of the components of the solvent mixture, ratio of the components of the solvent mixture), it cannot be expected that the selectivity could be described by this parameter only, because optical resolutions are not barely solubility controlled.That is the reason why we took into account temperature and pressure as well independently in our model.Evaluations of two experimental datasets are interpreted in this paper.The first one belongs to the antisolvent process performed with methanol as organic solvent.The second one includes data gained in the experiment performed with ethanol.Designed experiment was utilized in the experiments with ethanol, while in the case of methanol it was not utilized.The reason for lack of designed experiment in the case of methanol is that the experiment was being conducted without statistical support.

Methodology of model building
Two stages of model building are considered in this paper: • model selection, • residual analysis.
In the phase of model selection, the aim is to find a model which suitably describes and predicts the behavior of the observed system while it remains as simple as possible.Afterward a candidate model is selected, residual analysis is used to check the assumptions of regression method, to verify the adequacy of the selected model and to identify outlier data.

Model selection
There are two basic philosophies behind model selection.First, one would want the model to describe the given data as well as possible and has as high predictive accuracy as possible.This can be achieved by adding more and more parameters and different functions of the parameters (quadratic terms, cross-products, logarithms, etc.) to the model.On the other hand, one would want the model to be simple with as few terms as possible.The reason for the latter is that this way the model is easier to handle, to calculate with, and to understand the behavior of the observed system [21].Also as Occam's razor (i.e. law of parsimony) states, the simplest solution tends to be the correct one, therefore it is preferable to choose the simplest one.
There is no perfect model that could be selected in any situation.As Box and Draper [22] describe: "The most that can be expected from any model is that it can supply a useful approximation to reality: All models are wrong; some models are useful".It is the task of the experimenter to choose an appropriate model in view of the above-mentioned philosophies.
There are different statistical methods that can be used to choose the most suitable model [21].In this paper, best subset regression procedure is being used.Other methods like stepwise regression and backward elimination could also be used, however they are more constrained and give less freedom in choosing from candidate models.The best subset method gives information about the goodness of fit of all possible models that can be built by the parameters.Afterwards the preferable model can be chosen by considering the goodness of fit (i.e.descriptive power) and the number of terms in the model.There are different indicators to describe the goodness of fit [23], in this paper the adjusted R 2 is applied.
The higher the adjusted R 2 the higher is the descriptive power of the model.However, adjusted R 2 is a random variable, a higher R 2 may be observed only by chance when there is no true difference in expected values.Two models with adjusted R 2 relatively close to each other are worth to be compared based on the residual variances of the models.Significantly lower residual variance indicates better goodness of fit (significantly higher adjusted R 2 ).The statistical comparison of variances is carried out with F-test.
A more sensitive test to compare models is based on the test of the extra sum of squares [24].The extra sum of squares measures the reduction in the error sum of squares of a given model when terms are added to it.F-test is used to test the null hypothesis that the expected value of the extra error sum of squares is zero, i.e. adding these terms to the model does not increase the descriptive power of the model significantly.The test statistic is calculated by Eq. ( 1) where SS q is the error sum of squares of the basic model, q is the number of terms in that model, SS r is the error sum of squares of the model with added terms in it, r is the total number of terms in that model and n is the number of data points.If the test statistic, F 0 does not exceed the F-critical value at 5 % significance level with degrees of freedoms of r q − ( ) and n r − − ( ) 1 the null hypothesis is accepted.Thus there is no significant difference between the two models.

Residual analysis 2.2.1 Checking assumptions
Assumptions of regression method have to be fulfilled otherwise the estimates of the regression parameters might be biased.In cases where the assumptions are not fulfilled, other statistical regression methods (e.g.Generalized Linear Regression, Nonparametric Regression) are to be used.Normality of the residuals with constant variance and independence of the residuals are those to be tested.Fulfilled assumptions are indicators of a proper model, therefore checking them also supports the verification of the model.The methods to check assumptions are not discussed in this paper, but demonstrated in the data evaluation section.

Outlier data detection
Identifying and removing outlier data is of importance, as they may distort the model estimation.A single point can be outlier in two ways: it can be outlier in the space of the dependent variable or in the space of the independent variables.
Plotting studentized residuals against predicted values [25] is a way to identify potential outlier data in the space of dependent variable.The points should scatter around zero approximately within a constant width band.Statistical software usually provides the studentized values thus they can be easily plotted.Usually values higher than 2.5 should be checked more closely as there is a great chance that these points are outliers.
Mahalanobis distance (D i ) is used to identify data that are outlier in the space of the independent variables.The D i values for the data are usually provided by statistical programs (calculations of D i can be found for example in [26]).Plotting D i against predicted values can make the examination easier by visualization.Points which are far from the rest may be considered outliers.Table for critical values for D i can be found in [27].D i higher than the critical value is considered outlier in the space of independent variables and should be removed from the dataset.
Cook's distance (C i ) measures the combined influence on the case of being an outlier in the space of the dependent variable and the independent variables.Parameters of the fitted model significantly change when data with high C i (formulas for calculations can be found in [28]) is omitted from the data set [29].Cook suggests that data with C i higher than 1 should be flagged as outlier and omitting the data should be considered [30].

Methodology of designed experiments
Design of experiments (DOE) is a statistical tool that constructs an experiment with least necessary experiments needed to be performed for creating a certain model.The structure of experiment designed by DOE can be made to be orthogonal which means that the parameters of the fitted model are estimated independently from each other.This results in a less uncertain model estimation than that of non-designed experiments [31].
Due to the fact that the solubility parameter itself depends on the other two controlled parameters (temperature and pressure) the settings of the experiments in a designed experiment is not straightforward.Three parameters define HSP: carbon-dioxide -organic solvent ratio, pressure and temperature.In order to tune HSP at any given set of pressure and temperature combination, the ratio of the solvent mixture is to be set.However, this ratio is constrained by the solubility of the ibuprofen and the solubility of the diastereomeric salt forms in the organic solvents.As experiments were designed with constant apparent concentrations of both the ibuprofen and of the salts in the GASF crystallizer, decreasing the amount of the organic solvent was limited by the solubility of the components in it.Increasing the amount of the organic solvent, which results in an increasing solubility of all components of interest, was limited as well since no fractionation could be possible without a diastereomeric salt precipitation.In practice due to the constraint, the HSP cannot be set to constant maximum and minimum values during the experiment as it would be desired in an orthogonal design.This means that full orthogonality cannot be achieve, approximate only.However, other advantages, such as smaller variance of estimates, can be gained over undesigned experiments by designing the experiment.
In our case there are 3 parameters, namely pressure, temperature and HSP.The expectation of the authors is that the selectivity of the antisolvent process can be described mostly by the linear effect of HSP, while temperature and pressure are less important factors and take part in the model only as correction terms.The effects that are evaluated are the linear effects of each parameter, the quadratic effects of temperature and pressure and the interaction of temperature and pressure.To estimate these effects a 2 × 3 2 design would be desired, where HSP would be set at 2 levels while temperature and pressure would be set at 3 levels during the experiments.However, as it was discussed above, the setting of HSP is constrained, an orthogonal 2 × 3 2 design cannot be constructed.But designs with other advantages over undesigned experiment can be constructed.The frequently used D-optimal design [32] minimizes the volume of joint confidence region of estimated parameters of the model, i.e. minimizes the uncertainty of the estimates.TIBCO Statistica Software was used to create the D-optimal design for experiments in ethanol with Sequential algorithm [33].

Data evaluation 4.1 Experiment with methanol as the solvent
The dataset contains 34 data points that were obtained in these experiments.Exact experimental methodology can be found our previous paper [34].The pressure, temperature and Hansen solubility parameter were varied in the experiments in a non-designed structure.The aim is to find an appropriate model to describe selectivity (S) as function of temperature (T), pressure (p) and the Hansen parameter (δ).The data can be found in Table 1.

Model selection
The following effects were considered in the candidate models: linear effects of temperature, pressure and Hansen parameter, quadratic effects of temperature and pressure and the interaction of temperature and pressure.The choice of the candidate effects are arbitrary and based on the preferences and expectations of the experimenters.If the data cannot be described appropriately with a model including these effects other effects should be considered to be taken into account.
The 10 best (based on adjusted R 2 ) models obtained from best subset method can be seen in Table 2. Every row belongs to a certain model and the + signals mark those terms that are included in that model.Based on the discussion of model selection above, higher adjusted R 2 and smaller number of effects are desired.Therefore #1 and #5 were chosen for further comparison.
The earlier was chosen because of its high adjusted R 2 value and the latter was chosen because of its simplicity (and reasonably high R 2 value).Model #1 has residual variance of 0.003276 with 29 degrees of freedom while model #5 has residual variance of 0.003584 with 30 degrees of freedom (not shown in the table).The two R 2 values are relatively close to each other, the comparison is desirable.To compare the two models, the null hypothesis of the extra sum of square being zero is tested.The F 0 is 3.82, calculated by Eq. ( 1).The critical F-value at 5 % significance level with degrees of freedom of 1 and 29 is 4.18.
As F 0 is smaller than the critical F-value, the null hypothesis is accepted, there is no significant difference in the descriptive power of the two models.This result supports the choice of the simpler model with only linear terms in it, namely p, T and Hansen solubility parameter (model #5).

Residual analysis
Residual analysis was carried out on the selected model in order to detect outlier data and to check assumptions of regression analysis and adequacy of the model.Normality of residuals was examined with normal probability plot (Fig. 1).The data fit well to the red line representing the normal distribution and no deviation can be seen.Shapiro-Wilk's test [35] verifies the normality of residuals (p = 0.98).
In order to detect outlier data studentized residuals, Mahalanobis distances and Cook's distances were examined.
Fig. 2 shows the studentized residuals against the predicted values.Thought the point #6 and #15 are a bit extreme compared to the rest, the values are lower than 2.5 and one has no reason to flag these points as outliers based on this plot.A potential anomaly can be seen in the plot as if the residuals had a curve trend.This pattern might suggest a need for quadratic term in the model.However, extended nature of the trend is not convincing, it should not necessarily give rise to concern.
Fig. 3 shows the Mahalanobis distances against the predicted values.Based on the table of critical values of Mahalanobis distances [27] the point #6 is outlier in the dimension of the independent variables at 5 % significance level.
To affirm the outlier nature of this point, Cook's distances are examined.Fig. 4 shows the Cook's distances against the predicted values.The Cook's distance of point #6 is evidently higher than those of the other points and also higher than 1 which is a suggested limit in [30] to flag the point as outlier.Based on the above observations, point #6 is flagged as outlier and dropped from the dataset.It is not evident whether the point is outlier only in the space of the independent variables or both in the space of the independent variables and the dependent variable.Added experiences could be conducted with settings close to those of point #6 in order to test the outlier nature in the space of the dependent variable.Also these added experiences would reduce the extent of outlier nature in the space of independent variables of point #6.
As point #6 is dropped from the initial data set, the evaluation of the remaining data should be carried out from the beginning.The same model was chosen in the model selection phase as before, the assumptions of the analysis and the model adequacy were verified by residual analysis and no further outlier data were detected.

Fitted model
The selectivity of the GASF optical resolution of ibuprofen from methanol can be described by the temperature, the pressure and the Hansen solubility parameter.The fitted model is presented in Eq. (2).
where p is pressure (MPa), T is temperature (°C), δ is Hansen solubility parameter (MPa 0.5 ) and S is selectivity (-).The coefficient of HSP being much greater than the two other parameters, while the scales of parameters are within the same magnitude reinforces the expectations of the authors that HSP has the highest effect on the selectivity.

Experiments with ethanol as the solvent
The second set of experiments was conducted in order to evaluate the process with ethanol as solvent in a designed experiment.Despite the fact that the response showed only linear correlation with the parameters in the experiment with methanol, a design was constructed that allows evaluation of quadratic effects of pressure and temperature as well.This experiment would contain 18 points (2 × 3 2 ) in an orthogonal design.The difficulty is that the design space of the experiments is constrained as it was discussed in Section 3. The settings of the Hansen parameter were the closest that was practically achievable to the settings of those in an orthogonal 2 × 3 2 design.
Table 3 shows the settings of the experiments.The δ* column represents the settings as if the design space were not constrained, while the δ column represents the HSP values that would be practically achievable.Also, the experimenters preferred to reduce the number of experiments to 14. D-optimal design method was used to construct a design with 14 points (chosen from among the practically achievable 18 points of the 2 × 3 2 design) which allows evaluating quadratic effects of temperature and pressure as well as linear effects of the parameters and cross-product of temperature and pressure.
If it is found that considering these effects only, the descriptive power of the model is low, the experimenter should include other effects.However, to include other effects, it might be needed to complete the designed experiment with more experiments.The situation is special here, as the design space is restricted and the design is not orthogonal.It results in points (Table 3, the points with HSP around 13.0) that can be used to evaluate quadratic effect of HSP as well.The quadratic effect is not considered in the paper, as it was found to be the way the experimenters expected, non-significant.
The coloring in Table 3 marks the chosen points.Experiment #3, #8 and #17 were selected two times by the optimizing algorithm to enhance these settings in order to attain D-optimality.The experimental results can be found in Table 4.There are varying differences between the Hansen values defined in the designed experiment (Table 3, δ column) and those are set in the experiments (Table 4, δ column).The reason for that it is hard to set the parameter to an exact value, therefore what was achieved in the experiments those may differ from the ones defined in the design.The real values that were set during the experiments were recorded and the calculations were performed with them.

Model selection
The simplest model which is linear in all the parameters is more than appropriate to describe the data, comparison is not needed with other models.The adjusted R 2 is 0.99 while the residual variance is 0.000526.Note, that this residual standard variance is surprisingly smaller than the ones in the experiments with methanol as the solvent.

Residual analysis
Residual analysis was carried out on the selected model, the details are not shown here.The assumptions of the regression analysis are verified, there was no sign of the model being inadequate.Also no data was found to be outlier.

Fitted model
The selectivity of the antisolvent fragmentation of ibuprofen from ethanol can be described by the temperature, the pressure and the Hansen solubility parameter.The model fitted is given in Eq. ( 3) where p is pressure (MPa), T is temperature (°C), δ is Hansen solubility parameter (MPa 0.5 ) and S is selectivity (-).It can be stated here as well that HSP has the highest effect on the response, S .

Comparison of the estimates of the two models
It can be seen in Table 5, that the estimates of the parameters in the two models are quite similar.Further analysis could be carried out to test whether the two datasets can be described by a single model, however it is not in the focus in this paper.The standard errors of the estimates are function of the residual variance, the number of data point, and the structure of the design of the experiment.Smaller residual variance, greater number of data points and designed nature of the experiments are decreasing the standard errors of estimates.
Despite the fact that the designed ethanol experiment contained half the number of data compared to the undesigned methanol experiment, the standard errors of the parameters of the ethanol model are much smaller.This is the effects of the residual variance being much smaller in the ethanol experiments and the designed nature of the experiment.Moreover, the probable reason for the smaller residual variance is the D-optimal property of the design.The 14 experiments with ethanol were performed consecutively, in relatively short time, in a designed structure.Contrary, the previous experiments with methanol were done less consecutively in longer time scale without utilization of DOE.The discontinuity of the experiments gave opportunity to random effects (effect of the day for example) to arise and increase the variance of the process significantly, making the residual variances greater.

Conclusion
Evaluation and model building utilizing statistical methods were interpreted in the paper for GASF experiments with methanol and ethanol as organic-solvents.A proper model was found to describe the selectivity of the GASF method by operating parameters, namely pressure, temperature and Hansen solubility parameter.The model shows that HSP has the highest effect on selectivity.We have no knowledge about other study in the literature which focuses on the modelling of the GASF method process especially not one that includes the Hansen solubility parameter.This quite simple model can easily be used to predict the selectivity of the process at given operating conditions.Also it can be used to find the optimal operating parameters where the selectivity is the highest.Furthermore it was demonstrated that optimal design is a remarkable tool in researches with constrained design space like the supercritical antisolvent method.The design does not only reduce the number of experiments required to build a certain model but also reduces the uncertainty of the model estimates.

Table 1
Data of experiments with methanol

Table 2
Candidate models for further comparison

Table 3
Design of experiment

Table 4
Data of the experiments with ethanol

Table 5
Parameter estimates and its standard errors