Statistical Evaluation of 4-ethylphenol and 4-ethylguaiacol Concentrations to Support Sensory Evaluation of " Brett Character " of Wines : A Proposed Threshold

Analytical data of 260 sensory evaluated wine samples have been statistically examined. All samples had been classified by at least one taster of the five member jury as having "Brett character". Wines have been finally judged as "Good", having "Other defects" or "Brett character". 4-etylphenol (4-EP) and 4-ethylguaiacol (4-EG) concentrations showed different distributions for the "Brett character" group while the other two groups could not be distinguished from each-other. Threshold concentrations for 4-EP, 4-EG and their sum (4-EP + 4-EG) were calculated to classify wine samples as "non-Brett" and "Brett character". 4-EP concentrations were found to be the most reliable markers, with a 245 μg/l lower and 968 μg/l upper threshold. Below or above this range a sample can reliably be classified as "non-Brett" and "Brett character" respectively, while within this range only sensory evaluation can distinguish the two characteristics. Other tested classical analytical parameters did not show significant differences between these groups except for SO2 which was found to be lower in the "Brett character" group, stressing the importance of sulpihiting as a tool in the fight against Brettanomyces.


Introduction
The "Brett character" of certain, mainly red wines is a wellknown problem for the wine industry.This term refers to an aroma complex described as "horsy", "leather", "medicinal", "smoky", "barnyard", "animal" etc. having a very negative impact on the perception of wine quality.The compounds responsible for this are 4-ethylphenol (4-EP) and 4-ethylguaiacol (4-EG).These can be formed by the yeast genus Brettanomyces (or its perfect form Dekkera) from precursors (p-coumaric acid and ferulic acid) present in all grape musts, especially in those of red wines.[1][2][3][4] Brettanomyces naturally present on the grape berry skin can spoil wines at many phases of wine production, generally proliferating after alcoholic and/or malolactic fermentation.There is now a good consensus about the possible risk factors of Brettanomyces spoilage and about measures to tackle this problem [5][6][7][8].Measures include sulphiting in the pre-fermentative phase, high or low temperature maceration, well controlled alcoholic fermentation with minimal nutrient supply, use of starter cultures at malolactic fermentation phase, adding SO 2 and keeping molecular SO 2 between 0.5 to 0.8 mg/l during maturation, systematic racking, fining [9,10] and filtration and avoiding oxygen dissolution.A high-risk step can be the use of wooden barrels, since these can be deeply penetrated by Brettanomyces and there is practically no suitable method to completely disinfect an infected barrel [11,12].However, Brettanomyces can even spoil wines after bottling since it can grow in anaerobic environment and can use ethanol as carbon source.Lightly filtered wines and wines containing low levels of SO 2 are especially prone to post bottling Brettanomyces spoilage.Recent trends to limit the use of SO 2 and to use "organic" wine making practices make the fight against Brettanomyces even more difficult [13].Consequently, the "Brett character" is not only a current problem that can be avoided by careful production practices but remains a long-term problem for the industry.While 4-EP and 4-EG can be found in certain concentration in every wine, definition and evaluation of the "Brett character" is an often debated issue.Once recognized, this character is clearly considered as an off-flavor in wine.Threshold values ranges for 4-EP have been reported from 230 to 650 μg/L, and from 33 to 135 μg/L for 4-EG [4,[14][15][16], but there is a significant difference between detection thresholds of these compounds and the overall judgement of the wine in question.
In complex matrices such as wine, aroma components interfere with each other.While certain results showed that both sub-and supraliminal concentrations 4-EP could interfere with the perception of fruity notes [17], other research have demonstrated that isobutyric and isovaleric acids have a masking effect on the detection of ethylphenols [18], and ethanol and polyphenols and even 4-EP / 4-EG ratio have a significant influence on the olfactory perception of the "Brett character" [16].Further research has shown that not only the wine itself, but tasters' expertise, age and qualifications, had an impact on the assessment of the "Brett character".These tests revealed a significant effect of academic degrees and profession on the detection skills of experts.Winemakers and trained tasters apparently had greater detection capacities independently of their age, while winegrowers, and young and old tasters, had a significant tendency failing to identify the defect in samples containing ethylphenols [19].While confirming the effect of wine knowledge on wine evaluations, another recent study has shown, that consumers can differentiate samples the best between 500 μg/L 4-EP and 1000 μg/L 4-EP which corresponds to the reported detection threshold ranges cited above [20].
In spite of all uncertainties related to analytical results and their perception, sensory analysis of wine is a very important tool used at all levels of wine industry, including food safety and market control authorities (OIV, 2015) [21].For authorities, objectivity is a must, so it is important to establish the closest possible connection between analytical data and olfactory evaluation of wine defects.The aim of the present study is to determine analytical threshold values to support sensory evaluation of the "Brett character" of wines.

Materials and methods 2.1 Wine samples
In Hungary all wines must be submitted to the Directorate of Oenology and Alcoholic Beverages of the Hungarian National Food Chain Safety Office for marketing authorization before placing them on the market.The laboratory tests approx.15 000 samples per year, including those taken in the course of random sampling of the market.All samples go through basic laboratory tests.Received, producer sealed bottles all undergo a sensory evaluation for any obvious or suspected defect by a five-member jury.All bottles for which any of the evaluators indicated sensory defect are also analyzed for certain defect related aroma components.The present study analyzes data of those 260 wines that were submitted for authorization in the period between 1st January and 31st December 2017 and their sensory evaluation indicated the need for a more comprehensive analysis (i.e. at least one evaluator found sensory defect).Most of these wines were red (219), while 6 rosé and 35 white wines also fell in this category.Fig. 1 shows the varieties of the tested wines.

Analytical methods
Sensory analysis was carried out according to Hungarian Standard MSZ 9462:2016 "Guidelines for sensory analysis of wines" by five member juries of specifically trained and experienced staff members of the authority randomly selected from a pool of 12 experts with at least 3 years of experience of sensory evaluation of wine.Juries were set up on a daily basis.The standard covers testing of the sensory ability of the members before each session.In case of qualifying a wine as rejected, the jury must give a detailed verbal explanation, using standard terms.The term "animal" was used to describe the "Brett character".
4-ethylphenol 4-ethylguaiacol and 4 vinylphenol were determined using high-performance liquid chromatography with a fluorimetric detector [22] as follows: The wine sample was filtered on 0.45 μm PTFE syringe cartridge and directly transferred into a 2 ml glass screwtop vial.Analysis was carried out with HPLC (Shimadzu LC-10AD VP) equipped with a fluorimetric detector (excitation at 225 nm; emission at 320 nm).Isocratic separation (50 mM NaH 2 PO 4 buffer adjusted to pH 3.40 with The temperature of the column was 25 °C, the flow rate 1.5 ml/min.The injection volume was 30 μl.The analysis time was 15 min.Ethanol was determined by Alcolyzer-IR:R2001.Total sugar content, total and volatile acidity, total and free SO 2 were determined according to the international methods for wine and must analysis published by the Organisation Internationale de la Vigne et du Vin.

Statistical methods
The statistical evaluation of the datasets was carried out with TIBCO Statistica 13.4.software.In order to examine whether groups of data are from different distributions Kruskal-Wallis test was used.The null hypothesis of the non-parametric test is that the distribution functions of the populations are identical.If the null hypothesis is rejected, it can be concluded that at least one of the distributions deviate from the others.Post-hoc comparisons allow for detect which two groups do not belong to the same distribution.To decrease the false positive error rate Bonferroni adjusted p-value could be used [23].The selected significance level is 0.05 (family-wise error rate).

Results and discussion
Sensory evaluation classified the 260 tested wines either as "good" (132), having "Brett character" (97) or having "other olfactory defects" (31).In the course of the following discussion these three categories are referred to as "Good", "Brett character" and "Other defects".It is interesting that 12 of the white wine samples and 2 rosé wines were also found to have a "Brett character", while 7 of the white wine samples had other olfactory defects.In the following analysis -due to the statistically insufficient number of white and rosé wine samples -these are not distinguished from the red wine samples.

4-EP
First, extreme values / outliers were to be identified.There are several graphical techniques which can help to make outliers or extreme values visible (e.g.Box-Plot, normal probability plot, histogram).Based on this, in the group of "Brett character" 2 suspicious values were detected.With the Grubbs test [24] it was verified that the 6500 and 4250 μg/ml values, according to their probability of occurrence, are outliers.For this reason, these two data were omitted from further evaluation.Fig. 2 shows the histogram of 4-EP concentrations of the three distinguished groups of samples.Analyzing the dataset with Kruskal-Wallis test the null-hypothesis is rejected (p << 0.05).
It means that, based on this dataset at least one of the distributions differ significantly from the others.
To identify which group or groups are significantly different post-hoc comparisons were conducted.Based on the adjusted p-values, the distribution of the category of "Brett character" deviates from that of the other two groups (p << 0.05).While, for categories "Good" and "Other defect" it could not be shown that these samples are from separated distributions.
As it was expected, many wines with "Brett character" have higher 4-EP concentration than those that are qualified as good or as having different olfactory defects.At the same time reliability of the sensory evaluation is shown by the fact that distribution of "Good" and "Other defects" wines is overlapping with very close expected values.It is also expected that there is no evident, clear border between wines with and without "Brett character", since many other factors interfere with the human perception of this defect as discussed above [16][17][18].
In agreement with findings of a former study [20], sensory evaluation could distinguish "Brett character" in the 500-1000 µg/l concentration range (Fig. 3).There were only two samples with 4-EP concentration above 1000 µg/l (1023 and 1036 µg/l) in the "Good" category while 70 % of "Brett character" samples had 4-EP concentration above 1000 µg/l.It is important to note that one sample was qualified "Brett character" although both 4-EP and 4-EG concentration was below detection limit.This white wine sample had however extreme Fig. 2 The histogram of the three categories of the 4-EP concentration 4-vinylphenol level (5104 µg/l) which is usually qualified as a medicinal, or phenolic off-flavour.

4-EG
Fig. 4 shows the histogram of 4-EG concentrations of the three distinguished groups of samples.Analyzing this dataset with Kruskal-Wallis test a similar conclusion could be drawn as in the case of 4-EP.Post-hoc comparisons with adjusted p-values, demonstrated that the distribution of the category of "Brett character" is significantly different from the other two groups (p << 0.05).While, for categories "Good" and "Other defect" it could not be shown that these samples are from separate distributions.In addition, the distributions of the three groups are rightskewed (not normal).
As 4-EP and 4-EG are the main indicators of "Brett character", 4-EG data also show a possible separation of "Brett character" wines from any samples that are finally not qualified as such.It is also clear that other sensory defects do not interfere with the perception of "Brett character" since the "Good" and "Other defects" categories cannot be distinguished from each other.However, the separation of the categories by the concentration of 4-EG is not as clear as in the case of 4-EP.There is a wide concentration range (30 -450 µg/l) in which the "Brett" and "non-Brett" categories still overlap.

4-EG to 4-EP ratio
Petrozziello et al. [16] has found that perception of "Brett character" is also influenced by the ratio of 4-EG / 4-EP.According to their results, the higher this ratio the less apparent is the "Brett character".We analyzed our data from this point of view, but our data cannot support this observation.
While majority of samples in all three categories have a low 4-EG / 4-EP ratio (below 0.5) samples at extremely high (above 2) ratio were still qualified as having "Brett character" even well under 1000 µg/l 4-EP concentration (Fig. 5).

Determining threshold for "Brett character"
The aim of the present study is to determine an analytical threshold supporting sensory evaluation of the "Brett character".As shown above, both 4-EP and 4-EG concentrations have a certain distinguishing power between "Brett character" and "non-Brett" wines.It was also shown, that other sensory defects do not interfere significantly with the separation of these two groups, so for the determination of a threshold value, the groups of "Good" and "Other defects" wine samples were treated as one (i.e."non-Brett").In order to increase selective power, these two parameters might be combined.Threshold values for 4-EP and 4-EG concentrations as well as for their sum (4-EP + 4-EG) were calculated.
If the condition of normal distribution is fulfilled, using z distribution, it is possible to determine a threshold concentration above which the probability of mistakenly classifying a good wine in the "Brett character" group (error of the first kind) is maximum 5 %.This is highly important from an authority's point of view when rejecting or allowing a certain product for the market.With the above-mentioned threshold, the probability of declaring a wine good when it is defective (error of the second kind) can be calculated as well.The smaller is the overlapping area of the "Brett character" and "non-Brett character" categories the lower is the probability of error of the second kind.Three different threshold values (based on the dependent variables 4-EP, 4-EG or 4-EP + 4-EG, respectively) belonging to 5 % probability of error of the first kind are given Table 1.For these upper threshold values (above which the wine is qualified as of "Brett character" the probability of error of the second kind were calculated and also given in Table 1.(Since the normality of the distributions is a requirement for this calculation and in the case of 4-EG the distributions are right-skewed (Fig. 4), a log-transformed 4-EG data was used for the calculations.) As seen from data in Table 1, a 4-EP concentration of 968.3 µg/l could be used as a lower threshold to qualify a wine as having "Brett character".Using this value as a basis for decision would not only mean that 5 % is the probability of declaring a good wine as having "Brett character", but the probability of classifying a defective wine as good would be 31.52%.The 410.0 and 1242.0 µg/l thresholds for 4-EG or the sum of 4-EP and 4-EG would result in the same 5 % error in qualifying a good wine as defective, but would increase the probability of the opposite false decision, although the difference between 31.52 % and 32.05 % of the error of the second kind between 4-EP and the sum of 4-EP + 4-EG is practically negligible.
Naturally it is also desirable to fix the risk of the improper classification of "Brett character" wines (into the "non-Brett" group).It is also possible to determine a lower threshold concentration under which the risk of this mistake is limited.With the established threshold by a given value (e.g. 5 %) of the probability of the error of the second kind the probability of the error of the first kind can be calculated (Table 2).
As seen from data in Table 2, a 4-EP concentration of 244.8 µg/l could be used as a threshold maximum concentration to qualify a wine as good.Using this value as a basis for decision would mean that 5 % is the probability of declaring a defective wine as good.The probability of classifying a good wine at this limit as "Brett character" would be 87.28 %.The 66.7 and 114.3 µg/l thresholds for 4-EG or the sum of 4-EP and 4-EG with the same 5 % probability of qualifying a defective wine as good, would increase the probability of the opposite false decision to 89.06 and 96.87 % for 4-EG and the sum of 4-EP + 4-EG respectively.
The comparison of dependent variable alternatives thus shows that using only 4-EP concentration as threshold is equivalent or safer compared to the two other alternatives.There is a range however above or below which it is relatively safe to determine if a wine sample has or has no "Brett character".Below a 4-EP concentration of 244.8 µg/l samples can be classified good and above 968.3µg/l as "Brett character" with only 5 % error probability.
It is important to stress, that these results -including all uncertainties -reflect the borderline cases only.As explained above, these 260 examined wine samples have all been evaluated at least by one tester out of the five member jury to have a "Brett character".The jury decided about these suspect cases.Consequently, it is presumed, that the real error of the second kind is far lower than the outcome of the present calculations shows, if applying the proposed threshold value to all tested wines not only on suspect cases.This approximately 245 -968 µg/l 4-EP concentration range within which still a sensory evaluation is needed, might be further narrowed repeating the above calculations on a wider range of random wine samples.

Further results
We have also examined if "Brett character" wines show any further difference in terms of a few other more conventional analytical parameters.One of the reasons of sulphiting is its well-known effect against Brettanomyces and other microbes induced spoilage of wine.Consequently, free (and total) SO 2 levels might negatively correlate with Brettanomyces count or 4-EP concentration [23].Fig. 6(A) and (B) show the distribution of the free and total SO 2 concentrations of the three examined groups.
Although far from being indicative to a possible Brettanomyces contamination, "Brett character" wines show a bit lower free and total average SO 2 concentration ("Brett character" 14.7 and 89.1; "Other defects": 25.5 and 108.8; "Good" 20 and 102.2 mg/l).As figure Fig. 6 (A) shows, by characteristic of free SO 2 concentration, the distribution of the category of "Brett character" is rightskewed and the most probable number is lower than in the case of the other two groups.According to the Kruskal-Wallis test and the post-hoc comparisons the "Brett character" group is significantly different (p << 0.05) from the "Good" and "Other defects" group.In view of the total SO 2 concentration, this difference was not detectable.
On the one hand, this is a demonstration of the importance of sulphiting as a tool against Brettanomyces, on the other hand, high SO 2 levels of the "Other defects" group might indicate that winemakers have recognized the start of some unwanted microbial processes and tried to stop them by intensive sulphiting.This latter thought is further supported by the fact that for the "Other defects" group, a slight negative correlation between 4-EP and SO 2 is observable at R² = 0.3088 for free and 0.3233 for total SO 2 (Fig. 7).In case of the "Brett character" and "Good" groups of samples this correlation cannot be seen (R² < 0.029 for all cases).
Volatile acidity data show a bit different picture.While distributions of "Good" and "Brett character" groups are different, the distribution of the "Other defects" group data cannot be distinguished statistically from neither the "Good" nor the "Brett character" group.This slight difference between the separable "Good" and "Brett character" groups is obviously connected to Brettanomyces activity, but not characteristic enough to deduct any further conclusions.
Distribution of ethanol content, total dry matter, total sugar content, total acidity data of the three groups of samples has not shown any significant difference as Brettanomyces has a wide tolerance for substrates and environmental conditions.

Conclusions
The statistical examination of the sensory evaluation and analytical profile of 260 suspect wine samples has resulted in two thresholds.Wines above 968 μg/l 4-EP concentration can be classified as having "Brett character" with the 5 % probability of the error.The lower 4-EP concentration threshold is 245 μg/l.Wine samples below this can be safely classified as good with only 5 % probability of error.Using these guiding limits, the workload on sensory evaluation can be significantly reduced while objectivity of the judgement is improved.

Fig. 1
Fig. 1 Vine varieties of the tested wines

Fig. 3 Fig. 4
Fig. 3 The cumulative histogram of the three categories of the 4-EP concentration

Fig. 6
Fig. 6 Distribution of the free (A) and total (B) SO 2 concentrations of the three examined groups

Fig. 7
Fig. 7 Correlation between free (A) and total (B) SO 2 concentrations and 4-ethylphenol concentration in the "Other defects" group of samples

Table 1
The minimum threshold concentration to qualify a sample as "Brett character" and the probability of error of the second kind