1 Comparison of the Rotating Source Identifier and the Virtual Rotating Array Method Comparison of the Rotating Source Identifier and the Virtual Rotating Array Method

The aim of this paper is to present two acoustic beamforming methods developed for rotating sources, namely the Rotat ing Source Identifier (ROSI) and the Virtual Rotating Array method (VRAM). These were applied onto a series of simulated test cases, and their behaviour was analysed. Both methods were able to localise the source reliably. However, the source strength was found to depend on the number or microphones when VRAM was applied. This phenomenon was quantified and an approximate formula was given providing the minimum number of microphones required to reach a certain amplitude error. Beamwidth and side lobe suppression were found to agree between the two methods, meaning that the way rotation is handled does not significantly affect the point spread func -tions. The computational cost of ROSI was two to three orders of magnitude higher than that of VRAM. The results show that both methods are applicable for the beamforming analysis of rotating sound sources. However, in case of VRAM, the num ber of microphones has to be chosen carefully to obtain reli able amplitude results.


Introduction and objectives
The noise emitted by machinery is an increasingly important concern, due to its adverse effects on people. Noise reduction is therefore of paramount importance. However, the sources of noise are often unknown. Acoustic beamforming is a technology that localizes sound sources based on measurements carried out with an array of microphones [1]. It is widely applied for stationary sources to analyse and investigate noise generation mechanisms with the aim of reducing noise.
Rotating machines are significant sources of noise, too, therefore they are often investigated using beamforming techniques. Turbomachinery are traditionally of great interest due to the strict air traffic regulations [2,3]. Wind turbines are similarly important, as they are becoming more widespread [4]. More recently, low-speed axial fans have been studied as well [5][6][7][8]. These machines emit little noise compared to e.g. turbofans, however, they are often used in the close vicinity of people, therefore their noise should be reduced as well. It is then important to deal with rotating noise sources, however, their investigation is made difficult by the continuously changing source-receiver distances, resulting in the apparent change of the received frequency, termed the Doppler effect. Nevertheless, there are several beamforming methods developed for rotating sources. Approaches exist both in the time domain [9], and in the frequency domain [6,[10][11][12]. Each of them has different advantages, application criteria and limitations.
Microphone array methods are often analysed and compared against each other [13,14]. The literature however rarely treats the comparison of methods for rotating sources, probably due to their sometimes mutually exclusive application criteria. Reference [15] reports such a work, where the virtual rotating array method [6] and the frequency domain rotation compensation technique [11,12] is used to characterise the sound sources in an axial fan.
Time-domain de-Dopplerisation is not applied there, however, and duct modes are considered, as the case involves a ducted fan. The aforementioned three groups of methods are compared in [16], with interesting results regarding the performances of the methods applied together with deconvolution methods, however, little information is available about the original beamforming results.
The objective of this paper is to analyse some methods developed for rotating sources. Summarise their requirements, advantages and drawbacks, then apply them on the same test cases. Examine the results, and draw conclusions based on them, using quantitative parameters characterising a beamform map. This is to provide some preliminary information to aid designing beamforming experiments on rotating sound sources.
First, the applicable methods are introduced shortly. Second, the relevant methods are applied to localise a single rotating monopole source, whose angular velocity, and signal frequency are varied. The number of microphones in the array is changed, as well. The beamform maps are analysed first visually, then quantitatively, considering the main lobe level at the location of the source, the beamwidth, and the side lobe suppression. Computational costs are analysed, too. Special attention is paid to VRAM, and the beamform level of the localised source. Finally, conclusions are drawn that highlight the differences between the methods, and give useful advice regarding their application possibilities.

Methods for rotating sources
Three fundamentally different methods have been proposed to localise rotating sources. These are shortly summarised in this section, together with their application criteria, and foreseeable advantages and disadvantages. Such special methods are required, as conventional frequency-domain beamforming is not able to handle moving cases, because of the relative change of receiver position, when viewed from the source.

Rotating Source Identifier
The Rotating Source Identifier (ROSI) method was proposed in [9]. It works in the time domain, and its general idea is to calculate the beamforming result on grid points that rotate together with the target.
First, a grid point is chosen. Then, a uniformly spaced emission time series is created, using the length and the sampling frequency of the measurement. For each emission time instant, the r m,n distance between the investigated grid point n and each microphone m is calculated, making use of the angular position of the target. Using the speed of sound a, this gives the times of flight of uniformly emitted sound pulses from the selected grid  In this step the amplitude variation is taken into account as well. The resulting sound pressure vector is then treated as the noise emitted from the investigated grid point, at the original, uniformly spaced t emission time instants. This way the rotation effects are removed, and from that, the beamform level at the investigated point can be determined. The procedure is repeated for all grid points.
ROSI is able to take rotation effects into account. It can be applied for non-uniform angular velocity and arbitrary array geometry. However, due to the time domain calculations, the method is very resource-consuming, and since there is no cross spectral matrix to describe the complete source field, a large number of deconvolution methods cannot be applied.

Virtual rotating array method
Another method, termed Virtual Rotating Array method (VRAM) was proposed in [6]. Earlier, a similar algorithm was applied in [10] for a ducted fan. In these methods, the fundamental idea is to create a virtual array, rotating together with the target. These methods originally require the microphones to be arranged along one or more rings with uniform angular separation. (This requirement was removed in [17].) The axis of the array must also be identical to the axis of revolution of the target.
A virtual array of microphones is imagined as rotating together with the target. First, a virtual microphone of index m is chosen. Its angular position in time is calculated, which is then used to determine the index of the microphones in the real array preceding (m − ) and following (m + ) the virtual microphone m at each time step. They are obtained using Eqs. (2) and (3), respectively. In these equations, the angular separation of the microphones is απ = 2 M ,     means the floor operation, and mod M means the remainder after division by M.
Finally, the signal at the virtually rotating microphone m is obtained by linearly interpolating the signals of microphones m − and m + , weighted by the distances, as shown in Eq. (4).
The weights of the following and the preceding microphone signals are given in Eq. (5) and (6), respectively.
This procedure is repeated for each t time instant and each m microphone. This way, following a Fourier transform, the (1) acoustic pressure p is obtained, as if the microphones were rotating together with the target of beamforming. From then on, a cross spectral matrix C can be constructed, and conventional beamforming can be carried out, as shown in Eq. (7) [1]. † n z = nn w Cw In Eq. (7), w n is the steering vector, describing the propagation of sound from grid point n to each microphone, while z n is the result of beamforming at the n-th grid point. † denotes the complex conjugate transpose operator. In the present investigation, z n is written in a level form, denoted by L b in Eq. (8).
The normalisation factor used here is the threshold of hearing: p 0 = 2 × 10 −5 Pa. The source map values are given in this quantity for both ROSI and VRAM. Using this algorithm, the effects of rotary motion can be removed. Afterwards, the problem is reduced to a simple case with steady sources. Therefore all the methods developed for steady beamforming can be applied, including deconvolution approaches. The procedure is suitable for varying angular velocity, too, similarly to the ROSI method, but it requires a uniformly distributed circular array, placed perpendicularly the source rotation axis. This method is investigated in detail, as well.

Series expansion method
For the sake of completeness, it should be noted that besides ROSI and VRAM, a third kind of method also exists for rotating sources, proposed in [12] based on the work in [11]. This method is based on writing the Green's function for the rotating source in a spherical coordinate system. Utilizing the symmetry, this method works completely in the frequency domain and allows the application of deconvolution methods. It can however only be applied in case of an axially symmetric measurement setup, with a ring-shaped microphone array and a constant speed of revolution. Due to the complexity of the formulation, this method is excluded from the present investigations. The comparative analysis of this method optionally forms the subject of future work.

Methodology
The selected methods were compared using synthetic sources. This is common practice when beamforming array methods are analysed [13,14,18]. An in-house algorithm was used to generate the required sound signals, first creating the emitted noise vector, being significantly oversampled compared to the sampling frequency of the final signal. Then, these signals were numerically propagated to each microphone using the time-dependent source -receiver distance obtained from the prescribed motion. The recorded pressure value for each of the oversampled source values was generated, using a far-field, free space monopole propagation model. This resulted in a non-uniformly distributed pressure time history at each microphone, that was re-sampled to the uniform sampling frequency of the final signal using spline interpolation. Beamforming was carried out with an in-house ROSI code, reported in [19]. This program was extended by the authors to incorporate VRAM.
The applied weight vector was chosen to conform with previous investigations reported in literature [5,20,21]. One element of this vector is shown in Eq. (9). For our investigation, test cases relevant for industrial measurements on low speed axial fans were considered. The ambient pressure was 10 5 Pa, the ambient temperature 293.15 K, the ratio of specific heats 1.4, and the specific gas constant 287 J kg -1 K -1 .
One source was investigated, its location being representative of the fan tip radius, placed at the coordinates x = 1, y = 0.2 m. The reason for investigating just one source is the high computational demand of ROSI. This was deemed acceptable, as the source represents a point on the tip radius, often of interest in an axial fan [21]. Furthermore, the angular location has little relevance, if the number of microphones M is large enough, being a further reason for investigating only one source position.
Due to the requirements of the VRAM method, all arrays consisted of microphones being distributed along a circle, whose centre falls onto the axis of revolution of the source. The diameter of the microphone array was 1 m.
Beamforming expressions were evaluated on a rectangular grid with 5 mm spacing. The distance between the source plane and the array plane was Z = 1 m. The final sampling frequency was 44 100 Hz. For the calculation of the cross spectral matrix, the Welch method [22] was followed: blocks consisting of 1024 data points were used with von Hann windows, overlapping by 50 %. In each case, the diagonal elements of the matrix were removed, similarly to the method in [18]. This is to conform with usual measurement settings, however, it was found to have negligible effect in the present case due to the presence of only one synthetic source.
Preliminary tests were carried out to determine the applied sample lengths. 0.1 s was found acceptable, providing the same results as a sample of 1 s in case of both methods. Thus 0.1 s samples were applied for ROSI in order to reduce the computational cost. When the computational requirements were considered, VRAM was also evaluated on samples of 0.1 s length, to get comparable results. However, when the amplitude of the source was investigated, samples of 1 s were used for VRAM.
The investigated nominal signal frequencies were f = 1, 2, 3, 4, 5 kHz; always modified to fall onto a discrete Fourier frequency. The speed of revolution of the rotor was varied between Ω = 10 1/s and 30 1/s in steps of 5 1/s. The number of microphones M was also varied between 12 and 80 in steps of 8 for VRAM, while for ROSI 24, 56, 64 and 80 were used, again (8) (9) to reduce the computational demand. The source strength was set in a way that in the absence of rotation, a beamform level of L b = 0 dB was obtained in the map.
Beamform maps were created, and from those, the main lobe width, the side lobe suppression, and the maximum strength were extracted, and compared in case of the two methods. The results are presented and discussed below. Fig. 1 depicts a source map from each method with equal dynamic ranges. These were obtained with the same parameters: f = 3 kHz, Ω = 20 1/s, M = 64. Comparing the figures, one can see that both methods have localised the source successfully, and the strength estimates are very similar, as well, meaning that both methods are applicable. The point spread functions are also very similar, reported also in [16]. This observation was found true for all the investigated cases. Thus removing the rotation effects does not influence the beamforming part significantly; the point spread functions are similar to what would be obtained with conventional beamforming in case of stationary sources. This means that results on stationary sources could be used to judge the goodness of microphone layouts for ROSI or VRAM, as well.

Beamforming results
Due to the similarity between the point spread functions, the width of the main lobe, and the side lobe strength all agree well, meaning that the achievable spatial resolution and the dynamic range are similar, too. This result was observed in all cases, meaning that the two methods give equivalent results from the point of view of beamforming. It should be noted however, that since VRAM uses a frequency-domain beamforming formulation, a multitude of deconvolution methods can be applied there to enhance the beamforming map, e.g. CLEAN-SC [23].
Comparison between some methods can be found in [16].
As noted before, the maximum L b of the maps agrees well, too, in the reference case M = 64. However, VRAM was found to underestimate L b when M was low. This probably results from the fact that as M decreases, the angular separation α of the microphones increases, and thus the results of the interpolation in Eq. (4) may degrade.
In case of ROSI, as noted before, only some practically relevant microphone numbers of M = 24, 56, 64 and 80 were investigated. In these cases, a constant deviation of about -0.07 dB from the ideal L b was observed. This however did not depend on the number of microphones. Its origins were not investigated, since due to its weakness, it is unlikely to be of interest in measurement scenarios. Fig. 2 shows the ∆L b error relative to the ideal L b value in case of VRAM. It can be seen that for increasing M values, ∆L b tends towards zero, as expected, however, for decreasing M, the magnitude of ∆L b increases, and at a very low M of 12, an error above 2 dB is possible. The degradation of the beamform maps is shown in [16], but no numerical results are presented there. It should be noted that the location of the source was successfully determined even with M = 16, while at M = 12, only a negligible position error was obtained. The above result can be important when designing a microphone array: one has to consider the permissible uncertainty in L b when choosing M. This can be achieved based on Fig. 3, showing the required M as a function of the amplitude error.
These results can be approximated well with a remarkably simple function, shown in expression (10), and in Fig. 3 with a dashed line. Using this approximate formula, the required M can readily be determined.
It should be noted that the numerical results are only valid for the present case, with f = 3 kHz, Z = 1 m, Ω = 20 1/s, a source radius of 0.2 m and an array diameter of 1 m. Nevertheless, since these parameters were chosen based on real-life axial fan measurements, they can be applied as guidelines for other similar cases. Investigating this phenomenon with other parameters forms the subject of future work.
Interestingly, varying Ω was not found to affect the L b values in the maps. Changing the frequency was found to have a minor effect on both ROSI and VRAM results. Increasing f from 1 kHz to 5 kHz resulted in L b being decreased by about 0.25 dB with M = 64. This effect can probably be attributed to the degradation in interpolation as the wavelengths decrease relative to the microphone separations in case of VRAM, however, its magnitude is small and would not be noticeable in a real measurement. Fig. 4 depicts the spectral distribution of L b at the location of the source. One can observe that both methods have successfully identified the source at 3 kHz with the amplitude reaching very close to the real one. All other spectral peaks are significantly weaker than the main source, indicating that the methods perform as expected. Nevertheless, the trends of the methods appear different: ROSI has a wider range with nearly constant values surrounding the real signal frequency, besides which the results fall. VRAM on the other hand has lower overall values, with some secondary peaks: a lower frequency one appearing at about f l =1700 Hz and a higher frequency one close to f h = 4300 Hz. The aforementioned peaks were found in all VRAM results and some of their properties were analysed. The f l ≈ f − 1300 Hz and f h ≈ f + 1300 Hz relationship was found regardless of the signal frequency f . When M increases, f l was found to decrease, while f h was found to increase, both in a linear manner. They exhibit a similar trend when Ω was varied.

Beamforming spectra
At these f l and f h frequencies, the beamforming maps show secondary sources localised in the same direction from the origin, as the real source, but on a different radius. These radii were found to be linear in terms of f , M , and Ω, as well. They exhibited no dependence on Z or the array diameter.
In the present work, these secondary sources were not investigated in more detail. Fig. 5 shows the processing time T required by each algorithm as a function of M. This was measured in the GNU Octave environment running on a personal computer with an Intel Core i5 processor.

Computational cost
The most important conclusion is that ROSI has two to three orders of magnitude higher computational cost in the present range of parameters than VRAM. In Ref. [16], VRAM was found to be 350 times faster than ROSI, being in good agreement with the present findings. The exact result depends on implementation details, but it is nevertheless clear, that VRAM is significantly faster than ROSI. Furthermore, as the number of microphones increases, the advantage of VRAM further grows, since T is about proportional to M for VRAM, but for ROSI, this is a higher degree dependence.
ROSI was also found to have T directly proportional to the number of grid points, while in case of VRAM a more moderate dependence was discovered.  The reason for the speed of VRAM is that interpolation of the sound pressure signals has to be carried out only once for each microphone, which is followed by conventional beamforming. For ROSI however, the distances have to determined for each time step, each grid point and each microphone, followed by the interpolation, taking considerable time.

Conclusions
The paper has demonstrated two of the available methods for beamforming on rotating sources: ROSI and VRAM. It contributes to the beamforming literature by providing some results of numerical experiments, aiding in determining the appropriate method for microphone array measurements on rotating sources. Besides visual observation of the beamform map, the beamform level, the beamwidth, and the side lobe suppression were investigated. These parameters were generally found to be in good agreement between the methods.
A previously not reported dependence was found on the number of microphones: as M decreases, the L b beamforming value obtained by VRAM decreases from the real value. This may be an important concern when designing an array for rotating sources, therefore an approximate formula was provided, that allows determining the required number of microphones M for a prescribed allowable amplitude uncertainty.
The computational cost of the two methods was examined, as well. Results similar to those reported in [16] were obtained, indicating that ROSI has a two to three orders of magnitude higher computational requirement than VRAM.
The results show that the application of VRAM is beneficial, as it has low computational requirements, but identifies the sources well. Its results are acceptable if the number of microphones is large enough. The required ring shaped array geometry may be a drawback, as it usually has to be manufactured specially for the method.
ROSI is able to localize and quantify the sources, as well. It is very robust, giving accurate results even for a small number of microphones. But due to the time-domain procedure, the majority of deconvolution methods can not be used. Furthermore, its computational demands are significantly higher, than those of VRAM. It should be noted however, that ROSI can be applied for any array geometry, therefore it does not require manufacturing an array specifically for the analysis of rotating sources.
Some array geometries are known to reduce side lobes and provide better source maps, even without deconvolution algorithms [24]. In the present case, a well-optimised array could reduce side lobe strength by about 10 dBs. The application of ROSI onto data obtained with a ring array is therefore not practical in a real-world scenario, it is only presented herein to provide a basis on which the methods can appropriately be compared.