Extraction of Voiced Regions of Speech from Emotional Speech Signals Using Wavelet-Pitch Method

Authors

  • Lakshmi Srinivas Dendukuri
    Affiliation
    Department of Electronics and Communication Engineering, Vignan's Foundation for Science, Technology and Research, Vadlamudi, Guntur 522213, Andhra Pradesh, India
  • Shaik Jakeer Hussain
    Affiliation
    Department of Electronics and Communication Engineering, Vignan's Foundation for Science, Technology and Research, Vadlamudi, Guntur 522213, Andhra Pradesh, India
https://doi.org/10.3311/PPee.15373

Abstract

Extraction of voiced regions of speech is one of the latest topics in speech domain for various speech applications. Emotional speech signals contain most of the information in voiced regions of speech. In this particular work, voiced regions of speech are extracted from emotional speech signals using wavelet-pitch method. Daubechies wavelet (Db4) is applied on the speech frames after downsampling the speech signals. Autocorrelation function is performed on the extracted approximation coefficients of each speech frame and corresponding pitch values are obtained. A local threshold is defined on obtained pitch values to extract voiced regions. The threshold values are different for male and female speakers, as male pitch values are low compared to the female pitch values in general. The obtained pitch values are scaled down and are compared with the thresholds to extract the voiced frames. The transition frames between the voiced and unvoiced frames are also extracted if the previous frame is voiced frame, to preserve the emotional content in extracted frames. The extracted frames are reshaped to have desired emotional speech signal. Signal to Noise Ratio (SNR), Normalized Root Mean Square Error (NRMSE) and statistical parameters are used as evaluation metrics. This particular work provides better SNR and Normalized Root Mean Square Error values compared to the zero crossing-energy and residual signal based methods in voiced region extraction. Db4 wavelet provides better results compared to Haar and Db2 wavelets in extracting voiced regions using wavelet-pitch method from emotional speech signals.

Keywords:

emotional speech, wavelets, autocorrelation, pitch, thresholding

Citation data from Crossref and Scopus

Published Online

2021-07-13

How to Cite

Dendukuri, L. S., Hussain, S. J. “Extraction of Voiced Regions of Speech from Emotional Speech Signals Using Wavelet-Pitch Method”, Periodica Polytechnica Electrical Engineering and Computer Science, 65(3), pp. 262–278, 2021. https://doi.org/10.3311/PPee.15373

Issue

Section

Articles