Extraction of Voiced Regions of Speech from Emotional Speech Signals Using Wavelet-Pitch Method

Lakshmi Srinivas Dendukuri; Shaik Jakeer Hussain

doi:10.3311/PPee.15373

Authors

Lakshmi Srinivas Dendukuri

Affiliation

Department of Electronics and Communication Engineering, Vignan's Foundation for Science, Technology and Research, Vadlamudi, Guntur 522213, Andhra Pradesh, India
Shaik Jakeer Hussain

Affiliation

Department of Electronics and Communication Engineering, Vignan's Foundation for Science, Technology and Research, Vadlamudi, Guntur 522213, Andhra Pradesh, India

Abstract

Extraction of voiced regions of speech is one of the latest topics in speech domain for various speech applications. Emotional speech signals contain most of the information in voiced regions of speech. In this particular work, voiced regions of speech are extracted from emotional speech signals using wavelet-pitch method. Daubechies wavelet (Db4) is applied on the speech frames after downsampling the speech signals. Autocorrelation function is performed on the extracted approximation coefficients of each speech frame and corresponding pitch values are obtained. A local threshold is defined on obtained pitch values to extract voiced regions. The threshold values are different for male and female speakers, as male pitch values are low compared to the female pitch values in general. The obtained pitch values are scaled down and are compared with the thresholds to extract the voiced frames. The transition frames between the voiced and unvoiced frames are also extracted if the previous frame is voiced frame, to preserve the emotional content in extracted frames. The extracted frames are reshaped to have desired emotional speech signal. Signal to Noise Ratio (SNR), Normalized Root Mean Square Error (NRMSE) and statistical parameters are used as evaluation metrics. This particular work provides better SNR and Normalized Root Mean Square Error values compared to the zero crossing-energy and residual signal based methods in voiced region extraction. Db4 wavelet provides better results compared to Haar and Db2 wavelets in extracting voiced regions using wavelet-pitch method from emotional speech signals.

Keywords:

emotional speech, wavelets, autocorrelation, pitch, thresholding

Citation data from Crossref and Scopus

Extraction of Voiced Regions of Speech from Emotional Speech Signals Using Wavelet-Pitch Method

Authors

Abstract

Keywords:

Citation data from Crossref and Scopus

Published Online

How to Cite

Issue

Section

Make a Submission