Feature Extraction Using Empirical Mode Decomposition of Speech Signal

Nikil V Davis

doi:https://doi.org/10.14445/22315381/IJETT-V3I2P203

Research Article | Open Access | Download PDF

Volume 3 | Issue 2 | Year 2012 | Article Id. IJETT-V3I2P203 | DOI : https://doi.org/10.14445/22315381/IJETT-V3I2P203

Feature Extraction Using Empirical Mode Decomposition of Speech Signal

Nikil V Davis

Citation :

Nikil V Davis, "Feature Extraction Using Empirical Mode Decomposition of Speech Signal," International Journal of Engineering Trends and Technology (IJETT), vol. 3, no. 2, pp. 77-80, 2012. Crossref, https://doi.org/10.14445/22315381/IJETT-V3I2P203

Abstract

Speech signal carries information related to not only the message to be conveyed, but also about speaker, language, emotional status of speaker, environment and so on. Speech is produced by exciting the time varying vocal tract system with a time varying e xcitation. Each sound is produced by a specific combination of excitation and vocal tract dynamics. This paper presents a speaker identification system using empirical mode decomposition (EMD) feature extraction method. The EMD is an adaptive multiresolution decomposition technique that appears to be suitable for non - linear, non - stationary data analysis. The EMD sifts the complex signal of time series without losing its original properties and then obtains some useful intrinsic mode function (I MF) components. T he FFT is the most useful method for frequency domain feature extraction . Wavelet transform(WT) is yet another method for feature extraction.

Keywords

Speaker identification, Empirical mode decomposition, Intrinsic Mode Function

References

[1] I Jian - Da Wu, Yi - Jang Tsai , “ Speaker identification system using empirical mode decomposition and an artificial neural network ,” Expert Systems with Applications , 38 ,6112 – 6117.
[2] Avci, E., & Akpolat, Z. H. (2006). “Speech recognition using a wavelet packer adaptive network based fuzzy inference system,” Expert Systems with Applications , 31, 495 – 503.
[3] Corinthios, M. J. (197 1). A fast Fourier transform for high - speed signal processing. IEEE Transaction on Computer , C - 20, 843 – 846.
[4] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back - propagating errors . Nature, 323, 533 – 536.
[5] F.Cu mmins, M.Grimaldi, T.Leonard, and J.Simko, “The chains corpus: Characterizing individual speakers,” in Proc.SPECOM’06, St. Petersburg, Russia, 2006, pp.431 - 435.