Time Delay Neural Networks and Speech Recognition: Context Independence of Stops in Different Vowel Environments
Date of Award
Master of Science
Masters Thesis-Open Access
A series of speech recognition experiments was conducted to investigate time-dynamic speech recognition of stop consonants invariant of vowel environment using data from six talkers. The speech preprocessing was based on previous studies investigating acoustic characteristics which correlate to the place of articulation (Blumstein and Stevens 1979). The place of articulation features were statistically abstracted using four moments and the energy level of the speech sample.
Both statistical and neural network pattern recognition methods were used. Statistical methods included linear and quadratic discriminant functions, maximum likelihood estimator (MLE) and K-nearest neighbors (KNN). The neural network approach used was Time Delay Neural Networks (TDNN), a time shift-invariant version of backpropagation (Waibel 1989). The classification error rates ranged from 6.1% for quadratic resubstitution, 15.6% for KNN, 18.0% for MLE, 19.0% for the TDNN and 19.1% for linear resubstitution.
Makowski, Gregory Andrew, "Time Delay Neural Networks and Speech Recognition: Context Independence of Stops in Different Vowel Environments" (1991). Masters Theses. 646.