From 2008 until 2012 I worked as a speech processing researcher at the Institute for Human-Machine Communication at Technische Universität München (TUM). Our research group focussed on automatic speech and emotion recognition for intelligent dialog systems.
In my PhD thesis, I developed techniques for context-sensitive classification of speech signals. More specifically, I examined how so-called Long Short-Term Memory (LSTM) neural networks can be used to improve automatic speech and emotion recognition.
LSTM is a machine learning algorithm that learns how much contextual information should be exploited in order to classify a given speech fragment. It was invented in 1997 by Sepp Hochreiter and Jürgen Schmidhuber at TUM. I was the first one to apply LSTM for continuous speech recognition and presented my research results at several international conferences.
With the breaktrough of "Deep Learning" methods, the usage of neural networks for speech classification tasks became more and more popular in the last few years. Since 2016, also the big players in speech processing are using LSTM for speech recognition: The technology is now used by Google and Microsoft for new products and can be also found in Apple's SIRI and Amazon's Alexa (see Wikipedia).
In my PhD thesis, I developed techniques for context-sensitive classification of speech signals. More specifically, I examined how so-called Long Short-Term Memory (LSTM) neural networks can be used to improve automatic speech and emotion recognition.
LSTM is a machine learning algorithm that learns how much contextual information should be exploited in order to classify a given speech fragment. It was invented in 1997 by Sepp Hochreiter and Jürgen Schmidhuber at TUM. I was the first one to apply LSTM for continuous speech recognition and presented my research results at several international conferences.
With the breaktrough of "Deep Learning" methods, the usage of neural networks for speech classification tasks became more and more popular in the last few years. Since 2016, also the big players in speech processing are using LSTM for speech recognition: The technology is now used by Google and Microsoft for new products and can be also found in Apple's SIRI and Amazon's Alexa (see Wikipedia).