Force Alignment using HMM 2

This is the continuation of the post before. This discussion on this post is: The variation of accuracy and correctness based on the number of observation mixtures. How does adding noise affect our recognition accuracy (The experiment done in our last post involves clean test input) The confusion matrix of our test results     […]

Read more "Force Alignment using HMM 2"

GMM-Based Speaker Recognition

Introduction GMM vs K-Means First, we’ll have to understand what are hard decisions and soft decisions . Hard Decision A data point is clustered to a single cluster and the results are final. Soft Decision A data point is modeled by a distribution of clusters, thus it will be probabilistically defined and there’s no definite […]

Read more "GMM-Based Speaker Recognition"

LPC & Cepstrum & MFCC

As the title, there’re several ways on extracting important information from speech signals. We’ll dive into all of them. All speech signals will be pre-emphasized  by a pre-emphasis filter of   As we know, the whole process of LPC coefficient extraction can be divided into the following stages: source: https://www.mathworks.com/help/dsp/examples/lpc-analysis-and-synthesis-of-speech.html First, we would like to find […]

Read more "LPC & Cepstrum & MFCC"