Introduction In this experiment, we will be using VGG19 which is pre-trained on ImageNet on Cifar-10 dataset. We will be using PyTorch for this experiment. (A Keras version is also available) VGG19 is well known in producing promising results due to the depth of it. The “19” comes from the number of layers it has. […]Read more "Transfer Learning of VGG19 on Cifar-10 Dataset using PyTorch"
Introduction In this Lab, we will be implementing Network In Network  where its purpose is to enhance model discriminability for local patches within the receptive field. Conventional convolutional layers uses linear filters followed by a nonlinear activation function. The downside of the conventional method is the local receptors are too simple and doesn’t project local […]Read more "Network-in-Network Implementation using TensorFlow"
This is the continuation of the post before. This discussion on this post is: The variation of accuracy and correctness based on the number of observation mixtures. How does adding noise affect our recognition accuracy (The experiment done in our last post involves clean test input) The confusion matrix of our test results […]Read more "Force Alignment using HMM 2"
Disclaimer: Most of the contents are from HTKBook, this is just a summary based on my own Chinese Digit Recognition Dataset Development Platform OS: Linux 4.9.27-1 Tools: Wavesurfer, HTK, Python2.7 Data Set Segregation The data set that we will be using is provided by NCTUDS-100 DATABASE. The file format will be stored as follow: md010101.pcm […]Read more "Force Alignment using Hidden Markov Model"
Introduction By using EEG to collect EEG data from our brain, sometimes we will need to know which frequency band does our signal fall in to provide more features and information for later tasks. In this experiment, we are about to analyze a signal using Fast Fourier Transform (FFT) and Power Spectral Density (PSD). There […]Read more "Spectrum Analysis of EEG Signal"
Introduction The advantage of neural networks over other methods is due to their non-linearity. The non-linearity is caused by the linear combinations of the activation functions used. The activation functions that we will be using here is Sigmoid and ReLU. Let be the output of a neuron after a linear combination of its input neurons […]Read more "Neural Network for Multiclass Classification"
Introduction Dynamic Time Warping is an algorithm used to match two speech sequence that are same but might differ in terms of length of certain part of speech (phones for example). Here, we’ll not be using phone as a basic unit but frames that are obtained from MFCC features that are obtained from feature extraction […]Read more "Dynamic Time Warping for Speech Recognition"