The time complexity for decoding and evaluating canonical forms of the hidden Markov model for N states and T observations is O(N2T). The training of the HMM using the Baum-Welch algorithm is O(N2TM), where M is the number of iterations.
There are several options to improve the performance of the HMM:
Avoid unnecessary multiplication by 0 in the emission probabilities matrix by either using sparse matrices or tracking the null entries.
Train the HMM on the most relevant subset of the training data. This technique can be particularly effective in the case of tagging of words or a bag of words in natural language processing.
The training of the linear chain conditional random fields is implemented using the same dynamic programming techniques as the HMM implementation (Viterbi, forward-backward passes, and so on). Its time complexity for training T data sequences, N labels (or expected outcomes), and M weights/features λ is O(MTN2).
The time complexity of the training...