## Performance consideration

The time complexity for decoding and evaluating canonical forms of the hidden Markov model for *N* states and *T* observations is *O(N _{2}T)*. The training of the HMM using the Baum-Welch algorithm is

*O(N*, where

_{2}TM)*M*is the number of iterations.

There are several options to improve the performance of the HMM:

Avoid unnecessary multiplication by 0 in the emission probabilities matrix by either using sparse matrices or tracking the null entries.

Train the HMM on the most

*relevant*subset of the training data. This technique can be particularly effective in the case of tagging of words or a bag of words in natural language processing.

The training of the linear chain conditional random fields is implemented using the same dynamic programming techniques as the HMM implementation (Viterbi, forward-backward passes, and so on). Its time complexity for training *T* data sequences, *N* labels (or expected outcomes), and *M* weights/features *λ* is *O(MTN _{2})*.

The time complexity of the training...