# Hidden Markov Models¶

The Hidden Markov Model (HMM) is a state-based statistical model for sequence modelling.

When used for classification, a HMM can be used to represent an **individual** observation sequence class.
For example, if we were recognizing spoken digits from the Free Spoken Digit Dataset, we would train a separate HMM for each digit,
to recognize observation sequences belonging to that class.

HMMs can be used to classify both categorical and numerical sequences.

See also

See [1] for a detailed introduction to HMMs.

## Parameters and training¶

A HMM is composed of:

a

**Markov chain**, which models the probability of transitioning between hidden states.an

**emission model**, which models the probability of emitting an observation from a hidden state.

A HMM \(\lambda\) is defined by the following parameters:

**Initial state distribution**\(\boldsymbol{\pi}\):A probability distribution that dictates the probability of the HMM starting in each state.**Transition probability matrix**\(A\):A matrix whose rows represent a probability distribution that determine how likely the HMM is to transition to each state, given some current state.Note

Sequentia HMMs are time homogeneous.

**Emission probability distributions**\(B\):A collection of \(M\) probability distributions (one for each state) that specify the probability of the HMM emitting an observation given some current state.- For categorical sequences, the emission distribution \(b_m(o^{(t)})\) at state \(m\) is a univariate discrete distribution of the probability of the observation \(o^{(t)}\) at time \(t\) being one of the possible symbols \(\mathcal{S}=\{s_0,s_1,\ldots,s_K\}\).
This collection of state emission distributions can be modelled as an \(M \times K\) transition matrix over all states and symbols \(\mathcal{S}\).

- For numerical sequences, the emission distribution \(b_m(\mathbf{o}^{(t)})\) at state \(m\) is a multivariate continuous distribution of the probability of the observation \(\mathbf{o}^{(t)}\) at time \(t\).
Numerical sequence support in Sequentia assumes unbounded real-valued emissions which are modelled according to a multivariate Gaussian mixture distribution.

HMMs are fitted to observation sequences using the Baum-Welch (or forward-backward) algorithm which learns all of the parameters described above via Expectation-Maximization (EM).

## Topologies¶

The nature of the transition matrix determines the **topology** of the HMM.

Three common types of topology used in sequence modelling are **ergodic**, **left-right** and **linear**.

**Ergodic topology**: All states have a non-zero probability of transitioning to any state.**Left-right topology**: States are arranged in a way such that any state may only transition to itself or any state ahead of it, but not to any previous state.**Linear topology**: Same as left-right, but states are only permitted to transition to the next state.

Left-right topologies are particularly useful for modelling sequences where ordering must be respected.

Note

## Making predictions¶

Multiple HMMs trained to recognize individual observation sequence classes can be combined to form a single multi-class classifier that makes predictions according to posterior maximization.

See HMM Classifier for more detail on how HMMs can be used for classification.

References