38.3 Shannon-Fano Codes

Shannon-Fano codes are an approach to create prefix-free codes based on a set of symbols/characters and their probabilities. The main idea is that we want shorter prefix-free codes for more popular characters, and longer codes for lesser used characters.

The algorithm is:

  • Count relative frequencies of all characters in a text.

  • Split into ‘left’ and ‘right halves’ of roughly equal frequency.

    • Left half gets a leading zero. Right half gets a leading one.

    • Repeat.

At the end, you will get a tree as shown below, with shorter paths for characters with a higher frequency, and longer paths for characters with a lower frequency.

However, Shannon-Fano coding is NOT optimal, so it is not used very often.

Last updated