Replace consecutive repeated values with a (count, value) pair. Effective when data has many repeating sequences.
Example
Original: AAABBBBBCCA
RLE output: 3A5B2C1A
Original: 11 chars × 8 bits = 88 bits
Compressed: 4 pairs × 16 bits = 64 bits
Saving: 27%
Best & Worst Cases
Best case — highly repetitive data
AAAAAAAAAAAAAAAA → 16A (huge saving)
Use case: fax transmission, simple bitmap images
Worst case — random/varied data
ABCDEFG → 1A1B1C1D1E1F1G (file GROWS)
RLE is ineffective for natural language text or photographs.
A variable-length prefix coding scheme where more frequent characters receive shorter codes. It is lossless — the original data can be perfectly reconstructed.
Frequency Table
| Char | Freq | Code | Bits |
|---|---|---|---|
| A | 2 | 0 | 1 |
| B | 2 | 10 | 2 |
| C | 1 | 11 | 2 |
Without Huffman (3 chars → 2 bits each): 5 × 2 = 10 bits
With Huffman: (2×1)+(2×2)+(1×2) = 8 bits → 20% saving
Tree (simplified)
A=0, B=10, C=11 — no code is a prefix of another (prefix-free property).
Type a string of characters and see the RLE output with space saving calculation.
Enter a short string to see its frequency table and Huffman codes calculated.
1. The string 'AABBBCCCCDDDDDD' is compressed using run-length encoding. (a) Show the compressed output. (b) The original uses 8 bits per character. Each RLE pair uses 16 bits (8 for count, 8 for character). Calculate the compression ratio. [4 marks]
Mark scheme:
2. Explain the process of building a Huffman tree for the string 'AABBC'. Include the frequency table and describe the key steps. [5 marks]
Mark scheme:
3. Explain why Huffman coding is described as lossless, and explain one property of a valid Huffman code that ensures unambiguous decoding. [3 marks]
Mark scheme: