Data Compression

J277 · 1.2 Memory & Storage · GCSE Computer Science

Component 01

Why Compress Data?

Faster Transfer

Smaller files transfer more quickly over a network — important for streaming, email attachments, and web content.

Less Storage Space

Compressed files take up less space on disk or in the cloud — more content fits on the same device.

Lossy vs Lossless Compression

Lossy Compression

  • Removes data permanently — the removed data is gone forever
  • Original file cannot be restored from the compressed version
  • Results in smaller file size than lossless
  • Quality is reduced — visible/audible artefacts at high compression

Common formats:

MP3 JPEG AAC H.264 video

Lossless Compression

  • No data is lost — all original data is retained
  • Original file can be perfectly restored
  • File size reduction is smaller than lossy
  • No quality loss — identical to original

Common formats:

PNG ZIP FLAC GIF
Factor Lossy Lossless
File size reductionLarge reductionModerate reduction
QualityReduced permanentlyNo change
Data recoveryImpossibleFull recovery possible
Best forMedia (audio, video, photos)Text, code, archives

Run-Length Encoding (RLE)

RLE is a simple lossless compression method. Consecutive repeated values are replaced with a count and the value — a "run".

Original data:

A A A B B B B A

RLE encoded:

3A 4B 1A

Original

8 characters

Compressed

6 values

Saving

25%

RLE works well when:

  • Data has many consecutive repeated values
  • e.g. a mostly blue sky in an image
  • Simple black & white graphics

RLE works poorly when:

  • Data is highly varied (no repeats)
  • Can actually increase file size if no runs exist
  • e.g. a photographic image with millions of colours

RLE Encoder

Interactive

Type a string of repeated characters (uppercase, e.g. AAABBCCCC) and click Encode to see the RLE output.

Compression Classifier

Interactive

Click each card to classify it as Lossy or Lossless. Then click "Check Answers" to see your score.

Key Terms

Lossy

Compression that permanently removes data. Smaller files but reduced quality.

Lossless

Compression that retains all data. Original file can be perfectly restored.

RLE

Run-Length Encoding. Replaces repeated values with a count + value pair. Lossless.

Artefact

Visual or audio distortion caused by lossy compression (e.g. blurry JPEG edges).

Bit rate

For audio/video: number of bits per second. Lower bit rate = more lossy compression.

Archive

A compressed file (e.g. ZIP) containing one or more files. Usually lossless.