Arithmetic coding increases sequence length

2 visualizaciones (últimos 30 días)
Giuseppe Esposito
Giuseppe Esposito el 25 de Jun. de 2018
Editada: Michael Montouchet el 2 de Oct. de 2025 a las 15:59
Hi all, I'm using the function "arithenco(seq,counts)" to compress a sequence of 1's,2's,3's and 4's of size 65536. The correspondent counts (number of occurrences for each symbol) is [1991,7759,52117,3669] so the symbol 3 shows an high probability to occur and I would expect a compression gain from the arithmetic code. But this doesn't happen, and the function outputs a code of size 66424 (longer than the original), how is possible? Thank you for the attention.
  1 comentario
Michael Montouchet
Michael Montouchet el 2 de Oct. de 2025 a las 15:56
Editada: Michael Montouchet el 2 de Oct. de 2025 a las 15:58
To write your sequence of 1, 2, 3, 4 as a binary sequence, you need at least 2 bits per symbol. The space required to write it as a binary sequence is 2 * 65536 bits.
The arithmetic code is made of 0 and 1, so you need 1 bit per symbol. The space required to write it as a binary sequence is 1 * 66424 bits.
So you managed to compress an initial input into a code that is nearly half of the initial size.

Iniciar sesión para comentar.

Respuestas (1)

Michael Montouchet
Michael Montouchet el 2 de Oct. de 2025 a las 15:57
Editada: Michael Montouchet el 2 de Oct. de 2025 a las 15:59
To write your sequence of 1, 2, 3, 4 as a binary sequence, you need at least 2 bits per symbol. The space required to write it as a binary sequence is 2 * 65536 bits.
The arithmetic code is made of 0 and 1, so you need 1 bit per symbol. The space required to write it as a binary sequence is 1 * 66424 bits.
So you managed to compress an initial input into a code that is nearly half of the initial size.

Categorías

Más información sobre Electrical Block Libraries en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by