What is the optimum precision to use in an arithmetic encoder?

1 year ago

#301829

danieleades

I've implemented an arithmetic coder here - https://github.com/danieleades/arithmetic-coding

i'm struggling to understand a general way to choose an optimal number of bits for representing integers within the encoder. I'm using a model where probabilities are represented as rationals.

I know that to prevent underflows/overflows, the number of bits used to represent integers within the encoder/decoder must be at least 2 bits greater than the maximum number of bits used to represent the denominator of the probabilities.

for example, if i use a maximum of 10 bits to represent the denominator of the probabilities, then to ensure the encoding/decoding works, i need to use at least MAX_DENOMINATOR_BITS + 2 = 12 bits to represent the integers.

If i was to use 32bit integers to store these values, I would have another 10 bits up my sleeve (right?).

I've seen a couple of examples that use 12 bits for integers, and 8 bits for probabilities, with a 32bit integer type. Is this somehow optimal, or is this just a fairly generic choice?

I've found that increasing the precision above the minimum improves the compression ratio slightly (but it saturates quickly). Given that increasing the precision improves compression, what is the optimum choice? Should I simply aim to maximise the number of bits i use to represent the integers for a given denominator? Performance is a non-goal for my application, in case that's a consideration.

Is it possible to quantify the benefit of moving to say, a 64bit internal representation to provide a greater number of precision bits?

I've based my implementation on this (excellent) article - https://marknelson.us/posts/2014/10/19/data-compression-with-arithmetic-coding.html

encoding

compression

integer-arithmetic

lossless-compression

0 Answers

Your Answer

Posts

Questions

Blogs

Jobs