When using LZ77 why is the compressed message longer than the decompressed version?
Dictionary coders (such as LZ77) store a segment of the message
in a dictionary, and then replace each occurrence with the index
into the dictionary. This works very well for repeated values, or
messages that use a small portion of the possible values (such as
26 letters out of the 256 possible values in a byte). However, if
the data does not repeat, or uses almost all of the possible values
randomly, there will be the space used for the dictionary PLUS the
space used for indexes of single occurrences. This can make the
message longer.
In general terms, if the message entropy is too high, any
lossless compression scheme will fail to compress the data.