COST (GBP)

4.00

0.00

Computing › Io › Compression ›

Variable LZW

Implements the variable compression size LZW algorithm

Controller: CodeCogs

Contents

Interface
Class VariableLZW
Page Comments

Interface

C++

Class VariableLZW

This class implements a modified version of the Lempel-Ziv-Welch (LZW) compression method, allowing for the symbols in the compressed and uncompressed data streams to contain a variable number of bits. On contrast, when using the classic LZW algorithm available as IO/Compression/LZW, each symbol contains a fixed number of bits.

Using a variable number of bits per symbol allows for higher compression rates, which is why this version of the algorithm is better than the original one. It is also the algorithm used by the GIF file format to encode and decode its image data.

The example below shows how to encode a random string of 1000 characters, only consisting of the capital letters in the English alphabet.

Example 1

#include <codecogs/computing/io/compression/variablelzw.h>
#include <iostream>
#include <stdlib.h>
#include <time.h>
 
// the number of characters to generate in the sample string
#define N 1000
 
using namespace Computing::IO::Compression;
 
int main()
{
  // initialize random seed
  srand(time(0));
 
  // set the alphabet to include the letters A, B, C, D
  std::string alphabet = "ABCD";
 
  // create a new variable LZW coder
  // with the alphabet set to the one defined above
  IO::Compression::VariableLZW coder(alphabet);
 
  // generate a random string of N capital letters
  // from the set {A, B, C, D} to allow more repetitions
  std::string sample;
  for (int i = 0; i < N; ++i)
  {
    sample += 'A' + rand() % 4;
  }
 
  // compress the sample string
  std::string compressed = coder.encode(sample);
 
  // uncompress the compressed data
  std::string uncompressed = coder.decode(compressed);
 
  // compare the sizes of the compressed and uncompressed strings
  std::cout << "      Size of the sample string: " << N << std::endl;
  std::cout << "  Size of the compressed string: " << compressed.size() << std::endl;
  std::cout << "Size of the uncompressed string: " << uncompressed.size() << std::endl;
 
  std::cout << std::endl;
 
  // test if the sample and the uncompressed strings are identical
  // (this proves that the LZW compression algorithm does not affect the initial data)
  bool identical = (N == uncompressed.size());
  for (size_t i = 0; identical && i < uncompressed.size(); ++i)
    if (sample[i] != uncompressed[i])
      identical = false;
 
  if (identical)
    std::cout << "The sample and uncompressed strings are identical." << std::endl;
  else
    std::cout << "Error! The sample and uncompressed strings are distinct." << std::endl;
 
  return 0;
}

Output:

Size of the sample string: 1000
  Size of the compressed string: 342
Size of the uncompressed string: 1000
 
The sample and uncompressed strings are identical.

Authors

Lucian Bentea (August 2009)

References

LZW and GIF explained by Steve Blackstock, http://www.cis.udel.edu/~amer/CISC651/lzw.and.gif.explained.html
"Compressed image file formats: JPEG, PNG, GIF, XBM, BMP" by John Miano, Chapter 12, http://books.google.com/books?id=_nJLvY757dQC&dq=Compressed+image+file+formats&printsec=frontcover&source=bn&hl=ro&ei=ImaPSuGxIsiK_Abyh5SvAg&sa=X&oi=book_result&ct=result&resnum=4#v=onepage&q=&f=false
Implementation of the original LZW algorithm using various programming languages, http://rosettacode.org/wiki/LZW_compression

Source Code

Source code is available when you agree to a GP Licence or buy a Commercial Licence.

Not a member, then Register with CodeCogs. Already a Member, then Login.

Members of VariableLZW

VariableLZW

VariableLZW(

)

The implicit constructor sets the default alphabet to the extended ASCII table (codes 0 through 255) and also sets the initial code size to 9 bits, correspondingly. It also sets the maximum code size to 12, allowing the dictionary to hold at most 4096 entries.

VariableLZW

VariableLZW( std::string& alphabet )[constructor]

The alphabet is the set of values that the symbols in your raw or uncompressed data can take. For instance, if you would like to encode sequences like ABAABBAAABBB, only containing the letters A and B, then the length two alphabet "AB" is sufficient.

This version of the constructor sets the alphabet to the one given through the alphabet argument.

alphabet

the alphabet to use when encoding and decoding

SetAlphabet

voidsetAlphabet( std::string& alphabet )

Sets the alphabet to the one specified by the alphabet argument.

alphabet

the alphabet to use when encoding and decoding

GetAlphabet

std::stringgetAlphabet(

)

Returns the current alphabet used by the encoding and decoding methods.

Returns

the currently used alphabet

SetInitCodeSize

voidsetInitCodeSize( int initCodeSize )

Sets the initial code size used when encoding and decoding data.

initCodeSize

the value to set the initial code size to

GetInitCodeSize

intgetInitCodeSize(

)

Returns the current value of the initial code size.

CalculateInitCodeSize

voidcalculateInitCodeSize(

)

This method calculates the initial code size to use when encoding and decoding data, based on the specified alphabet size.

SetMaxCodeSize

voidsetMaxCodeSize( int maxCodeSize )

Sets the maximum code size to the value specified by maxCodeSize.

The maximum code size gives the maximum number of bits used when encoding a symbol in the raw data. This value is important in the encoding algorithm because, as soon as the dictionary has a number of entries equal to two to the power of maximum code size, a clear code is added to the compressed data stream and the dictionary is reset.

maxCodeSize

the maximum code size to use when encoding and decoding data

GetMaxCodeSize

intgetMaxCodeSize(

)

Returns the current maximum code size used by the encoding and decoding methods.

Returns

the currently used maximum code size

Encode

std::stringencode(	std::string&	`input`
	bool	`calcInitCodeSize` `= true`	)

This method compresses the raw data given through input using the variable code size LZW encoding algorithm.

If the calcInitCodeSize parameter is set to true, the initial code size is calculated prior to compressing the data, using the calculateInitCodeSize method.

Assuming that sample is a string and coder is an instance of the VariableLZW class, the following line encodes sample and stores the result in the compressed string:

std::string compressed = coder.encode(sample);

input	the raw data
calcInitCodeSize	specifies whether the initial code size should be calculated prior to encoding the data; this parameter is set to true by default

Returns

the compressed data

Decode

std::stringdecode( std::string& input )

This method uncompresses the compressed data given through input using the variable code size LZW decoding algorithm.

Make sure that the alphabet and the initial code size are initialized prior to calling this method.

Assuming that compressed is a string resulted from an encoding operation, and coder is an instance of the VariableLZW class, the following line decodes the compressed string and stores the result in the uncompressed string:

std::string uncompressed = coder.decode(compressed);

input

the compressed data

Returns

the uncompressed data