Variable LZW
Implements the variable compression size LZW algorithm
Controller: CodeCogs
Interface
C++
Class VariableLZW
This class implements a modified version of the Lempel-Ziv-Welch (LZW) compression method, allowing for the symbols in the compressed and uncompressed data streams to contain a variable number of bits. On contrast, when using the classic LZW algorithm available as IO/Compression/LZW, each symbol contains a fixed number of bits. Using a variable number of bits per symbol allows for higher compression rates, which is why this version of the algorithm is better than the original one. It is also the algorithm used by the GIF file format to encode and decode its image data. The example below shows how to encode a random string of 1000 characters, only consisting of the capital letters in the English alphabet.Example 1
#include <codecogs/computing/io/compression/variablelzw.h> #include <iostream> #include <stdlib.h> #include <time.h> // the number of characters to generate in the sample string #define N 1000 using namespace Computing::IO::Compression; int main() { // initialize random seed srand(time(0)); // set the alphabet to include the letters A, B, C, D std::string alphabet = "ABCD"; // create a new variable LZW coder // with the alphabet set to the one defined above IO::Compression::VariableLZW coder(alphabet); // generate a random string of N capital letters // from the set {A, B, C, D} to allow more repetitions std::string sample; for (int i = 0; i < N; ++i) { sample += 'A' + rand() % 4; } // compress the sample string std::string compressed = coder.encode(sample); // uncompress the compressed data std::string uncompressed = coder.decode(compressed); // compare the sizes of the compressed and uncompressed strings std::cout << " Size of the sample string: " << N << std::endl; std::cout << " Size of the compressed string: " << compressed.size() << std::endl; std::cout << "Size of the uncompressed string: " << uncompressed.size() << std::endl; std::cout << std::endl; // test if the sample and the uncompressed strings are identical // (this proves that the LZW compression algorithm does not affect the initial data) bool identical = (N == uncompressed.size()); for (size_t i = 0; identical && i < uncompressed.size(); ++i) if (sample[i] != uncompressed[i]) identical = false; if (identical) std::cout << "The sample and uncompressed strings are identical." << std::endl; else std::cout << "Error! The sample and uncompressed strings are distinct." << std::endl; return 0; }
Output:Size of the sample string: 1000 Size of the compressed string: 342 Size of the uncompressed string: 1000 The sample and uncompressed strings are identical.
Authors
- Lucian Bentea (August 2009)
References
- LZW and GIF explained by Steve Blackstock, http://www.cis.udel.edu/~amer/CISC651/lzw.and.gif.explained.html
- "Compressed image file formats: JPEG, PNG, GIF, XBM, BMP" by John Miano, Chapter 12, http://books.google.com/books?id=_nJLvY757dQC&dq=Compressed+image+file+formats&printsec=frontcover&source=bn&hl=ro&ei=ImaPSuGxIsiK_Abyh5SvAg&sa=X&oi=book_result&ct=result&resnum=4#v=onepage&q=&f=false
- Implementation of the original LZW algorithm using various programming languages, http://rosettacode.org/wiki/LZW_compression
Source Code
Source code is available when you agree to a GP Licence or buy a Commercial Licence.
Not a member, then Register with CodeCogs. Already a Member, then Login.
Members of VariableLZW
VariableLZW
The implicit constructor sets the default alphabet to the extended ASCII table (codes 0 through 255) and also sets the initial code size to 9 bits, correspondingly. It also sets the maximum code size to 12, allowing the dictionary to hold at most 4096 entries.VariableLZW( ) VariableLZW
The alphabet is the set of values that the symbols in your raw or uncompressed data can take. For instance, if you would like to encode sequences like ABAABBAAABBB, only containing the letters A and B, then the length two alphabet "AB" is sufficient. This version of the constructor sets the alphabet to the one given through the alphabet argument.VariableLZW( std::string& alphabet )[constructor] alphabet the alphabet to use when encoding and decoding
SetAlphabet
Sets the alphabet to the one specified by the alphabet argument.voidsetAlphabet( std::string& alphabet ) alphabet the alphabet to use when encoding and decoding
GetAlphabet
Returns the current alphabet used by the encoding and decoding methods.std::stringgetAlphabet( ) Returns
- the currently used alphabet
SetInitCodeSize
Sets the initial code size used when encoding and decoding data.voidsetInitCodeSize( int initCodeSize ) initCodeSize the value to set the initial code size to
GetInitCodeSize
Returns the current value of the initial code size.intgetInitCodeSize( ) CalculateInitCodeSize
This method calculates the initial code size to use when encoding and decoding data, based on the specified alphabet size.voidcalculateInitCodeSize( ) SetMaxCodeSize
Sets the maximum code size to the value specified by maxCodeSize. The maximum code size gives the maximum number of bits used when encoding a symbol in the raw data. This value is important in the encoding algorithm because, as soon as the dictionary has a number of entries equal to two to the power of maximum code size, a clear code is added to the compressed data stream and the dictionary is reset.voidsetMaxCodeSize( int maxCodeSize ) maxCodeSize the maximum code size to use when encoding and decoding data
GetMaxCodeSize
Returns the current maximum code size used by the encoding and decoding methods.intgetMaxCodeSize( ) Returns
- the currently used maximum code size
Encode
This method compresses the raw data given through input using the variable code size LZW encoding algorithm. If the calcInitCodeSize parameter is set to true, the initial code size is calculated prior to compressing the data, using the calculateInitCodeSize method. Assuming that sample is a string and coder is an instance of the VariableLZW class, the following line encodes sample and stores the result in the compressed string:std::stringencode( std::string& input bool calcInitCodeSize = true
) std::string compressed = coder.encode(sample);
input the raw data calcInitCodeSize specifies whether the initial code size should be calculated prior to encoding the data; this parameter is set to true by default
Returns
- the compressed data
Decode
This method uncompresses the compressed data given through input using the variable code size LZW decoding algorithm. Make sure that the alphabet and the initial code size are initialized prior to calling this method. Assuming that compressed is a string resulted from an encoding operation, and coder is an instance of the VariableLZW class, the following line decodes the compressed string and stores the result in the uncompressed string:std::stringdecode( std::string& input ) std::string uncompressed = coder.decode(compressed);
input the compressed data
Returns
- the uncompressed data