Greedy Algorithm : A Brief of Huffman Coding

Sagar Chauhan
5 min readApr 11, 2022

When it started?

In 1951, David A. Huffman develop the Huffman coding algorithm with his MIT classmates. Huffman Coding Algorithm was published in 1952 papers “A Method for the Construction of Minimum-Redundancy Codes”.

What is Huffman Coding?

Basically, Huffman Coding is a technique to reduce the size of a data or message with the help of frequency based sorting. Huffman Coding Algorithm helps us to compress the size of data without any data loss.

The idea behind Huffman Algorithm is, it uses the variable length encoding scheme means it provides some binary codes to characters depending on their frequencies.

The character or part of data which occur most frequently, the algorithm assign the smallest code to it while the data or character which do not occur most frequently or have least frequency in data, the algorithm assign the largest code to it.

Huffman Coding also follow a prefix rule which says “ the binary code assigned to any character can not be a prefix of code of any other character”.

What is the usage of Huffman Coding?

Huffman Coding is required at various places in real life also. As we have know, Huffman Coding Algorithm is a famous algorithm for lossless data compression so it is used for multimedia compressions like JPEG, PNG, MP3, etc. In conventional compression like GZIP, Zip, etc Huffman Coding Algorithm is used. It also used in fax and text message transmissions.

What is Huffman Tree?

To solve the problem based on Huffman Coding, Huffman tree is required. Huffman tree is a full binary tree in which a character is assigned to each leaf node of binary tree.

While creating a Huffman Tree, we have to take those leaf nodes which have minimum weight in all the leaf nodes and add them up to create a new node.

For example, first we have to take character B & D and add them up to make a new node because character B & D have smaller weight in all the given characters.

By following the path of minimum weight, we got a full binary tree as mentioned above.

How to create a Huffman Tree?

Huffman Tree or Huffman Coding can be done by following certain major steps. Let’s understand Huffman Coding with the help of an example.Let’s suppose we have a string data or string message that we have to send over a network.

Analysis : The above string contain 15 characters and the size of a character data type is of 8 bits. Therefore, without any Huffman Coding the size of our string is 15(characters) * 8(each of 8 bit in size) =120 bits. Hence we need 120 bits to send this data at receiver’s end.

Now let’s apply Huffman encoding to this string.

Step 1 —

First of all, we have to calculate frequency of each character in the given string.

Frequency of characters in a string

Step 2 —

Now we have to sort the string in increasing order of their frequencies.

Frequency based sorting of a given string

Step 3 —

Create a binary tree by making each character to leaf node. Now create a blank node lets say ‘a’. Select the character which have minimum frequency i.e. B and make it left child of ‘a’ and then select the character with second minimum frequency i.e. D and make it right child of ‘a’.

Put the value of ‘a’ as the sum of least frequent nodes. The sum of B(1) & D(3) is 4 and assign the value 4 to ‘a’.

Step 4 —

Now, consider the sum as leaf node and select the two minimum frequencies and add them together to make a new leaf node which again is the sum of two least frequencies.

Step 5 —

By repeating step 3 & 4, finally we get a full binary tree or a complete Huffman tree.

Step 6 —

At last, mark left edges with 0 and right edges with 1.

Complete Huffman Tree

Analysis : Earlier, to send this text data over a network we require 120 bit of data while after applying Huffman Coding we need only 32 + 15 +28 = 75 bits of data.

How to Decode a Huffman Tree?

To decode a Huffman coding Tree, we have to traverse from parent node to the required node with the help of binary numbers mention on the left — right edge of Huffman tree.

Complete Huffman Tree

For eg. — Let’s suppose we have to decode a code 100. So, we have to start traversing from parent node and first move towards the edge 1 and reach at node (9) after that we select the edge 0 and reach at node (4) and similarly again select edge 0 and finally reach at node(1) which is a leaf node for character B. Hence, we successfully decode the code 100 as a character B.

Conclusion

From the above mentioned information we can conclude that Huffman Coding Algorithm is a simple and efficient algorithm to convert or decode a data to compress its size with any data loss. Huffman Coding can be used at various places in real life scenario.

--

--