Write a program that implements the huffman coding

Assignment Help Basic Computer Science
Reference no: EM131404694

Write a program that implements the "Huffman coding" compression algorithm using priority queues and binary trees. Huffman coding is an algorithm devised by David A. Huffman of MIT in 1952 for compressing text data to make a file occupy a smaller number of bytes. Normally text data is stored in a standard format of 8 bits per character, commonly using an encoding called ASCII that maps every character to a binary integer value from 0-255. The idea of Huffman coding is to abandon the rigid 8-bits-per-character requirement and use different-length binary encodings for different characters. The advantage of doing this is that if a character occurs frequently in the file, such as the letter "e", it could be given a shorter encoding (fewer bits), making the file smaller.
The steps involved in Huffman coding a given text source file into a destination compressed file are the following:

a. Examine the source file's contents and count the number of occurrences of each character (consider using a map).

b. Place each character and its frequency (count of occurrences) into a priority queue ordered in ascending order by character frequency.

c. Convert the contents of this priority queue into a binary tree with a particular structure. Create this tree by repeatedly removing the two front elements from the priority queue (the two nodes with the lowest frequencies) and combining them into a new node with these two nodes as its children and the two nodes' combined frequencies as its frequency. Then reinsert this combined node back into the priority queue. Repeat until the priority queue contains just one single node.

d. Traverse the tree to discover the binary encodings of each character. Each left branch represents a ‘0' in the character's encoding and each right branch represents a "1".

e. Reexamine the source file's contents, and for each character, output the encoded binary version of that character to the destination file to compress it.

Reference no: EM131404694

Questions Cloud

Create a test to verify the performance of each operation : (Do you expect that a three-heap will be faster or slower than a binary heap for insertion, and for removal? Why? You can create a test to verify the performance of each operation.)
What does aftercare planning look like for this population : What special considerations and ethical guidelines may impact treatment success with juvenile sexual offenders?Describe how the offenses of a juvenile sexual offender may differ from a "typical" male sexual offender.What does aftercare planning look ..
What would be the value of the correlation in given context : Heights and weights were recorded in meters and kilograms, respectively. What would be the value of the correlation if the measurements had instead been made in inches and pounds?
Analyze the actions taken by cardillos outside auditors : Analyze the actions taken by Cardillo's outside auditors and evaluate the level of efficiency of the audit risk management in this case study. Provide support for the rationale.
Write a program that implements the huffman coding : Traverse the tree to discover the binary encodings of each character. Each left branch represents a ‘0' in the character's encoding and each right branch represents a "1".
Discuss the jaffee v redmond case 1996 : Discuss the Jaffee v. Redmond (1996) case with your classmates. Using the appropriate terminology, examine the background, participants, and historical significance of the case in relation to the standardized substance abuse assessments used in to..
How bizcon have positive net income and yet run out of cash : Assess how at the end of the year, BizCon reported a favorable net income, yet the company's management is concerned because the company is very short of cash. Explain to management how BizCon could have positive net income and yet run out of cash..
Characterize relationship between age and body temperature : A scatterplot showed a linear relationship with a correlation between age and body temperature of -0.313. Using this value, characterize the relationship between age and body temperature.
What are the primary assumptions each author makes : There are very different views of what types of evidence are most credible in evaluating the effectiveness of psychological treatment research. In this discussion you will analyze basic applied psychological research as well as evaluate how resear..

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Calculate the sensitivity of lvdt

The output of an inductance type transducer (such as LVDT) is connected to a 5 V voltmeter. An output of 2 mV appears across the terminals of the transducer when the core of the LVDT moves through a distance of 0.1 mm. Calculate the sensitivity of..

  Describe ease of finding information on internet

Explain main elements of assignment in the substantive way. Describe the ease of finding information on the Internet.

  Find the coordinates of the vertices of abc

Use a reflection matrix to find the coordinates of the vertices of ABC reflected over the y-axis for vertices A (3, 2), B (2, -4), and C (1, 6).

  Instead of changing individual attributes

If you do not wish to use the content placeholder, you can also insert a chart using the Insert Chart button in the __________ group.

  How long will it take for the country mineral reserves

If this is so, how long will it take for the country's mineral reserves to be depleted? Solve using Excel.

  Consider the time slot relation

Consider the time slot relation. Given that a particular time slot can meet more than once in a week, explain why day and start time are part of the primary key of this relation, while end time is not.

  Frequent shopper program

Hardware platform: Describe the hardware environment to support the development and production of this system.

  Application-information security laws

You probably obey most laws; however, you may not view some seriously enough to obey them. For example, you might be someone who is comfortable going faster than the posted speed limit. Is it all right to ignore some laws because they do not seem ..

  What are the advantages of inheritance

In C++, What are the advantages of Inheritance? What are the advantages of Composition?

  Tcp flags are utilized

Provide a detailed description of how each of the following TCP flags are utilized: (In Network Security) SYN (Synchronize) ACK (Acknowledgement)

  What is the cumulative incidence of cvd

Five hundred people are enrolled in a 10-year cohort study. At the start of the study, 50 have diagnosed CVD. Over the course of the study, 40 people who were free of CVD at baseline develop CVD.

  Use two arrays of integers to store two test scores

You must use 2 arrays of integers to store the 2 test scores for each student. You must use a third array to store the average and a fourth array to store the letter grade.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd