Write a program that implements the huffman coding

Assignment Help Basic Computer Science
Reference no: EM131404694

Write a program that implements the "Huffman coding" compression algorithm using priority queues and binary trees. Huffman coding is an algorithm devised by David A. Huffman of MIT in 1952 for compressing text data to make a file occupy a smaller number of bytes. Normally text data is stored in a standard format of 8 bits per character, commonly using an encoding called ASCII that maps every character to a binary integer value from 0-255. The idea of Huffman coding is to abandon the rigid 8-bits-per-character requirement and use different-length binary encodings for different characters. The advantage of doing this is that if a character occurs frequently in the file, such as the letter "e", it could be given a shorter encoding (fewer bits), making the file smaller.
The steps involved in Huffman coding a given text source file into a destination compressed file are the following:

a. Examine the source file's contents and count the number of occurrences of each character (consider using a map).

b. Place each character and its frequency (count of occurrences) into a priority queue ordered in ascending order by character frequency.

c. Convert the contents of this priority queue into a binary tree with a particular structure. Create this tree by repeatedly removing the two front elements from the priority queue (the two nodes with the lowest frequencies) and combining them into a new node with these two nodes as its children and the two nodes' combined frequencies as its frequency. Then reinsert this combined node back into the priority queue. Repeat until the priority queue contains just one single node.

d. Traverse the tree to discover the binary encodings of each character. Each left branch represents a ‘0' in the character's encoding and each right branch represents a "1".

e. Reexamine the source file's contents, and for each character, output the encoded binary version of that character to the destination file to compress it.

Reference no: EM131404694

Questions Cloud

Create a test to verify the performance of each operation : (Do you expect that a three-heap will be faster or slower than a binary heap for insertion, and for removal? Why? You can create a test to verify the performance of each operation.)
What does aftercare planning look like for this population : What special considerations and ethical guidelines may impact treatment success with juvenile sexual offenders?Describe how the offenses of a juvenile sexual offender may differ from a "typical" male sexual offender.What does aftercare planning look ..
What would be the value of the correlation in given context : Heights and weights were recorded in meters and kilograms, respectively. What would be the value of the correlation if the measurements had instead been made in inches and pounds?
Analyze the actions taken by cardillos outside auditors : Analyze the actions taken by Cardillo's outside auditors and evaluate the level of efficiency of the audit risk management in this case study. Provide support for the rationale.
Write a program that implements the huffman coding : Traverse the tree to discover the binary encodings of each character. Each left branch represents a ‘0' in the character's encoding and each right branch represents a "1".
Discuss the jaffee v redmond case 1996 : Discuss the Jaffee v. Redmond (1996) case with your classmates. Using the appropriate terminology, examine the background, participants, and historical significance of the case in relation to the standardized substance abuse assessments used in to..
How bizcon have positive net income and yet run out of cash : Assess how at the end of the year, BizCon reported a favorable net income, yet the company's management is concerned because the company is very short of cash. Explain to management how BizCon could have positive net income and yet run out of cash..
Characterize relationship between age and body temperature : A scatterplot showed a linear relationship with a correlation between age and body temperature of -0.313. Using this value, characterize the relationship between age and body temperature.
What are the primary assumptions each author makes : There are very different views of what types of evidence are most credible in evaluating the effectiveness of psychological treatment research. In this discussion you will analyze basic applied psychological research as well as evaluate how resear..

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Identifies the cost of computer

identifies the cost of computer components to configure a computer system (including all peripheral devices where needed) for use in one of the following four situations:

  Input devices

Compare how the gestures data is generated and represented for interpretation in each of the following input devices. In your comparison, consider the data formats (radio waves, electrical signal, sound, etc.), device drivers, operating systems suppo..

  Cores on computer systems

Assignment : Cores on Computer Systems:  Differentiate between multiprocessor systems and many-core systems in terms of power efficiency, cost benefit analysis, instructions processing efficiency, and packaging form factors.

  Prepare an annual budget in an excel spreadsheet

Prepare working solutions in Excel that will manage the annual budget

  Write a research paper in relation to a software design

Research paper in relation to a Software Design related topic

  Describe the forest, domain, ou, and trust configuration

Describe the forest, domain, OU, and trust configuration for Bluesky. Include a chart or diagram of the current configuration. Currently Bluesky has a single domain and default OU structure.

  Construct a truth table for the boolean expression

Construct a truth table for the Boolean expressions ABC + A'B'C' ABC + AB'C' + A'B'C' A(BC' + B'C)

  Evaluate the cost of materials

Evaluate the cost of materials

  The marie simulator

Depending on how comfortable you are with using the MARIE simulator after reading

  What is the main advantage of using master pages

What is the main advantage of using master pages. Explain the purpose and advantage of using styles.

  Describe the three fundamental models of distributed systems

Explain the two approaches to packet delivery by the network layer in Distributed Systems. Describe the three fundamental models of Distributed Systems

  Distinguish between caching and buffering

Distinguish between caching and buffering The failure model defines the ways in which failure may occur in order to provide an understanding of the effects of failure. Give one type of failure with a brief description of the failure

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd