Creating a solution to achieve Huffman Coding

Assignment Help Python Programming
Reference no: EM133039633

CPSC 231 Introduction to Computer Science for Computer Science Majors - University of Calgary

Assignment: Huffman Coding

Description

In this assignment you are given a partially complete program designed to take an ASCII file and compress it. Compression is when you take information that occupies a certain quantity of space and through an algorithmic method reduce it to occupy less space by ‘encoding' it. To be useful, this process should contain a method to ‘decode' the data back from the compressed form to be read.

Compression is used across electronic media from simple text compression to image compression like jpg and video compression like mp4. In this assignment you will be complete two Python classes to finish the compression program. Instructions in the assignment will lead you through each of the classes to complete parts of each so that the program operates correctly at the end. Similar to how assignment 2 required you to complete functions to make that program complete, this assignment will have you complete objects and their internal methods (object functions) to make it work. This is a common software engineering challenge where a ‘class' must be completed according to the specifications given to you.

You will be creating a solution to achieve Huffman Coding. Huffman Coding is optimal when encoding ASCII symbols one at a time. Optimal here is defined as the resulting text is reduced from the original byte quantity to the smallest byte quantity possible. Later courses in Computer Science discuss the math that supports this, however for this assignment we will follow a simple greedy algorithm that gives us a Huffman Code without needing to know why it is correct or optimal.

You do not have to deal with command line arguments, opening files, exceptions, in this assignment. The challenges will be completing two python files each that contain object methods used by the main program. When completing these files you must implement the Huffman Tree and Encoding Table with your own code and are not allowed to import libraries to accomplish this.

Part 1 HuffmanTree: Constructor/make_trees

You have been provided with a HuffmanTree.py file. Edit this file to complete this part.
Create the constructor init ()
The constructor will take 5 parameters (of course remember the first parameter of self to make 6 total parameters):
1. the char to be stored in the tree
2. the count of how many times that char occurred in the text
3. a link to a left HuffmanTree (default None)
4. a link to a right HuffmanTree (default None)
5. a boolean bit that stores if this tree was reached with a 0 (False) or a 1 (True) (default None)

Create a class function make_trees(dictionary):
The class function will take 1 parameter. This is class function so it should not be indented inside the class (the method does not have self as a parameter). Put it at end of HuffmanTree.py file.
1. Dictionary dictionary which stores each symbol from the input file as a key char such that
dictionary[char] = count, where count is how many times that symbol occurred in the file
This method should loop through all symbols (keys) in the dictionary and create a default HuffmanTree for each symbol (left,right,bit will all be None). These HuffmanTrees should be collected in a list and returned. Note, you will need to use the constructor from earlier to create a HuffmanTree storing each char and associated count. Ten keys would produce ten Huffman Trees storing one char/count.
Once Part 1 is complete the first part of the program should operate. At this point the program doesn't do anything interesting but successfully read the input file and create an initial list of HuffmanTrees.

Part 2 HuffmanTree: Comparison/sort
To start the algorithm of Huffman Coding we need to order the list of trees we created earlier such that the trees storing a symbol which occurred the most occur first and are followed by those that occurred less. Those that occurred the least at the end. (We will select these least from end using pop() to merge to make Huffman Tree.)
To accomplish this you need to add a method to your HuffmanTree class. This method will compare one HuffmanTree to another HuffmanTree and allow you to sort a list of HuffmanTrees using list.sort().
Create a method lt (other): [also known as < operator]
The method will take 1 parameter. This is a method so it should be indented inside the class. (of course remember the first parameter of self to make 2 total parameters)
1. HuffmanTree other that is being compared to the current HuffmanTree self.
• This method should return True if the count for self is greater than that of other.

• This method should return False if the count for self is less than that of other.
• If the counts are the same, then it should return True if self's char is greater than that of other, otherwise False.
With this complete, the initial list of HuffmanTrees can be sorted. However, things are still challenging because we can't see (print) what the order is in a useful way. You'll notice the print statements in the code aren't very informative. Part 3 will fix that.

Part 3 HuffmanTree: String/Representation

To accomplish this you need to two methods to your HuffmanTree class. These methods will create a printable short form of each HuffmanTree that can be produced using str() and a detailed formed using repr().
Create a method str (): (default output)
The method will take 0 parameters. This is a method so it should be indented inside the class. (of course remember the first parameter of self to make 1 total parameters)
This method should produce a string of the form "(char,count,left char,right char,bitstring)"
Note that if left or right is None then it should use "None". For bit=True it should use "1" and bit=False it should use "0".
Use repr(char) for char because it will show the space symbol as ‘ ‘, <enter> as ‘\n', and <tab> as ‘\t'.
For example a new HuffmanTree would look like ('l',15,None,None,None)
A left child at the bottom might look like ('d',5,None,None,0)
Or a tree that is a right child in middle the would look like ('lwde',30,'l','wde',1)
The top of the tree at the end would look like ('hrolwde',50,'hro','lwde',None)
Create a method repr (): (-V output)
The method will take 0 parameters. This is a method so it should be indented inside the class. (of course remember the first parameter of self to make 1 total parameters)
This method should produce a string of the form "HuffmanTree(char,count,left,right,bit)" where each value in it like char,count,etc. is the representation form repr(char),repr(count),etc.. Note that repr(left), repr(right) is a recursive call on the left HuffmanTree and the right HuffmanTree.
For example a new HuffmanTree would look like HuffmanTree('l',15,None,None,None)
A left child at the bottom might look like HuffmanTree('d',5,None,None,False)
Or a tree that is a right child in middle the would look like HuffmanTree('lwde',30,...,...,True)
The top of the tree at the end would look like HuffmanTree('hrolwde',50,...,...,None)

... represents the complicated repr() on a left or right child that would be printed here but has been omitted for space. Check the example output files for what this would look like.
With this complete the program will now print the sorted list of HuffmanTrees. Now the program can attempt to make the final HuffmanCoding tree. The algorithm is already written in the program, but it relies on a merge operation to combine HuffmanTrees. Part 4 will create that.

Part 4 HuffmanTree: Merge/Recursive Structure
To complete part 4 you need to create another class function.
Create a class function merge(t1, t2)
The method will take 2 parameters. This is class function so it should not be indented inside the class, and the method does not have self as a parameter
1. Huffman tree t1
2. Huffman tree t2
This method should return a new HuffmanTree. This new HuffmanTree should be created using the previous constructor. This HuffmanTree will store
char = A.char + B.char count = A.count + B.count left=A
right=B bit = None
To store the encoding of bits, assign the bit 0 to A and the bit 1 to B using the bit boolean for each of left/right.
To determine which of t1/t2 is A and B we want A to be tree with a smaller count. If count's are equal, then put tree with smaller char (ASCII value) in A.
With this complete your program creates a final HuffmanTree. We have one more method to add to the HuffmanTree class. This method will let us compare the HuffmanTrees create when we are given two input files <input_filename1> <input_filename2>.

Part 5 HuffmanTree: Equality/Recursive Function
To accomplish this you need to add a method to your HuffmanTree class. This method will compare one HuffmanTree to another HuffmanTree and allow you decided if they are equal (==).
Create a method eq (other):

The method will take 1 parameter. This is a method so it should be indented inside the class. (of course remember the first parameter of self to make 2 total parameters)
1. HuffmanTree other that is being compared to the current HuffmanTree self. This method should return False if other is None.
This method should return True if the char, left, and right of self/other are all ==. (Note, we don't ignore char in eq , but we do ignore count.)
Otherwise, if should return False.

This program should now be able to run for usages that involve giving two input files.
Ex. sampletree2.dat and sampletree2-1.dat should produce "Trees are the same in structure!" Tree 1:
-1-> ('c',3)
-1->
-0-> ('b',2)
-->
-0-> ('a',4)

Tree 2:
-1-> ('c',3)
-1->
-0-> ('b',3)
-->
-0-> ('a',4)

sampletree2.dat and sampletree3.dat should produce "Trees are the different in structure!" Tree 1:
-1-> ('c',3)
-1->
-0-> ('b',2)
-->
-0-> ('a',4)

Tree 2:
-1-> ('c',6)

-->

-1-> ('b',3)
-0->
-0-> ('a',2)

We still need to make the program encode our input file into and output file. This will need us to complete a second class. This class is partially complete. It already has a encode method that will take text in text form and turn it into a binary string. However, this EncodingTable still needs to be constructed.

Part 6 Encoding Table: String
You have been provided with a EncodingTree.py file. Edit this file to complete this part. Please note until you complete Part 7, there will be nothing in the object to print.
We want to be able to visualize our encoding table in a consistent way. To accomplish this you need to add a method to your EncodingTable class. This method will create a printable short form of each EncodingTable that can be produced using str().
Create a method str ():
The method will take 0 parameters. This is a method so it should be indented inside the class. (of course remember the first parameter of self to make 1 total parameters)
This method should produce a string from the dictionary encode storing the encoding where each line in the string is an entry such as
'd':1110
'e':1111
'h':000
'l':10
'o':01
'r':001
'w':110

These entries should be sorted by character.
Use repr(char) for char because it will show the space symbol as ‘ ‘ and enter as ‘\n'.

Part 7 Encoding Table: Constructor/Recursive Method

You have been provided with a EncodingTree.py file. Edit this file to complete this part.
Examine the constructor init ()
The constructor will take 1 parameter (of course remember the first parameter of self to make 2 total parameters):
1. The HuffmanTree tree to be converted into an encode dictionary
This method is actually complete for you, it is the second recursive function recurse that needs to be completed.

Complete the method recurse(tree, code)
This method will take 2 parameters (of course remember the first parameter of self to make 3 total parameters):
1. the HuffmanTree tree to use to determine the codes
2. the current string code created so far
This is a recursive function. Which means I will define this function recursively for you.
Every recursive call: If the bit for the tree is not None, then modify the code to add a "1" if the bit was True or "0" if the bit was False.
Base Case: If you reach a HuffmanTree tree which has a None left and right sub-tree. Then store the current parameter code in the dictionary self.encode at the symbol tree.char.
Recursive Case: If you aren't at the base case, then recursively call this method on tree.left with the current code, and on tree.right with the current code.

Idea of code:

We start at top of tree with code = "". We will go down left side first, but eventually will also explore down right side. We add "1" or "0" to this code based on if we have bit==True or False stored in the sub-tree we move into.

If we reach a bottom node and can't go further. We'll have tracked the code from the top to that point. We can now store this string into our dictionary of encodings for the character store at this point at the bottom of the tree.

Once we store a code our function will fall out to the previous location up the tree, and we will explore the right side like we did with the left.

For example, we would start at the top of tree. We have a bit==None so we don't add (0/1). Then we recurse on left side with code="". We have bit==False so we add "0". Then we recurse on left side with code="0". We have bit==False so we add "0". Then we recurse on left side with code="00". We have bit==False so we add "0". Our left=right=None so we store for ‘h' the code="000" and return to the node above. Then we recurse on right side with code="00". We have bit==True so we add "1". Our left=right=None so we store for ‘r' the code="001".

Bonus:
For half of the bonus write another program called CPSC231F21A4-Name-Bonus.py that is able to read an encoded file like samplehello.dat.txt and an encoding table like samplehello.dat.tbl and will decode that file.
Ex. CPSC231F21A4-Name-Bonus.py samplehello.dat.txt samplehello.dat.tbl Will print helloworldhelloworldhelloworldhelloworldhelloworld to terminal.
For the full bonus, expand the previous program that when using a "-b" flag will read a binary file like samplehello.dat.bin and decode that file as well using the encoding table like samplehello.dat.tbl.
Ex. CPSC231F21A4-Name-Bonus.py samplehello.dat.bin samplehello.dat.tbl -b Will print helloworldhelloworldhelloworldhelloworldhelloworld to terminal.

Attachment:- Huffman Coding.rar

Reference no: EM133039633

Questions Cloud

How sn can be used as an it asset : Explain how SN can be used as an IT Asset and how the sills squired by corporate staff can be viewed as an IT Capability.
Bus adm 600 management analysis-eharmony case : Each of the three new markets eHarmony is considering (Broaden to Medium-Term, Life Stage Websites, or Geographic Expansion) are attractive, but eHarmony is arg
What is the present value of all cash flows associated : What is the present value of all cash flows associated with the maintenance costs under option 2 (rounded to the nearest hundred pesos)
Question on truss beverages : Truss beverages is the company Your Managment Team is looking for you to provide insights on: The Hemp & CBD landscape in the US.
Creating a solution to achieve Huffman Coding : Creating a solution to achieve Huffman Coding. Huffman Coding is optimal when encoding ASCII symbols one at a time. Optimal here is defined as the resulting
Calculate the five profitability ratios : Problem: The 2021 income statement of Adrian Express reports sales of $20,910,000, cost of goods sold of $12,650,000, and net income of $2,020,000.
What is the carrying amount of the accounts receivable : At the end of 2020, Cullumber Co. has an allowance for doubtful accounts of $36,000. What is the carrying amount of the accounts receivable
Determine the maximum amount of mortgage they can get : Determine the maximum amount of mortgage they can get. How much will their house cost when jack and rose but it in 5 years
Calculate the budgeted cost of direct materials purchases : Begin by preparing the direct materials budget for January and February through total direct materials needed line and then complete the budget

Reviews

len3039633

12/1/2021 10:58:31 PM

This is an assignment in Python. Please READ the assignment description CAREFULLY, and don''t use too advanced code or method in this assignment. Because this is an assignment for python beginners. There are a few Excample files and output files for this assignment, I will pass them to you later. Don''t forget to add comments in the code.

Write a Review

Python Programming Questions & Answers

  Write a python program to implement the diff command

Without using the system() function to call any bash commands, write a python program that will implement a simple version of the diff command.

  Write a program for checking a circle

Write a program for checking a circle program must either print "is a circle: YES" or "is a circle: NO", appropriately.

  Prepare a python program

Prepare a Python program which evaluates how many stuck numbers there are in a range of integers. The range will be input as two command-line arguments.

  Python atm program to enter account number

Write a simple Python ATM program. Ask user to enter their account number, and print their initail balance. (Just make one up). Ask them if they wish to make deposit or withdrawal.

  Python function to calculate two roots

Write a Python function main() to calculate two roots. You must input a,b and c from keyboard, and then print two roots. Suppose the discriminant D= b2-4ac is positive.

  Design program that asks user to enter amount in python

IN Python Design a program that asks the user to enter the amount that he or she has budget in a month. A loop should then prompt the user to enter his or her expenses for the month.

  Write python program which imports three dictionaries

Write a Python program called hours.py which imports three dictionaries, and uses the data in them to calculate how many hours each person has spent in the lab.

  Write python program to create factors of numbers

Write down a python program which takes two numbers and creates the factors of both numbers and displays the greatest common factor.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Improve the readability and structural design of the code

Improve the readability and structural design of the code by improving the function names, variables, and loops, as well as whitespace. Move functions close to related functions or blocks of code related to your organised code.

  Create a simple and responsive gui

Please use primarily PHP or Python to solve the exercise and create a simple and responsive GUI, using HTML, CSS and JavaScript.Do not use a database.

  The program is to print the time

The program is to print the time in seconds that the iterative version takes, the time in seconds that the recursive version takes, and the difference between the times.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd