Compare the efficiency of different data structures

Assignment Help Other Subject
Reference no: EM133753833

Algorithms and Data Structures

Purpose

The purpose of this assignment is for you to:
Improve your proficiency in C programming and your dexterity with dynamic memory allocation.
Demonstrate understanding of a concrete data structure (radix tree) and implement a set of algorithms.
Practice multi-file programming and improve your proficiency in using UNIX utilities.
Practice writing reports which systematically compare the efficiency of different data structures.

Background
In Assignment 1 you implemented a dictionary using a linked list. Many applications of dictionaries cannot assume that the user will query in the correct format. Take Google search as an example. If someone is so excited to check the recent Olympics updates, they may accidentally search something similar to this:

Thankfully, there is enough context in the original query that the intended search result is recommended back to the user. In Assignment 2 you will be extending your implementation to account for misspelled keys when querying your dictionary.

Your Task

Assignment:

You will be using the same dataset as Assignment 1.
Users will be able to query the radix tree and will get either the expected key, or the closest recommended key.
You will then write a report to analyse the time and memory complexity of your Assignment 1 linked list compared to your radix tree implementation.

Implementation:
Your programs will build the dictionary by reading data from a file. They will insert each suburb into the dictionary (either the linked list (Stage 3) or radix tree (Stage 4)).
Your programs will handle the search for keys. There are three situations that your programs must handle:
Handle exact matches: output all records that match the key (Stage 3 and 4).
Handle similar matches: if there are no exact matches, find the most similar key and output its associated records (Stage 4 only).
Handle no matches being found: if neither exact nor similar matches are found, indicate that there is no match or recommendation (Stage 3 and 4).
Your programs should also be able to populate and query the dictionary with the Assignment 1 linked list approach and the radix tree approach.
Your programs should also provide enough information to compare the efficiency of the linked list with the radix tree with a operation-based measure that ignores less important run- time factors (e.g. comparison count).

An Introduction to Tries and Radix Trees
First, it is important to establish the difference between a Trie and a Tree. This is best illustrated with an example. One example of a tree is a binary search tree (BST), where each node in the tree stores an entire string, as illustrated below. The nodes are ordered and allow easy searching. When searching in the BST, we compare the query with the entire string at each node, deciding whether to switch to the left or right subtree or stop (if the subtree is empty) based on the result of the comparison.

A trie is slightly different. It has multiple names, such as retrieval tree or prefix tree. In a trie, the traversal of the tree determines the corresponding key. For the same strings as above with one letter per node, it would look like:

Tries allow for quick lookup of strings. Querying this trie with the key "hey" would find no valid path after the "e" node. Therefore, you can determine that the key "hey" does not exist in this trie.

A radix trie is a subtype of trie, sometimes called a compressed trie. Radix trees do not store a single letter at each node, but compress multiple letters into a single node to save space. At the character level, it would look like this:

Radix tries can again be broken down into different types depending on how many bits are used to define the branching factor. In this case, we are using a radix (r) of 2, which means every node has 2 children. This is accomplished by using 1 bit of precision, so each branch would be either a 0 or 1. This type of radix trie (with r=2) is called a Patricia trie. Another example of a radix trie with r=4 would have 4 children, determined by the binary numbers 00, 01, 10, 11. The radix is related to the binary precision by r = 2x where x is the number of bits used for branching.
Radix trees benefit from less memory usage and quicker search times when compared to a regular trie.
While these visual representations are at the character level, a Patricia trie is implemented using the binary of each character. Each node in the trie stores the binary prefix at the current node, rather than the character prefix. For example, we can insert 3 binary numbers into a PATRICIA trie: 0000, 0010 and 0011.

Combining the previous worded example with the binary representation gives us a patricia tree of the form:

You should trace along each path and validate that the stored strings are the same as the example above. Each character is represented over 8 bits, and the last character is followed by an end of string 00000000 8 bit character, i.e. NULL .
It is important to note that these representations only show the prefix at each node. An actual implementation will require more information within a node struct. To see this, look at the "extra insertion example" slide.

Implementation Tips

Get a particular bit from a given character
Extract a stem from a given key number of bits from a given key.

Reference no: EM133753833

Questions Cloud

Individuals experience fear of taking risks : In an unsupportive atmosphere, individuals experience fear of taking risks, refrain from making decisions, and restrict their level of dedication towards
Unsupportive atmosphere : In an unsupportive atmosphere, individuals experience fear of taking risks, refrain from making decisions
Prepare the group for the implementation phase of project : Discuss any difficulties with their supervisor, and while the supervisor will not solve the problems for the group, they will offer advice
Healthcare professionals to diagnose breast cancer earlier : Summarize Advances in breast cancer screening allow healthcare professionals to diagnose breast cancer earlier.
Compare the efficiency of different data structures : Demonstrate understanding of a concrete data structure (radix tree) and implement a set of algorithms. Practice multi-file programming and improve
Explain how r-two is affected by sample size : Explain how R2 is affected by sample size. Describe whether a large R2 value means that a regression is significant. Provide reasons for your answer.
Individuals experience fear of taking risks : In an unsupportive atmosphere, individuals experience fear of taking risks, refrain from making decisions, and restrict their level of dedication towards
How does picot help form a clinical question : How does PICOT help form a clinical question? Use Box 2.2 in your textbook to discuss population, intervention, comparison intervention, and outcome.
Morbidity and mortality rates of breast cancer : Summarize Since both the morbidity and mortality rates of breast cancer have significantly increased over the past decades

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd