Reference no: EM132780941
Computer organization - Homework
Exercise 1: run length coding
Run-length encoding is a simple, lossless data compression algorithm. A byte string is to be translated into a sequence of byte pairs: The first byte of a pair specifies the character; the second byte the number of repetitions of this character. A series of several identical consecutive characters can thus be coded as a single byte pair to save space. Examples:
Input output
"aaaaaaaaaa" ('a', 10)
"Hello" ('H', 1)('a', 1)('l', 2) ('O', 1)
"BBBBZZABB" ('B', 4) ('Z', 2) ('A', 1)('B', 2)
"abBBBBBBBB" ('a', 1)('b', 1)('B', 8th)
Implement the function rle, which transfers an input string str_in run-length-coded to an output buffer buf_out. str_in is a zero-terminated ASCII character sequence (string). The compression result is to be stored in buf_out in the form of consecutive byte pairs
The ASCII character is to be saved in the first byte, the number of repetitions as a number (not ASCII-coded) in the second byte. You can assume that no sequences of more than 127 identical consecutive characters appear in the input, so that overflows in the repetition value do not have to be taken into account. After the input string has been processed successfully, the number of written byte pairs should be returned.
For the input "BBBBZZABB", for example, the following byte sequence should be written to the output buffer:
Byte sequence hexadecimal: 42 04 5A 02 41 01 42 02
(ASCII characters: 'B' - 'Z' - 'A' - 'B' -)
In the following you can see the C signature of the desired function and the MIPS register for parameters and return value, which result from the MIPS register convention:
intrle (char * str_in, char * buf_out);
$ v0 $ a0 $ a1
Assistance
• Implement a loop that iterates over the string. Load the individual characters using lb (or lbu).
• If a zero is read, the end of the string has been reached.
• To determine the repetition value, the same consecutive characters must be counted.
• As an intermediate step, you can note the function as pseudocode or in C / Java.
Task 2: Animal identification using a bloom filter
The bloom filter is a probabilistic data structure that can be used to quickly check whether a given element is included in a detection set. This can lead to false positive results, but not false negative results.
The basis of our bloom filter is a bit map with a length of 256 bits. Originally all bits in the row are set to 0.
To add a new value to the recognition set of the Bloom filter, the hash function hash_str is executed k times on the input value. The start values (seed) 0, 1,. . . k 1 used. The return values of the hash function are values in the range 0 to 255. These can be interpreted as a bit index in the bit sequence. After each execution of the hash function, the bit of the bit sequence corresponding to the return value as a bit index is set to 1.
In order to check within the scope of an evaluation function whether an input belongs to the recognition set, the hash function hash_str is executed k times on the input value, just like when adding. The start values (seed) 0, 1,. . . k 1 used. If the bit of the bit sequence corresponding to the return value as a bit index is set (1) after each execution of the hash function, the result is positive / true (the input is very likely to be part of the recognition set). Is at least if one of the bits is not set (0), the result is negative / false (input is definitely not part of the recognition set).
Use the given function hash_str with the following signature:
inthash_str (char * str_in, int seed);
$ v0 $ a0 $ a1
In addition, the following two functions are available for reading (bitmap_getbit) and setting (bitmap_setbit) individual bits in the bitmap:
Part 2.1: Evaluation function
In this function you should create the bloom filter evaluation function bf_evaluate. The argument str_in contains the input to be checked as a zero-terminated character string. The bitmap argument points to the bit sequence of the Bloom filter, which encodes the recognition set. The argument k specifies the number of hash function calls. With a positive recognition result (true) the function should return the value 1, with a negative recognition result (false) the value 0.
intbf_evaluate (char * str_in, char * bitmap, int k);
$ v0 $ a0 $ a1 $ a2
The default file already contains a bit sequence that encodes a series of animal names as a recognition set. The main function of the default carries out its evaluation function for a series of inputs. In our example, the inputs "cat", "dog", "mole", "mouse" and "goldfish" should produce a positive result. The inputs "tree", "lamp", "lake" and "boat" should give a negative result.
Part 2.2: Extending the recognition set
Unfortunately, the detection set does not yet include all animals: "Earthworm", for example, gives a negative result. Therefore, now implement the function bf_add, which should add an input str_in to the recognition set. The bitmap argument continues to point to the bit sequence of the bloom filter. The argument k specifies the number of hash function calls.
voidbf_add (char * str_in, char * bitmap, int k);
$ a0 $ a1 $ a2
To test the function, the default function bf_add is executed with "earthworm". Once added, the new word should be recognized as positive by bf_evaluate. "Tree", "Lamp" etc. should still be recognized as negative.
Assistance
• A loop must be used for both subtasks to execute the hash function hash_str k times. Make sure that the correct values for the function parameter seed (start value) are transferred as described above.
• To work on subtask 2.1, leave the default empty function bf_add unchanged at first. This is called by the main function of the default, but empty has no effect. For the processing of subtask 2.2 you then add the functionality of bf_add.
• After completing subtask 2.1, all animals except earthworms should be correctly identified with the given test words and no false positive results should occur. After completing subtask 2.2, all animals should be correctly identified and no false positive results should occur.
• If you add other non-animal test words to test_words (this is not required), your function should most likely give a negative result (false). Since it is a probabilistic algorithm, there is a possibility of false positives.
• In spite of Due to the probabilistic properties of the Bloom filter, we have ensured that the given test words must not give false positive results!
• Do not access the bitmap bitmap directly with load / store commands, but use the bitmap_getbit and bitmap_setbit functions.
• As an intermediate step, you can note the functions as pseudocode or in C / Java.
Attachment:- Computer organization.rar