Reference no: EM133008894
UFCFFL-15-M Parallel Computing - University of the West of England
1. Overview
This assignment assesses the following module learning outcomes:
• Critically evaluate the effectiveness of parallel computation in homogenous and heterogeneous environments;
• Develop programs for parallel systems, e.g. using OpenMP and MPI for multicores
• Understand and develop a parallel design and algorithm.
This is an individual assignment, worth 60% of the overall mark for the module. It is split into two components, a research logbook and a development project. It requires students to design and implement a selection of parallel algorithms for computing the problem described in Section 2. In addition to implementing an optimized set of parallel implementations for the problem, it is expected that a successful completion will include a benchmarking analysis of the implemented algorithm, compared against a "golden" serial version.
20% of the 60% marks obtainable from this assignment is for the companion logbook. The logbook is expected to be between 1500 - 3000 words and is a reflective account of the development process, research into different approaches to parallelising sequential algorithms, both from material taught as part of the course, but also from research material, e.g. papers.
Working on this assignment will help understand modern multi-core and many-core architectures, parallelise sequential algorithms, thinking concurrently, and apply academic research in a practical context.
Task Specification
This section introduces the details of the problem chosen for implementing parallel algorithms as part of this assessment point. The chosen problem is how to compute frequencies of itemsets in a text file. In this problem, a set of keywords is given as a text file and your program should work out the frequency for each keyword in the text file. For example, three keywords, A, B and C are given and as well as a text file containing a number of words. The program should return the results like that the keyword A occurs 10 times, B occurs 5 times and C happens 2 times.
The serial version of the program is very simple, and it reads the text file line by line until the end of the file, and it searches the frequency of each keywords sequentially and stores the counting for next line. Clearly, a brute-force search algorithm can be implemented, and it will take very long time to finish the search if the text file is very big.
This creates a good motivation to go for parallelisation of Brute-force search algorithms. The main requirement of this assignment is to design parallel Brute-force search algorithms for computing the frequencies of the keywords using both shared memory and distributed memory paradigms with OpenMP and MPI parallelisation tools, respectively.
Requirements
A serial version of the program is provided as well as the text file and the keywords. You can modify the serial program for the parallel implementation, which can also be considered as a benchmark for the performance review and comparison.
a. Problem set 1: OpenMP
Implement a parallelised version of Brute-force search, which behaves the same as the serial solution for computing the frequencies of the keywords. The implementation should use OpenMP for parallelization of the algorithm.
Your solution should be implemented using C/C++, compiled with gcc. Your final solution should include a Makefile that can be used by makers to build and test your work.
Your solution should be submitted to the blackboard.
b. Problem set 2: MPI
Implement a parallelised version of Brute-force search, which behaves the same as the serial solution for computing the frequencies of the keywords. The implementation should use MPI for parallelization of the algorithm.
Your solution should be implemented using C/C++. Open MPI is installed on own PCs as illustrated during the lab session, or PCs in the Q block labs. Your final solution should include a Makefile that can be used by makers to build and test your work.
c. Problem set 3: Benchmarking and performance analysis (20%)
For this component you are expected to perform an analysis of the serial solution and compare and contrast this against your solution problem set 1 and 2.
The analysis should be reproducible and you should include instructions on how to execute the solutions, including the serial one provided, to produce the performance results. Of course, it is not expected that during testing the same performance numbers will be produced, but they should be within a reasonable margin of error.
You are free to include this in your logbook, but it does not count against the word count, and it might be easier to produce a separate document with the presentation of the performance numbers and analysis of the results.
Deliverables
a. Each problem solution, OpenMP and MPI implementations, should be committed in its own directory in the BB submission, including a README.md documenting what the program does, how to build it, and benchmark the resulting executable(s).
b. Additionally, you are encouraged to produce a video demo to present your code, implementation and the solutions to help markers understand your program.
c. Logbook of 1500 - 3000 words. The logbook is for you to document your design and development process, along with a performance analysis of your solutions to Brute- Force search. It is not specified what shape your logbook should take, but it should be academically rigorous, including academic references supporting any claims made, as well as to provide context, and so on.
Attachment:- Parallel Computing.rar