Operating systems parallel programming assignment

Assignment Help C/C++ Programming

Reference no: EM133793928

Operating Systemss Parallel Programming Assignment

Objective

This assignment aims to deepen your understanding of CUDA programming by requiring you to explore CUDA's architecture and theoretical performance benefits without requiring GPU access. You will select a real-world computational problem, propose a CUDA-based solution, analyze its theoretical performance, and reflect on your findings. The goal is to synthesize the knowledge you've gained about parallel programming frameworks and apply it to GPU programming concepts.

Assignment Overview

You will:
Select a computational problem suitable for CUDA parallelization.
Research CUDA-specific techniques for solving the problem and justify your approach.
Design a CUDA kernel for the problem, focusing on thread and block organization as well as memory optimization strategies.
Theoretically evaluate the kernel's performance, including execution time, scalability,
and bottlenecks.
Reflect on your work, challenges faced, and lessons learned.
Deliverables
Your submission will consist of a detailed report with the following structured sections (titles required):
Problem Selection and Justification (20%) What to Include:
Select a computational problem that benefits from parallelism (e.g., image convolution, matrix multiplication, scientific simulation).

Justify your selection by explaining:
Why the problem is parallelizable.
Why CUDA is a suitable framework for solving it.
Provide at least two references (see acceptable types below) supporting your problem choice and its relevance to CUDA.
Tips for Depth:
Discuss specific aspects of the problem that align with GPU parallelism, such as repetitive computations or large datasets.
Compare the potential benefits of CUDA with other frameworks (e.g., MPI, OpenMP) for the selected problem.
Kernel Design and Memory Optimization (30%) What to Include:
Provide detailed pseudocode for your CUDA kernel.
Clearly annotate how threads and blocks are indexed.
Explain how the kernel distributes work across threads and blocks.
Propose at least two memory optimization strategies (e.g., using shared memory, minimizing global memory accesses). Justify your strategies with references to CUDA documentation or technical resources.
Tips for Depth:
Highlight how the kernel design maximizes GPU utilization (e.g., balancing threads, minimizing memory contention).
Discuss how the memory hierarchy (global, shared, constant) influences your design choices.
Theoretical Performance Analysis (30%) What to Include:
Estimate the execution time of your kernel on a hypothetical GPU (e.g., assume a GPU with 2048 cores and 256 KB shared memory). Calculate metrics like throughput (operations/sec) or speedup compared to a serial CPU implementation.
Identify potential bottlenecks (e.g., warp divergence, memory bandwidth).

Analyze the scalability of your kernel for larger datasets or increased computational complexity.
Tips for Depth:
Use references to support your performance assumptions (e.g., benchmarks reporting similar tasks).
Include hypothetical scenarios to illustrate how increasing thread or block counts impacts performance.
Reflection and Lessons Learned (20%) What to Include:
Reflect on the challenges you faced while designing the kernel or analyzing performance.
Discuss any trade-offs you made in kernel design or memory usage.
Compare your experience with insights from at least one external reference that addresses similar challenges.

Reference no: EM133793928

Questions Cloud

Continuity of care and uniform standards : Continuity of care and uniform standards are very important for healthcare system to work well. To make sure that care is consistent and of high quality across

What factor does not contribute to the improvement : According to Bumpus and Copeland standardized multidisciplinary discharge planning rounds have been shown. What factor does not contribute to this improvement?

Which stage in medication reconciliation : They decide to contact the doctor to confirm the dosage. This is an example of which stage in medication reconciliation?

List them from top priority to least priority : Recently had her first child two months ago. Currently married; List them from top priority to least priority.

Operating systems parallel programming assignment : CSC718 Operating Systemss Parallel Programming Assignment - Discuss any trade-offs you made in kernel design or memory usage. Compare your experience

Responsible for educating and empowering health care : Nurse leaders are responsible for educating and empowering health care staff members about financial implications of health care delivery.

Wheel of Health for Bassam : A Wheel of Health for Bassam The five dimensions of health for Bassam, including symptoms and causes Support strategies for Bassam's palliative needs

Father install new window when the window fell : The patient is a 4-year-old male brought to the emergency room by his father. He was helping his father install new window when the window fell

Health promotion and disease prevention : What is the pathophysiology, health promotion and disease prevention, risk factors, expected findings, laboratory test, diagnostic procedures,

User Account

All Pages