Implement a memory bandwidth benchmark

Assignment Help Assembly Language
Reference no: EM133105280

Assignment: Intro to HPC: implement a memory bandwidth benchmark

For this assignment, you are required to implement using x86-64 assembly, a simplified (i.e., single- core) version of the STREAM benchmark. This benchmark measures the memory bandwidth of the computer using simple array operations (kernels).

Considering A, B, C are arrays of length N , and q is a scalar, the four STREAM kernels are the following:
• COPY: A[i] = B[i], for i ≤ N
• SCALE: A[i] = q · B[i], for i ≤ N
• ADD: A[i] = B[i] + C[i], for i ≤ N
• TRIAD: A[i] = B[i] + q · C[i], for i ≤ N

Exercise: For this assignment, your goal is to implement the four STREAM array operations in x86-64 assembly and compare its performance with the default (single-core) STREAM im- plementation. The input arrays should consist of 64 bit unsigned integers. Is the performance better/worse? Why?

Basic requirements:

• Implement the four array operations.

• Find N large enough such that the data does not fit into the CPU cache (i.e., if the array is too small, the bandwidth you measure may be the cache bandwidth).

• Measure the running time of each operation (for this you may use a system call).

• Run each operation 20 times and report average running time.

• Based on the runtime, report the best achieved memory bandwidth.

Extra requirements:

• Vectorize the previously designed code using Intel AVX, or your SIMD instruction set of choice.

• Compare its achieved performance with the basic version. Does it perform better? Why?

Reference no: EM133105280

Questions Cloud

What is blue ocean strategy : What is blue ocean strategy and what is its role in digital economy? explain with example and citation.
Upcoming ai projects and initiatives of a company : Provide a general framework design that ensures that the upcoming AI projects and initiatives of a company that you are familiar with, will remain within an eth
What is the taxable equivalent yield : What is the taxable equivalent yield on a 20-year par value municipal bond that generates a tax-free bond yield of 5.75% if the marginal income tax rate
Hotel management-rooms division discussion : AirBnB's (bed and breakfast) are new competition for hotel chains. They offer a place to stay, at a more affordable price, and provide laundry, cooking, and fri
Implement a memory bandwidth benchmark : Implement the four STREAM array operations in x86-64 assembly and compare its performance with the default (single-core) STREAM
Define sustainability in context of social responsibility : Define sustainability in the context of social responsibility. How does adopting this concept affect the way businesses operate?
Construct a time series plot : For the Hawkins Company, the monthly percentages of all shipments received on time over the past 12 months are 80, 82, 84, 83, 83, 84, 85, 84, 82, 83, 84, and 8
Determine how many of pound of direct materials in inventory : Determine how many of pounds of direct materials are currently in Howard's January 1st inventory. Beginning materials inventory
Importance of gathering or collecting and interpreting data : What is the importance of gathering or collecting and interpreting data and information about competitors? What practices should a firm use to gather competitor

Reviews

Write a Review

Assembly Language Questions & Answers

  Implement a simple login and password system

You are to implement a simple login and password system. Your system should allow for TEN usernames and their associated passwords to be stored.

  Write avr assembly code to transmit precisely timed patterns

Morse Code Transmitter - Write AVR assembly code to transmit precisely timed patterns and Combine the timing and individual patterns to create Morse code

  Develop trap routines and the use them to implement program

Develop these TRAP routines and the use them to implement a program to perform memory dumps (using the i/o format provided in the example above).

  Write a nonrecursive version of the factorial procedure

(Nonrecursive Factorial) Write a nonrecursive version of the Factorial procedure (Section 8.3.2) that uses a loop. (A VideoNote for this exercise is posted on the Web site.) Write a short program that interactively tests your Factorial procedure.

  Write mips assembly language define and initialize variables

Write a header comment block at the top of the source code file in the format shown on the next page. Make sure to put both author's names in the header comment block if you worked with a partner.

  Design circuits incorporating microcontrollers

Choose a microcontroller for a particular application and design circuits incorporating microcontrollers plus any additional hardware to control a particular application.

  Write a subroutine assembly language code using nasm

You have to write a subroutine (assembly language code using NASM) for the given equation - You cannot use MUL or DIV instructions, you have to perform multiplication and division (if required) using shift operations.

  Write program which should first ask for five random numbers

You are to write a program which should first ask for 5 random numbers from 0-20 (user will input these numbers in no preset order). Input these 5 numbers in variables called num1, num2, num3, num4, and num5.

  Write a complete well documented assembly language program

Write a complete well documented assembly language program (starts at $C000) that counts the number of 1s in each byte in a list.

  Briefly describe the process of compilation

Briefly describe the process of compilation. In your discussion, include the role of the high-level languages, low-level languages, machine code, opcodes

  Write the boolean function as boolean algebra terms

Write the Boolean function as Boolean algebra terms. First, think about how to deal with the two outputs. Then, describe each single row in terms of Boolean algebra.

  Write a sequence of two instructions

Write a sequence of two instructions that copies the inter in 4-7 from the AL registers into bits 0-3 of the Bl register. The upper 4 bits of AL will be cleared as will the upper 4 bits of BL. (Microsoft Assembly Language)

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd