We will compare the performance of a vector processor, Basic Computer Science

Assignment Help:
In this problem we will compare the performance of a vector processor with a system
that contains a scalar processor and a GPU-based coprocessor. In the hybrid system,
the host processor has superior scalar performance to the GPU, so in this case all scalar
code is executed on the host processor while all vector code is executed on the GPU.
We will refer to the rst system as the vector computer and the second system as the
hybrid computer.
Assume your target application contains a kernel with an arithmetic intensity of 0.5
FLOPs per DRAM byte accessed. However, the application also has a scalar component
which must be performed before and after the kernel in order to prepare the input
vectors and output vectors, respectively.
For a sample dataset, the scalar portion of the code requires 400 ms of execution time
on both the vector processor and the host processor in the hybrid system. The kernel
reads input vectors consisting of 200 MB and has output data consisting of 100 MB.
The vector processor has a peak memory bandwidth of 30 GB/s and the GPU has a
peak memory bandwidth of 150 GB/s. The hybrid system has an additional overhead
that requires all input vectors to be transferred between the host memory and GPU
local memory before and after the kernel is invoked. The hybrid system has a DMA
bandwidth of 10 GB/s and an average latency of 10 ms.Assume that both the vector processor and GPU are both performance bound by mem-
ory bandwidth. Compute the execution time for both computers for this application

Related Discussions:- We will compare the performance of a vector processor

Flowcharts, flowchart that display yhe students average scores for 3 quizze...

flowchart that display yhe students average scores for 3 quizzes.Assume that there are 3 sections having 5 student each.Valid number is 1-100 for the quizzes.Enter an invalid numbe

Explain different types of attacks, Question 1 Explain synchronous and asy...

Question 1 Explain synchronous and asynchronous Data Replication Question 2 Write a short note on Access Time, Latency, Transfer Time, and Streaming Tape Question 3

Explain the important concepts of modern cpu, Question 1 Explain the impor...

Question 1 Explain the important concepts of modern CPU CISC vs. RISC CPUs Circuit Size and Die Size Processor Speed Processor Cooling System Clocks Architect

C++, whats the out put of int main(){ int n=310; funcone(n); functwo(&n); ...

whats the out put of int main(){ int n=310; funcone(n); functwo(&n); cout return 0; } void funcone(intn) n=240; } void func two(intn*) { n=120; }

Computational physiology, Estimate the mean deflection of the QRS complex i...

Estimate the mean deflection of the QRS complex in each of the six standard leads (I, II, III, aVR, aVL, aVF) in Figure 1 and then estimate the mean heart vector for the normal hea

As, access,excel and ms word/

access,excel and ms word/

Explain particle collision in detail, Question 1 Name the different types ...

Question 1 Name the different types of emitters and explain them in brief Question 2 List and explain the Particle Attributes Question 3 Explain Particle Collision

Database management software, Database Management Software: For storage of...

Database Management Software: For storage of large amounts of varied data, and rapid retrieval and interpretations of data, the database is the ideal tool. It is most useful for s

Discuss the challenges in it infrastructure management, Question 1 What is...

Question 1 What is difference between cathode ray tube monitors and LCD monitors? List three popular types of operating systems and give brief introduction of each type Que

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd