Estimate the total number of simd instruction cycles needed

Assignment Help Computer Engineering
Reference no: EM131937116

Problem

Devise a minimum-time algorithm to multiply two 64 x 64 matrices, A = (0,) and 8 = (EV, on an SIMD machine consisting of 64 PEs with local memory. The 64 PEs are interconnected by a 2D 8 x 8 torus with bidirectional links.

(a) Show the initial distribution of the input matrix elements (a4 and (b) on the PE memories.

(b) Specify the SIMD instructions needed to carry out the matrix multiplication. Assume that each PE can perform one multiply, one odd, or one shift (shifting data to one of its four neighbors) operation per cycle. You should first compute all the multiply and odd operations on local data before starting to route data to neighboring PEs. The SIMD shift operations can be either east, west, south. or north with wraparound connections on the COILS. (c) Estimate the total number of SIMD instruction cycles needed to compute the matrix multiplication. The time includes all arithmetic and data-routing operations. The final product elements C = A x 8 = (cf) end up in various PE memories without duplication.

(c) Estimate the total number of SIMD instruction cycles needed to compute the matrix multiplication. The time includes all arithmetic and data-routing operations. The final product elements C = A x 8 = (c1) end up in various PE memories without duplication.

(d) Suppose data duplication is allowed initially by loading the same data element into multiple PE memories. Devise a new algorithm to further reduce the SIMD execution time. The initial data duplication time, using either data broadcast Instructions or data touting (shifting) instructions, must be counted. Again, each result element cl ends up in only one PE memory.

Reference no: EM131937116

Questions Cloud

Model materials to demonstrate dna replication : Present a detailed analysis of DNA replication at one replication fork. Use drawing, descriptions, and/or captions detailing the process.
At the time of the margin call the stocks price must have : In 1 year the investor has interest payable and gets a margin call. At the time of the margin call the stock's price must have been ________.
Differences between prokaryotic and eukaryotic cells : 1. Identify the major similarities and differences between prokaryotic and eukaryotic cells.
Nebulized albuteral treatments : She was in the ED only 6 hours ago with an acute asthma attack, which resolved with nebulized albuteral treatments.
Estimate the total number of simd instruction cycles needed : Estimate the total number of SIMD instruction cycles needed to compute the matrix multiplication. The time includes all arithmetic and data-routing operations.
Calculate the peak performance in g flops with reasoning : Calculate the peak performance in G flops with reasoning in each of the following two vector supercomputers. The Cray Y-MP C-90 with 16 vector processors.
Function and synthesis of these molecular machines : Cells without proteins do not function and synthesis of these molecular machines starts at the DNA level. Transcription of DNA into RNA and translation
Describe an algorithm that compute the given expression : Describe an algorithm using odd. multiply, and data-routing operations to compute the expression s =A1 x8, A2 82 f A32 x 832 with minimum time.
Comparative embryology and the study of evolution : What is the connection between comparative embryology and the study of evolution?

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd