Reference no: EM133788651
Operating Systems Multithreaded Programming Assignment
This assignment will test your understanding of parallel programming using MPI and OpenMP by implementing two computationally intensive tasks. Each question combines practical programming with profiling, analysis, and reporting. Follow the requirements carefully and ensure that your submission meets the outlined criteria.
Requirements
This is an open-book, take-home assignment. You may use any resources, including textbooks, and online references, but all code must be your own.
Programs must be compatible with the Rushmore virtual cluster environment.
Submit a well-organized report in PDF format, including code, profiling data, answers to analysis questions, and any required tables or screenshots.
Question 1: Prime Cluster Search Using MPI
Problem Description
A prime cluster is a set of three consecutive prime numbers that appear within a short interval. For example, {101, 103, 107} and {149, 151, 157} are clusters of three consecutive primes. Write a parallel program using Open MPI to search for and analyze prime clusters in a large range of integers.
Program Requirements
Range Specification
Your program should find all prime clusters of size three within the range [2, 1000000]. The range should be divided among processes so each one works on a separate sub-range.
Parallelization
Use MPI to parallelize the program, ensuring each process independently searches for primes in its assigned sub-range. Carefully handle boundary conditions to ensure accurate detection of clusters that may span across sub-range boundaries.
Expected Console Output
At the start, print the range being processed and the number of MPI processes used, along
with each process's assigned sub-range.
Example:
Searching for prime clusters in range [2, 1000000] using 4 processes.
Process 0 handling range [2, 250000]
...
...
Total prime clusters found: 548 Smallest-sum cluster: {101, 103, 107}
Timing Measurement
Use a manual timing function, such as clock() in C/C++, to measure the execution time of your program and include the results in your report.
Analysis Questions
Explain your approach to parallelizing the prime cluster search. Discuss how you handled data distribution and boundary conditions. Describe any observed performance considerations. Did you encounter any load imbalance? If so, explain potential causes and solutions.
Implementation Notes
Use the Sieve of Eratosthenes (or another efficient algorithm) to generate primes within each sub-range. Minimize inter-process communication to maximize efficiency.
Point Breakdown
Accurate identification of prime clusters and correct output formatting. 15 points Effective distribution of work and correct boundary handling. 10 points
Clear explanations of parallelization, data handling, and performance observations. 5 points
Question 2: Merge Sort with OpenMP
Problem Description
Write a program to implement and parallelize the merge sort algorithm for sorting a large array of integers. Start with a sequential version, then use OpenMP to parallelize it, and finally profile and analyze the performance of both versions.
Program Requirements
Sequential Merge Sort
Implement a sequential version of merge sort for sorting an array of integers (at least 1,000,000 elements). This will serve as your baseline implementation.
Parallel Merge Sort Using OpenMP
Use OpenMP to parallelize the merge sort algorithm. Due to merge sort's recursive nature, determine the best way to apply parallelism. Consider parallelizing the merging process or dividing tasks among threads at specific recursion depths. Ensure the parallelized version produces the same sorted output as the sequential version.
Profiling and Performance Measurement
Use omp_get_wtime() or an equivalent timing function to measure running time for both the sequential and parallel versions. Run each version with different OpenMP thread counts (e.g., 1, 2, 4, 8) and record results in the following table format:
Threads Sequential Time (s) OpenMP Time (s) Speedup
1
2
4
8
Analysis Questions
Discuss challenges faced when parallelizing merge sort with OpenMP, focusing on the recursive nature of the algorithm. Explain the observed performance changes as thread count increased. Did parallelism yield expected improvements? Discuss any deviations.
Point Breakdown
Correct implementation of a working merge sort.
Effective parallelization strategy, maintaining correct output.
Complete and accurate timing and speedup data.
Thoughtful answers discussing parallelization challenges and observed performance.
Report Structure and Requirements
Submit a report (PDF format) that includes the following sections, formatted consistently for clarity:
Introduction
Provide an overview of each problem, including your approach and any key design decisions.
Program Description
For Question 1, explain your algorithm for prime detection, parallelization approach, data distribution strategy, and boundary handling. For Questions 2, describe the sequential implementation, then detail your parallelization strategy for OpenMP, including which sections were parallelized and why.
Profiling Summary
Summarize timing data from manual timing functions for both MPI and OpenMP programs. Include tables or graphs to visualize timing and speedup results for the OpenMP section.
Performance Analysis
For the MPI program, include observations on data distribution efficiency and any load imbalance noted during execution. For the OpenMP program, discuss observed speedup as thread count increased and the impact of recursive parallelization.
Screenshots
Include screenshots for each command used to run your programs and display outputs. Ensure output screenshots confirm correct final results.
Reflection
Reflect on the challenges and insights gained from implementing and profiling parallel algorithms using MPI and OpenMP. Discuss potential (future) improvements for both programs.
Readme
Include a README section with any special instructions for compiling and running the programs.
File Requirements
Code Files
Submit your source code files, clearly named, e.g., prime_clusters_mpi.c, merge_sort_sequential.c, and merge_sort_openmp.c. Ensure each file is well-documented, with comments explaining key functions and logic.