Performance of two distinct sorting algorithms

Assignment Help Data Structure & Algorithms
Reference no: EM13915090

This lab assignment requires you to compare the performance of two distinct sorting algorithms to obtain some appreciation for the parameters to be considered in selecting an appropriate sort. Write a HeapSort and a Shell Sort. They should both be recursive or both be iterative, so that the overhead of recursion will not be a factor in your comparisons. In this case, iteration is recommended. Be sure to justify your choice. Also, consider how your code would have differed if you had made the other choice.

The strategy behind a Shell Sort is to create a more nearly optimal environment for a simple, relatively inefficient sort technique, namely Simple Insertion Sort. This optimal environment allows the simple strategy to be efficient. Use increments of 1, 4, 13, 40, 121, 364 and 1093; then increments of 1, 5, 17, 53, 149, 373 and 1123; then increments of 1, 10, 30, 60, 120, 360 and 1080; and then one more set of increments of your choice. Please note that the increment sets will need to be supplemented if you use larger data sets. You will have four different Shell sort to run.

Heap Sort is a practical sort to know and is based on the concept of a heap. It has two phases: Build the heap and extract the elements in sorted order from the heap. Altogether, you will have five sorts.

Create input files of four sizes: 50, 500, 1000, 2000, and 5000 integers. For each size file, make three versions. On the first, use a randomly ordered data set. On the second, use the integers in reverse order. On the third, use the integers in normal ascending order. (You may use a random number generator or shuffle function to create the randomly ordered file. It is important to avoid too many duplicates. Keep them to about 1%). This means you have an input set of 15 files plus whatever you deem necessary and reasonable. Files are available at on the course web page if you want to copy them. Your data should be formatted so that each number is on a separate line with no leading blanks. There should be no blank lines in the file.

Each sort must be run against all the input files. This will give you at least 75 runs For grading purposes, for each sort, generate output only from the files of size 50. You will have 15 sets of output to turn in for the size 50 files. Your code needs to print out the sorted values and the times for each of the Shell Sorts and the Heap Sort for each of the three orders for size 50.

Your program should access the system clock to get some time values for the different runs. The call to the clock should be placed as close as possible to the beginning and the end of each sort. If other code is included, it may have a large, fixed, cost, which would tend to drown out the differences between the runs, if any. Why take a chance! If you get too many zero time data values or any negative time values then you must fix the problem. One way to do this is to use larger, files than those specified. Another solution is to perform the sorting in a loop, N times, and calculates an average value. You would need to be careful to start over with unsorted data, each time through the loop.

Turn in a analysis comparing the two sorts and their performance. Be sure to comment on the relative runtimes of the various runs, the effect of the order of the data, the effect of different size files, and the effect of different increment sizes for the Shell Sort. Which factor has the most effect on the efficiency? Be sure to consider both time and space efficiency. Be sure to justify your data structures. As time permits consider implementing a Straight Insertion Sort to compare with Shell Sort. Also, consider files of size 10,000 or additional random files - perhaps with 15-20% duplicates. Your write-up must include a table of the times obtained.

In developing this assignment, please keep in mind that you will be turning in your source code to be run against my input. This is in addition to the runs you will need to make for analysis purposes. It needs to print out the sorted values. For grading purposes, it does not need to print the times, but the times should be printed in the sample runs you turn in.

Reference no: EM13915090

Questions Cloud

What is the dependent variable : What is the independent variable of this experiment? How many levels does it have? What is the dependent variable? On what scale (nominal, ordinal, interval, ratio) was it measured
Ethics and law presentation : Ethics and Law Presentation
Describe the lifo inventory method : Describe the LIFO inventory method. What effects does it have on reported income, cash flows, and income taxes during periods of price changes?
A students personal leadership plan : Name is a student currently studying to obtain her degree in Human Services Management. It is name's goal to graduate with her associate's degree then continue on with her education graduating from the University of Phoenix with a bachelor's degre..
Performance of two distinct sorting algorithms : This lab assignment requires you to compare the performance of two distinct sorting algorithms to obtain some appreciation for the parameters to be considered in selecting an appropriate sort.
What should the probability of winning : Suppose ABC Company is considering bidding on a given contract. It will cost $2,000 to prepare the bid. If the bid is lost, the $2,000 cost is also lost. If ABC Company wins the bid, it will make $40,000 and recover the $2,000 bid preparation cost..
Provide the sql code that inserts data into all of the table : Provide the SQL code that inserts data into all of the tables
Onus is on the senders of messages. : How well do you respond to honest feedback from someone who sent it? What are some ways in which senders might receive feedback from their messages? How might this feedback affect the sender and the message? Why is this important?
How would you apply confidence intervals : Summarize what the case is about, and what the variables represent. How would you apply confidence intervals? How would you apply hypothesis testing

Reviews

Write a Review

Data Structure & Algorithms Questions & Answers

  Cost control techniques

Assume your company has just completed the Initiation Process for implementing an Email System Upgrade. It was identified in a recent meeting with management leaders from the Sales,

  Telephone number as a string

Write a program that inputs a telephone number as a string in the form (555) 555-5555. The program should use an object of class StringTokenizer to extract the area code as a token, the first three digits of the phone number as a token and the las..

  Design a divide-and-conquer algorithm

Design a divide-and-conquer algorithm for the Motif Finding problem and estimate its running time. Have you improved the running time of the exhaustive search algorithm?

  Discuss the different redundant array of independent disks

Discuss how you would use different RAIDs in the workplace.

  Problem 1nbsp what-if and goal-seeking analysis george is

problem 1nbsp what-if and goal-seeking analysis george is planning to set up a new hair salon in a trendy inner city

  Computations of database characteristics

A file has r=20,000 student records of fixed-length. Suppose the file is ordered by SSN; compute the number of blocks it takes to search for a record given its SSN value by doing a binary search.

  Describe ambiguity in proposed algorithm

Describe the distinction between an ambiguity in a proposed algorithm and an ambiguity in the representation of an algorithm. Describe how the use of primitives helps remove ambiguities in an algorithm's representation.

  1 for a 77t truck with gross vehicle weight gvw of 136078

1. for a 77t truck with gross vehicle weight gvw of 136078 kg with dual rear tyres and a tyre inflation pressure is 120

  Find cost of sorting the relation in seconds

Suppose you need to sort a relation of 40 gigabytes, with 4 kilobyte blocks, using a memory size of 40 megabytes. Find the cost of sorting the relation, in seconds, with bb = 1 and with bb = 100.

  Queue and content of countdown timer-using priority queue

At time 230 five processes (P1 - P5) are waiting for timeout signal. They are scheduled to wake up at times: 260, 320, 360, 430, 450. Using priority queue with time differences illustrate queue and content of countdown timer at time 230.

  Creating an idef1x diagram

Construct an IDEF1X diagram that demonstrate only entities and relationships. Name each relationship and specify its cardinalities.

  Show how to construct a las vegas algorithm c to establish j

A deterministic, process terminating verification algorithm B that tests if j holds or not.Show how to construct a Las Vegas algorithm C to establish J.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd