Determine how long the match goes on

Assignment Help C/C++ Programming
Reference no: EM131102994

Project

Delta Force

Introduction

Small­Mart, one of the world's largest retailers, wants to hire you to help them manage their inventory. A centralized file contains information about every item Small­Mart sells. This file is updated each day. All around the world, there are devices that need to receive a copy of the file each day: computers and cash registers in the stores, vending machines, handheld devices, etc., so each day, Small­Mart sends a copy of the inventory file to all these devices. There are 10 million such devices, and the inventory file is about 100 kilobytes long, so about 1 terabyte (i.e., 1000 gigabytes = 100KB * 10 million) is sent each day. Given that their internet provider charges Small­Mart roughly $1 per gigabyte, it costs about $1,000 per day to provide up­to­date inventory information to the devices. In addition to the inventory files, Small­Mart occasionally sends updates of other files to the devices as well (for example, new versions of applications as they are updated).

Marty Small, one of Small­Mart's smartest managers (and no relation to the company founder), recently had an idea. He realized that the file that Small­Mart sends each day changes very little from the file it sent the day before. For example, consider these two sample inventory files sent by Small­Mart to the devices on April 10 and April 11. Each file contains multiple records; each record has an item number, an item name, and the number of those items in the inventory:

April 10 version of the inventory file:

81609,Feather Duster,198,92246,Lawn Chair Set,50,03854,Carrano C++ book,183,27408,Monsters, Inc. DVD,89
April 11 version of the inventory file: 66284,Screwdriver,1000,81609,Feather Duster,195,92246,Lawn Chair
Set,50,03490,Bedspread,87,27408,Monsters, Inc. DVD,89,40411,Hair Spray,380

As you can see, both the earlier and later version of the inventory file have many similarities. There are some new additions in the April 11 version of the file, such as the entry

03490,Bedspread,87

There are also some changes to existing records between the two versions: 81609,Feather Duster,198
changes to

81609,Feather Duster,195

And some of the entries in the original file have been removed (e.g., it looks like Small­Mart sold all of the Carrano C++ books in one day, since there is no entry for it in the second file).

Given the significant similarities between successive versions of the inventory file, Marty realized he could save Small­Mart hundreds of dollars every day. How? Marty realized that since a given day's inventory file is so similar to the previous day's inventory file, that it doesn't make sense to ship the entire 100K inventory file to each salesperson every day. Instead, Small­Mart could just send a smaller file containing the list of changes (called a delta file) to each device to indicate what has changed from yesterday's inventory file to today's file. Each device could then run an incremental update program to use this delta file to convert their existing copy of the inventory file computer to the new version of the inventory file.

Let's number each character in both of inventory files and then see an example. April 10 version of the inventory file:
1 2 3 4
01234567890123456789012345678901234567890123456789
0 81609,Feather Duster,198,92246,Lawn Chair Set,50,0
50 3854,Carrano C++ book,183,27408,Monsters, Inc. DVD 100 ,89

April 11 version of the inventory file:

1 2 3 4
01234567890123456789012345678901234567890123456789
0 66284,Screwdriver,1000,81609,Feather Duster,195,92
50 246,Lawn Chair Set,50,03490,Bedspread,87,27408,Mon
100 sters, Inc. DVD,89,40411,Hair Spray,380

Given these two versions of the inventory file, let's see what a delta file might look like. Remember, the delta file contains a set of instructions on how to convert a first version of the file (e.g., the April 10 version of the file) into a later version of the file (e.g., the April 11 version). This delta file would normally be created at Small­Mart headquarters and then sent to each of the 10 million devices, which already have the full April 10 version of the inventory file:

1. Create a new, empty output file
2. Add 23 characters to the output file: 66284,Screwdriver,1000,
3. Copy 23 characters from offset 0 in the source file to the output file.
4. Add 1 character to the output file: 5
5. Copy 27 characters from offset 24 in the source file to the output file.
6. Add 16 characters to the output file: 490,Bedspread,87
7. Copy 28 characters from offset 75 in the source file to the output file.
8. Add 21 characters to the output file: ,40411,Hair Spray,380

Given this set of instructions, an incremental update program on a device could follow the above set of instructions and convert their April 10 version of the file (which is already on their computer) into the April 11 version of the file. How would this work?

After following the first instruction, the incremental update program would create an empty output file. After following the second instruction, that output file would contain
66284,Screwdriver,1000,

After following the third instruction, the output file would contain

66284,Screwdriver,1000,81609,Feather Duster,19

After following the fourth instruction, the output file would contain

66284,Screwdriver,1000,81609,Feather Duster,195

After following the fifth instruction, the output file would contain 66284,Screwdriver,1000,81609,Feather Duster,195,92246,Lawn Chair Set,50,03 And so on.
To perform this conversion, only two different commands are required: a Copy command and an Add command. The Copy command specifies that a specific number of characters should be copied from a particular offset in the first version file to the end of the output file. The Add command is used to add entirely new content to the output file when this content cannot be located and copied from the original (earlier version of the) file.

Of course, the eight delta instructions shown above are in a human­readable format that is quite wordy. We can make our delta file much smaller by removing all of the human­readable text and using a more compact encoding. Shown below is a 93­character delta file containing all of the instructions above; note that this delta file is 33% smaller than the full April 11 version of the file:

A23:66284,Screwdriver,1000,C23,0A1:5C27,24A16:490,Bedspread,87C28,75A21:,40411,Hair Spray,380

In the example above, each add command is represented by an A, followed by the number of characters to add, followed by a colon character and the actual characters to append. Each copy command is represented by a C followed by the number of characters to copy, a comma, and the offset in the original file from which to start copying the characters. It is possible to convert any file to any other file using just these two types of commands.

So to recap, each day, in its corporate office, Small­Mart can create the latest version of the full inventory file (e.g., the April 11 version). Then, Small­Mart can use a special tool called a delta creator to create a delta file (made up of Add and Copy commands) that contains the instructions for converting yesterday's version (the April 10 version) of the file to today's version (the April 11 version) of the file. Then, Small­ Mart can send this delta file to the 10 million devices instead of sending the full (much larger) April 11 inventory file.

Upon receiving the delta file, each of the 10 million devices then runs a program called an incremental updater that takes yesterday's full April 10 inventory file (which is already on the device, produced yesterday) and the just­received delta file, and follows the instructions in the delta file to create today's April 11 full inventory file. On April 12, the device will receive a delta file that it will use along with this full April 11 file to produce a full April 12 inventory file. Since the delta files are just a fraction of the full inventory files' sizes, this saves Small­Mart considerable networking costs.

Of course, this delta approach can be used to update all types of files, not just inventory files. For instance, consider the following two files A and A', where A' is a derived version of A.

1 2 3
01234567890123456789012345678901234
A : ABCDEFGHIJBLAHPQRSTUVPQRSTUV
A': XYABCDEFGHIJBLETCHPQRSTUVPQRSTQQELF

Here's a delta file to convert A into A'. Verify for yourself that it works:

A2:XYC12,0A3:ETCC13,13A5:QQELF

Now it may not be obvious, but there are actually many possible correct delta files that can be created to convert a first version of a file A into a second version A'. For example, here's another correct delta file for

the example above:

A3:XYAC9,1A6:BLETCHC12,14A5:QQELF

And here's another correct, but not very useful, delta file:

A35:XYABCDEFGHIJBLETCHPQRSTUVPQRSTQQELF

This delta file simply contains the entire contents of A' and instructs the incremental updater to write all 35 characters of this data out to the output file. It's a bad solution because it's actually larger in size than A'! It'd be cheaper to just send A' instead of this delta file.

Your Assignment

You have been hired by the Chief Frugality Officer (CFO) of Small­Mart to create two functions:

createDelta: This function takes the contents of two files, A and A', and produces a delta containing instructions for converting A into A'. Each day, Small­Mart will use this function at their corporate headquarters to create a new delta file that they will then send to all the devices.
applyDelta: This function takes the content of a file A and a delta file, and will apply the instructions in the delta file to produce a new file A'. Each day, every device will use this function to update the previous day's inventory file.

The createDelta function has the following interface:

void createDelta(istream& oldf, istream& newf, ostream& deltaf);

You may name the parameters something else, but there must be three parameters of the indicated types in the indicated order. (The File I/O tutorial explains istream and ostream.) The parameters are

an already­opened input source (for yesterday's full file, say) an already­opened input source (for today's full file, say)
an already­opened output destination to which you will write the instructions for converting the first input to the second.

The applyDelta function has the following interface:

bool applyDelta(istream& oldf, istream& deltaf, ostream& newf);

You may name the parameters something else, but there must be three parameters of the indicated types in the indicated order. The parameters are

an already­opened input source (for yesterday's full file, say) an already­opened input source (the delta file)
an already­opened output destination to which you will write the file resulting from applying the delta file to the first input.

The function returns true if the operation succeeds. If it fails because the delta file is malformed (e.g., there's a character other than an A or C where a command is expected, or an offset or length is invalid), the function returns false. If the function returns false, the caller can make no assumption about what may or may not have been written to the output destination (so you're free to, for example, just return false as soon as you detect an error, without regard to what you may have already written).

Other than the output required to be written to its third parameter, neither function may write to any

destination other than cerr (presumably for debugging purposes). Our testing program will ignore anything you write to cerr.

Your functions must be able to create deltas and apply deltas for files up to 100 kilobytes (102,400 bytes) in length, and if you wish, may be able to handle longer files as well.

Your program must build and run successfully on the SEASnet Linux server. Of course, you can do your primary development in Visual C++ or Xcode, but be sure you check that it works on the Linux server.

Your createDelta function must not run for longer than 15 seconds on the SEASnet Linux server for any file of up to 100 kilobytes. Your applyDelta function must not run for longer than 5 seconds on the SEASnet Linux server for any file of up to 100 kilobytes.

Here's an example of a main routine that performs a test of the functions:

#include <iostream>
#include <fstream>
#include <string>
#include <cassert> using namespace std;

bool runtest(string oldname, string newname, string deltaname, string newname2)
{
ifstream oldfile(oldname); if (!oldfile)
{
cerr << "Cannot open " << oldname << endl; return false;
}
ifstream newfile(newname); if (!newfile)
{
cerr << "Cannot open " << newname << endl; return false;
}
ofstream deltafile(deltaname); if (!deltafile)
{
cerr << "Cannot create " << deltaname << endl; return false;
}
createDelta(oldfile, newfile, deltafile); deltafile.close();

oldfile.clear(); // clear the end of file condition oldfile.seekg(0); // reset back to beginning of the file ifstream deltafile2(deltaname);
if (!deltafile2)
{
cerr << "Cannot read the " << deltaname << " that was just created!" << endl; return false;
}
ofstream newfile2(newname2); if (!newfile2)
{
cerr << "Cannot create " << newname2 << endl; return false;
}
assert(applyDelta(oldfile, deltafile2, newfile2));
cout << "You must compare " << newname << " and " << newname2 << endl;

cout << "If they are not identical, you failed this test." << endl; return true;
}

int main()
{
assert(runtest("myoldfile.txt", "mynewfile.txt", "mydeltafile.txt", "mynewfile2.txt"));
}

(Note: We close the deltafile output stream to ensure that all output to the file is completed before we open it for input. Because createDelta reached the end of file on the oldfile input stream, we need to clear the end of file condition on the stream and reset the stream so that we read from the beginning again.)

Here's another main routine that tests the functions. It uses the types istringstream (an input source where the characters read come from a string instead of a file), and ostringstream (an output destination where the characters written are later available as a string).

#include <iostream>
#include <sstream> // for istringstream and ostringstream
#include <string>
#include <cassert> using namespace std;

void runtest(string oldtext, string newtext)
{
istringstream oldfile(oldtext); istringstream newfile(newtext); ostringstream deltafile; createDelta(oldfile, newfile, deltafile); string result = deltafile.str();
cout << "The delta length is " << result.size()
<< " and its text is " << endl; cout << result << endl;

oldfile.clear(); // clear the end of file condition oldfile.seekg(0); // reset back to beginning of the stream istringstream deltafile2(result);
ostringstream newfile2;
assert(applyDelta(oldfile, deltafile2, newfile2)); assert(newtext == newfile2.str());
}

int main()
{
runtest("There's a bathroom on the right.", "There's a bad moon on the rise.");
runtest("ABCDEFGHIJBLAHPQRSTUVPQRSTUV", "XYABCDEFGHIJBLETCHPQRSTUVPQRSTQQELF");
cout << "All tests passed" << endl;
}

We will provide you with several different old and new files for you to test your program with, in a Windows version and a Mac and Linux version. For each of pair of old and new files, your program must create delta files that are at least 5% smaller than the new version of the file. We will test your program on other files as well, so it can't work just for these sample files. Below we show how big our solution's delta file is for each of these sample pairs of files:

Windows file sizes in bytes Mac and Linux file sizes in bytes old new our delta old new our delta

Windows file sizes in bytes    Mac and Linux file sizes in bytes   

 Old  New  Our delta
 Old  New  Our Delta
 Small­Mart inventory

105

141

95

104

140

94

 Weird Al's Dr.
Seuss

533

606

69

510

579

69

 War and Peace

77285

77333

399

76634

76682

393

 Some strange files

100764

100764

8746

100440

100440

8746

Small­Mart inventory

Weird Al's Dr.

Seuss War and Peace

Some strange files

Algorithm Requirements

You may be wondering how to write a function to create a delta file. This is definitely a non­trivial problem, and most senior job interview candidates can't come up with a viable algorithm (i.e., a conceptual approach) within a one­hour interview.

Here's a general, high­level algorithm that can be used to build a delta file. It's not ideal, however, and you'll have to come up with improvements of your own to make it work well:

1. Read in the entire contents of the old file into a string. Read the entire contents of the new file into another string.

2. For all consecutive N­character sequences in the old file's string, insert that N­character sequence and the offset F where it was found in the old file's string, into a table (e.g. hash table or binary search tree). You might use 8 for N, or maybe 16.

3. Once you have filled up your table with all N­byte sequences from the source file, start processing the new file's string, starting from offset j=0, until j reaches the end of the string.

a. Look up the N­byte sequence which starts at offset j ([j,j+N­1]) in the new file's string in the table you created from the old file's string.

b. If you find this sequence in the table, you know that that sequence is present in both the old and new versions of the file.

i. Determine how long the match goes on (it might be just N bytes long, or it might extend past the first N matching bytes to be many bytes longer).

ii. Once you have determined how long the match is (call this L), write a Copy instruction to the delta file to copy L bytes from offset F from the source file.

iii. Go back to step 3a, continuing at offset j = j + L in the new file's string.

c. If you don't find the current sequence (new file's string [j,j+N­1]) in the table you created, you know that the first version of the file doesn't contain the current N byte sequence.

i. Write an instruction to the delta file to Add the current character.

ii. Increment j by one to continue past the character used in the add instruction.

iii. Go back to step 3a, where we'll try to find the next N­byte sequence in our table.

Of course, this is a simple version of the algorithm, and many improvements are possible. For example, this version of the algorithm will result in only single­character add instructions: If the new file contains the new text BLAH (not present in the old text), then the above algorithm would produce four Add instructions (A1:BA1:LA1:AA1:H) instead of a single (and more compact) Add instruction that adds all four new characters at once (A4:BLAH).

To obtain the highest score possible and create the smallest delta files, you'll have to improve on the algorithm substantially. Be creative! In addition, this naïve algorithm also has troubles with certain types of files (such as the strange*.txt sample files that we provide). You'll have to figure out why and make sure this doesn't cause problems in your solution.

For this project, we are constraining your implementation:

You are required to use either a hash table or a binary search tree (your choice of which) for your table data structure.
The only containers from the C++ standard library you may use are vector, list, stack, and queue (and string). You'll thus have to implement your BST or hash table yourself instead of using a map or unordered_map, but you can, say, use the list type to help you implement a hash table. Although we're limiting your use of containers from the library, you are free to use library algorithms (e.g., sort).

Compared to creating a delta file, the algorithm for applying a delta file is much easier. You may use the following functions to aid you in extracting commands from the delta file:

bool getInt(istream& inf, int& n)
{
char ch;
if (!inf.get(ch) || !isascii(ch) || !isdigit(ch)) return false;
inf.unget(); inf >> n; return true;
}

bool getCommand(istream& inf, char& cmd, int& length, int& offset)
{
if (!inf.get(cmd) || (cmd == '\n' && !inf.get(cmd)) )
{
cmd = 'x'; // signals end of file return true;
}
char ch; switch (cmd)
{
case 'A':
return getInt(inf, length) && inf.get(ch) && ch == ':'; case 'C':
return getInt(inf, length) && inf.get(ch) && ch == ',' && getInt(inf, offset);
}
return false;
}

Attachment:- Project.rar

Reference no: EM131102994

Questions Cloud

Why is it a good idea to use a manager as a trainer : Susan has to decide which of her employees can go on vacation during certain times of the year. Which of the following decision roles of a manager is Susan using?
What was the initial size of the culture : The count in a bacteria culture was 600 after 10 minutes and 2000 after 40 minutes. Assuming the count grows exponentially,
Planners lab presentation on walmart or target : Download and learn how to use the Planners' Lab. You will be using this tool for this project. Go to your Yahoo Finance and choose any of the publicly traded companies, for example, Walmart or Target. Using their financials, and applying the Plann..
When the company sells a new issue of stock : Is it true that the "flatter," or more nearly horizontal, the demand curve for a particular firm's stock, and the less important investors regard the signaling effect of the offering, the more important the role of investment bankers when the company..
Determine how long the match goes on : Determine how long the match goes on (it might be just N bytes long, or it might extend past the first N matching bytes to be many bytes longer).
Create one junit test for dots.java : Some test cases have been provided for you, but none of these test a full game. So: In MyUnitTests.java, create one JUnit test for Dots.java. This test MUST test a full game, from dropping the first dot to being unable to drop any more dots because t..
An original business research project : The final version of your Research Project should be 10 to 12 double-spaced pages (not including the title page, reference page, tables, appendices, etc.).
Discuss the roles played by financial ratios in general : Select a local hospital and compare its financial ratios for the most recent three years against the national norms for this type of institution. Include analytical comments and how the organization compares to the national norms as well as any su..
Different movie soundtracks : Jenny White is shopping for CDs. She decides to purchase 3 movie soundtracks. The music store has 10 different movie soundtracks in stock. How many different selections of movie soundtracks are possible?

Reviews

Write a Review

C/C++ Programming Questions & Answers

  Use the generate dropdown to create c++ code

use the generate dropdown to create example C++ code based on your working logical flow chart to see what the code would look like.

  Application which will read a file of daily payments

C++ application which will read a file of daily payments, calculate the total as well as the average payment, display the results to the screen and write the results to a file. The input and output file names should be provided as command line argume..

  Write a program to evaluate infix expressions

Write a program to evaluate infix expressions. An infix expression looks like the following:   9 * (5 - 4) + 2 / 6

  Program that converts kilograms into pounds

Write a program that prompts the user to enter the weight of a person in kilograms and outputs the equivalent weight in pounds output both the weights rounded to two decimal places.

  Write function to accept character array

Write down the C++ function which will accept the character array of at most 30 cells. Your function must return true if string and its reverse are identical;

  Your program will read the array

Your program will read the array from the file, check the diagonal and print a report containing the size and the status of the diagonal. A typical report would look like: The matrix is 3x3 and all the numbers on the main diagonal are the same.

  Design a class named employeerecord

Design a class named EmployeeRecord that holds an employee's ID number, name, and payrate. Include mutator methods to set the values for each data field and output the values for each data field. Create the class diagram and write the code that

  Design a class that has a static method named writearray

Design a class that has a static method named writeArray.  THe method should take two arguments: the name of a file and a reference to an int array.

  Writer a program that allows the user to enter

Writer a program that allows the user to enter an unknown number of characters, stores those characters in a data structure (a vector) and then prints the values to the screen.

  Write cpp pthreads program to discover shortest path in maze

The objective of this assignment is to apply the concepts of threading by developing a simple C++ Pthreads program to discover the surroundings and the shortest path in a 2D maze.

  Linear program was developed to help

The client has stipulated that the money must be put into either a stock fund or a money market fund, and the annual return should be at least $14,000. Other conditions related to risk have also been specified, and the following linear program was..

  How a base version of this assignment works

For this assignment you are to create an interactive moving sign in the context of a cityscape street scene. Click the link below to see how a base version of this assignment works. Type a message in the long blank slot at the top left, and then clic..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd