Problem1 assumethefollowingsimpleb-treewithn4this tree

Assignment Help Database Management System
Reference no: EM13351181

Problem 1

Assume the following simple B+-tree with n=4:

This tree consists of only a root node and three leaf nodes. Recall that the root node must have between 2 and n child pointers (basically RIDs) and between 1 and n-1 key values that separate the sub-trees. In this case, the values 9 and 19 mean that to search for a value strictly less than 9 you visit the left-most child, for at least 9 and strictly less than 19, you visit the second child, and otherwise the third child. Each internal node (none in the figure) has between 2 and 4 child pointers and between 1 and 3 key values (always one less than the number of pointers), and each leaf node has between 2 and 3 key values and associated RIDs pointing to a record with that key value in the indexed table. Sketch the state of the tree after each step in the following sequence of insertions and deletions:

Insert 15, Delete 19, Delete 9, Insert 27, Insert 20, Insert 22, Delete 20, Delete 16, Delete 12.

Note that for insertions, there are two algorithms, one that splits a full node without trying to off-load data to a direct neighbor, and one that first tries to balance with a direct neighbor in the case of a full node. Please use the first algorithm!

Problem 2

You are given a sequence of 8 key values and their 8-bit hash values that need to be inserted into an extendible hash table where each hash bucket holds at most two entries. The sequence is presented in Table 1 below. (You do not need to know what function was used to compute the hashes, since the resulting hashes are already given.) In Figure 1 you can see the state of the hash table after inserting

the first two keys, where we only use the first (leftmost) bit of each hash to organize the buckets. Now insert the remaining six keys (k2 to k7) in the order given. Sketch the bucket address table and the buckets after each insertion.

Problem 3:

In the following, assume the latency/transfer-rate model of disk performance, where we estimate disk access times by allowing blocks that are consecutive on disk to be fetched with a single seek time and rotational latency cost (as shown in class). Also, we use the term RID (Record ID) to refer to an 8-byte "logical pointer" that can be used to locate a record (tuple) in a table.

You are given the following very simple schema for a credit card payment database. In this schema, people can make payments to merchants (stores) with their credit cards. All the payment records are stored in the CardCharge table with a unique identifier and a timestamp. (We are only concerned with the time of the charge, not the time the merchant receives the money from the credit card company or the time the customer pays his credit card bill, which both are much later.) The database stores for every person a name, city, state, and SSN. A customer may have several credit cards, for which we store the customer's SSN, the time the credit card was issued,, and the expiration time. Each merchant has a name, city, state and a unique identifier. The details of the schema are shown below:

Assume there are 50 million customers, 200 million cards, 1 million merchants and 10 billion charge records. Each tuple is of size 200 bytes, and each ID requires 16-bytes. Consider the following queries:

SELECT chargeid
FROM CardCharge CC, Card C, Person P
WHERE CC.ccn = C.ccn and C.ssn = P.ssn and P.pname = "Cathy Crowbar"

SELECT P.ssn
FROM CardCharge CC, Card C, Person P
WHERE CC.ccn = C.ccn and C.ssn = P.ssn and P.pcity = "Chicago"

SELECT P.ssn
FROM CardCharge CC, Card C, Merchant M, Person P
WHERE CC.mid = M.mid and CC.ccn = C.ccn and C.ssn = P.ssn and CC.camount > 1000 and M.mcity =
"Elko" and P.pcity = "New York City"

SELECT C.ssn
FROM CardCharge CC, Card C, Merchant M, Person P
WHERE CC.ccn = C.ccn and CC.mid = M.mid and C.ssn = P.ssn and M.mcity = P.pcity

(a) For each query, describe in one sentence what it does. (That is, what task does it perform?)

In the following, to describe how a query is executed, draw a query plan tree and state what algorithms should be used for the various selections and joins. Provide estimates of the running times, assuming these are dominated by disk accesses.

(b) Assume that there are no indexes on any of the relations, and that all relations are unclustered (not sorted in any way). Describe how a database system would best execute all four queries in this case, given that 2GB of main memory are available for query processing, and assuming a hard disk with 10 ms for seek time plus rotational latency (i.e., a random access requires 10 ms to find the right position on disk) and a maximum transfer rate of 60 MB/s.

Assume that 2% of all customers live in Chicago and 5% live in New York City, that there are only 5 customers named "Cathy Crowbar", that there are 200 Merchants Elko, and that 1% of all charges are for more than $1000. Also, if nothing is stated, assume independence (e.g.., customers in Chicago have on average the same number of cards and same spending patterns as the average customer, and if 1% of all people live in Cleveland and 20% of all charges were done in 2011, then 0.2% of all charges were made during 2011 by people living in Cleveland.)

(c) Consider a sparse clustered B+-tree index on chargeid in the CardCharge table, and a dense unclustered B+-tree index on mname in the Merchant table, where mname has a (fixed) size of 16 bytes. For each index, what is the height and the size of the tree? How long does it take to fetch a record with a particular key value value using these indexes?

(d) Suppose that for each query, you could create up to two index structures to make the query faster. What index structures would you create, and how would this change the evaluation plans and running times? (In other words, redo (b) for each query using your best choice of indexes for that query.)

Problem 4:

(a) Consider a hard disk with 6000 RPM and 3 single-sided platters. Each surface has 400,000 tracks and 2000 sectors per track. (For simplicity, we assume that the number of sectors per track does not vary between the outer and inner area of the disk.) Each sector has 1024 bytes. What is the capacity of the disk? What is the maximum rate at which data can be read from disk, assuming that we can only read data from one surface at a time? What is the average rotational latency?

(b) Suppose we have another disk, different from the one in part (a), with average seek time 4 ms, average rotational latency 6 ms, and maximum transfer rate 60 MB/s. How long does it take to read a file of size 8 KB? How about a file of 80 KB? How about a file of 8 MB? Use both the block model (4KB per block) and the latency/transfer-rate model, and compare.

(c) Suppose you have a file of size 81 GB that must be sorted, and you have only 1 GB of main memory to do the sort (plus unlimited disk space). Estimate the running time of the I/O-efficient merge sort algorithm from the class on this data, using the hard disk from part (b). Use the latency/transfer-rate model of disk performance, and ignore CPU performance. Assume that in the merge phase, all sorted runs from the initial phase are merged together in a single merge pass.

(d) Suppose you use two (instead of one) merge phases in the scenario in (c). What would be the degree of the merges, and how would this change the running time of the sort?

Reference no: EM13351181

Questions Cloud

Q1 a train at rest emits a sound at a frequency of 1045 hz : q1. a train at rest emits a sound at a frequency of 1045 hz. an observer in a car travels away from the sound source at
Q1 a ball in flight is recorded and analyzed using digital : q1. a ball in flight is recorded and analyzed using digital video analysis software. the analysis produced the result
Parent manufacturing inc is negotiating a merger with one : parent manufacturing inc. is negotiating a merger with one of its major competitors targetsub manufacturing inc.nbsp
1 one of the sidebands of an am dsb-lc broadcast station is : 1 one of the sidebands of an am dsb-lc broadcast station is measured to have a power of 63 dbm.nbspif the modulation is
Problem1 assumethefollowingsimpleb-treewithn4this tree : problem1 assumethefollowingsimpleb-treewithn4this tree consists of only a root node and three leaf nodes. recall that
Q1 if the weight of a room full of water is so great why : q1. if the weight of a room full of water is so great why dont the first floors of houses built over basements all
Marketing scenarios1arimount a well-known beauty and : marketing scenarios1.arimount a well-known beauty and grooming company wants to launch a new deodorant product. the
You will implement a simplified version of the set class : you will implement a simplified version of the set class. you must implement all functions defined in the provided file
Q patricia is researching venues for a restaurant business : q. patricia is researching venues for a restaurant business. she is estimating 3 chief features that she considers

Reviews

Write a Review

Database Management System Questions & Answers

  Design an relational model

Choose 2 enterprises(businesses, corporations, companies, institutions) from one or more economic, organizational, institutional or industrial sector of interest to you. See the Appendix for suggestions and examples.

  What do you think is the wisdom of maintaining 2 models

what do you think is the wisdom of maintaining 2 models? The W3C has a Document Object Model as a recommendation. Do you think browsers should implement this model? If not, propose a model which you think would be suitable.

  Describe a minimum of three heuristics to optimize queries

Your supervisors and customers are very impressed with the database you have put together. Describe a minimum of three heuristics to optimize Queries.

  Develop basic tools to expedite use of oracles dictionary

Write a script that provides all of the information in, and duplicates the formatting of, Oracle's SQL*Plus describe command. Additionally, the output should add the comments on the rows. Input: owner and table name. Output: columns for Name, Null..

  Computing functional dependencies

Compute the functional dependencies which exist in following table. After determining the functional dependencies, transform this table to an equivalent collection of the tables which are in third normal form.

  Write down advantage in storing metadata in tables

What is meant by a Metadata? How does this term pertain to a database? Write down advantage is there in storing metadata in tables?

  Use embedded sql and c or any programming language

Convert the example of GEOMETRY_OBJECTS given in section 11.1.5 from the functional notation to the notation given in Figure 11.2 that distinguishes between attributes and operations. Use the keyword INHERIT to show that one class inherits from an..

  Convert er diagram into relational schema

Suppose we are to design a registrar's database to store information about students, courses, the courses students have taken. Convert the E/R diagram into a relational schema.

  Write problems and issues related with internet databases

Write down some of problems and issues related with internet databases? Consider security, performance, architecture issues.

  Description of the relationship represented by scatterplot

Produce a scatterplot of Rent vs. Size (square meters of the apartment) for the rental data in rent.

  Problem on relational algebra

The database used for this question is a very simple one with the following schema: (Primary keys are bold, foreign keys are underlined)

  Justify a question on database management

When a student has not chosen a major at a university, the university often enters a value of "Undecided" for the major field. Is "Undecided" a way to represent the null value? Should it be used as a default value? Justify your answer carefully.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd