Reference no: EM131220688
In this exercise we consider the problem of finding squares in a graph. That is, we want to find quadruples of nodes a, b, c, d such that the four edges (a, b), (b, c), (c, d), and (a, d) exist in the graph. Assume the graph is represented by a relation E as in Section 10.7.4. It is not possible to write a single join of four copies of E that expresses all possible squares in the graph, but we can write three such joins. Moreover, in some cases, we need to follow the join by a selection that eliminates "squares" where one pair of opposite corners are really the same node. We can assume that node a is numerically lower than its neighbors b and d, but there are three cases, depending on whether c is
(i) Also lower than b and d,
(ii) Between b and d, or
(iii) Higher than both b and d.
(a) Write the natural joins that produce squares satisfying each of the three conditions above. You can use four different attributes W, X, Y , and Z, and assume that there are four copies of relation E with different schemas, so the joins can each be expressed as natural joins.
(b) For which of these joins do we need a selection to assure that opposite corners are really different nodes?
(c) Assume we plan to use k Reduce tasks. For each of your joins from (a), into how many buckets should you hash each of W, X, Y , and Z in order to minimize the communication cost?
(d) Unlike the case of triangles, it is not guaranteed that each square is produced only once, although we can be sure that each square is produced by only one of the three joins. For example, a square in which the two nodes at opposite corners are each lower numerically than each of the other two nodes will only be produced by the join (i). For each of the three joins, how many times does it produce any square that it produces at all?