Reference no: EM13906567
1. Which methodology is used to group products that customers purchase together?
A. Classification analysis
B. Market basket analysis
2. Mya is investigating the factors that impact soda consumption. She examines a host of variables that help explain the amount consumed. Which type of data mining methodology is she most likely to use?
B. Classification analysis
D. Market basket analysis
3. A facts table has:
A. few rows and many columns
B. many rows and many columns
C. many rows and few columns
D. few rows and few columns
4. In a facts table, a supermarket database is likely to have which item listed in rows?
A. The number of units sold
B. Revenue generated from a particular unit
C. The department in which the unit was purchased
D. The individual items purchased
5. The testing set in data partitioning is:
A. The first subset of data, which usually contains 70% of the records
B. The second subset of data, which usually contains less than 70% of the records
C. The initial data set from which subsets are created
D. The first subset of data, which usually contains 30% of the records
6. The predicted value from a logistic regression will be:
A. between 0 and 1
B. between -1 and 1
C. less than 0
D. greater than 1
7. Suppose the odds of Team A winning are 5 to 1. Then, the odds ratio is:
8. If the regression coefficient estimate from a logistic regression is positive, the probability of the dependent variable taking on a value of 1:
A. approaches zero
C. remains constant
9. A data mart is typically smaller than a data warehouse.
1. When you try to find the most appropriate input probability distribution in a simulation model, you first have to choose the most appropriate family, and then you have to select the most appropriate member of that family.
2. Assume that x is a random number between 0 and 1, and that the number of units expected to be sold is uniformly distributed between 300 and 500. Then, sales are given by the expression
A. 300 + x
B. 500 - x
C. 300 + 200 x
D. 500 - 200 x
E. 300 + 500 x
3. Which of the following statements is correct regarding the graph of a discrete probability distribution?
A. It is a series of spikes.
B. The height of each spike is the probability of the corresponding value.
C. There is an empty space between adjacent spikes.
D. All of these options
4. The RAND() function in excel models which of the following probability distributions:
5. Suppose you run a simulation model several times with different order quantities. What can we infer about the quantity that maximizes the output, the company's profit?
A. This quantity is the optimal order quantity.
B. This quantity might be the optimal order quantity, but we also need to consider the company's attitude toward risk.
C. This is not necessarily the optimal order quantity, because it may have produced the largest profit by luck.
D. We can't infer anything
6. Which of the following functions is often required in simulations where we must model a process over multiple time periods and must deal with uncertain timing of events?
E. None of these options
7. Which of the following is not among the financial applications where simulation can be applied?
A. Future stock prices
B. Customer preferences for different attributes of products
C. Future interest rates
D. Future cash flows
E. None of these options
8. In a manufacturing model, we might simulate the number of days to produce a batch and the yield from each batch. The number of days would typically be a ___________ distribution and the yield would be a ___________ distribution.
A. Continuous, discrete
B. Continuous, continuous
C. Discrete, continuous
D. Discrete, discrete
9. The value at risk (VAR) is typically defined as the:
A. 5th percentile of NPV distribution
B. 10th percentile of NPV distribution
C. 50th percentile of NPV distribution
D. 90th percentile of NPV distribution
E. 95th percentile of NPV distribution