Construct the appropriate dummy variable

Assignment Help Applied Statistics
Reference no: EM131376203

Question 1:

The production manager of American Tool and Castings Company is conducting a study regarding the relationship between the numbers of alloy caps milled on a lathe versus the measure of distance from specification of outside cap diameters.  The lathe uses a sharp steel cutting tool in a milling process to cut and shape raw alloy bars into caps.  The lathe tool turns at a high speed while cutting into the alloy, in essence, cutting the alloy down to size and shaping it to resemble a round cap.  A similar lathe tool cuts into the inside of the cap.  The caps are later fit with interior gaskets and permanently sealed onto airtight canisters.    After the steel cutting tool is used repeatedly, the tool begins to wear, hence cutting a larger outside cap diameter than desired.  If the outside cap diameter is too large the cap can't be properly affixed and sealed to the canister.  The production manager would like to build a model to estimate/predict how many caps a tool can mill until it wears down too much, hence milling caps that are too large in diameter and unusable.  Each cap costs approximately $400 to mill, so defective caps are expensive.  The main variable of interest (y) is "distance from specification" of outside cap diameter.

To conduct the study, 62 lathe tools were randomly sampled.  Each lathe operator keeps a record of the number of caps milled by particular tool.  Each cap milled is measured to see how close to specification the outside diameter is.  According to specification, each cap should be 6 inches in diameter. For example, the measure in record one is 0.36, meaning it was 0.36 inches larger than specification. When each tool was sampled, the number of caps milled by the tool was recorded, as well as a measure from specification of the diameter of the last cap milled by that particular tool.  The data for each cutting tool sampled and the measure of distance from specification of the outside cap diameter of the last cap milled is in the spreadsheet labeled American.

1. Scatter plot

Construct a scatter plot revealing the relationship between the number of caps milled by a tool and the distance from specification of the outside cap diameter.  Make sure the x variable is on the x-axis and the y-variable is on the y-axis.  Move the chart so that it starts in cell E3.  Do not resize the chart beyond the red shaded region.

2. Correlation

Using a built-in Excel function in cell F22, calculate the correlation (r) between the number of caps milled by a tool and the distance from specification of the outside cap diameter.

In cell F23, indicate the strength of the linear relationship as very strong, relatively strong, very weak, relatively weak, or no relationship.

In cell F24, indicate if the relationship is positive or negative.

3. Anchoring the output in cell P3, generate the regression output.  Make sure you select an appropriate "Residual Plot," and place the residual plot in the designated area near cell E32.

4. Output

In cells J23 and J24, enter the value of the intercept and slope (respectively) by referencing the appropriate cells in the regression output.

In cell K24, enter the value of the t test statistic for testing the slope significance by referencing the appropriate cell from the regression output.

In cell L24, enter the p-value regarding the slope significance by referencing the appropriate cell from the regression output.

In cell M24, indicate with the word "Yes" or "No" if the slope coefficient is significant.  Assume α.01.

5. In cell F29, provide the predictive power (a.k.a. the coefficient of determination) of the model by referencing the appropriate cell from the regression output.

6. In cell J29, write the prediction equation relating NM to DS using the intercept and slope values.  This is a text input that starts with a number, so you must start the input with a space to trick Excel into interpreting the input as text.  For example, if a = 4 and b = 10, enter 4 + 10(NM), placing a space before the value 4.

7.

Cell E32 should contain the residual plot.  Keep the plot within the red shaded area.

In cell F48, comment on the assumption of linearity as interpreted using this residual plot.

In cell F49, comment on the assumption of constant variance as interpreted using this residual plot.

8. Prediction and Residual

In cell F53, predict the distance from specification of a cap milled by a tool when the cap is the 20th cap to be milled.

In cells F54 and F55, calculate the lower and upper values for the range of definition for this data set.

9. Prediction Interval

Using the table in cells J52:K53 as the Predication Data Set and StatTools, calculate the lower limit and upper limit for a 95% prediction interval for the DS of a cap that is the 20th cap milled.  Anchor your StatTools Regression output in cell A1 of the Regression Worksheet Place the values in cells J58 and K58 by referencing the appropriate cells in the StatTools output.  Note that this will shift the columns of your worksheet.

Question 2:

A mental health agency measured the self-esteem score for randomly selected individuals with disabilities who were involved in some work activity within the past year.  The spreadsheet named Self Esteem provides the data including each individuals self-esteem measure (y), years of education (YrsEdu), age, months worked in the last year (MonWork), marital status dummy variables (MS2, MS3, MS4) indicating if the individual is single, married, separated, or divorced, and a support level (SL) dummy variable indicating if the level of job support (counseling, etc) was provided directly (1) or indirectly (0).  Regarding marital status, if single all MS indicators are 0, while MS2 = 1 indicates married, MS3 = 1 indicates separated, and MS4 = 1 indicates divorced.

In cell N4, use Excel's "Correlation" Data Analysis tool to construct a correlation matrix for all the variables.   Note that the categories in columns I and J should not be included since the data are already represented as dummy variables in columns E through H.

Considering the correlation between self esteem and each x variable identify the three variables that, based on correlation with y alone, should be considered as best candidates for inclusion in the model.  Shade the appropriate cells containing the correlation values in yellow.  Ignore any multicollinearity concerns for this part.

Considering the correlation between each pair of x variables, identify the variables that would possibly cause multicollinearity problems if included in the model.  Shade the appropriate cells containing the correlation values in green.

Based on your conclusions in parts b and c, shade in red color the names of any variables that should not be included in the initial model because of possible multicollinearity problems.

With cell N19 as the upper left hand corner of the output, fit the full regression model. (Do not include a residual plot)

Considering the regression output from part e, shade (in yellow) the name of any x variable that appears significant and should remain in the model.  Also shade the t stat and p-value.  Consider the p-value small if it is less than 0.05.

Partial Regression Model: With cell N51 the upper left hand corner of the output, fit the model including only the x variable(s) that were found to be significant in part f.  (Do not include a residual plot)

Question 3:

A bank must prepare for a discrimination suit filed on behalf of female employees that claim females are paid less than male employees.  The bank manager sampled employee files to see if he could build a useful model for predicting salary as a function of gender and other characteristics.  For each employee, the data includes salary (y, in thousands of dollars), years experience (YrsExp), years prior experience (YrsPrior), and Gender.  The data is in the spreadsheet named Bank.

1. Since Gender is a categorical variable, construct the appropriate dummy variable in column E to indicate gender as female = 1 and male = 0.  You must use an "IF" statement in the appropriate cell(s) to indicate the correct dummy value based on gender.

2. With cell H7 the upper left hand corner of the output, fit the full model.  (Do not include a residual plot).

3. Based on the regression output from part b, shade (in yellow) the name of any x variable that appears significant and should remain in the model.  Also shade the t stat and p-value.

Attachment:- Assignment.rar

Reference no: EM131376203

Questions Cloud

Create an erd from the information : The information he would like to know about each student includes ID number, name, and phone number. He also needs to know what grade the student receives in each course. He has asked you to create an ERD from the information described here using ..
Enormous amount of money : Observing that Kodak is making an enormous amount of money from their film salesand the owners of Kodak are becoming very rich, the government imposes a tax of $0.50per roll of film.
Create an erd from information described using chen model : Each equipment type is associated with a single manufacturer that is referenced by a unique two-digit manufacturer ID number. You have been hired to assist Foothills Athletics to create an ERD from the information described here using the Chen mod..
Find a business continuity or disaster recovery article : Research on the general internet or in the University Library and find a Business Continuity or Disaster Recovery article online relating to records recovery, data (anything regarding to d/r and data - including data loss) OR workstations. Summari..
Construct the appropriate dummy variable : Since Gender is a categorical variable, construct the appropriate dummy variable in column E to indicate gender as female = 1 and male = 0.  You must use an "IF" statement in the appropriate cell(s) to indicate the correct dummy value based on gen..
Proposal that increases her income : Consider first the proposal that increases her income from 10,000 to 15,000 (for example, by introducing a program like the earned income tax credit).  Let's call this policy as the cash transfer program.  Write her new budget constraint and draw ..
How the manager in an agile organization may use : Compare and contrast the advantages and disadvantages to the planning function and explain how the manager in an agile organization may use both to his or her advantage.
Develop an erd from the business rules mentioned here : A ski or snowboard need not be assigned to any customer. Your job is to develop an ERD from the business rules mentioned here.
Define the completeness axiom : Define the completeness axiom. Give a verbal explanation of a situation where the consumer's preferences are incomplete.

Reviews

Write a Review

Applied Statistics Questions & Answers

  Hypothesis testing

What assumptions about the number of pedestrians passing the location in an hour are necessary for your hypothesis test to be valid?

  Calculate the maximum reduction in the standard deviation

Calculate the maximum reduction in the standard deviation

  Calculate the expected value, variance, and standard deviati

Calculate the expected value, variance, and standard deviation of the total income

  Determine the impact of social media use on student learning

Research paper examines determine the impact of social media use on student learning.

  Unemployment survey

Find a statistics study on Unemployment and explain the five-step process of the study.

  Statistical studies

Locate the original poll, summarize the poling procedure (background on how information was gathered), the sample surveyed.

  Evaluate the expected value of the total number of sales

Evaluate the expected value of the total number of sales

  Statistic project

Identify sample, population, sampling frame (if applicable), and response rate (if applicable). Describe sampling technique (if applicable) or experimental design

  Simple data analysis and comparison

Write a report on simple data analysis and comparison.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  What is the probability

Find the probability of given case.

  Frequency distribution

Accepting Manipulation or Manipulating

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd