Briefly describe the shape of the distribution

Assignment Help Applied Statistics
Reference no: EM131151373

Detailed Question:

Idea how to do the computer generated aspects of this study guide.Data Exploration and Descriptive Statistics.

For the final you will pick 4 variables to work with. At least one of them has to be an interval-ratio variable; please consult me if you are having trouble finding an interval-ratio variable. Your other variables can be nominal,ordinal, or interval ratio.

Part 1.

Fill in this chart:

 

Variable Name

Variable Label

Type of Variable

Variable 1

 

 

 

Variable 2

 

 

 

Variable 3

 

 

 

Variable 4

 

 

 

Part 2 .Data Exploration

a) Run a histogram for each interval ratio variable. Cut and paste those onto your word file. Briefly describe the shape of the distribution, making note of its overall shape and also looking for any outliers.

b) Build a scatter plot if you have two or more interval ratio variables. What type of relationship, if any, can you observe between the variables?

Now, turn to your categorical variables (if you have any).

c) Run a frequency distribution for each of your categorical variables- you can use either the tab or fre command. Cut and paste the output in your word file and briefly describe the distribution of the variable. Note which category has the most observations, and note categories which have very few observations.

d) If you have two or more categorical variables run a cross-tab to examine the relationship between two of them using the tab command and either column or row percentages. Briefly describe the relationship between the variables. Cut and paste your output into a word file.

Part 3. Descriptive Statistics

Now, calculate descriptive statistics for your variables using the sum command. You can do all four variables at once.

a) Make sure you describe the mean of each variable. If the mean is not a good measure of central tendency for a particular variable please explain why. Do the same for the standard deviation and the range.

Of course, the type of correlation that you should calculate depends upon the type of variables you are working with. At the minimum you should calculate Three correlations but you must be very careful to use the right type of correlation coefficient for your data.
Just a few reminders:

1) To correlate interval-ratio variables, use pearson's r. Make sure you display a scatter plot before you run your correlation.

2) To correlate ordinal variables use spearman's rho. Make sure you display a cross tab before you run your correlation.

3) To correlate nominal variables use lambda. Make sure you display a crosstab before you run your correlation.

For each of your three correlations make sure you describe the size, statistical significance and, when applicable, the direction of the relationship.

Notice that the three correlations your report could be very different depending on the types of variables you are working with.
Regression and Multiple Regression

Now you will build a series of regression models. Before you begin keep the following in mind:
-Your outcome variable MUST be interval ratio.
-The interpretation of the regression coefficients depends upon the type of variable you are using and it's coding.
-For categorical predictors you might want to do some recoding. If you recode any variables make sure that you SAVE your data and that you describe how you recoded in your homework.

Part 1.
Estimate a regression model with a single predictor. Interpret the regression coefficient and it's p-value, the intercept and the R2.

Part 2.
Add another predictor variable to the model you estimated in Part 1. Describe any change in the coefficient of the original variable (and it's associated p-value) and interpret the coefficient and p-value of the new variable. Note any change in the R2

Part 3.
Add a third predictor variable. Describe any changes in the coefficients and p-values for the variables you entered into the previous models. Interpret the coefficient and p-value for your new variable. Note any change in the R2.

Effect size, prediction, and diagnostics

There are multiple ways to think about "effect size" in a multiple regression context. In this section:

1) Use the listcoef command after your regression models to obtain standardized coefficients. Briefly interpret the standardized coefficients using 2-3 sentences.

2) Now, calculate some "effect sizes" as I show in the video and the notes. The way that you do this section will depend upon the types of variables that you have. For categorical variables it probably makes most sense to calculate and effect at the mode. For interval ratio variables you might want to use the 25th and 75th percentiles. Do whatever makes sense for the type of data you are working with.
Take a few sentences to describe what you have calculated. Which variable appears to be the most important now?

Predicted values:

1) Create 2-4 "archetypes" or "representative cases" and calculate predicted values for those cases. How you do this will depend upon what types of variables you are working with. Please show your work and explain your archetypes.
Residuals

1) Calculate the residuals from your final regression model. Plot those residuals on a histogram and cut and paste the histogram into your word file. Do the residuals follow a normal distribution?

Heteroscedasticity

1) Plot your residuals against your fitted values using a scatter plot (paste the scatter plot into your word file). Do you see visual evidence of heteroscedasticity?

2) Test for heteroscedasticity using the Breusch-Pagan/ Cook-Weisberg test. What does this test tell you? Make sure you paste the output of the test into your word file.

Multicollinearity

1) Calculate VIFs for your model and paste the output into your word file? Does your model have multicollinearity problems?

Reference no: EM131151373

Questions Cloud

Prepare review of the book mayflower by nathaniel philbrick : Prepare a review of the book "Mayflower" Author is Nathaniel Philbrick. Writing a proper academic book review may seem like a daunting task if you have never written one before.
Describe the impact of telecommuting on energy conservation : Describe how the business infrastructure should be designed so that employees will be able to continue to perform business functions in the event of a disaster (i.e., storm, hurricane, or earthquake) that destroys or makes it impossible to access ..
Prepare the closing entries : Selected accounts for Heather's Salon are presented below. - All June 30 postings are from closing entries. -Prepare the closing entries.
Define it priorities and governance for it : Identify how IT supports business processes. The business is relying on their capabilities to achieve the business goals. The IT strategy should include a plan for supporting those business capabilities, creating efficiencies, and competitive adva..
Briefly describe the shape of the distribution : Build a scatter plot if you have two or more interval ratio variables. What type of relationship, if any, can you observe between the variables - Briefly describe the shape of the distribution, making note of its overall shape and also looking for ..
Explain why limited leverage is good for business : Explain why limited leverage is good for business. Based on the given information, tax rate, and depreciation show the profitability of the project so that Stephanie can convince her father to purchase the truck by borrowing money
Explain the means attackers use to compromise systems : Describe the principles of risk management, common response techniques, and issues related to recovery of IT systems. Describe how malicious attacks, threats, and vulnerabilities impact an IT infrastructure.
Determine the required diameter : The elements at B, C, D. and E are held in position with retaining rings and keys in profile keyseats. The shaft is to be of uniform diameter, except at its ends, where the bearings are to be mounted. Determine the required diameter
Complete the worksheet : The trial balance columns of the worksheet for Nanduri Company at June 30, 2014, are as follows.- Complete the worksheet.

Reviews

Write a Review

Applied Statistics Questions & Answers

  Hypothesis testing

What assumptions about the number of pedestrians passing the location in an hour are necessary for your hypothesis test to be valid?

  Calculate the maximum reduction in the standard deviation

Calculate the maximum reduction in the standard deviation

  Calculate the expected value, variance, and standard deviati

Calculate the expected value, variance, and standard deviation of the total income

  Determine the impact of social media use on student learning

Research paper examines determine the impact of social media use on student learning.

  Unemployment survey

Find a statistics study on Unemployment and explain the five-step process of the study.

  Statistical studies

Locate the original poll, summarize the poling procedure (background on how information was gathered), the sample surveyed.

  Evaluate the expected value of the total number of sales

Evaluate the expected value of the total number of sales

  Statistic project

Identify sample, population, sampling frame (if applicable), and response rate (if applicable). Describe sampling technique (if applicable) or experimental design

  Simple data analysis and comparison

Write a report on simple data analysis and comparison.

  Analyze the processed data in statistical survey

Analyze the processed data in Statistical survey.

  What is the probability

Find the probability of given case.

  Frequency distribution

Accepting Manipulation or Manipulating

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd