Explain relationship between price and variables

Assignment Help Basic Statistics
Reference no: EM131031747

Statistical Models and Methods

The Data

Data are available on the recommended prices of used cars in the United States. All cars are the same age, but have done different mileages and have different specifications. You have recently been employed by a used car dealership to build models to describe the dependence of recommended prices on potential explanatory variables, in order to use these models to price your own used cars. The data, which come in two parts, are available on Moodle. They are TrainData.txt Training data, which will be used to build models.

TestData.txt Test data, which will be used to assess predictions from the models built. They can be read into R (after saving the file in your working directory) using

Train = read.table("TrainData.txt",header = T)

Test = read.table("TestData.txt",header = T)

A description of the variables can be found in the file description.txt.

After reading in the data, you can look at the structure of the data (number of observa- tions/variable types etc) using the str() command, e.g. str(Train). For both data sets, you should treat the covariates Cylinder, Doors, Cruise, Sound and Leather as factors (they are treated as integers by default). This can be done using, for example,

Train$Cylinder = factor(Train$Cylinder)

The Task

(a) Using the TRAINING data, investigate models to explain the relationship between Price and the other variables. That is, Price (or transformations of it) is to be the response variable, and all other variables are potential explanatory variables.

(b) Use your fitted model(s) from (a) to predict the responses for the observations in the TEST data set. That is, for each of the observations in the Test data, use the values of the explanatory variables as input to your model(s) from (a) to obtain fitted/predicted responses for these observations. Compare your predicted responses with the known observed responses from the observations in the Test data, using suitable plots/numerical summaries.

Notes

- As with any analysis, the first step should be to do some exploratory analysis using any relevant plots and numerical summaries.

- For the model fitting, you can/should use any of the techniques we have covered this semester to investigate potential models. The task is deliberately open-ended, as would be the case in real situations working with real data. As this is a realistic situation with real data, there is not necessarily one single correct answer. Your job is to investigate potential models, and provide a summary of what they tell us about the problem we are trying to solve. The important point is that you correctly use the relevant techniques in a logical and principled manner, and provide a concise but insightful summary of your findings and reasoning. (Note however that you do not have to produce a report in a formal "report" format.)

- You should pay attention as to whether the model assumptions are being met, for example using suitable diagnostic plots, and consider any transformations of the numerical variables if appropriate. Also consider whether your conclusions depend on a few outlying or influential points.

- You should (briefly and concisely) interpret your model(s) and consider whether they make sense in the context of the problem, for example via interpreting the fitted parameters.

- You do not need to include all your R output, as you will generate lots of output when experimenting with the model fitting. However, you should include the output which is relevant to the arguments that you make when describing the logical developments of your model fitting, and any diagnostic plots which justify changes you make in order to meet the modelling assumptions. Finally, at all stages please remember to explain your reasoning and describe (concisely but accurately) the action you take and why, along with the relevant output.

Case Study: Visualising the data

Reference no: EM131031747

Questions Cloud

Define capital budgeting and decision making : Welcome to Discussions! Let's start with defining capital budgeting and decision making. What is capital budgeting? What are the differences between screening decisions and preference decisions?
Aristotle and present an aristotelian analysis : Choose some virtue not discussed by Aristotle and present an Aristotelian analysis. Be sure to give a careful picture of what the virtue and its corresponding vices would look like.
The potential drawbacks of a tow tier structure : To the degree job growth (and increased car sales that come from more competitive labor costs) is based on two tier wage structures, how sustainable is this approach? What are the potential drawbacks of a tow tier structure?
Determine two possible corporate governance challenges : Corporate governance has become a hot issue in the U.S. over the past two decades. From your analysis of the case study, determine two possible corporate governance challenges that might be faced by Best Buy as a result of its rapid growth and why th..
Explain relationship between price and variables : Price and the other variables. That is, Price (or transformations of it) is to be the response variable, and all other variables are potential explanatory variables - explain the relationship between Price and the other variables. That is, Price i..
Briefly describe the enterprise in terms of its operations : HI5019 STRATEGIC INFORMATION SYSTEMS FOR BUSINESS AND ENTERPRISE. Briefly describe the enterprise, in terms of its operations, products/services, markets, competitors etc
Analyze the three internal governance mechanisms : Analyze the three internal governance mechanisms (ownership concentration, boards of directors, and executive compensation) and recommend a possible fourth mechanism that would help align the interests of managerial agents with those of the firm’s ow..
Write a review on each of the given post : Write a review on each of the following post. The review of each post should be a minimum of 100 words. If you can do each one in a separate word document.
Calculate the contribution margin per unit : Martinez Company’s relevant range of production is 9,000 units to 15,000 units. In the most recent period Martinez produced and sold 13,000 units, its revenues and costs were as follows: Calculate the The contribution margin per unit. The total manuf..

Reviews

Write a Review

 

Basic Statistics Questions & Answers

  If a random sample of 65 married women are selected what is

sociologists say that 80 of married women claim that their husbands mother is the biggest bone of contention in their

  Assessing a paired t-test

There're several advantages of using a paired t-test. Since the same subjects are being tested in each of the varying conditions, those subjects serve as their own control.

  To find the sample size for the true population mean

A manufacturer of dodge balls uses a machine to inflate its new balls to a pressure of 13.5 pounds (σ =0.1). When the machine is properly calibrated, the mean inflation pressure s 13.50 pounds, but uncontrollable factors can cause the pressure of ..

  For the standard normal distribution the area to the left

for the standard normal distribution the area to the left of z -1.35 isa- 0.4115b- 0.0885c-

  Number of serious accidents per year in a large factory

If the mean number of serious accidents per year in a large factory (where the number of employees remains constant) is five, find the probability that in the current year there will be:

  Based on new water standards for manufacturers in the

an import threshold for mercury in crayfish is 9 parts per billion ppb. a study obtained a random sample of 120

  Consider the game where two dice die a and die b are rolled

consider the game where two dice die a and die b are rolled. we say that die a wins and write a gt b if the outcome of

  Consider the system with components as shown below

consider the system with components as shown below indicating their failure relationship. the component reliabilities

  Testing gas pumps in michigan for accuracy

Cheating Gas Pumps When testing gas pumps in Michigan for accuracy, fuel-quality enforcement specialists tested pumps and found that 1299 were not accurate (within 3.3 oz when 5 gal is pumped), and 5686 were accurate.

  Identify the null hypothesis and the alternative hypothesis

Determine the test statistic. Show all work; writing the correct test statistic, without supporting work, will receive no credit and determine the p-value. Show all work; writing the correct critical value, without supporting work, will receive no c..

  Is sample significantly different than the population

A random sample of 30 students was selected, and the following values were recorded: Is this sample significantly different than the population? State the null and alternative hypotheses.

  Method of least squares model

Use the method of least squares to model the relationship between x and y.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd