Calculating the least squares regression equation

Assignment Help Applied Statistics
Reference no: EM131990257

STATISTICAL ANALYSIS PROJECT -

This project leads you through a statistical analysis of residential property data from a given non-capital city or town in Australia. This property data is also compared with property data from another non-capital city or town.

Project Situation

To analyse the real estate market in non-capital cities and towns Safe-As-Houses Real Estate, a large national real estate company, has collected data from random samples of residential properties for sale for a selection of non-capital cities and towns in States A, B and C.

As a research assistant for Safe-As-Houses Real Estate, you are analysing this data for the town or city specified by your sample. In addition, you compare the price data for this location with price data from another town or city. For example, if your student ID number ends in 8 your sample is Sample 8. That is, you will be analysing the real-estate market in Regional City 1, State B. You will also compare the residential property price data in Regional City 1, State B with the price data for Regional City 2, State A.

In each part of the project, you are required to analyse your sample data in response to given questions and provide a written answer. You can assume that the written answers are components of a longer report on the real estate market in your given city or town.

Data Analysis Project Part A -

Purpose: To

  • introduce you to the project data, situation and Excel
  • use Excel to graph data and calculate summary statistics
  • interpret and communicate Excel results.

Part A Question -

From past research, Safe-As-Houses Real Estate is aware that the majority of first homebuyers purchase properties with three bedrooms.

You are asked to provide information on the price of three bedroom residential properties for sale in the location and state specified by your sample. In particular, information on the minimum and maximum price and the average price is required. As is an estimated price range for a three-bedroom property.

Complete the following tasks

1) Download and save your data.

2) Download the Project Part A cover sheets, name and save this file as

"Family Name_First Name_Part_A_Campus".

3) Enter your Sample Number on page 2 of the Part A coversheets.

4) Statistical Tasks

Using Price $000 (first column of your data) explore prices of three bedroom residential properties, by using Excel to

  • Construct a frequency histogram or polygon
  • Calculate descriptive statistics

Note: The required data for three bedrooms is in the first rows of your sample. 

5) Written Task - Component of a Longer Report

Using the instructions given on page four of the Part A coversheets, introduce your data and the results of your investigation of prices of three bedroom properties for sale in the location and state specified by your sample.

This should be one to three pages and 300 to 500 words.

Use an appropriate style, without statistical jargon and equations, to clearly communicate your results.

6) Complete Coversheets 1 and 2, save and submit Part A of the project online using Project Part A link in Submit Project by the due date Tuesday 20 March 2018.

Data Analysis Project Part B -

Tasks

Task 1 Part A Self-Marking -

When directed to do so during Week 5 complete the following tasks

1) Open your saved copy of your submission for Part A.

2) Replace the Part A coversheets (three pages) with the Part B coversheets (first four pages).

3) Rename and save this file as

"Family Name_First Name_Part_B_Campus".

4) Use the solution template and marking guide provided to mark your submission for Part A. Enter recommended marks on the self-marking sheet for Part A, page 3 of the file in 3) above.

5) Write a short (approximately 200 words) reflection/feedback on your submission and marking of Part A. In particular:

  • consider the good aspects of your submission, what did you do well
  • identify where you made mistakes, and how you would avoid them in the future
  • consider what you learnt from submitting and marking Part A.

This is to be entered in the space at the bottom of the self-marking sheet for Part A.

6) Save file. This is to be submitted with Part B - due Sunday 29 April 2018.

Task 2 Part B Appendix - Statistical Inference Tasks

The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.

These appendices should come after your written answer within your single Word document for Part B.

In preparing your appendices you may use one of the following formats:

  • Word with Excel output added.
  • Handwritten with Excel output added. This will then need to be scanned and added to your word document.

Statistical Inference

Choose a level of significance for any hypothesis tests and a level of confidence for any confidence intervals. Enter these values on page 2 of the Part B coversheets along with the sample number from Part A.

Question 1 - Topic 5

Older buyers are often looking to downsize, moving from a four or more bedroom house to a smaller two or three bedroom unit.

Explore if older buyers wishing to downsize have a reasonable choice of units to choose from by using the Type data (6th column of your data) for ALL 125 residential properties for sale and an appropriate statistical inference technique to answer the following question

  • What proportion of residential properties for sale, in the location and state specified by your sample, are units?

Question 2 - Topic 6

From past research, Safe-As-Houses Real Estate is aware that many potential buyers consider a non-capital city or town too expensive if the average house price is more than half a million dollars.

Explore if potential buyers would consider house prices in the location and state specified by your sample too expensive by using the Price $000 data (first column of your data) for ALL houses for sale and an appropriate statistical inference technique to answer the following question

  • In the location and state specified by your sample, is the mean house price more than $500,000?

Task 3 - Part B Written Task - Components of a report

For each question, present the results of your calculations, with your interpretation and conclusion as components of a longer report on the residential property market.

Use the instructions given on page five of the Part B coversheets.

This should be a one to three pages and 200 to 400 words.

It should be submitted as a Word file with Excel output included.

Make sure you:

  • Introduce each question and put it in context.
  • Answer the question in non-statistical language
  • Present the results of your intervals or tests without unnecessary statistical jargon
  • Include conclusions which answer the given questions.

Data Analysis Project Part C -

Task 1 Part C - Appendix Statistical Inference and Regression and Correlation Tasks

The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.

These appendices should come after your written answer within your single Word document for Part C.

In preparing your appendices you may use one of the following formats:

  • Word with Excel output added.
  • Handwritten with Excel output added. This will then need to be scanned and added to your word document.

Choose a level of significance for any hypothesis tests. Enter this value on page 2 of the Part C cover sheets along with the sample number from Part A.

Use your sample and appropriate statistical inference and regression and correlation techniques to answer the following questions.

Question 1 Statistical Inference Topic 7

Safe-As-Houses Real Estate is comparing residential property prices in different locations. In particular, they are interested if there is a difference in average price between two given locations.

You are required to decide if there is a difference in average price between the residential properties for sale in the location and state specified by your sample and those in the location and state specified in the last column of your data.

For example, if your student ID number ends in 2 you will be comparing residential property prices in Coastal City 1 State A with those in Coastal City 1 State B.

To provide a justified decision use Price $000 (first column of your data) and Location X State Y Price $000 (last column of your data) for ALL 125 residential properties for sale in each sample, with an appropriate statistical inference technique to answer the following question.

  • Is there a difference in the mean price of residential properties for sale in the two locations?

Questions 2 and 3 Simple and Multiple Linear Regression

Safe-As-Houses Real Estate is interested in developing a model to predict the price of a residential property for sale.

To develop such a model, first develop a simple linear regression model to predict price from internal area and then a multiple linear regression model to predict price from internal area, number of bedrooms and if the property is a unit or house. Finally choose, or construct, and then interpret the linear model that best fits your data.

Question 2 Simple Linear Regression Model Topic 8

To explore the relationship between the internal area of a residential property and its price use Internal Area m^2 (independent variable - second column of your data) and Price $000s (dependent variable - first column of your data) for all 125 residential properties for sale in your sample. Using this data develop and then explore a simple linear relationship between the two variables by:

  • Plotting the data with a scatter plot.
  • Calculating the least squares regression equation, correlation coefficient and coefficient of determination.
  • Interpreting the gradient and vertical intercept of the simple linear regression equation.
  • Interpreting the correlation coefficient and coefficient of determination. Are these values consistent with your scatter plot?

Question 3 Multiple Linear Regression Model Topic 9

To explore what other factors may have an influence on the price of a residential property for sale use Internal Area m^2, Bedrooms and Type, (three independent variables - second, third and sixth columns of your data) and Price $000 (dependent variable - first column of your data), for all 125 residential properties for sale in your sample. Using this data develop and then explore the relationship between these four variables by:

  • Calculating the multiple regression equation, multiple correlation coefficient, and coefficient of multiple determination.
  • Interpreting the values of the multiple regression coefficients.
  • Interpreting the values of the multiple correlation coefficient and coefficient of multiple determination. Compare these values with the corresponding values for the simple linear regression model.

Then determine the best model to predict the price of a residential property for sale by:

  • Using appropriate tests to determine which independent variables make a significant contribution to the regression model.
  • Using the results of the above tests to give or calculate the simple or multiple regression equation which best fits the data.

Task 2 - Written Answer - Components of a report

For Question 1 and Questions 2 and 3 combined present the results of your calculations, with your interpretation and conclusions as components of a longer report on the residential property market.

Use the instructions given on page four of the Part C coversheets.

This should be 300 to 700 words and three to six pages.

It should be submitted as a Word file with Excel output embedded.

Make sure you:

  • Introduce each question and put it in context
  • Answer the questions in non-statistical language.
  • Present the result of your calculations and tests without unnecessary statistical jargon
  • Include conclusions which answer the given questions.

In particular, for Question 2

  • Explain the choice of independent and dependent variables
  • Include your scatter plot and discuss any apparent relationship between internal area and price. Comment on the strength, shape and sign of the relationship.

In particular, for Questions 2 and 3

  • Include and justify the best model.
  • Discuss and interpret the values of the regression and correlation coefficients of the best model.

Note - Only Part C need to be done.

Attachment:- Assitgnment File.rar

Reference no: EM131990257

Questions Cloud

What structures are responsible for cell movement : Would a pH of 2.0 be acidic or basic? What structures are responsible for cell movement? List and define the 5 Roman Numeral Staging categories for cancer.
Describe the structure and function of the skin : Describe the structure and function of the skin (integumentary system) and related diseases. Also, indicate which skin layers are involved.
Explain how the culture or ethnicity has had an impact : Evaluate its role in transnational crimes. Analyze and explain how this culture or ethnicity has had an impact on systems of justice.
Students worked in investment bank at salary : Two former Yale University students worked in an investment bank at a salary of $ 60,000 each for 2 years after they graduated.
Calculating the least squares regression equation : MAT10251 STATISTICAL ANALYSIS PROJECT. Calculating the least squares regression equation, correlation coefficient and coefficient of determination
Necessary money to initiate and operate your business : How do you plan to secure the necessary money to initiate and operate your business?
Examine potential changes in it related to innovation : Examine potential changes in IT related to innovation and organizational processes. List and describe internal (online) information security risks.
Define the marketing versus advertising : Define/Differentiate the following GNP per capita versus Purchasing Power Parity. Marketing versus advertising
What kinds of programs exist in your area to assist victims : What kinds of programs exist in your area to assist victims of crime with disabilities? How would you provide these services to your clients?

Reviews

len1990257

5/21/2018 1:22:00 AM

Project Preparation - You are expected to use Excel when completing the project. Your written answers presenting your findings and conclusions should be considered as a part of a larger report on the real estate market in your given city or town. Each written answer should be a word document into which your Excel output has been copied. In addition, your statistical workings for Parts B and C should appear as appendices to your written answers. These should include all necessary steps and appropriate Excel output. Each part of the project should be submitted as a single Word document.

len1990257

5/21/2018 1:21:51 AM

In preparing your appendices you may use one of the following formats: Word with Excel output added. Handwritten with Excel output added. This will then need to be scanned and added to your word document. Notes - You should not need to read beyond the study guide and textbook to complete the project. Referencing - You are not required to reference. However, as the format of your written answer is a component of a longer report it may be appropriate to reference. In this case, use any consistent referencing style.

len1990257

5/21/2018 1:21:42 AM

Project Submission - Each part of the project should be a SINGLE Word file with Excel output included. The given cover sheets should be the first pages of your submitted project and are not part of the page limit. DO NOT submit your appendices, which are not part of the page limit, for either part B or C as separate files. Ensure that the page setup of your submitted document is A4 Portrait, with an appropriate format so that it is easily readable if printed. Use line spacing of at least 1.5. Please name your file. “Family Name_First Name_Part_A/B/C_Campus”, For example; Jayne _Nicola_Part_A_Lismore.

len1990257

5/21/2018 1:21:34 AM

Penalties For Incorrect Sample - If you use a sample that does not correspond to the last digit of your student ID number, to be entered on the cover sheet, a maximum of two marks may be deducted, as this causes the marker extra work and frustration. Incorrect Format - If the page setup of your submitted Word file is not as required (that is, A4 Portrait, with appropriate format so that it is easily readable if printed), with at least 1.5 line spacing or your project is not submitted as a single Word document a maximum of two marks may be deducted, as this causes the marker extra work and frustration.

len1990257

5/21/2018 1:21:26 AM

In addition, if your file is not named as requested or the required cover sheets are not included or correctly completed a maximum of two marks may also be deducted, as this can cause the marker extra work and frustration. Part C Preparation - While the submission date for Part C is Sunday 20, you should be working on Part C during Weeks 9 to 11. It is recommended that you follow the following timetable. Question 1 covering Topic 7 should be attempted in Week 9. Question 2 covering Topic 8 should be attempted in Week 10. Question 3 covering Topic 9 should be attempted in Week 11.

len1990257

5/21/2018 1:21:18 AM

Notes: You may need to transform or manipulate the given data, before using Excel for the corresponding statistical calculations. Use Excel for the statistical calculations. You do not need to repeat any Excel calculations by hand. However, make sure that you define your random variables and include any steps not given by Excel. For example, in a hypothesis test include the null and alternative hypotheses, along with the decision to reject or not reject the null hypothesis. Mention any assumptions you need to make.

len1990257

5/21/2018 1:21:09 AM

In Question 2 fit a linear model even if from your scatter plot you decide that a non-linear relationship better fits the data or that no apparent relationship exists. However, mention this in your written answer and/or corresponding appendix. In Question 3 while there may be interaction between independent variables, you are not required to add interaction terms to your model or test for interaction. Similarly, in Question 3 while there may be collinearity of pairs of independent variables, you are not required to consider this or calculate a variance inflation factor (VIF). Comment on why a test has been chosen. Make sure you write conclusions to hypothesis tests.

Write a Review

Applied Statistics Questions & Answers

  Why does a p-value provide a true measure of statistical

Why does a p-value provide a true measure of statistical significance? Try to relate your answer to previously covered concepts - example continuous probability distributions.

  Calculate the costs associated with level production

a. Calculate the costs associated with level production and chase production strategies.

  Why can we just calculate the maximum value

Lab Session 7 STATS 220- Why can we just calculate the maximum value for the whole file, rather than having to focus just on the retweet_count column?

  Calculate the correlation coefficient

Calculate the correlation coefficient

  Find the value of "k"

Let "x" be a random variable from the standard normal distribution. Find the value of "K" for the following problems. (a) P(x=0)=K  (b) P(x ≤ K)= 0.9

  Describe an independent-samples t-test

As you continue to review SPSS, statistics are important in assessing development and comparisons between groups (means). In SPSS, two group means can be compared to assess differences. You will watch the tutorial on how to do an SPSS independent-..

  Provide step by step solving process

A dental clinic at which only one dentist works is open two days a week. During those two days, the traffic arrivals a Poisson distribution with patients arriving at the rate of three per hour. The doctor serves patients at the rate of one every 15 m..

  What is the probability of a randomly selected resident

A) What is the probability of a randomly selected resident being in favor of the bridge?B) What is the probability that a randomly selected resident is a woman and is Opposed to the bridge?C) What is the probability of a randomly selected resident be..

  Problem1 1 using if filter the data for these students

problem1 1. using if filter the data for these students into a column which contains the heights of only female

  Calculate clv and total segment value

MIS784 Customer Analytics - Use the survey data to create four customer segments Hint: use K-Means algorithm in IBM SPSS Modeler and Calculate CLV and ‘Total Segment Value' for each segment from Task 1

  Calculate the mean play times c of these songs

Davids ipod has about 10000 songs.the distribution of the play times for these songs is heavily skewed to the right with mean of 225 seconds and a standard deviation of 60 seconds. Suppose we choose an srs of 10 songs from this population and ..

  Describe the variables implicit in given items

A survey by an electric company contains questions on the given:- Describe the variables implicit in these 11 items as quantitative or qualitative, and describe the scales of measurement

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd