Reference no: EM133490395
Data Analytics for Business
Case Study: Analysis of NSW Valuer General Land Value Information
Task
This assignment will use data similar to our Case Study introduced in Topic 2 and will assess your understanding of the content in Topics 1-5.
You will recall that the NSW Valuer General has a database containing information on all registered properties in New South Wales. The Department provides access to this information by allowing downloads of information extracted from its database. You will be required to investigate and answer questions on a subset of this data, which is held in the file "Landvalues QBM100 202360 Assignment 1". You will also need to refer to the User Guide for the Valuer General's database: "202360 Land Value data file User Guide.pdf".
In completing this task you will be using Excel to manage, extract and interpret information from a relatively large file similar to situations you may well encounter in a professional environment.
Rationale
This assessment task will assess your progress toward achieving the following learning outcomes:
• be able to discuss the basic concepts relating to data analytics (including big data);
• be able to recognise the various types and sources of data;
• be able to extract value from data sets for informed decision making;
• be able to identify and determine the appropriate statistical technique to analyse data;
• be able to apply industry software tools to summarise and analyse data;
• be able to present and communicate relevant decisions arising from the analysis of data.
Presentation
The assignment may be typed or neatly handwritten with any Excel output inserted where required at the appropriate place in the assignment not in an Appendix at the back of the assignment. Marks will be deducted for assignments which do not follow these guidelines.
The assignment is to be uploaded to EASTS as a single PDF file. Assignments submitted in non-printable formats such as a ZIP file or as a collection of images will not be marked.
Pages must be numbered, and your name and student number must be included on every page.
Question 1: Initial Investigation of the data set and variables
As the business manager of a large real estate agency you are deciding if this data set is of value to the business in describing and appraising properties in the Albury district, and by extension, other districts across NSW. Use the data set and the "User Guide" to answer these preliminary questions:
a. State the number of variables in the dataset.
b. State the number of records in the data set.
c. Categorise the data set as being either cross-sectional or time series and justify your decision.
d. List all the values taken by the variable "AUTHORITY 2"
e. Find the number of properties that have no value entered for the variable "HOUSE NUMBER". Show the Excel function you used to find this.
f. The variable "DISTRICT CODE" takes the value "50" (or "050") because all properties are in the Albury district. Find the value "DISTRICT CODE" would have if the "DISTRICT NAME" was "Snowy River".
g. In a 2010 data set a particular property was identified as having a ZONE CODE of "O". If the zoning for that property has not changed since that time, then state the possible "ZONE CODE" that could now be allocated to the property.
h. State the variable type and scale of measurement of:
• "LANDVALUE 1"
• "ZONE CODE"
i. The Variable PROPERTY ID is described as a unique numeric identifier applied to every property in NSW. Explain in statistical terms why this variable should not be treated as a number, even though it is described as "numeric".
Question 2: Investigating the variable "LAND VALUE 1"
You are interested in the extent to which the data can be used to understand and communicate the assessed value of properties in the Albury district. As such you focus on the variable "LAND VALUE 1", where the large number of values make it difficult to "understand" the situation. Use the data set to answer the following questions, which are central this task:
a. State the date at which this value was established.
b. Create a Descriptive Statistics table for "LAND VALUE 1".
c. Use upper class limits of $50,000, $100,000, $150,000, , $500,000 to create a histogram for "LAND VALUE 1". Include a sensible title and axes labels.
d. Describe the shape of the histogram and justify this by referring to the histogram (one aspect) and the Descriptive Statistics table. (two aspects)
e. Give the name of the statistic (and its value) that would best represent the "typical value" of these properties. Justify your choice.
f. Calculate the Coefficient of Variation for "LAND VALUE 1". What does this statistic say about the values taken by the variable?
Question 3: Understanding the Location of Business Zones
An investor from Sydney is considering establishing a manufacturing business in Albury that requires land with an "Industrial" zoning. The first important step to inform this decision is to get a better understanding of the location of the Industrial Zones within the Albury LGA. Answer the following questions, which are the basis for providing relevant information to the investor.
a. Use the pivot table functionality in Excel to generate a pivot table of "SUBURB NAME" within "POSTCODE" against "ZONE CODE". [Check you have a grand total of 21,927 properties]
b. Identify the three suburbs with the highest number of properties zoned "industrial".
c. For the two suburbs with the highest number of "industrial" zoned properties, state the suburb with the highest proportion of these properties zoned as "Light Industry"? Show the calculation of this percentage.
d. The investor would like to purchase an "industrial" property that is not in close proximity to environmentally sensitive areas. Based upon this (admittedly scant) criteria determine a suburb/s you might recommend. Justify this choice.
e. The investor is also looking to purchase a home on a relatively large block and preferably in a village setting. Use the information in your pivot table to identify two suburbs that would be of potential interest. Determine which suburb might be the best place to begin the search, justifying your choice.
Question 4: Investigating the relationship between property area and land value
A strategic property investor is seeking your advice about the relationship between the area of properties and their assessed value. You undertake some background research in preparation for your meeting with the client as outlined in the questions below - after firstly creating a subset of the data to have only the variables "AREA" and "LAND VALUE 1".
a. Use this subset to find the correlation between these two variables and show your Excel command.
b. Construct a scatterplot of "LAND VALUE 1" (vertical) against "AREA" (horizontal). Insert a sensible title and axis labels.
c. Describe what the scatterplot is saying about firstly, the values taken by these two variables (individually) and secondly, the relationship between them.