Evaluate and apply data mining software

Assignment Help Other Subject
Reference no: EM133743747 , Length: word count:1600

Data Acquisition and Management

Assessment - Sampling and data mining project

Learning outcome 1: Create analysis-ready data sets by applying and exploring basic validation, preprocessing, filtering and cleaning techniques

Learning outcome 2: Evaluate and apply data mining software

Assessment Description

Business Problem: Airbnb is a U.S. company which provides an online marketplace for short- term and/or holiday accommodation. Airbnb collect large volumes of data to gain insight into their clients and associated customers, such as review scores, host acceptance rate, ‘superhosts', popular accommodation types and density of listings in particular location.

Data sets: We have obtained data on Airbnb listings in Melbourne with a variety of variables. Sampled datasets, the original data and data dictionary will be available from Week 4.

Assessment Instruction

Recall the sampling methods below that you have learnt about in lectures.

A data dictionary file and the following datasets (as .csv files) that contain sample data generated using quota, systematic, simple random, and stratified sampling will be available from week 4, see section c. below. You will also have to access the original population dataset cleansed_listings_dec_18.csv from the source, see section a. and section e. below.

Create a report and include your response to the following questions:

Access the data file cleansed_listings_dec_18.csv, by going to the link provided on MyKBS under the Assessment 1 tab. You will initially be downloading a zip folder from the Melbourne Airbnb Open Data project on Kaggle. Extract all the files within the folder and then choose the file cleansed_listings_dec_18.csv. Browse over the columns and comment on which variables appear to be the most useful in terms of insights into current listings. Document that in your report. (150 words)

List an advantage, possible disadvantage and limitations of each of the sampling methods. (150 words)

Access the sampled data sets on MyKBS. Choose a number of different variables, as in part (a), then for each of the sampled datasets create summary statistics for each of those variables. That is, make sure that the selected variables are the same for each of the four datasets and document them in your report. (300 words)

Interpret and compare the results of the summary stats across all four sample datasets. What conclusions can you draw from the comparison. Document your findings in your report. (500 words)

Repeat the above for the original dataset cleansed_listings_dec_18.csv. Explain with statistical examples which sampling method summary stats (across all chosen variables) were nearest in value to the original dataset summary stats.

Explain the variations in your report and include the supporting data. Explain possible ethical issues that could occur from the use of sampled data.

Briefly evaluate the software that you have used to produce the summaries. (500 words)

Reference no: EM133743747

Questions Cloud

What health outcomes result from your problem : NR449 Evidence-Based Practice Matrix, Chamberlain University - Apply research principles to the interpretation of the content of published research studies
Create at least four different constructors for the class : IT 232 Saudi electronic university- Each product class have a name, an ID, a description, and a price. Create at least 4 different constructors for the class.
What current california brn requirements for rn license : What are the current California BRN requirements for RN license renewal for the new graduate? For the RN with more than 2 years of experience?
Why might more competition in financial market be a bad idea : Why might more competition in financial markets be a bad idea? Would restrictions on competition be a better idea? Why or why not?
Evaluate and apply data mining software : Create analysis-ready data sets by applying and exploring basic validation, preprocessing, filtering and cleaning techniques and Evaluate and apply data mining
Describe two things you like about the article : Describe two things you like about the article. Explain why. Do you find the argument and evidence to be convincing? Explain.
Define customer satisfaction : Define customer satisfaction. It can be a company selling to an end customer or a sub-supplier. Explain why it is important.
What do you believe kurt vonnegut is arguing in interviews : What do you believe Kurt Vonnegut is arguing in his individual interviews? Select at least THREE, state your perception of each, and then support each opinion.
Examine the impact of stigma on the person : Examine the impact of stigma on the person with lived experience and Discuss the consequences of stigma on the person with lived experience

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd