CSCE 822 - Data Mining and Warehousing Assignment

Assignment Help Other Subject
Reference no: EM132400011

CSCE 822 - Data Mining & Warehousing Assignment - University of South Carolina, USA

Attached melb_data.csv file is the Snapshot of Tony Pino's Melbourne Housing Dataset. Do the following data preprocessing and apply KNN and RandomForest algorithms to classify the property prices.

1. Fill the missing values in the dataset using imputation approaches as we talked in class. You can use the scikit-learn's module

from sklearn.impute import SimpleImputer

my_imputer = SimpleImputer()

data_with_imputed_values = my_imputer.fit_transform(original_data)

The default imputer use mean values to fill the missing values. You can try other imputation method as well.

2. Replace the categorical/nominal attributes with one-hot-encoding.

You can use Category Encoders package for use with scikit-learn in Python.

Read this blog for more approaches for data encoding - Smarter Ways to Encode Categorical Data for Machine Learning.

3. Install Weka system on your computer

Sort all the property samples by the property prices and divide the samples equally into 5 categories/classes: Top value, High value, medium value, low value, bottom value.

Apply the KNN algorithm of Weka with K=5 to 10 to classify the property instances into 5 classes. Calculate the accuracy for each K values.

Apply RandomForest algorithm of Weka and report the performance.

You need to split the whole dataset into training (66% samples) and testing datasets (34% samples). Do the random splitting 10 times to calculate the average accuracy.

from sklearn.model_selection import train_test_split

xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size = 0.2, random_state = 0)

 

K=5

K=6

K=7

K=8

K=9

K=10

KNN

Average accuracy

...

 

 

 

 

RandomForest

Average accuracy

 

 

 

 

 

Write report to discuss the performances of KNN and randomforest. You are encouraged to compare the performance of different missing value imputation methods or the categorical encoding methods.

Attachment:- Data Mining & Warehousing Assignment Files.rar

Reference no: EM132400011

Questions Cloud

CSCE822 Data mining Homework : CSCE822 Data mining Homework - Deep learning application for microscopy image classification. Download sample code, and run the code, and report your training
Whistle-blowing-motivation-decentralization-group norms : Pick one of the following terms for your research: Whistle-blowing, motivation, decentralization, group norms, or needs.
Reports produced by council saudi chambers : According to recent reports produced by the Council of Saudi Chambers, healthcare turnover is on the rise within the Kingdom of Saudi Arabia
PMBA6020 Accounting for Decision Making and Control : PMBA6020 Accounting for Decision Making and Control Assignment Help and Solution, Nanyang Business School, Singapore- Assessment Writing Service
CSCE 822 - Data Mining and Warehousing Assignment : CSCE 822 - Data Mining and Warehousing Assignment Help and Solution - University of South Carolina, USA. Fill the missing values in the dataset
Problem - Regression using SVR or Random Forest : Problem 2: Regression using SVR (Support vector regression) or Random Forest. Develop a regression model that can beat a theory model
Explain what might cause process to be out of control : What are some patterns that would indicate that the process is out of control? Additionally explain what might cause a process to be out of control
Organization values support the practice mission and vision : Values/Mission/Vision: How can you ensure that the organization's values support the practice's mission and vision?
GB601-about areas of success-opportunities for improvement : GB601- How did the numbers provide information to you as a base about areas of success, opportunities for improvement?

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd