Define and explain about the data mining

Assignment Help Management Information Sys
Reference no: EM131293275

You will be using the Real Estate data set that you used for last weeks descriptive statistics assignment.

You will be using the Real Estate data set to build a model to predict what a house should sell for. This model will be used by a real estate agency to help their clients understand what their house should sell for so they can make an educated decision about listing price. Secondarily, the model will be used by a home contractor. S/he would like to be able to tell clients the selling value of adding an additional bathroom.

Part 1 of the project involves the first three steps in the data mining process: sample, explore and modify. You will be preparing the data for model building, which will be done in Part 2 of the project.

-You will need to make decisions regarding data that is in text form, missing data, potentially incorrect data, the inclusion of potential outliers, binning strategy and variable transformation. Please make sure your decisions are justified. Note that the specific requirements and relative weights are outlined in the rubric.

This is some rules and information from my professor for this assignment.

What is my first step?

You can't have text in the Excel file. The first step should be decided which columns you want to delete (like Chester because all of the homes are in Chester County). YOU NEED A GOOD REASON FOR DELETING DATA.
Next, you have to decide how to code the text data. Is it continuous or categorical? If it is categorical, are you coding it as binary, nominal or ordinal?

You can get creative! There are different styles for homes. There is not "order" to traditional, colonial, farmhouse, etc. so you can NOT code it 1, 2, 3. Coding is as nominal would mean adding LOTS of new columns. Which style is most prevalent - colonial, I think? You could code it as binary with 1=colonial and 0=all other styles
Keep as much data as you can but don't make yourself nuts!

What comes after coding all of the text data? Now you have to deal with missing values. Sometimes 0 was filled in when the information wasn't known. I can promise that square foot and taxes are not 0!

What should I bin? How should I do it?

Binning is taking a continuous variable (like age or acres or square foot) and turning it into a categorical - ordinal variable (1=0-15, 2=16-30, 3=31-45, etc.)
How would I do that? I would probably add a column and then use Filters to do the coding. Sorting the data would be another way. But you are adding a NEW column that will have the ordinal 1, 2, 3, etc. value.

What are we turning in for Data Mining - Part 1?

You are turning in a Word (or PDF) file that will contain you discussions and any relevant Excel output (descriptive statistics, frequencies and correlation) in labeled, professional tables. I do NOT need the raw data in the Word doc.
You are asked to post the Excel file but I will only be looking at it if there is a problem. Anything you want me to grade should be in the Word doc.
Please turn in a printed copy of the Word doc. It is easier for me to give feedback in that form.

Reference no: EM131293275

Questions Cloud

Equilibrium by colluding and choosing another strategy : Identify any Nash equilibriums in this pricing game. Hint: It is considered good form to denote outcomes by the strategy combination that gets you there, not by the resulting payoffs. Could the two firms do better than the Nash equilibrium by collu..
Identify overall scope and analyze unexpected events : Identify Overall Scope and Analyze Unexpected Events. Develop the project scope document (no more than 3 pages long) to include the following information:.
What is the company that you have chosen : What is the company that you have chosen?- What product(s) is being produced that could cause the company to use MRP?
Who are the two vendors that you have selected : Who are the two vendors that you have selected?- What common components do you notice that their systems share?
Define and explain about the data mining : Define and Explain about the Data Mining.What comes after coding all of the text data? Now you have to deal with missing values. Sometimes 0 was filled in when the information wasn't known. I can promise that square foot and taxes are not 0!Wha..
Borrower or the lender in the given scenario : Imagine that you have a fixed 30-year interest rate for your mortgage, and the economy has experienced unanticipated inflation. Examine who the winner and loser would be. Is it the borrower or the lender in the given scenario? Provide support for ..
What is your biggest concern about that mnes operation : Identify a foreign MNE, big or small, that is operating in your area.- Of the concerns outlined in this section, what is your biggest concern about that MNE's operation?
Quantity demanded by the market : If Firms A and B decide o collude and work as pure monoplist so that each firm will produce half the quantity demanded by the market, what will be the economic profit for Firm A?
Explain whether action taken was appropriate and effective : Identify the development level and style demonstrated in the situation. Support your position with specific examples. Explain whether the action taken was appropriate and effective.

Reviews

Write a Review

Management Information Sys Questions & Answers

  Describe a different sdlc model

The paper will be five pages: (a) Describe the 7 Step SDLC - 2 pages, (b) Describe a different SDLC Model (4 step or 12 step) - 2 pages, (c) Compare and contrast the 7 Step Model and the second model you selected (4 step or 12 step). (DO NOT discu..

  Compare and contrast two siem tools

Compare and contrast two SIEM tools of your choice based on their common uses and market reputation. Determine which of these tools you would prefer to use as part of an incident response strategy and explain why

  Contrast two major erp systems from different erp software

Create a 7- to 10- slide presentation comparing and contrasting two major ERP systems from different ERP software providers.

  Explian what are your retailer profits for each stock level

What are your retailer profits for each stock level? Supplier profits for each stock level? Supply chain total for each stock level?

  Find techniques hacker would use to steal organization data

You have been asked to help secure the information system and users against hacking attempts. Complete the following: Take this opportunity to list 2 different approaches and techniques a hacker would use to steal the organization's data

  What would the syntax look like

Thinking about repetition loops and things we do more than once can help identify something you would store in an array. For instance, if we were to define an array named that contained temperatures for the past 19 days, what would the syntax loo..

  Information technology and decision making

Information Technology and Decision Making- Patient safety and care is reliant upon the quality of nurses' daily decision making

  How the situation could have been handled or perhaps

Please provide a one page executive summary on the Blackout of 2003. Your summary should not exceed one single spaced page and should include Who, What, When, Where, Why and How the situation could have been handled or perhaps how the situation wa..

  Have you ever encountered a filter bubble

What are you thoughts about companies collecting data about you? Do you feel this type of data collection is a beneficial or detrimental? Why? Do you feel you have a right to know what data companies are collecting about you? Does this concern you..

  Describe assumptions or limitations for each relationship

Provide an Entity Relationship Model (ERM) that will describe the data structure that will store all data elements. Note: The graphically depicted solution is not included in the required page length.

  Advantages of increased connectivity

Increased connectivity - advantages of increased connectivity

  Unleashing a boost of energy to individual

Voltage Energy Drink aims to rejuvenate the body and mind, unleashing a boost of energy to the individual also offering a great taste.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd