Reference no: EM131293275
You will be using the Real Estate data set that you used for last weeks descriptive statistics assignment.
You will be using the Real Estate data set to build a model to predict what a house should sell for. This model will be used by a real estate agency to help their clients understand what their house should sell for so they can make an educated decision about listing price. Secondarily, the model will be used by a home contractor. S/he would like to be able to tell clients the selling value of adding an additional bathroom.
Part 1 of the project involves the first three steps in the data mining process: sample, explore and modify. You will be preparing the data for model building, which will be done in Part 2 of the project.
-You will need to make decisions regarding data that is in text form, missing data, potentially incorrect data, the inclusion of potential outliers, binning strategy and variable transformation. Please make sure your decisions are justified. Note that the specific requirements and relative weights are outlined in the rubric.
This is some rules and information from my professor for this assignment.
What is my first step?
You can't have text in the Excel file. The first step should be decided which columns you want to delete (like Chester because all of the homes are in Chester County). YOU NEED A GOOD REASON FOR DELETING DATA.
Next, you have to decide how to code the text data. Is it continuous or categorical? If it is categorical, are you coding it as binary, nominal or ordinal?
You can get creative! There are different styles for homes. There is not "order" to traditional, colonial, farmhouse, etc. so you can NOT code it 1, 2, 3. Coding is as nominal would mean adding LOTS of new columns. Which style is most prevalent - colonial, I think? You could code it as binary with 1=colonial and 0=all other styles
Keep as much data as you can but don't make yourself nuts!
What comes after coding all of the text data? Now you have to deal with missing values. Sometimes 0 was filled in when the information wasn't known. I can promise that square foot and taxes are not 0!
What should I bin? How should I do it?
Binning is taking a continuous variable (like age or acres or square foot) and turning it into a categorical - ordinal variable (1=0-15, 2=16-30, 3=31-45, etc.)
How would I do that? I would probably add a column and then use Filters to do the coding. Sorting the data would be another way. But you are adding a NEW column that will have the ordinal 1, 2, 3, etc. value.
What are we turning in for Data Mining - Part 1?
You are turning in a Word (or PDF) file that will contain you discussions and any relevant Excel output (descriptive statistics, frequencies and correlation) in labeled, professional tables. I do NOT need the raw data in the Word doc.
You are asked to post the Excel file but I will only be looking at it if there is a problem. Anything you want me to grade should be in the Word doc.
Please turn in a printed copy of the Word doc. It is easier for me to give feedback in that form.
Equilibrium by colluding and choosing another strategy
: Identify any Nash equilibriums in this pricing game. Hint: It is considered good form to denote outcomes by the strategy combination that gets you there, not by the resulting payoffs. Could the two firms do better than the Nash equilibrium by collu..
|
Identify overall scope and analyze unexpected events
: Identify Overall Scope and Analyze Unexpected Events. Develop the project scope document (no more than 3 pages long) to include the following information:.
|
What is the company that you have chosen
: What is the company that you have chosen?- What product(s) is being produced that could cause the company to use MRP?
|
Who are the two vendors that you have selected
: Who are the two vendors that you have selected?- What common components do you notice that their systems share?
|
Define and explain about the data mining
: Define and Explain about the Data Mining.What comes after coding all of the text data? Now you have to deal with missing values. Sometimes 0 was filled in when the information wasn't known. I can promise that square foot and taxes are not 0!Wha..
|
Borrower or the lender in the given scenario
: Imagine that you have a fixed 30-year interest rate for your mortgage, and the economy has experienced unanticipated inflation. Examine who the winner and loser would be. Is it the borrower or the lender in the given scenario? Provide support for ..
|
What is your biggest concern about that mnes operation
: Identify a foreign MNE, big or small, that is operating in your area.- Of the concerns outlined in this section, what is your biggest concern about that MNE's operation?
|
Quantity demanded by the market
: If Firms A and B decide o collude and work as pure monoplist so that each firm will produce half the quantity demanded by the market, what will be the economic profit for Firm A?
|
Explain whether action taken was appropriate and effective
: Identify the development level and style demonstrated in the situation. Support your position with specific examples. Explain whether the action taken was appropriate and effective.
|