Reference no: EM131068782
BHV Estimation project proposal
Member 4 recently dropped, need one more member
B. What would be the important predictors for estimating the Boston housing value and how would these predictors affect the housing value of Boston? We want to figure out the relationship between these predictor variables and the response variable (home value).
C. Description of the dataset:
There are 506 observations of 14 variables in this dataset. These 14 variables contain 13 continuous attributes and 1 binary-valued attribute.
1) CRIM refers to per capita crime rate by town
2) ZN refers to the proportion of residential land zoned for lots over 25,000 square feet
3) INDUS refers to the proportion of non-retail business acres per town
4) CHAS is a qualitative variable that refers to Charles River dummy variable (= 1 if tract bounds river; 0 otherwise
5) NOX refers to the nitric oxides concentration (parts per 10 million)
6) RM refers to average number of rooms per dwelling
7) AGE refers to the proportion of owner-occupied units built prior to 1940
8) DIS refers to weighted distances to five Boston employment centres
9) RAD refers to the index of accessibility to radial highways
10) TAX refers to the full-value property-tax rate per $10,000
11) PTRATIO refers to pupil-teacher ratio by town
12) B refers to 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
13) LSTAT refers to % lower status of the population
14) MEDV refers to Median value of owner-occupied homes in $1000's
D. The techniques we think would be useful are cross-validation, PCA and LDA. For the cross-validation, we need to first separate the data into train set and test set with proper percentages. Then we use set.seed() method to get the corresponding train set and test set, we can then use the train set and test set to find the misclassification test error using the KNN-Fold Cross-Validation strategy. We can then compare the test errors at each K value and find the minimal test error and the best K value for the number of folds. For the PCA, we need to draw histograms for the response variable(s) to check for their skewness and normality. If the data is normal, we need to scale the data using mean and standard deviation. If the data is skewed, we need to scale the data using median and median absolute deviation.
We can then look at the loadings for each Principal Component and find the best PC's for our predictor variables. For further investigation, we can use biplot to visualize the PC's and determine which ones work the best. We can then apply the LDA test to do the logistic discrimination analysis for the data. We then can compare the LDA and PCA to find the best estimators for our predictor variables. These are the ones we learned in class so far. There might be some more useful techniques we can apply after getting further in the course, so our techniques for this data might change in the future.
E. How many PC's should we select or use?
What should we do if there are more than one indicator variable?
How should we treat the outliers?
Do we need to use Box-Cox transformation for the data?
Calculate the velocity and acceleration vectors
: Calculate the velocity and acceleration vectors and the speed at t=pi/4 for a particle whose position at time t isgiven by vector r(t)=(cost t) i +( cos 2t) j +( cos 3t) k.
|
Length of the rods coming out of our new cutting machine
: The length of the rods coming out of our new cutting machine can be said to approximate a normal distribution with a mean of 10 inches and a standard deviation of .2 inch find Find the probability that a rod will have a length of less than 10.inch..
|
Write about accounting for strategic management and control
: Write a paper on Accounting for Strategic Management and Control (ASMC). Wordcount - 1750 words.
|
What is the mean hdl for group males
: Application Assignment Worksheet- SPSS Hypothesis Testing. What is the mean HDL for group males? What is the HDL standard deviation for males? What is the mean HDL for group females? What is the HDL standard deviation for females
|
Bhv estimation project proposal
: B. What would be the important predictors for estimating the Boston housing value and how would these predictors affect the housing value of Boston? We want to figure out the relationship between these predictor variables and the response variabl..
|
Exponential function to model
: a. Write an exponential function to model this situation using the formula A = P(1 + r)^t. Is it growth or decay? b. If he hasn't withdrawn any money, how much is in the account when he retires?
|
Evaluate the development of the movement and its successes
: A strong thesis statement supported by research from at least 5 different sources. In a research-based project like this, it is important to refer to and cite your sources throughout the paper to show where your information is coming from and to s..
|
Write essay on the concepts discussed in article
: You may want to clarify the thesis topic, highlight the assumptions made, the biases of the author etc and provide a brief summary of the argument and the supporting evidence.
|
Term paper on- hydrogen fuel cell vehicles
: Choose one of these topics , TRANSPORTATION:- Transportation Energy Alternatives, Hydrogen Fuel Cell Vehicles, The Revival of Elactric Cars, Plug-in Hybrid Cars and The Past, Present and Future of Compressed-Air Vehicles.
|