What would be the error of the training set

Assignment Help Computer Engineering
Reference no: EM131926033

Problem

Predicting Housing Median Prices. The file BostonHousing.csv contains information on over 500 census tracts in Boston, where for each tract multiple variables are recorded. The last column (CAT.MEDV) was derived from MEDV, such that it obtains the value 1 if MEDV > 30 and 0 otherwise. Consider the goal of predicting the median value (MEDV) of a tract, given the information in the first 12 columns. Partition the data into training (60%) and validation (40%) sets.

a. Perform a k-NN prediction with all 12 predictors (ignore the CAT.MEDV column), trying values of k from 1 to 5. Make sure to normalize the data and choose function knn() from package class rather than package FNN. To make sure R is using the class package (when both packages are loaded), use class: knn(). What is the best k? What does it mean?

b. Predict the MEDV for a tract with the following information, using the best k:

1692_information-table.jpg

c. If we used the above k-NN algorithm to score the training data, what would be the error of the training set?

d. Why is the validation data error overly optimistic compared to the error rate when applying this k-NN predictor to new data?

e. If the purpose is to predict MEDV for several thousands of new tracts, what would be the disadvantage of using k-NN prediction? List the operations that the algorithm goes through in order to produce each prediction.

Reference no: EM131926033

Questions Cloud

Causes and consequences of the french revolution : What were the causes and consequences of the French Revolution? Who? What? Where? When? and why.
Describe one of the key events in global history : What are some of the key events in the evolution of global politics? Describe one of the key events in global history.
Write down a run that causes the controller to crash : Write down a run that causes the controller to crash (throw an unhandled exception). Confirm your run by writing a unit test or by executing it in the simulator
What is the minimum price should charge : Note that per unit manufacturing overhead costs include $840,000 fixed costs. Note that per unit administrative expenses include $500,000 fixed costs.
What would be the error of the training set : If we used the above k-NN algorithm to score the training data, what would be the error of the training set? What is the best k? What does it mean?
Full web address of the testimony : Search the USHMM site (or Yad Vashem, the USC Shoah Foundation, or other credible online source) for a survivor testimony
Reflect on your theological foundation : Articulate the significance of identifying first, second, and third order Christian doctrines as you begin to dialogue with opposing theological views.
Find a security case study regarding a hack or data breach : Find a security case study regarding a hack or data breach that shouldn't have happened.
Design to create a patient satisfaction survey : you use best practices of survey design to create a patient satisfaction survey - Using best practices of survey design, create a 15-question patient

Reviews

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd