Classify malignant vs benign breast tumors

Assignment Help Advanced Statistics
Reference no: EM131869519

R programming

In this assignment you will use the lda algorithm from the MASS pack-age, the knn algorithm from the class package, and the glm function for logistic regression in R to classify malignant vs. benign breast tumors. The dataset you will use can be found at the UCI Machine Learning Repository and is labeled Breast Cancer Wisconsin (Diagnostic).

This dataset contains the ID, diagnosis, and 30 real-valued input features. You are to split the dataset into two pieces: a training set consisting of the first 75% of the observations, and a test set consisting of the remaining observations.

Please use your best judgement to preprocess your data appropriately and employ all three algoriths lda, knn, and glm to predict the correct diagnosis for as many test instances as possible. Report your findings along with a written explanation of your process and results. Note, be sure to compare and constrast your results employing each algorithm.

Reference no: EM131869519

Questions Cloud

Which currency is expected to appreciate in value : Is the U.S. dollar selling at a premium or a discount relative to the Canadian dollar?Which currency is expected to appreciate in value?
Discuss two leadership behaviors : Discuss two leadership behaviors that are pivotal to mobilizing people to tackle this challenge and thrive providing specific examples.
How the united states compares to other countries : Explain how the United States compares to other countries with regards to social mobility rates. Are there differences between the United States.
Explain the role of race and gender on mobility : You should examine at least three generations of your family and think about their achievements and ascriptions and whether people climbed upward.
Classify malignant vs benign breast tumors : DS-630 - The dataset you will use can be found at the UCI Machine Learning Repository and is labeled Breast Cancer Wisconsin
What services are and are not covered in other nations : The film Escape Fire discusses the provision of care in the United States. Several concerns were raised in the film. A large cost for the health care system.
Provide example of rhetorical devices about climate change : please provide an example of the use of one of the following rhetorical devices in the discussion/debate about climate change
How many consumers believe in global warming : Will alternative fuels such as compressed natural gas or electricity perform compared to gasoline in terms of miles per gallon, costs per gallon, and so forth?
What role did his mother and father play : What role did his mother and father (or his mother's boyfriend) play in contributing to Harris-Moore's criminal lifestyle

Reviews

Write a Review

Advanced Statistics Questions & Answers

  How many standard deviations is the sample mean

How many standard deviations is the sample mean from the mean of the sampling distribution?

  Conduct the two-sided nonparametric test

What nonparametric test can be used to compare the three groups-What is the predicted number of cavities for someone who consumes on average 45 grams of sugar a day?

  Business probability and statistics assistance

You are managing an equity portfolio with the S&P 500 as the performance benchmark. You are introducing a new indicator to your analysis, the basis spread between the S&P index spot and futures contract prices.

  What are the values of the specification limits

What are the values of the specification limits that would be required to achieve the minimum acceptable Cpk

  Organizational change-management

Can you tell me what role organizational change out of: e-business management, global leadership, managerial economics, and financial methods play in a management career and professional development?

  Minimize the cost of shipping

Sindbad Marine Co has two ships Barbarossa and Jebel Tarik based in Lattaquieh. Each ship has three sections: front, middle, and back. Sinbad Marine Co has received an offer to transport two commodities:

  Determine significance of difference between the groups

Determine the significance of the difference between the groups and determine whether building these systems helped reduce new cases of malaria.

  Prepare and provide the appropriate control chart

Calculate the 2σ (sigma) control limits. The centerline should be based on the data from these 15 days. All calculations should be to three decimal places

  Unscientific sampling

A Milwaukee television station, WITI-TV, conducted a telephone call-in survey asking whether viewers liked the new newspaper, the Journal Sentinel.

  Fit the indicated multiple regression

Give a confidence interval for the coefficient of IBM returns and carefully interpret estimate and does the inclusion of IBM returns improve the fit of the model with just market returns by a statistically significant amount? Does this imply that w..

  Question 1a corporation produces packages of paper clips

question 1a corporation produces packages of paper clips. the number of clips per package varies as indicated below for

  Describe difference between internal and external validity

For what types of data would you use nonparametric versus parametric statistics? Briefly describe the difference between internal and external validity

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd