Reference no: EM132234453
Project: Fit a Logistic Regression Model to the Thoracic Surgery Binary Dataset - Part 1
For this project, you will be working with the thoracic surgery data set from the University of California Irvine machine learning repository. This dataset contains information on life expectancy in lung cancer patients after surgery.
The underlying thoracic surgery data is in ARFF format. This is a text-based format with information on each of the attributes. You can load this data using a package such as foreign or by cutting and pasting the data section into a CSV file.
Instructions: Include all of your answers in a R Markdown report.
a. Fit a binary logistic regression model to the data set that predicts whether or not the patient survived for one year (the Risk1Yvariable) after the surgery. Use the glm() function to perform the logistic regression. See Generalized Linear Models for an example. Include a summary using the summary() function in your results.
b. According to the summary, which variables had the greatest effect on the survival rate?
c. To compute the accuracy of your model, use the dataset to predict the outcome variable. The percent of correct predictions is the accuracy of your model. What is the accuracy of your model?
Project: Fit a Logistic Regression Model to Previous Dataset - Part 2
Include all of your answers in a R Markdown report.
Fit a logistic regression model to the binary-classifier-data.csv dataset from the previous assignment.
Dataset (use previous data): binary-classifier-data.csv is attached.
a. What is the accuracy of the logistic regression classifier?
b. How does the accuracy of the logistic regression classifier compare to the nearest neighbors algorithm?
c. Why is the accuracy of the logistic regression classifier different from that of the nearest neighbors?
Attachment:- Assignment Files.rar