Build logistic regression model that includes all attributes

Assignment Help Applied Statistics
Reference no: EM132365228

Statistics Assignment - Predicting Wine Quality

Descriptions: This dataset is related to red vinho verder wine samples from the north of Portugal. The goal is to model wine quality (v12) based on physicochemical attributes (v1-v11). Each row of the data set corresponds to a wine sample.

Descriptions for the data follow: v1 fixed acidity v2 volatile acidity v3 critic acid v4 residual sugar v5 chlorides v6 free sulfur dioxide v7 total sulfur dioxide v8 density v9 pH v10 sulphates v11 alcohol v12 quality (score from 1-10) Please download the train.csv and test.csv from BBlearn.

A wine sample is identified as good wine if the quality v12>=6. The research goal is, therefore, to predict if a wine sample is a good wine. Please create a new dependent variable that takes the values of 1 if a sample is good wine and 0 otherwise.

(i) Please build a logistic regression model that includes all the attributes from the training set to predict the wine quality (model 1) in the testing set. Show the confusion table. What are precision, recall, and accuracy?

(ii) From model 1, what will happen to the odds of good wine if the concentration of sulphates is increased by 1 unit?

(iii) Please build a logistic regression model that v1, v2, v3, v4, and v5 from the training set to predict the wine quality (model 2) in the testing set. Show the confusion table. What are precision, recall, and accuracy?

(iv) Please compare model 1 and model 2 using the ROC curve on the testing set. Which one performs better?

(v) Suppose that a merchant will lose $10 if he misclassifies a bad wine as a good wine. He will also lose $1 if he misclassifies a good wine as a bad wine. Please compare the total cost of model 1 and model 2.

(vi) We can also use a linear regression model for classification. Please use v12 as the dependent variable and build a linear regression model on all the attributes from the training set. Please show the Mean Square Error (MSE): the average of the squared differences between the predicted and actual wine quality score.

(vii) We can also use a linear regression model for classification. Please use v12 as the dependent variable and build a linear regression model on all the attributes from the training set. To make decisions, a merchant will classify a sample as good wine if the predicted V12 is greater than or equal to 6. Show the confusion table. 2. It's very important to inspect the dataset before conducting analysis.

Please download the final.csv from BBlearn. Use v10 as the dependent variable and build a logistic regression model on all the attributes. You may find that the algorithm does not converge as the warning message indicates. Can you find out why?

Reference no: EM132365228

Questions Cloud

Analyze key historical events in field of health informatics : Analyze key historical events in the field of health informatics for how technology has been used that could inform the management of health information.
Describe the use of cryptocurrency and altcoin : With the advances of technology, organizations are being challenged to examine how they conduct business to maintain relevancy in their processes and systems.
Discuss the challenges management faces : The traditional approach of conducting business has evolved from the brick-and-mortar mode to more electronic forms. Identify an organization that offers their.
How you plan to maintain an acceptable academic standing : Please describe how you plan to maintain an acceptable academic standing at LAPU by describing the methods you will exercise to improve your grades.
Build logistic regression model that includes all attributes : Statistics Assignment - Predicting Wine Quality. Please build a logistic regression model that includes all the attributes from the training set
Describe entitlement and block grant programs : The government provides health insurance to low-income individuals, the elderly, and the disabled. These individuals cannot afford private health insurance.
Explain policy analysis and its three purposes : Policy analysis is an analysis that provides informed advice to a client that relates to a public policy decision, includes a recommended course of action.
Find the cost of providing the health care service : Write a 700- to 1,050-word paper that considers the cost of providing the health care service and running the facility you have proposed. Include an evaluation.
Write overview of what a complete webography looks like : A generic Webography example titled "Webography Example" that will give you a good overview of what a complete Webography looks like.

Reviews

Write a Review

Applied Statistics Questions & Answers

  What is the probability that xy will be even

If x is chosen at random from the set {1,2,3,4} and y is to be chosen at random from the set {5,6,7}, what is the probability that xy will be even?

  Law of universal regression

1)The term regression was originally used in 1885 by Sir Francis Galton in his analysis of the relationship between the heights of children and parents. He formulated the “law of universal regression,” which specifies that “each peculiarity in a man ..

  Calculate the binomial probability

Calculate the binomial probability by either using one of the binomial probability tables, software, or a calculator using the formula below

  Research and compare various techniques for organizing data

ITEC874 Big Data Technologies Assignment - Data Lake Architecture, Macquarie University, Australia. Research and compare various techniques for organizing data

  Compare variance and standard deviation

BB108 Business Statistics Assignment, Melbourne Institute of Technology, Australia. Compare variance and standard deviation

  Create an examination bed for the clinic in nicaragua

Critique the proposed 14-week design process to create an examination bed for the clinic in Nicaragua. This process is displayed in the Gantt chart of Figure 1.

  What is the probability of hitting the target exactly twice

MST-003: Probability Theory Assignment - The probability that a player hits a target is 0.24. He fires 6 times. What is probability of hitting target twice

  Forecasting with time series analysis objective crusty

forecasting with time series analysis objective crusty pizza executives have to forecast december sales for the 10

  What is the value of the critical value of t

If you were constructing a 99% confidence interval of the population mean based on a sample of n where the standard deviation of the sample s = 0.05, what is the value of the critical value of t that you should use? Explain/show how you obtain your a..

  Responsible for assuring statistical control. in one proces

A statistical process analyst is responsible for assuring statistical control. In one process, a machine is supposed to drop 11.4 ounces of mints into a bag. (Assume that this process can be approximated by a normal distribution). The acceptable rang..

  Examine the lift chart and classification table for the tree

Examine the lift chart and classification table for the tree. What can you say about the predictive performance of the model

  An astronomer at the mount palomar observatory notes

An astronomer at the Mount Palomar Observatory notes that during the Geminid meteor shower, an average (?) of 50 meteors appears each hour, with a variance (σ2) of 9. The Geminid meteor shower will occur next week.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd