Supervised machine learning classifiers

Assignment Help Computer Engineering

Reference no: EM133944278

Question - Supervised machine learning classifiers

Data: The zip file "hw2.q2.data.zip" contains 3 CSV files:
"hw2.q2.train.csv" contains 8,000 rows and 11 columns. The first column ‘y' is the output variable with 4 classes: 0, 1, 2, 3. The remaining 10 columns contain input features: x1, ..., x10.
"hw2.q2.test.csv" contains 2,000 rows and 11 columns. The first column ‘y' is the output variable with 4 classes: 0, 1, 2, 3. The remaining 10 columns contain input features: x1, ..., x10.
"hw2.q1.new.csv" contains 30 rows and 10 columns. The first column ‘ID' is an identifier for 30 unlabeled samples. The remaining 10 columns contain input features: x1, ..., x10. Get expert assignment help online from PhD writers.

Task 1.
Use 4-fold cross-validation with the 8,000 labeled exampled from "hw2.q2.train.csv" to identify a classifier that achieves mean cross-validation accuracy of at least 0.96. You should try several Scikit-Learn classifiers, including: GaussianNB, DecisionTreeClassifier, RandomForestClassifier, ExtraTreesClassifier, KNeighborsClassifier, LogisticRegression, SVC, and MLPClassifier. Try different hyper-parameter values for the better performing classifiers to obtain a good set of hyper-parameter values. Then select the best performing model. Report the following:
Selected model with hyper-parameter values:

Mean cross-validation accuracy: ............................ (rounded to 4 decimal places)

Task 2.
Train the classifier with the hyper-parameter values determined in Task 1 on all 8,000 training samples and use it to predict the output class ‘y' for the 2,000 examples in "hw2.q2.test.csv". Report the following:
Accuracy on 2,000 test examples: ........................ (rounded to 4 decimal places)
Classification report for the 2,000 test examples:

Confusion matrix for the 2,000 test examples:

Task 3.
Use the model trained in Task 2 to predict the output class ‘y' for the 30 examples in "hw2.q2.new.csv".

Reference no: EM133944278

Questions Cloud

Identify good online resources that you can use : Identify good online resources that you can use to stay current - The podcasts that you select must be unique and not that another learner has already used

Which response by the nurse demonstrates understanding : I was surprised the chiropractor didn't twist me up and push on me. Which response by the nurse demonstrates understanding of the chiropractic medicine?

Discuss strict scrutiny versus undue burden : Explain how the Court is beginning to part with its original analysis in subsequent abortion case opinion.

Provide a concrete example from your own experiences : Provide a concrete example from your own experiences of the way that these concepts have impacted the demand of medical care.

Supervised machine learning classifiers : Try different hyper-parameter values for the better performing classifiers to obtain a good set of hyper-parameter values.

What can be done to minimize or eliminate this impact : What are some of the examples a researcher's implicit bias impact a study's internal validity? What can be done to minimize or eliminate this impact?

Will chuck and the company prevail in his lawsuit : Did Laura have a duty to disclose the haunting to Chuck and Bridget's Inc.? Will Chuck and the company prevail in his lawsuit, why or why not?

Determine the fewest number of rules : Which ones are relevant for this classification task - determine the fewest number of rules using which a decision tree classifier can achieve mean cross-valid

What elements should be included in patient financial policy : What elements should be included in a patient financial policy, and should patients be required to sign the policy? Explain your reasoning.

User Account

All Pages