Write down expressions of the class conditional probability

Assignment Help Basic Statistics
Reference no: EM131096504

10-701 Machine Learning - Spring 2012 - Problem Set 2

Q1. Logistic regression

1.1 Logistic vs linear regression-

In both logistic (LR) and linear regressions (R), given input X, the goal is to predict the response Y. The difference is that LR is typically used for classification whereas R is used for regression.

1. Propose a simple modification to R that makes it amenable to classification (instead of regression) tasks. Comment on whether such a proposal is superior than LR or not and briefly explain why.

2. Recall in R where y = wx (w being the estimated linear coefficient for a one-dimensional variable x), a unit change in x would induce a multiplicative w change in y. In LR, y and wx are linked by the sigmoid function. Explain how you would interpret the w coefficients in logistic regression. Suppose w = 2, calculate the change in the odds of the classes induced by a unit change in x, assuming there are two available classes.

1.2 Logistic vs naive Bayes-

Suppose in a binary classification problem, the input variable X = [x1, ..., xM] is M-dimensional and the response variable Y is a class indicator (0 or 1). In this section, you will work in steps to establish a connection between logistic regression and Gaussian naive Bayes.

1. Write down expressions of the class conditional probability for each class, P(Y = 1|X) and P(Y = 0|X), for logistic regression.

2. Using the Bayes rule, derive the posterior probabilities for each class, P(Y = 1|X) and P(Y = 0|X), for naive Bayes.

3. Assuming a Gaussian likelihood function in each of N dimensions, write down the full likelihood f(X|Y ) for naive Bayes.

4. Assuming a uniform prior on the two classes and using the results from part 2 and 3, derive a full expression for P(Y = 1|X) for naive Bayes.

5. Show that with appropriate manipulation and parameterization, P(Y = 1|X) in naive Bayes from part 4 is equivalent to P(Y = 1|X) for logistic regression in part 1.

1.3 Loss function-

Write down the loss function, or the negative log likelihood, for logistic regression. Denote y as the class indicator, x as the predictor vector, w as the coefficient vector and N as the number of data points. Derive the derivative of the loss function with respect to w (hint: first derive the derivative of the sigmoid function σ(u) with respect to a generic input u.).

Q2. Learning theory

 2.1 PAC learning-

Imagine yourself as an apprentice chef in a restaurant. Your first task is to figure out how to make a salad. The rules are supposedly simple: 1) you are free to combine any of the ingredients as they are 2) you can also slice any of the ingredients into two distinct pieces before mixing them. Since you have learnt PAC learning theory, you wonder how much effort you would need to figure out the makeup in a salad.

1. Suppose that a naive chef makes salads following only rule 1. Given N available ingredients and that each salad made out of these constitutes a distinct hypothesis. How large would the hypothesis space be? Explain how you arrive at your answer.

2. Suppose that a more experienced chef follows both rules when making a salad. How large is the hypothesis space now? Explain.

3. An experienced chef decides to train you to discern the makeup of a salad by showing you the salad samples he has made. There are 6 available ingredients. If you would like to learn any salad at 0.01 error with probability 99%, how many sample salads would you want to see? Show your workings in clear steps.

2.2 VC dimensions-

 Consider a 2D space or x1-x2 plane. What is the VC dimension of circles where points inside are labeled as 1's and those outside as 0's? Draw an example scenario with minimal number of points where these circles would fail to shatter the space.

Q3. Mistake bounds

Suppose you have a team of N robot agents and you wish to train them to help you make predictions in life. As a simple start, the prediction will be based on majority votes from these agents. To assure that they won't fail you in crucial tasks, you went on to analyze their prediction mistakes. Soon you find that their mistakes are curiously related to that of the best agent...

Your strategy of training these agents on a binary classification problem is as follows:

1. Initialize all robots with equal weight wi = 1 for i = 1...N.

2. Since each robot makes a prediction of either class (yi = 0 or 1), the ensemble prediction follows the weighted majority and predicts 1 if

i=1NwiI(yi = 1) ≥ i=1NwiI(yi = 0)                                    (1)

and otherwise 0, where I(·) is the indicator function and equals 1 if its argument is true.

3. If any robot makes a mistake, you penalize them by reducing their weights by a half.

4. Go to step 2.

You discover that your best agent makes Mb prediction mistakes while the ensemble agent makes Me mistakes. You are now going to figure out how these numbers are related. In other words, you are going to find an upper bound for Me in terms of Mb.

1. What would the weight of the best agent be after making Mb prediction errors and why? Let's denote it as wb.

2. What is the maximal ensemble weight (∑wi) after making Me errors? Let's denote it as Wmax.

3. Write a simple inequality that relates part 1 (wb) and 2 (Wmax).

4. Using the equality in part 3 and your solutions to part 1 and 2, derive an upper bound for the ensemble mistake Me in terms of the mistake from the best agent Mb.

Q4. Guess the lean animal

In the animal kingdom, there are lean candidates such as the monkey and chubby ones such as the giant panda. In this section, you will develop a classifier that predicts whether an animal is lean or not given some of its properties.

1. Load "LeanAnimals.mat" in MATLAB. You should see the following: "names" for animals, "properties" for properties of animals, "labs" for indicators of leanness and "D" for relational matrix (animals vs properties). Sort "labs" into groups of 0s and 1s and sort the rows in D accordingly. How many animals are lean? Plot the sorted matrix D in black and white using imagesc command. Formulate the problem using logistic regression given that the goal is to predict whether an animal is lean or not. Specify clearly your inputs and outputs, and write down the expressions for the class conditional probabilities.

2. Write a generic logistic regression classifier. Attach your MATLAB codes for the LR classifier ONLY in compact format in the writeup.

3. Apply your LR classifier to the data and perform leave-one-out crossvalidated (LOOCV) predictions on the animals. In other words, at each round you would first train the classifier on 49 animals, and predict whether the held-out animal is lean or not. Report your classification accuracies for the lean and non-lean classes in percentage separately.

4. Now, instead of LOOCV, fit your classifier on the entire dataset "D". This should return a single set of weights. List them and interpret them for properties 2 to 6-do they make sense, why or why not? The annotation for property 1 is missing-can you guess what property it might be given your estimated weight? [Note: no credits would be deducted or granted here so don't agonize if you are stuck.]

5. Using the "corr" function in MATLAB, compute the correlation coefficients between each of the properties and "labs"-tabulate these. Do these matches well with your estimated weights in part 6-explain briefly why or why not.

6. From your outputs in part 4, you should be able to compute p(lean|animal) for each animal. Rank the animals by sorting their class conditionals in descending orders. You should produce a table that consist of two columns: Animal Name (sorted), Conditional Probability (p(lean-animal)). Is this how you would sort these animals?

Attachment:- Data.rar

Reference no: EM131096504

Questions Cloud

How public policy had changed because of budget shortfalls : Find an example of how public policy had to be changed because of budget shortfalls. Relate your discussion to the public safety area in which you are employed or interested.
Identify the employee at corresponding level of management : Joe’s Steel Corporation is a company that fabricates a variety of industrial steel products. They are located in the Midwest and cater to regional construction needs. The company employs three primary managers. Joe Smith is the second generation CEO ..
Longer copyright validity period : Do you believe that a period this long is necessary to encourage the production of creative work? What are the advantages and disadvantages of a longer copyright validity period?
An algorithm that translates regular grammars into finite : Show that regular grammars and finite automata have equivalent definitional power by developing (a) an algorithm that translates regular grammars into finite automata and
Write down expressions of the class conditional probability : 10-701 Machine Learning - Spring 2012 - Problem Set 2. Write down expressions of the class conditional probability for each class, P(Y = 1|X) and P(Y = 0|X), for logistic regression
Use benefit-cost ratio : The Townsville City Council is considering a proposal by the mayor to construct a recreational facility at a cost of $1.0 million. The facility is expected to have a useful life of 30 years during which operating costs are expected to average $100,00..
Describe the controversy between the fbi and apple : Describe the controversy between the FBI and Apple concerning the San BernardinoTerror Attack. Why did Apple refuse to assist the FBI and comply with a court order?
Stock markets going to go down or up over the next year : Are stock markets going to go down or up over the next year (in your opinion), provide a reason why or why not? Should you be invested in the market yourself, and why?
About the integrative bargaining : Discuss the differences between integrative and distributive bargaining and the conditions in which either or both are used in the negotiation process. Describe Thompson’s Pyramid Model and its relationship to parties in bargaining situations, and wh..

Reviews

Write a Review

 

Basic Statistics Questions & Answers

  Gambling and winnings-linear regression equation

Develop a linear regression equation for these data and forecast how much money Robert will win if he spends $30.

  Autumn 1999 exam

A survey of a magazine's subscribers indicates that 60% own a home and 75% own a car. Ninety percent of the home owners who subscribe to the magazine, also own a car. What proportion of subscribers i. own both a car and a house? ii. own a car or a..

  For example the probability that the economy will be in

suppose that the percentage annual return you obtain when you invest a dollar in gold or the stock market is dependent

  Describe the variables and scale of measurement

Identify at least 2 variables for which you would utilize a repeated-measures ANOVA in your analysis and describe the variables and scale of measurement. Identify whether each factor is fixed or repeating.

  Estimate mean weekly earnings of students at one college

We want 95% confidence that the sample mean is within $5 of the population mean, and the population standard deviation is known to be $63.

  Munchausens syndrome by proxy

Convicted in 1993 of the murder of 4 children, which nurse allegedly suffered from a medical condition known as "Munchausens syndrome by proxy"?

  Two points are randomly on a line of length 14 so as to be

two points are randomly on a line of length 14 so as to be on opposite sides of the midpoint of the line. in other

  Bias of a statistic refers to

The mean of a sample is 22.5. The mean of 1000 bootstrapped samples is 22.491. The bias of the bootstrap mean is

  University bookstore student computer purchase program

The University Bookstore is owned and operated by State University through and Independent Corporation with its own board of directors.

  Assume that the significance level is a001 use the given

assume that the significance level is a0.01. use the given information to find the p-value and the critical values.

  Twenty guests arrive ten single women and ten single men on

guests arrive at random at a party and the host seats them as they arrive successively one at a time around a large

  A company that manufactures video cameras produces a basic

a company that manufactures video cameras produces a basic model and a deluxe model. over the past year 47 of the

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd