What is the highest accuracy you are able to achieve

Assignment Help Microeconomics
Reference no: EM131960409

Assignment

1 Lalonde NSW Data

A. Load the Lalonde experimental dataset with the lalonde data method from the module causalinference.utils. The outcome variable is earnings in 1978, and the co- variates are, in order:

Black       Indicator variable; 1 if Black, 0 otherwise.
Hispanic   Indicator variable; 1 if Hispanic, 0 otherwise.
Age         Age in years.
Married    Marital status; 1 if married, 0 otherwise. Nodegree Indicator variable; 1 if no degree, 0 otherwise. Education Years of education.
E74         Earnings in 1974.
U74         Unemployment status in 1974; 1 if unemployed, 0 otherwise.
E75         Earnings in 1975.
U75         Unemployment status in 1975; 1 if unemployed, 0 otherwise.

Using CausalModel from the module causalinference, provide summary statistics for the outcome variable and the covariates. Which covariate has the largest normalized difference?

B. Estimate the propensity score using the selection algorithm est propensity s. In se- lecting the basic covariates set, specify E74, U74, E75, and U75. What are the additional linear terms and second-order terms that were selected by the algorithm?

C. Trim the sample using trim s to get rid of observations with extreme propensity score values. What is the cut-off that is selected? How many observations are dropped as a result?

D. Stratify the sample using stratify s. How many propensity bins are created? Report the summary statistics for each bin.

E. Estimate the average treatment effect using OLS, blocking, and matching. For matching, set the number of matches to 2 and adjust for bias. How much do the estimates differ?

2 Document Classification

A. From the module sklearn.datasets, load the training data set using the method fetch 20newsgroups. This dataset comprises around 18000 newsgroups posts on 20 topics. Print out a couple sample posts and list out all the topic names.

B. Convert the posts (blobs of texts) into bag-of-word vectors. What is the dimensionality of these vectors? That is, what is the number of words that have appeared in this data set?

C. Use your favorite dimensionality reduction technique to compress these vectors into ones of K = 30 dimensions.

D. Use your favorite supervised learning model to train a model that tries to predict the topic of a post from the vectorized representation of the post you obtained in the previous step.

E. Use the test data to tune your model. Make sure to include K as a hyperparameter as well. Use accuracy score from sklearn.metrics as your evaluation metric. What is the highest accuracy you are able to achieve?

Reference no: EM131960409

Questions Cloud

Find the current news about the sustainability issue : Read the prompt and find the current news about the sustainability issue. Write the summary of the news.
Write paper about the effect of unemployment in saudi arabia : Write 4-5 pages of Literature Review paper about the The effect of unemployment in Saudi Arabia.
Second array contains the number of athletes : You are given two arrays the first array contains the sports at a sporting event and the second array contains the number of athletes playing in each sport.
Describe the factors that differentiated worst experience : Describe the factors that differentiated the worst experience from the best.
What is the highest accuracy you are able to achieve : Use accuracy score from sklearn.metrics as your evaluation metric. What is the highest accuracy you are able to achieve?
What light does the story of shark culling : What has caused public opinion to turn against the Western Australian Government and how effective do you consider the Government's efforts at managing
What is the stock current price : The required rate of return is rs = 10.5%, and the expected constant growth rate is g = 5.5%. What is the stock's current price?
What is the main goal of time multiplexing : 1. What is the main goal of time multiplexing. Give an example of how it can be used in sender to receiver scenario.
Describe typical external disruptions to the supply chains : List and describe typical external disruptions to the supply chains. What steps would you recommend to minimize these disruptions?

Reviews

Write a Review

Microeconomics Questions & Answers

  The free rider problem

Question: Explain why the free rider problem makes it difficult for perfectly competitive markets to provide the Pareto efficient level of a public good.

  Failure of the super committee is good thing for economy

Some commentators have argued that the failure of the “Super committee” is good thing for the economy?  Do you agree?

  Case study analysis about optimum resource allocation

Case study analysis about optimum resource allocation: -  Why might you suspect (even without evidence) that the economy might not be able to produce all the schools and clinics the Ministers want? What constraints are there on an economy's productio..

  Fixed cost and vairiable cost

Questions:  :   Which of the following are likely to be fixed costs and which variable costs for a chocolate factory over the course of a month?  Explain your choice.

  Problem - total cost, average cost, marginal cost

Problem - Total Cost, Average Cost, Marginal Cost: -  Complete the following table of costs for a firm.  (Note: enter the figures in the  MC   column  between  outputs of  0 and 1, 1 and 2, 2 and 3, etc.)

  Oligopoly and demand curve problem

Problem based on Oligopoly and demand curve,  Draw and explain the demand curve facing each firm, and given this demand curve, does this mean that firms in the jeans industry do or do not compete against one another?

  Impact of external costs on resource allocation

Explain the impact of external costs and external benefits on resource allocation;  Why are public goods not produced in sufficient quantities by private markets?  Which of the following are examples of public goods (or services)? Delete the incorrec..

  Shifts in demand and movements along the demand curve

Describe the differences between shifts in demand and movements along the demand curve. What are the main factors which can shift the demand curve? Explain why they cause the demand curve to shift. Use examples and draw graphs to support your discuss..

  Article review question

Article Review Question: Read the following excerpts from the article "Fruit, veg costs surge' by Todd, Dagwell, published in the Herald on January 25th 2011 and answer questions below:

  Long-term growth, international trade & globalization

Long-term Growth, International Trade & Globalization:- This question deals with concepts such as long-term growth, international trade and globalization. Questions related to trade deficit, trade surplus, gains from trade, an international trade sce..

  European monetary union (emu) in crisis

"Does the economic bailout of Spain and Greece spell the beginning of the end for the European Monetary Union (EMU)?"

  Development game “settlers of catan”

Read the rules of the game, the overview and the almanac for the Development Game "Settlers of Catan"

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd