We will look now at changes in the income distribution of Canadians between 1991 and 2001. Use the census data for these years provided in the course web page. Download that data into a directory on your computer.

Each row in the dataset corresponds to a person. Each column corresponds to a variable (for example, each entry in the first column states the province where the person represented in that row lives). The columns are in the following order:

Prov = province of residence

Ecfamsz = size of the economic family (number of people living under the same roof)

Age = the age of the person responding to the survey

Totinc = total annual income from all sources

Wages = total annual income from wages and salaries

Selfempl = total annual income from self-employment

Totgvinc = total annual income from government transfers (e.g., unemployment insurance, income assistance, public pensions)

Year = census year

Female = 1 if the individual is female, 0 if male

Canborn = 1 if the person was born in Canada, 0 if born elsewhere

Married = 1 if the individual is married, 0 if single, separated or divorced

Lesshs = 1 if the person's highest education level is at most some time in high school without graduating, 0 otherwise

Hsgr = 1 if the person's highest education level is high school graduation, 0 otherwise

Postsec = 1 if the person's highest education level is some postsecondary but less than a BA (largely, colleges and trades), 0 otherwise

Ba = 1 if the person's highest education level is a BA, 0 otherwise

Grad = 1 if the person's highest education level is an MA or a PhD, 0 otherwise

Fulltime = 1 if the person worked mainly full time in the previous year

Part time = 1 if the person worked mainly part time in the previous year

a. Construct frequency tables and plot histograms to describe the income frequency for each year use deciles to define the bins. Calculate the Kernel estimator for each year.

b. Produce the main statistics to describe the distribution of total income (mean, variance, standard distribution and skweness)

c. What can you say about the changes in the total income between the two years. Have incomes improved generally? Has inequality gone up or down?

d. Do parts 1) to 3) again using the wage distribution.

e. How does the distribution of wages compare to the distribution of total income?

f. Construct a variable for years of education and for experience and experience squared. Regress the natural logarithm of wages on gender, place of residence (you will have to construct the dummies for each province), marital status, Canadian born status, full time job, years of education, experience and experience squared. Interpret the results. What is the most important source of wage differentials according to this regression?

g. Run the above regression for males and females separately. Does this change the estimates you obtained before? What about running separate regressions for Canadian-born males and Foreign-born males?

h. Perform and Oaxaca decomposition on the wage differences of males and females. First use a very simple regression where you only include marital status, age and province of residence and compute the Oaxaca decomposition. Next substitute years of education for age and add experience, experience squared, Canadian born status and full time job and compute the Oaxaca decomposition again. Has the explained and unexplained components of the wage difference changed much? Discuss

We will use the male panel data to understand the differences between cohort and age/entry effects. Quebec = individual lives in quebec (same for the other provinces)

Canada = individual was born in Canada (same for other places of birth, USA, UK, Europe, Africa, Asia)

samer = individual was born in South America

othpob= individual was born in othe place

IMM55 = individual immigrated to Canada before 1955

IMM5660 = individual immigrated to Canada between 1956 and 1960 (similar for the other indicators IMM6165, IMM6670, IMM7175, IMM7680, IMM8185, IMM8690, IMM9195, IMM199600)

Ysm = years since migration (number of years since the individual immigrated to Canada

imm = immigrant status indicator. 1 = born outside Canada

year = census year (1981- 2001)

vancouvr = Individual lives in census metropolitan area of Vancouver (similar for Toronto and Montreal)

othercma = Individual lives in other census metropolitan area

married = Individual is married or living common law

yrsed = number of years of education

exp = number of years of experience (age-yrs of education - 6)

fulltime = individual works full time

parttime = individual works part time

fullyear = individual works more than 40 weeks a year

lwg = natural log of weekly wages

age = age of individual

a. Run a regression of the natural logarithm of wages on province, census metropolitan area (the dummies are already created), marital status, immigrant status, full time job, years of education and age. Add an interaction between age and census years. The interaction coefficients provide the age cohort effect. Interpret these numbers.

b. Let us look now at the entry effect for immigrants. Run a regression of log wages on province, census metropolitan area, marital status, full time job, years of education, experience and experience squared, immigrant status and ysm and ysm2. Interpret the coefficients. Now substitute the immigrant status variable with the cohort dummies (IMM55, IMM5660, ..., IMM9600). Interpret the coefficients and discuss the entry effects for immigrants. Do any variables change when you include the entry effects?