10 Presenting Results
So far, we have discussed how to use regression, descriptive statistics, and so on to expound on the relationship between quantitative variables as well as estimate the impacts of policy. However, none of this is meaningful if we cannot present our findings in the correct way. This chapter covers the presentation of results, using tables and graphics.
10.1 User Written Commands in Stata
Stata is a powerful econometrics software in its own right. However, users oftentimes have the need to move away from off the shelf Stata commands and use custom written commands developed by the users. One such command is the esttab command. The way to use the esttab command is by installing the Stata package estout. You can do it by typing this syntax into Stata’s terminal ssc install estout, replace. esttab is the wrapper for estout.
10.2 Using esttab
Consider the regression model
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
regress price i.foreign weightwhich returns the output
Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(2, 71) = 35.35
Model | 316859273 2 158429637 Prob > F = 0.0000
Residual | 318206123 71 4481776.38 R-squared = 0.4989
-------------+---------------------------------- Adj R-squared = 0.4848
Total | 635065396 73 8699525.97 Root MSE = 2117
------------------------------------------------------------------------------
price | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
foreign |
Foreign | 3637.001 668.583 5.44 0.000 2303.885 4970.118
weight | 3.320737 .3958784 8.39 0.000 2.531378 4.110096
_cons | -4942.844 1345.591 -3.67 0.000 -7625.876 -2259.812
------------------------------------------------------------------------------A key question though is how we may present these results. One answer, is NOT just by using the raw output of the Stata code. No, we can use esttab to do this. First, we can store the model in question by using a prefix called eststo. We then use esttab to present the results in a professional manner.
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
cls
eststo clear
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
esttab m1produces the output
----------------------------
(1)
price
----------------------------
0.foreign 0
(.)
1.foreign 3637.0***
(5.44)
weight 3.321***
(8.39)
_cons -4942.8***
(-3.67)
----------------------------
N 74
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001Much more organized than the default output no? We can do more. Let’s modify the variable names on the left hand side of our table.
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
cls
eststo clear
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
esttab m1, lThis syntax uses the variable labels instead of the variable name. It returns the output
------------------------------------
(1)
Price
------------------------------------
Domestic 0
(.)
Foreign 3637.0***
(5.44)
Weight 3.321***
(8.39)
Constant -4942.8***
(-3.67)
------------------------------------
Observations 74
------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001What if we wish to use the confidence interval instead of the default t-statistics?
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
cls
eststo clear
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
esttab m1, l cireturns the outout
----------------------------------------------
(1)
Price
----------------------------------------------
Domestic 0
[0,0]
Foreign 3637.0***
[2303.9,4970.1]
Weight 3.321***
[2.531,4.110]
Constant -4942.8***
[-7625.9,-2259.8]
----------------------------------------------
Observations 74
----------------------------------------------
95% confidence intervals in brackets
* p<0.05, ** p<0.01, *** p<0.001What if we wish to omit the constant from the table?
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
cls
eststo clear
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
esttab m1, l ci noconreturns the output
----------------------------------------------
(1)
Price
----------------------------------------------
Domestic 0
[0,0]
Foreign 3637.0***
[2303.9,4970.1]
Weight 3.321***
[2.531,4.110]
----------------------------------------------
Observations 74
----------------------------------------------
95% confidence intervals in brackets
* p<0.05, ** p<0.01, *** p<0.001All of this is simply a start. But, this still does not address the important part: how do we use it for the paper? We can save the output to a rich text file, which you may use for Microsoft Word.
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
cls
eststo clear
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
esttab m1 using table1.rtf, l ci nocon replacereturns the output
esttab m1 using table1.rtf, l ci nocon replace
(output written to table1.rtf)We can click the table1.rtf link and it’ll open the file, which may be copied and pasted to word.
10.3 DID with esttab
Let’s do an example using Difference-in-Differences.
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
replace treated = cond(id==3,1,0)
g post= cond(year >=1989,1,0)
keep id year treated post cigsaleAgain here, the “California” coefficient from the table below is the average pre-intervention differences between California and the control group in the pre policy period. We can prove this by
. su cigsale if year < 1989 & id ==3
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
cigsale | 19 116.2105 11.68303 90.1 128
. su cigsale if year < 1989 & id !=3
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
cigsale | 722 130.5695 32.69778 55 296.2“Post” refers to the case where California is 0 and post = 1, or the average difference of the control group’s outcomes in the post period, compared to the pre period. This can be confirmed by
. su cigsale if year < 1989 & id !=3
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
cigsale | 722 130.5695 32.69778 55 296.2
. su cigsale if year >= 1989 & id !=3
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
cigsale | 456 102.0581 23.58887 40.7 186.8DID is the ATT of interest. It is the difference between the average trend of the treated group in the post period and the control group in the post period, minus the difference between the same in the pre period. See how the coefficient is -27? We can get the same result like this
su cigsale if year < 1989 & id==3, mean
loc y1pre = r(mean)
su cigsale if year < 1989 & id != 3, mean
loc Y0pre = r(mean)
su cigsale if year >= 1989 & id==3, mean
loc y1post = r(mean)
su cigsale if year >= 1989 & id != 3, mean
loc Y0post = r(mean)
di (`y1post'-`Y0post') - (`y1pre'-`Y0pre')We can get this result with regression:
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
replace treated = cond(id==3,1,0)
g post= cond(year >=1989,1,0)
keep id year treated post cigsale
regress cigsale i.treated##i.post, vce(cl id)
//mat l r(table)
lab var cigsale "Cigarette Consumption"
lab var treated "California"
lab var post "Post"
cls
esttab, ci ///
l /// Uses variable labels
nobase /// from estout, whose options you may use with esttab
nocons /// omits the constant
ren(1.treated#1.post DID) /// renames the interaction term
nostar /// no stars for statistical significance
title(Table 1)which returns the output
-------------------------------------------
(1)
Cigarette Consumption
-------------------------------------------
California=1 -14.36
[-24.40,-4.315]
Post=1 -28.51
[-34.12,-22.90]
DID -27.35
[-32.96,-21.74]
-------------------------------------------
Observations 1209
-------------------------------------------
95% confidence intervals in bracketsAgain, we can save this table to a rich text file as we did above and use it in Word/drive.
10.4 Graphics
So far we’ve used graphics in the text, and we’ve even made some graphs on the fly. But, how can we present them such that they can be understood? Again, NOT by simply posting screenshots of graphs. No, instead Stata can do this for us such that we can use them in documents. Say we wish to plot the results of the DID model above. That is, the DID predictions versus California.
Before we do that though, a short word on Stata’s results. Some commands in Stata return results after you use them (indeed most do). Remember how we discussed how the DID intercept (the alpha parameter that shifts our control group average) is just \(\hat\alpha_{\mathcal{N}_0} \coloneqq T_{1}^{-1}\sum_{t \in \mathcal{T}_{1}}\left(y_{1t}-\bar{y}_{\mathcal{N}_0t}\right)\)? Well, we can obtain this with regression as we did above. When you run the regress command, you’ll be returned with the standart output table, but you’ll ALSO be returned with a table of all the underlying results stored in the r(table) matrix. So,
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
replace treated = cond(id==3,1,0)
g post= cond(year >=1989,1,0)
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
mat l r(table)returns the matrix
r(table)[9,9]
0b. 1. 0b. 1. 0b.treated# 0b.treated# 1o.treated# 1.treated#
treated treated post post 0b.post 1o.post 0b.post 1.post _cons
b 0 -14.359003 0 -28.511415 0 0 0 -27.349111 130.56953
se . 4.9612676 . 2.7696277 . . . 2.7696277 4.9612676
t . -2.8942206 . -10.294313 . . . -9.8746526 26.317776
pvalue . .00626458 . 1.515e-12 . . . 4.842e-12 5.320e-26
ll . -24.402564 . -34.118233 . . . -32.955929 120.52597
ul . -4.3154416 . -22.904597 . . . -21.742293 140.61309
df 38 38 38 38 38 38 38 38 38
crit 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942
eform 0 0 0 0 0 0 0 0 0In this case, we care about column two, row 1 (remember how -14 was the alpha parameter from the code)
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
order cigsale3, a(year)
tempvar ymeangood ytilde ymean cf te rss tss
egen `ymean' = rowmean(cigsale1-cigsale39)
g `ytilde' = cigsale3-`ymean'
reg `ytilde' if year < 1989
loc alpha = e(b)[1,1] // here is the intercept shift
di `alpha'So now, we can simply do
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
// Here is our alpha
loc alpha = r(table)[1,2]
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average
// plus the average of the pre-intervention differences
// between the control group and treated groupOkay this returns our observed values versus our predicted values. On one hand we have California, and on the other we have California without Proposition 99. The way to plot this, from start to finish, is
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
mat l r(table)
// Here is our alpha
loc alpha = r(table)[1,2]
loc ATT = r(table)[1,8]
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average
// plus the average of the pre-intervention differences
// between the control group and treated group
tsset year
// sort by year
twoway (tsline cigsale3, ///
lcolor(black) lwidth(medthick)) ///
(tsline DID_Cali, recast(connected) ///
mcolor(gs11) msize(small) msymbol(smsquare) lcolor(gs11) lpattern(solid) lwidth(thin)), ///
ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xline(1989, lwidth(medium) lpattern(solid) lcolor(black)) ///
xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
legend(region(lcolor(none)) order(1 "Real Cali" 2 "DID Cali") cols(1) position(3)) ///
xsize(7.5) ///
ysize(4.5) ///
graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
title(Difference-in-Differences: ATT: `ATT') yti("Cigarette Consumption") ///
xti(Year)We can also save this plot. We can do this like
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
mat l r(table)
// Here is our alpha
loc alpha = r(table)[1,2]
loc ATT = r(table)[1,8]
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average
// plus the average of the pre-intervention differences
// between the control group and treated group
tsset year
// sort by year
twoway (tsline cigsale3, ///
lcolor(black) lwidth(medthick)) ///
(tsline DID_Cali, recast(connected) ///
mcolor(gs11) msize(small) msymbol(smsquare) lcolor(gs11) lpattern(solid) lwidth(thin)), ///
ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xline(1989, lwidth(medium) lpattern(solid) lcolor(black)) ///
xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
legend(region(lcolor(none)) order(1 "Real Cali" 2 "DID Cali") cols(1) position(3)) ///
xsize(7.5) ///
ysize(4.5) ///
graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
title(Difference-in-Differences: ATT: `ATT') yti("Cigarette Consumption") ///
xti(Year) note(Prop 99 was passed in 1989)
graph export "DIDCali.png", as(png) name("Graph") replacewhere after the plot is created we can export it to our current folder, where the name of the plot is “DIDCali.png”. This plot simply displays the observed values versus \(\hat{y}_{1t}^0=\hat\alpha_{\mathcal{N}_0}+\bar{y}_{\mathcal{N}_0t}\), or the counterfactual.
We may also plot the actual treatment effect, or the differences over time between the counterfactual and the treated unit’s observed values
clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
mat l r(table)
// Here is our alpha
loc alpha = r(table)[1,2]
loc ATT = r(table)[1,8]
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
generate te = cigsale - DID_Cali
// treatment effect at time t
generate eventtime = year - 1989
twoway (connected te eventtime, ///
lcolor(black) lwidth(medthick) mcol(blue)), ///
ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xline(0, lwidth(medium) lpattern(solid) lcolor(black)) ///
xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xsize(7.5) ///
ysize(4.5) ///
graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
title(Difference-in-Differences: ATT: `ATT') yti("Treatment Effect") ///
xti(Year) note(Time point 0 is 1989) We can also export this as a .png file too.
11 Summary
Me personally, I prefer graphics to tell my main story. Graphics allow for a powerful, simple way of communicating basic results.