10 Presenting Results
So far, we have discussed how to use regression, descriptive statistics, and so on to expound on the relationship between quantitative variables as well as estimate the impacts of policy. However, none of this is meaningful if we cannot present our findings in the correct way. This chapter covers the presentation of results, using tables and graphics.
10.1 User Written Commands in Stata
Stata is a powerful econometrics software in its own right. However, users oftentimes have the need to move away from off the shelf Stata commands and use custom written commands developed by the users. One such command is the esttab
command. The way to use the esttab
command is by installing the Stata package estout
. You can do it by typing this syntax into Stata’s terminal ssc install estout, replace
. esttab
is the wrapper for estout
.
10.2 Using esttab
Consider the regression model
clear *
cls// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
regress price i.foreign weight
which returns the output
of obs = 74
Source | SS df MS Number F(2, 71) = 35.35
-------------+---------------------------------- F = 0.0000
Model | 316859273 2 158429637 Prob >
Residual | 318206123 71 4481776.38 R-squared = 0.4989
-------------+---------------------------------- Adj R-squared = 0.4848
Total | 635065396 73 8699525.97 Root MSE = 2117
------------------------------------------------------------------------------
price | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
foreign |
Foreign | 3637.001 668.583 5.44 0.000 2303.885 4970.118weight | 3.320737 .3958784 8.39 0.000 2.531378 4.110096
_cons | -4942.844 1345.591 -3.67 0.000 -7625.876 -2259.812
------------------------------------------------------------------------------
A key question though is how we may present these results. One answer, is NOT just by using the raw output of the Stata code. No, we can use esttab to do this. First, we can store the model in question by using a prefix called eststo
. We then use esttab
to present the results in a professional manner.
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
cls
clear
eststo
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
esttab m1
produces the output
----------------------------
(1)
price
----------------------------
0.foreign 0
(.)
1.foreign 3637.0***
(5.44)
weight 3.321***
(8.39)
_cons -4942.8***
(-3.67)
----------------------------N 74
----------------------------statistics in parentheses
t p<0.05, ** p<0.01, *** p<0.001 *
Much more organized than the default output no? We can do more. Let’s modify the variable names on the left hand side of our table.
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
lab
cls
clear
eststo
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
l esttab m1,
This syntax uses the variable labels instead of the variable name. It returns the output
------------------------------------
(1)
Price
------------------------------------
Domestic 0
(.)
Foreign 3637.0***
(5.44)
Weight 3.321***
(8.39)
Constant -4942.8***
(-3.67)
------------------------------------
Observations 74
------------------------------------statistics in parentheses
t p<0.05, ** p<0.01, *** p<0.001 *
What if we wish to use the confidence interval instead of the default t-statistics?
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
lab
cls
clear
eststo
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
l ci esttab m1,
returns the outout
----------------------------------------------
(1)
Price
----------------------------------------------
Domestic 0
[0,0]
Foreign 3637.0***
[2303.9,4970.1]
Weight 3.321***
[2.531,4.110]
Constant -4942.8***
[-7625.9,-2259.8]
----------------------------------------------
Observations 74
----------------------------------------------in brackets
95% confidence intervals p<0.05, ** p<0.01, *** p<0.001 *
What if we wish to omit the constant from the table?
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
lab
cls
clear
eststo
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
l ci nocon esttab m1,
returns the output
----------------------------------------------
(1)
Price
----------------------------------------------
Domestic 0
[0,0]
Foreign 3637.0***
[2303.9,4970.1]
Weight 3.321***
[2.531,4.110]
----------------------------------------------
Observations 74
----------------------------------------------in brackets
95% confidence intervals p<0.05, ** p<0.01, *** p<0.001 *
All of this is simply a start. But, this still does not address the important part: how do we use it for the paper? We can save the output to a rich text file, which you may use for Microsoft Word.
clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price
// price is in dollars, weight is in pounds
var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"
lab
cls
clear
eststo
qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored
using table1.rtf, l ci nocon replace esttab m1
returns the output
using table1.rtf, l ci nocon replace
esttab m1 (output written to table1.rtf)
We can click the table1.rtf
link and it’ll open the file, which may be copied and pasted to word.
10.3 DID with esttab
Let’s do an example using Difference-in-Differences.
"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
replace treated = cond(id==3,1,0)
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
Again here, the “California” coefficient from the table below is the average pre-intervention differences between California and the control group in the pre policy period. We can prove this by
if year < 1989 & id ==3
. su cigsale
dev. Min Max
Variable | Obs Mean Std.
-------------+---------------------------------------------------------
cigsale | 19 116.2105 11.68303 90.1 128
if year < 1989 & id !=3
. su cigsale
dev. Min Max
Variable | Obs Mean Std.
-------------+--------------------------------------------------------- cigsale | 722 130.5695 32.69778 55 296.2
“Post” refers to the case where California is 0 and post = 1, or the average difference of the control group’s outcomes in the post period, compared to the pre period. This can be confirmed by
if year < 1989 & id !=3
. su cigsale
dev. Min Max
Variable | Obs Mean Std.
-------------+---------------------------------------------------------
cigsale | 722 130.5695 32.69778 55 296.2
if year >= 1989 & id !=3
. su cigsale
dev. Min Max
Variable | Obs Mean Std.
-------------+--------------------------------------------------------- cigsale | 456 102.0581 23.58887 40.7 186.8
DID is the ATT of interest. It is the difference between the average trend of the treated group in the post period and the control group in the post period, minus the difference between the same in the pre period. See how the coefficient is -27? We can get the same result like this
if year < 1989 & id==3, mean
su cigsale r(mean)
loc y1pre = if year < 1989 & id != 3, mean
su cigsale r(mean)
loc Y0pre = if year >= 1989 & id==3, mean
su cigsale r(mean)
loc y1post = if year >= 1989 & id != 3, mean
su cigsale r(mean)
loc Y0post =
di (`y1post'-`Y0post') - (`y1pre'-`Y0pre')
We can get this result with regression:
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
replace treated = cond(id==3,1,0)
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
regress cigsale i.treated##i.post, vce(cl id)
//mat l r(table)
var cigsale "Cigarette Consumption"
lab var treated "California"
lab var post "Post"
lab
clsci ///
esttab, l /// Uses variable labels
/// from estout, whose options you may use with esttab
nobase nocons /// omits the constant
ren(1.treated#1.post DID) /// renames the interaction term
/// no stars for statistical significance
nostar title(Table 1)
which returns the output
-------------------------------------------
(1)
Cigarette Consumption
-------------------------------------------
California=1 -14.36
[-24.40,-4.315]
Post=1 -28.51
[-34.12,-22.90]
DID -27.35
[-32.96,-21.74]
-------------------------------------------
Observations 1209
-------------------------------------------in brackets 95% confidence intervals
Again, we can save this table to a rich text file as we did above and use it in Word/drive.
10.4 Graphics
So far we’ve used graphics in the text, and we’ve even made some graphs on the fly. But, how can we present them such that they can be understood? Again, NOT by simply posting screenshots of graphs. No, instead Stata can do this for us such that we can use them in documents. Say we wish to plot the results of the DID model above. That is, the DID predictions versus California.
Before we do that though, a short word on Stata’s results. Some commands in Stata return results after you use them (indeed most do). Remember how we discussed how the DID intercept (the alpha parameter that shifts our control group average) is just \(\hat\alpha_{\mathcal{N}_0} \coloneqq T_{1}^{-1}\sum_{t \in \mathcal{T}_{1}}\left(y_{1t}-\bar{y}_{\mathcal{N}_0t}\right)\)? Well, we can obtain this with regression as we did above. When you run the regress
command, you’ll be returned with the standart output table, but you’ll ALSO be returned with a table of all the underlying results stored in the r(table)
matrix. So,
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
replace treated = cond(id==3,1,0)
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
l r(table) mat
returns the matrix
r(table)[9,9]
0b. 1. 0b. 1. 0b.treated# 0b.treated# 1o.treated# 1.treated# post post 0b.post 1o.post 0b.post 1.post _cons
treated treated
b 0 -14.359003 0 -28.511415 0 0 0 -27.349111 130.56953
se . 4.9612676 . 2.7696277 . . . 2.7696277 4.9612676
t . -2.8942206 . -10.294313 . . . -9.8746526 26.317776
pvalue . .00626458 . 1.515e-12 . . . 4.842e-12 5.320e-26
ll . -24.402564 . -34.118233 . . . -32.955929 120.52597
ul . -4.3154416 . -22.904597 . . . -21.742293 140.61309
df 38 38 38 38 38 38 38 38 38
crit 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942 2.0243942eform 0 0 0 0 0 0 0 0 0
In this case, we care about column two, row 1 (remember how -14 was the alpha parameter from the code)
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
order cigsale3, a(year)
tempvar ymeangood ytilde ymean cf te rss tss
egen `ymean' = rowmean(cigsale1-cigsale39)
`ytilde' = cigsale3-`ymean'
g reg `ytilde' if year < 1989
alpha = e(b)[1,1] // here is the intercept shift
loc
di `alpha'
So now, we can simply do
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
// Here is our alpha
alpha = r(table)[1,2]
loc
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average
// plus the average of the pre-intervention differences
// between the control group and treated group
Okay this returns our observed values versus our predicted values. On one hand we have California, and on the other we have California without Proposition 99. The way to plot this, from start to finish, is
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
l r(table)
mat
// Here is our alpha
alpha = r(table)[1,2]
loc
r(table)[1,8]
loc ATT =
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average
// plus the average of the pre-intervention differences
// between the control group and treated group
tsset year
// sort by year
twoway (tsline cigsale3, ///
black) lwidth(medthick)) ///
lcolor(tsline DID_Cali, recast(connected) ///
(msymbol(smsquare) lcolor(gs11) lpattern(solid) lwidth(thin)), ///
mcolor(gs11) msize(small) ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xline(1989, lwidth(medium) lpattern(solid) lcolor(black)) ///
xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
legend(region(lcolor(none)) order(1 "Real Cali" 2 "DID Cali") cols(1) position(3)) ///
xsize(7.5) ///
ysize(4.5) ///
white) lcolor(white) ifcolor(white) ilcolor(white)) ///
graphregion(fcolor(plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
title(Difference-in-Differences: ATT: `ATT') yti("Cigarette Consumption") ///
xti(Year)
We can also save this plot. We can do this like
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
l r(table)
mat
// Here is our alpha
alpha = r(table)[1,2]
loc
r(table)[1,8]
loc ATT =
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average
// plus the average of the pre-intervention differences
// between the control group and treated group
tsset year
// sort by year
twoway (tsline cigsale3, ///
black) lwidth(medthick)) ///
lcolor(tsline DID_Cali, recast(connected) ///
(msymbol(smsquare) lcolor(gs11) lpattern(solid) lwidth(thin)), ///
mcolor(gs11) msize(small) ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xline(1989, lwidth(medium) lpattern(solid) lcolor(black)) ///
xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
legend(region(lcolor(none)) order(1 "Real Cali" 2 "DID Cali") cols(1) position(3)) ///
xsize(7.5) ///
ysize(4.5) ///
white) lcolor(white) ifcolor(white) ilcolor(white)) ///
graphregion(fcolor(plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
title(Difference-in-Differences: ATT: `ATT') yti("Cigarette Consumption") ///
note(Prop 99 was passed in 1989)
xti(Year)
graph export "DIDCali.png", as(png) name("Graph") replace
where after the plot is created we can export it to our current folder, where the name of the plot is “DIDCali.png”. This plot simply displays the observed values versus \(\hat{y}_{1t}^0=\hat\alpha_{\mathcal{N}_0}+\bar{y}_{\mathcal{N}_0t}\), or the counterfactual.
We may also plot the actual treatment effect, or the differences over time between the counterfactual and the treated unit’s observed values
clear *
cls"https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear
u
//1 if state is California
replace treated = cond(id==3,1,0)
// 1 if year is greater than or equal to 1989
post= cond(year >=1989,1,0)
g
keep id year treated post cigsale
qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
l r(table)
mat
// Here is our alpha
alpha = r(table)[1,2]
loc
r(table)[1,8]
loc ATT =
keep id year cigsale
qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state
order cigsale3, a(year)
// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)
drop cigsale1-cigsale39
/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
generate te = cigsale - DID_Cali
// treatment effect at time t
generate eventtime = year - 1989
twoway (connected te eventtime, ///
black) lwidth(medthick) mcol(blue)), ///
lcolor(ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xline(0, lwidth(medium) lpattern(solid) lcolor(black)) ///
xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
xsize(7.5) ///
ysize(4.5) ///
white) lcolor(white) ifcolor(white) ilcolor(white)) ///
graphregion(fcolor(plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
title(Difference-in-Differences: ATT: `ATT') yti("Treatment Effect") ///
note(Time point 0 is 1989) xti(Year)
We can also export this as a .png file too.
11 Summary
Me personally, I prefer graphics to tell my main story. Graphics allow for a powerful, simple way of communicating basic results.