10  Presenting Results

So far, we have discussed how to use regression, descriptive statistics, and so on to expound on the relationship between quantitative variables as well as estimate the impacts of policy. However, none of this is meaningful if we cannot present our findings in the correct way. This chapter covers the presentation of results, using tables and graphics.

10.1 User Written Commands in Stata

Stata is a powerful econometrics software in its own right. However, users oftentimes have the need to move away from off the shelf Stata commands and use custom written commands developed by the users. One such command is the esttab command. The way to use the esttab command is by installing the Stata package estout. You can do it by typing this syntax into Stata’s terminal ssc install estout, replace. esttab is the wrapper for estout.

10.2 Using esttab

Consider the regression model

clear *
cls
// Using Stata's pre-installed car data
sysuse auto, clear
// uses a car being foreign and its weight to predict price

// price is in dollars, weight is in pounds

regress price i.foreign weight

which returns the output


      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     35.35
       Model |   316859273         2   158429637   Prob > F        =    0.0000
    Residual |   318206123        71  4481776.38   R-squared       =    0.4989
-------------+----------------------------------   Adj R-squared   =    0.4848
       Total |   635065396        73  8699525.97   Root MSE        =      2117

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     foreign |
    Foreign  |   3637.001    668.583     5.44   0.000     2303.885    4970.118
      weight |   3.320737   .3958784     8.39   0.000     2.531378    4.110096
       _cons |  -4942.844   1345.591    -3.67   0.000    -7625.876   -2259.812
------------------------------------------------------------------------------

A key question though is how we may present these results. One answer, is NOT just by using the raw output of the Stata code. No, we can use esttab to do this. First, we can store the model in question by using a prefix called eststo. We then use esttab to present the results in a professional manner.

clear *

cls

// Using Stata's pre-installed car data

sysuse auto, clear


// uses a car being foreign and its weight to predict price

// price is in dollars, weight is in pounds

cls

eststo clear

qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored

esttab m1

produces the output


----------------------------
                      (1)   
                    price   
----------------------------
0.foreign               0   
                      (.)   

1.foreign          3637.0***
                   (5.44)   

weight              3.321***
                   (8.39)   

_cons             -4942.8***
                  (-3.67)   
----------------------------
N                      74   
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Much more organized than the default output no? We can do more. Let’s modify the variable names on the left hand side of our table.

clear *

cls

// Using Stata's pre-installed car data

sysuse auto, clear


// uses a car being foreign and its weight to predict price

// price is in dollars, weight is in pounds

lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"

cls

eststo clear

qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored

esttab m1, l

This syntax uses the variable labels instead of the variable name. It returns the output


------------------------------------
                              (1)   
                            Price   
------------------------------------
Domestic                        0   
                              (.)   

Foreign                    3637.0***
                           (5.44)   

Weight                      3.321***
                           (8.39)   

Constant                  -4942.8***
                          (-3.67)   
------------------------------------
Observations                   74   
------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

What if we wish to use the confidence interval instead of the default t-statistics?

clear *

cls

// Using Stata's pre-installed car data

sysuse auto, clear


// uses a car being foreign and its weight to predict price

// price is in dollars, weight is in pounds

lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"

cls

eststo clear

qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored

esttab m1, l ci

returns the outout



----------------------------------------------
                                        (1)   
                                      Price   
----------------------------------------------
Domestic                                  0   
                                      [0,0]   

Foreign                              3637.0***
                            [2303.9,4970.1]   

Weight                                3.321***
                              [2.531,4.110]   

Constant                            -4942.8***
                          [-7625.9,-2259.8]   
----------------------------------------------
Observations                             74   
----------------------------------------------
95% confidence intervals in brackets
* p<0.05, ** p<0.01, *** p<0.001

What if we wish to omit the constant from the table?

clear *

cls

// Using Stata's pre-installed car data

sysuse auto, clear


// uses a car being foreign and its weight to predict price

// price is in dollars, weight is in pounds

lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"

cls

eststo clear

qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored

esttab m1, l ci nocon

returns the output


----------------------------------------------
                                        (1)   
                                      Price   
----------------------------------------------
Domestic                                  0   
                                      [0,0]   

Foreign                              3637.0***
                            [2303.9,4970.1]   

Weight                                3.321***
                              [2.531,4.110]   
----------------------------------------------
Observations                             74   
----------------------------------------------
95% confidence intervals in brackets
* p<0.05, ** p<0.01, *** p<0.001

All of this is simply a start. But, this still does not address the important part: how do we use it for the paper? We can save the output to a rich text file, which you may use for Microsoft Word.

clear *

cls

// Using Stata's pre-installed car data

sysuse auto, clear


// uses a car being foreign and its weight to predict price

// price is in dollars, weight is in pounds

lab var weight "Weight"
lab var price "Price"
lab var foreign "Foreign"

cls

eststo clear

qui eststo m1: regress price i.foreign weight
// where m1 is the model we've stored

esttab m1 using table1.rtf, l ci nocon replace

returns the output

esttab m1 using table1.rtf, l ci nocon replace
(output written to table1.rtf)

We can click the table1.rtf link and it’ll open the file, which may be copied and pasted to word.

10.3 DID with esttab

Let’s do an example using Difference-in-Differences.

u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

replace treated = cond(id==3,1,0)

g post= cond(year >=1989,1,0)

keep id year treated post cigsale

Again here, the “California” coefficient from the table below is the average pre-intervention differences between California and the control group in the pre policy period. We can prove this by


. su cigsale if year < 1989 & id ==3

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     cigsale |         19    116.2105    11.68303       90.1        128

. su cigsale if year < 1989 & id !=3

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     cigsale |        722    130.5695    32.69778         55      296.2

“Post” refers to the case where California is 0 and post = 1, or the average difference of the control group’s outcomes in the post period, compared to the pre period. This can be confirmed by


. su cigsale if year < 1989 & id !=3

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     cigsale |        722    130.5695    32.69778         55      296.2

. su cigsale if year >= 1989 & id !=3

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     cigsale |        456    102.0581    23.58887       40.7      186.8

DID is the ATT of interest. It is the difference between the average trend of the treated group in the post period and the control group in the post period, minus the difference between the same in the pre period. See how the coefficient is -27? We can get the same result like this

su cigsale if year < 1989 & id==3, mean
loc y1pre = r(mean)
su cigsale if year < 1989 & id != 3, mean
loc Y0pre = r(mean)
su cigsale if year >= 1989 & id==3,  mean
loc y1post = r(mean)
su cigsale if year >= 1989 & id != 3, mean
loc Y0post = r(mean)

di (`y1post'-`Y0post') - (`y1pre'-`Y0pre')

We can get this result with regression:

clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

replace treated = cond(id==3,1,0)

g post= cond(year >=1989,1,0)

keep id year treated post cigsale

regress cigsale i.treated##i.post, vce(cl id)

//mat l r(table)

lab var cigsale "Cigarette Consumption"
lab var treated "California"
lab var post "Post"
cls
esttab, ci ///
    l /// Uses variable labels
    nobase /// from estout, whose options you may use with esttab
    nocons /// omits the constant
    ren(1.treated#1.post DID) /// renames the interaction term
    nostar /// no stars for statistical significance
    title(Table 1)

which returns the output


-------------------------------------------
                                        (1)
                      Cigarette Consumption
-------------------------------------------
California=1                         -14.36
                            [-24.40,-4.315]

Post=1                               -28.51
                            [-34.12,-22.90]

DID                                  -27.35
                            [-32.96,-21.74]
-------------------------------------------
Observations                           1209
-------------------------------------------
95% confidence intervals in brackets

Again, we can save this table to a rich text file as we did above and use it in Word/drive.

10.4 Graphics

So far we’ve used graphics in the text, and we’ve even made some graphs on the fly. But, how can we present them such that they can be understood? Again, NOT by simply posting screenshots of graphs. No, instead Stata can do this for us such that we can use them in documents. Say we wish to plot the results of the DID model above. That is, the DID predictions versus California.

Before we do that though, a short word on Stata’s results. Some commands in Stata return results after you use them (indeed most do). Remember how we discussed how the DID intercept (the alpha parameter that shifts our control group average) is just \(\hat\alpha_{\mathcal{N}_0} \coloneqq T_{1}^{-1}\sum_{t \in \mathcal{T}_{1}}\left(y_{1t}-\bar{y}_{\mathcal{N}_0t}\right)\)? Well, we can obtain this with regression as we did above. When you run the regress command, you’ll be returned with the standart output table, but you’ll ALSO be returned with a table of all the underlying results stored in the r(table) matrix. So,

clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

replace treated = cond(id==3,1,0)

g post= cond(year >=1989,1,0)

keep id year treated post cigsale

qui regress cigsale i.treated##i.post, vce(cl id)

mat l r(table)

returns the matrix


r(table)[9,9]
                 0b.           1.          0b.           1.  0b.treated#  0b.treated#  1o.treated#   1.treated#             
            treated      treated         post         post      0b.post      1o.post      0b.post       1.post        _cons
     b            0   -14.359003            0   -28.511415            0            0            0   -27.349111    130.56953
    se            .    4.9612676            .    2.7696277            .            .            .    2.7696277    4.9612676
     t            .   -2.8942206            .   -10.294313            .            .            .   -9.8746526    26.317776
pvalue            .    .00626458            .    1.515e-12            .            .            .    4.842e-12    5.320e-26
    ll            .   -24.402564            .   -34.118233            .            .            .   -32.955929    120.52597
    ul            .   -4.3154416            .   -22.904597            .            .            .   -21.742293    140.61309
    df           38           38           38           38           38           38           38           38           38
  crit    2.0243942    2.0243942    2.0243942    2.0243942    2.0243942    2.0243942    2.0243942    2.0243942    2.0243942
 eform            0            0            0            0            0            0            0            0            0

In this case, we care about column two, row 1 (remember how -14 was the alpha parameter from the code)


clear *


cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

keep id year cigsale

qui reshape wide cigsale, j(id) i(year)

order cigsale3, a(year)

tempvar ymeangood ytilde ymean cf te rss tss
egen `ymean' = rowmean(cigsale1-cigsale39)

g `ytilde' = cigsale3-`ymean'
reg `ytilde' if year < 1989


loc alpha = e(b)[1,1] // here is the intercept shift

di `alpha'

So now, we can simply do

clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

//1 if state is California
replace treated = cond(id==3,1,0)

// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)

keep id year treated post cigsale

qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID


// Here is our alpha
loc alpha = r(table)[1,2]


keep id year cigsale

qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state

order cigsale3, a(year)

// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)

drop cigsale1-cigsale39

/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average 
// plus the average of the pre-intervention differences
// between the control group and treated group

Okay this returns our observed values versus our predicted values. On one hand we have California, and on the other we have California without Proposition 99. The way to plot this, from start to finish, is

clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

//1 if state is California
replace treated = cond(id==3,1,0)

// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)

keep id year treated post cigsale

qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
mat l r(table)

// Here is our alpha
loc alpha = r(table)[1,2]

loc ATT = r(table)[1,8]


keep id year cigsale

qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state

order cigsale3, a(year)

// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)

drop cigsale1-cigsale39

/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average 
// plus the average of the pre-intervention differences
// between the control group and treated group

tsset year
// sort by year


twoway (tsline cigsale3, ///
        lcolor(black) lwidth(medthick)) ///
    (tsline DID_Cali, recast(connected) ///
        mcolor(gs11) msize(small) msymbol(smsquare) lcolor(gs11) lpattern(solid) lwidth(thin)), ///
        ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
        xline(1989, lwidth(medium) lpattern(solid) lcolor(black)) ///
        xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
        legend(region(lcolor(none)) order(1 "Real Cali" 2 "DID Cali") cols(1) position(3)) ///
        xsize(7.5) ///
        ysize(4.5) ///
        graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
        plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
        title(Difference-in-Differences: ATT: `ATT') yti("Cigarette Consumption") ///
        xti(Year)

We can also save this plot. We can do this like

clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

//1 if state is California
replace treated = cond(id==3,1,0)

// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)

keep id year treated post cigsale

qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
mat l r(table)

// Here is our alpha
loc alpha = r(table)[1,2]

loc ATT = r(table)[1,8]


keep id year cigsale

qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state

order cigsale3, a(year)

// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)

drop cigsale1-cigsale39

/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'
// again, the DID model is simply the control group average 
// plus the average of the pre-intervention differences
// between the control group and treated group

tsset year
// sort by year


twoway (tsline cigsale3, ///
        lcolor(black) lwidth(medthick)) ///
    (tsline DID_Cali, recast(connected) ///
        mcolor(gs11) msize(small) msymbol(smsquare) lcolor(gs11) lpattern(solid) lwidth(thin)), ///
        ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
        xline(1989, lwidth(medium) lpattern(solid) lcolor(black)) ///
        xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
        legend(region(lcolor(none)) order(1 "Real Cali" 2 "DID Cali") cols(1) position(3)) ///
        xsize(7.5) ///
        ysize(4.5) ///
        graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
        plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
        title(Difference-in-Differences: ATT: `ATT') yti("Cigarette Consumption") ///
        xti(Year) note(Prop 99 was passed in 1989)
        
graph export "DIDCali.png", as(png) name("Graph") replace

where after the plot is created we can export it to our current folder, where the name of the plot is “DIDCali.png”. This plot simply displays the observed values versus \(\hat{y}_{1t}^0=\hat\alpha_{\mathcal{N}_0}+\bar{y}_{\mathcal{N}_0t}\), or the counterfactual.

We may also plot the actual treatment effect, or the differences over time between the counterfactual and the treated unit’s observed values

clear *
cls
u "https://github.com/jgreathouse9/FDIDTutorial/raw/main/smoking.dta", clear

//1 if state is California
replace treated = cond(id==3,1,0)

// 1 if year is greater than or equal to 1989
g post= cond(year >=1989,1,0)

keep id year treated post cigsale

qui regress cigsale i.treated##i.post, vce(cl id)
// Estimating DID
mat l r(table)

// Here is our alpha
loc alpha = r(table)[1,2]

loc ATT = r(table)[1,8]


keep id year cigsale

qui reshape wide cigsale, j(id) i(year)
// one row per year, one column per state

order cigsale3, a(year)

// our control group average
egen DID_Cali = rowmean(cigsale1-cigsale39)

drop cigsale1-cigsale39

/// our DID prediction
replace DID_Cali = DID_Cali +`alpha'

generate te = cigsale - DID_Cali

// treatment effect at time t

generate eventtime = year - 1989


twoway (connected te eventtime, ///
        lcolor(black) lwidth(medthick) mcol(blue)), ///
        ylabel(#5, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
        xline(0, lwidth(medium) lpattern(solid) lcolor(black)) ///
        xlabel(#10, grid glwidth(vthin) glcolor(gs4%20) glpattern(dash)) ///
        xsize(7.5) ///
        ysize(4.5) ///
        graphregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
        plotregion(fcolor(white) lcolor(white) ifcolor(white) ilcolor(white)) ///
        title(Difference-in-Differences: ATT: `ATT') yti("Treatment Effect") ///
        xti(Year) note(Time point 0 is 1989)        

We can also export this as a .png file too.

11 Summary

Me personally, I prefer graphics to tell my main story. Graphics allow for a powerful, simple way of communicating basic results.