This is a Synthetic World: Three Essays on Causal Inference, Applied Econometrics, and Machine Learning for Policy Analysis

Jared Greathouse

Introduction

Jared Amani Greathouse
- PhD Candidate, Public Policy, Georgia State University
- Advisor: Jason Coupet

Policy Challenges Motivation

  • Public policy increasingly relies on causal inference to evaluate complex interventions
  • Many high-stakes policies create data environments where:
    • Donor pools/sets of untreated units are high dimensional
    • Spillovers are present, or
    • Valid geographic controls do not exist
  • These settings strain or invalidate traditional empirical tools, which relax one or many of these constraints.

Thesis:
Machine learning augmentations of traditional causal methods (e.g., ITS, SCM) can overcome these challenges, providing more credible inference in high-stakes policy settings

Technical Challenges/Motivation

  • Many cutting-edge causal inference methods (e.g., SHC, RESCM, augmented SCM) do not have ready-to-use software

  • Existing tools (SCM packages in R/Stata/Python) are limited in scope and cannot implement novel designs.

  • To fill this gap, I developed mlsynth, an open-source Python package

mlsynth enables:

  • Synthetic control methods and panel data approaches in a simple and flexbile manner

  • Implements machine learning methods to address spillovers/SUTVA violations, high-dimensional donor pools, etc.

Paper 1: Hawaii’s COVID-19 Quarantine

Question:
What were the short-run economic costs of Hawaii’s March 2020 mandatory quarantine?

Method:
Synthetic Historical Control (SHC)

Outcomes:
- Tourism demand
- Labor market conditions
- Statewide economic activity

Why Hawaii?

  • Most extreme U.S. COVID containment policy
  • Introduced a mandatory 14-day quarantine on all arrivals (March 26, 2020)
  • Effectively sealed borders and shut down tourism
  • No comparable U.S. state intervention

Core Policy Question

What are the economic costs of decisive early action under uncertainty?

  • Early containment may prevent long-run damage
  • But imposes large short-run economic losses
  • Critical for future crises:
    • Pandemics
    • Climate shocks
    • Geopolitical disruptions
  • A key question is the degree to which the intervention impacted the economy, in contrast to places like Florida who did not such restrictions.

Difficulty in studying this case

  • Hawaii is:

    • Geographically isolated

    • Highly tourism-dependent

    • Economically distinct

  • No credible untreated comparison units, meaning standard DiD and SCM designs fail

The Identification Problem

  • COVID-19 is a global shock
  • Nearly all regions experienced treatment
  • No clean donor pool exists
  • Cross-sectional comparisons break down

Methodological Solution

Synthetic Historical Control (SHC)

  • Uses the treated unit’s own past
  • Constructs counterfactuals from historical segments
  • Requires no geographic control units
  • Designed for:
    • One-off shocks
    • System-wide interventions

SHC Intuition

  • Match the last pre-treatment window
  • Using weighted combinations of:
    • Earlier historical segments where intervention was not active.
  • Learn how Hawaii typically evolves, absent either pandemic or lockdown
  • Extrapolate forward absent treatment

Outcomes

Tourism Demand - Visitor Days - Hotel Occupancy

Economic Conditions - Accommodation Employment

All measured as: - Annualized year-over-year growth rates

Data

  • Monthly time series
  • ~30 years of pre-treatment data (1990 to 2020).
  • Sources:
    • Hawaii DBEDT
    • FRED
  • Treatment begins: March 2020

Empirical Design

  1. Smooth pre-treatment series
  2. Construct historical donor segments
  3. Match evaluation window
  4. Predict post-treatment counterfactual, under no lockdown
  5. Compare observed outcomes to predictions

Inference

  • Placebo tests in time
  • Conformal prediction intervals
  • Distribution-free
  • 90% out-of-sample coverage

Contribution

  • First causal estimate of Hawaii’s quarantine effects
  • Demonstrates SHC in an extreme policy setting
  • Provides evidence on economic costs of near-zero-COVID policies
  • Broad relevance for crisis policymaking

Paper 2: Impact of National Lockdown on Air Quality in India

Motivation

  • Air pollution is one of India’s most severe public health challenges
  • Fine particulate matter (PM2.5) is the primary pollutant of concern
  • Chronic exposure affects over 1.4 billion people
  • Persistent exceedance of WHO and national standards

Sources of PM2.5 in India

  • Vehicular emissions
  • Industrial production
  • Construction activity
  • Biomass burning (household and agricultural)
  • Seasonal and meteorological factors

Severity and Spatial Variation

  • National average PM2.5 far exceeds WHO guidelines
  • Large regional heterogeneity:
    • Indo-Gangetic Plain: extreme winter pollution
    • Major metros consistently among the world’s most polluted
  • Pollution is persistent, not episodic

Health Impacts

  • Long-term exposure increases:
    • Cardiovascular disease
    • Respiratory illness
    • Stroke and lung cancer
  • Air pollution causes ~1.5 million premature deaths annually
  • Substantial reductions in life expectancy, especially in Delhi

Economic Impacts

  • Significant healthcare expenditures
  • Reduced labor productivity
  • Losses in tourism and agriculture
  • Estimated costs ≈ 1–1.5% of GDP annually
  • Policy response: National Clean Air Programme (NCAP)

The COVID-19 Lockdown

  • Nationwide lockdown began March 25, 2020
  • Among the strictest globally
  • Sudden halt to:
    • Transportation
    • Industrial activity
    • Construction
  • A large, abrupt emissions shock

Lockdown as a Natural Experiment

  • Policy applied uniformly across the country
  • Sharp timing and clear intervention onset
  • A rare opportunity to study large-scale emissions reductions
  • Particularly relevant for national-level air quality

Observed Changes During Lockdown

  • PM2.5 declined sharply across monitored locations
  • Reductions often in the range of 30–50%
  • Major metros experienced large AQI improvements
  • Effects strongest in early lockdown phases

What Existing Studies Show

  • Consistent evidence of pollution declines during lockdowns
  • Findings documented across:
    • India
    • China
    • Europe
  • Effects observed for PM2.5, NO₂, and PM10

Dominant Empirical Approaches

  • Pre–post comparisons
  • Year-over-year comparisons
  • Graphical trend analysis
  • Satellite imagery
  • Simple regression models

Limits of Existing Evidence

  • Largely descriptive
  • No explicit counterfactual construction
  • Confounding from seasonality and weather
  • Limited policy relevance

Causal Approaches in the Literature

  • Interrupted Time Series (ITS)
  • Synthetic Control and augmented SCM
  • Causal machine learning methods
  • Mixed findings across contexts

Why National Causal Inference Is Hard

  • Lockdowns affect all units simultaneously
  • No viable untreated donor pool
  • Parallel trends assumptions are implausible
  • Covariates cannot fully capture meteorology
  • Trditionally, practitioners would do t-tests/simpler methods.

Methodological Gap

  • Few credible causal designs for national-scale interventions
  • Existing tools struggle when:
    • Treatment is universal
    • Meaningful geographic controls do not exist
  • Limits inference for major policy shocks

Data

  • Source: SHRUG – Socioeconomic High-resolution Rural-Urban Geographic Data Platform for India
  • Spatial coverage: 635 districts across India
  • Temporal coverage: Jan 1999 – Dec 2020 (monthly observations)
  • Outcome: PM2.5 concentrations, population-weighted at district level

National-level analysis: - Population-weighted average across all districts → single monthly time series

Treated districts (city-level analysis): - Delhi, Mumbai, Bangalore, Kolkata (major tech hubs/metropolitan areas). - Aggregated subdistricts to district-level averages

Data Transformation

  • Compute year-over-year (YoY) growth rates of PM2.5:

    • Compares each month to same month in previous year
    • Controls for seasonality, via the annualization
  • Final outcome series for analysis:

    • National: YoY growth of PM2.5 across India
    • Cities: YoY growth for each treated district
  • Restricts analysis to 1999–2020, months 13–312 of the dataset

Contribution of This Chapter

  • First causal evaluation of India’s nationwide lockdown on PM2.5
  • Uses SHC:
    • No contemporaneous control units
    • Counterfactual built from historical patterns
  • Extends SHC to an augmented framework

What This Enables

  • National-level causal estimates
  • Policy-relevant counterfactual inference

Paper 3: Locking Away Prosperity? Evaluating the Labor Impacts of Vaccine Mandates

Motivation

  • COVID-19 NPIs initially focused on lockdowns and masks
  • Early research emphasized timing and public health benefits
  • Subsequent work documented economic and labor market costs
  • Less is known about later-stage pandemic policies

From Lockdowns to Mandates

  • Vaccines became widely available by early 2021
  • Policymakers sought alternatives to broad lockdowns
  • Vaccine mandates emerged as a targeted intervention
  • Goal: protect public health while sustaining economic activity

Why Focus on Restaurants?

  • High-contact, indoor settings
  • Central to urban leisure economies
  • Sensitive to both consumer confidence and labor supply
  • A leading concern during mandate debates

Policy Question

How did city-level vaccine mandates affect restaurant employment?

  • Did mandates slow employment recovery?
  • Or did they support demand by improving perceived safety?
  • Are effects consistent across cities?

Treated Cities

  • New York City
  • San Francisco
  • New Orleans
  • Los Angeles

Each implemented vaccine mandates for indoor dining in 2021. Other cities (i.e., Chicago) did too, but they did so when a new COVID variant (Omicron) became viral, a critical confounder for clean identification.

Why These Cities Matter

  • Large, economically important metros
  • Early adopters of vaccine mandates
  • Clear policy onsets and enforcement timelines
  • Urban outliers in the national policy landscape

Expected Employment Channels

  • Supply side:
    • Worker compliance or exit
    • Staffing frictions
  • Demand side:
    • Consumer confidence
    • Willingness to dine indoors

Net effects are theoretically ambiguous.

Contribution of This Chapter

  • First causal analysis of vaccine mandates on restaurant employment
  • Focuses on labor market outcomes, not health metrics
  • Evaluates mandates as an economic policy tool

Empirical Strategy

  • Compare treated cities to synthetic counterfactuals
  • Use MSAs without similar mandates as donors
  • Focus on year-over-year employment growth
  • Separate analysis by city

Data

  • Monthly MSA-level employment data
  • Source: BLS / FRED
  • Time span: 1991–2022
  • Outcome:
    • Full-service restaurant employment
    • Leisure and hospitality employment

Why This Design Works

  • Sharp policy timing
  • No national mandate during study period
  • Rich pre-treatment histories
  • Comparable untreated metropolitan areas