The MLSYNTH App

Synthetic Control Methods Made Accessible

2025-09-17

πŸ‘‹ Welcome

  • The MLSYNTH App = point-and-click interface for causal inference
  • Wraps my mlsynth Python package
  • Designed for:
    • Policy analysts
    • Economists
    • Business owners
    • Data scientists

Motivation

  • Policy evaluation needs credible counterfactuals
  • Synthetic control (SCM) provides a flexible framework
  • But today:
    • Fragmented implementations
    • Steep learning curve
    • Reproducibility challenges

Why Use the MLSYNTH App?

What are you really getting when you use MLSYNTH?

Convenience.

The Problem

  1. Papers are hard to digest
    • Some take days of rereading (e.g. PCR, RPCA)
    • Even straightforward ones (e.g. FDID) demand hours
  2. Code is fragmented
    • Some estimators exist only in replication files
    • Others are in fragile, unpackaged R/Python scripts
    • Often tied to Stata or MATLAB (paid software)

Time Costs

Rough estimates of time saved by MLSYNTH:

Estimator Hours to Learn + Code MLSYNTH Time
FDID ~40 Minutes
FSCM ~75 Minutes
RPCA ~90 Minutes
SI ~110 Minutes

Who Suffers Most?

  • Even trained researchers spend dozens of hours per estimator
  • For the average policy analyst or master’s student:
    • Multiply those hours by 2–4Γ—
    • Steep learning curve across stats, math, code, causal inference

MLSYNTH = a lifeline for non-technical users.

Two Ways to Pay

  1. Pay in Time
    • Learn Python + read the papers
    • Use mlsynth source code directly
  2. Pay in Money
    • Use the app
    • Skip the coding, focus on insights

Either way β†’ you don’t pay in both.

How It Works

  1. Upload a panel dataset (unit, time, outcome, treatment)
  2. Choose an estimator
  3. Run the model β†’ results visualized automatically
  4. Download outputs for further analysis

Sample Datasets

  • California Tobacco Law (1988) β†’ Cigarette sales
  • German Reunification (1990) β†’ GDP
  • Basque Terrorism (1975) β†’ GDP per capita
  • Hubei Lockdown (2020) β†’ GDP growth
  • Hong Kong Integration (2003) β†’ GDP

Each dataset is a classic case in SCM literature.

Available Estimators

  • FDID: Forward Difference-in-Differences
  • FSCM: Forward-Selected Synthetic Control
  • CLUSTERSC: Cluster donor pools (PCR, RPCA)
  • SRC: Synthetic Regression Control
  • PDA: Forward Selection / L2 / LASSO
  • FMA: Factor Model Approach
  • NSC: Nonlinear SCM
  • SI: Synthetic Interventions

Coming Soon

  • Additional estimators (SCMO, SDID)
  • Robustness checks (placebo tests)
  • Interactive tutorials

Using MLSYNTH App

A Word of Caution

  • MLSYNTH β‰  black box
  • Estimators rely on assumptions
  • Misuse β†’ misleading results

Use MLSYNTH to save coding time β€” but rely on econometric intuition.

Acknowledgements

  • Jason Coupet
  • Kathy Li
  • Mani Bayani
  • Zhentao Shi
  • Jaume Vives-i-Bastida
  • Andrew Wheeler

Closing

  • MLSYNTH = one-stop shop for SCM methods
  • Bridges methodological rigor with practical use
  • Ready for research, policy, and applied evaluation