The MLSYNTH App

Synthetic Control Methods Made Accessible

Jared Greathouse

2025-11-11

👋 Welcome

The MLSYNTH App = point-and-click interface for causal inference
Wraps my mlsynth Python package
Designed for:
- Policy analysts
- Economists
- Business owners
- Data scientists

Motivation

Policy evaluation needs credible counterfactuals
Synthetic control (SCM) provides a flexible framework, especially when the parallel trends assumption of DID does not hold.
But today:
- Fragmented implementations of different SCMs exist
- Steep learning curve for different implementations of SCM
- Not all researchers know R or Python, where many of these developments live

Why Use the MLSYNTH App?

What are you really getting when you use the MLSYNTH App?

Convenience.

The Problem

Many papers are hard to digest
- Some take days of rereading (e.g. On Robustness of Principal Component Regression Anish Agarwal,Devavrat Shah,Dennis Shen & Dogyoon Song, 1731-1745)
- Even relatively more straightforward ones (e.g. Li, Kathleen T. (2023). “Frontiers: A Simple Forward Difference-in-Differences Method.” Marketing Science 43(2):267-279. ) demand hours to read.
Code is fragmented
- Some estimators exist only in replication files or are in fragile, unpackaged R/Python scripts
- Some may be tied to Stata or MATLAB (paid software)

Time Costs

Rough estimates of time saved by MLSYNTH:

Estimator	Hours to Learn + Code	MLSYNTH Time
FDID	~40	Minutes
FSCM	~75	Minutes
RPCA	~90	Minutes
SI	~110	Minutes

Who Suffers Most?

Even trained researchers spend dozens of hours per estimator in the time that it takes to code and understand it.
For the average policy analyst or master’s student:
- Multiply those hours by 2–4
- Steep learning curve which demands solid stats, math, code, causal inference background

MLSYNTH App = simplifies this process.

Two Ways to Pay

Pay in Time
- Learn Python and read the papers
- Use mlsynth source code directly
Pay in Money
- Use the app
- Skip the coding, focus on analysis

Either way, with the mlsynth app, you never pay in both.

How It Works

Upload a panel dataset (unit, time, outcome, treatment)
Choose an estimator
Run the model → results visualized automatically
Download outputs for further analysis

Sample Datasets

California Tobacco Law (1988) → Cigarette sales
German Reunification (1990) → GDP
Basque Terrorism (1975) → GDP per capita
Hubei Lockdown (2020) → GDP growth
Hong Kong Integration (2003) → GDP

Available Estimators

FDID: Forward Difference-in-Differences
FSCM: Forward-Selected Synthetic Control
CLUSTERSC: Cluster SCM (PCR, RPCA)
SRC: Synthetic Regression Control
PDA: Panel Data Approaches
FMA: Factor Model Approach
NSC: Nonlinear SCM
SI: Synthetic Interventions

Coming Soon

Robustness checks (placebo tests)

Using MLSYNTH App

A Word of Caution

MLSYNTH ≠ black box
Estimators rely on assumptions
Misuse → misleading results

Use MLSYNTH to save coding time — but still understand what each method does.