Causal Inference and Market Experiment Design with MLSYNTH

Introduction

Jared Amani Greathouse
- PhD Candidate, Public Policy, Georgia State University
- Advisor: Jason Coupet
- Specialization: Econometrics and Causal Inference
- Focus: Synthetic Control Methods, High-Dimensional Panel Data, Machine Learning for Treatment Effects

My Work

  • Developed mlsynth, the largest Python SCM package to date
  • Suite of tools for policy evaluation using panel data
  • Supports dozens of estimators: vanilla SCM, cluster SCM, factor models, PDA, proximal SCM, and more
  • Handles complex, high-dimensional causal inference settings

What Makes MLSYNTH Special

  • Consolidates decades of non-experimental SCM and panel methods from multiple software ecosystems into Python with one singular syntax
  • Easy installation, automation, and scaling
  • Supports a broad set of estimators for diverse research questions
  • MAREX: Market Experiment Design feature for selecting treated and control units under constraints
  • Decision-making tool: explore multiple scenarios with penalties, clusters, costs, budgets—overnight simulations possible

Challenges of Standard Experiments

  • Randomized trials are often impractical at scale
    • High coordination and cost
    • Ethical or logistical constraints
  • Simple randomization ignores pre-existing heterogeneity
  • Cannot easily accommodate budgets, costs, or cluster structures
  • Slow to iterate: each new design requires manual planning

Why Prefer MAREX

  • Optimizes both treated and control selection simultaneously
  • Enforces cluster structure, budgets, and cardinality constraints
  • Penalizes units far from cluster means to ensure representativeness
  • Allows overnight simulation of multiple experimental scenarios
  • Treats synthetic control as a planning tool, not just an estimator

MAREX vs. Standard Experiment

Feature Standard Experiment MAREX
Validity in small/clustered marketing tests Often compromised Maintains validity via synthetic control optimization
Budget-aware Sometimes requires ad hoc adjustments Built-in into optimization
Cluster-aware Possible but may reduce power Enforced and balanced by design
Ethical / feasibility constraints Manual adjustments Integrated constraints
Scenario testing / iteration Limited by randomization and manual setup Overnight simulations with multiple configurations
Representativeness of treated/control Needs careful design, can fail Ensured via cluster-weighted synthetic controls

Takeaway

  • MAREX is a planning tool for experiments where randomization is expensive, infeasible, or restricted
  • Makes representative, feasible, and comparable treatment assignments
  • Analysts can simulate, optimize, and justify experimental decisions before implementation
  • Bridges econometrics theory and practical business experiment design