Synthetic Controls for Marketing Experiments

Experiments
Econometrics
Author

Jared Greathouse

Published

September 20, 2025

The Difficulties of Market Experimentation

Suppose the government of Curaçao wishes to implement a new “Green Stay” initiative that encourages hotels to adopt sustainability measures such as reducing water usage, improving waste management, and shifting toward renewable energy. From a methodological standpoint, the most rigorous way to measure the impact of such a program would be through a randomized controlled trial, where some hotels or neighborhoods are randomly assigned to implement the policy while others serve as controls. Randomization ensures that, in expectation, the treated and control groups are balanced and that differences in outcomes can be attributed to the policy. It also provides a clear framework for statistical inference and lends credibility to the findings.

However, despite these advantages, there are serious obstacles to conducting a true RCT in this context. Ethically, assigning certain hotels to receive the benefits of the program while denying them to others may be perceived as unfair, particularly if the policy boosts reputation, attracts eco-conscious travelers, or leads to financial advantages. Guests might also unknowingly experience different standards, raising questions about fairness and transparency.

Politically, implementing randomization across hundreds of hotels or entire market zones would require coordination among the government, hotel associations, and local businesses, all of whom may resist being “experimented on.” Because tourism is central to the island’s economy, public perception would also be sensitive, and randomized assignment could easily be framed as a risk to one of Curaçao’s most important industries.

The logistical burden is equally daunting: monitoring and enforcing different sustainability requirements on a hotel-by-hotel basis would be complex and costly, with significant overhead required for compliance and enforcement. In practice, coordinating hundreds of independent hotels is unlikely to succeed. In other words, while an RCT is desirable in theory, the ethical, political, and feasibility barriers make it impractical in the context of Curaçao’s hotel sector.

In light of these challenges, a more practical approach is to first consider the market at a localized level, grouping hotels or neighborhoods into clusters that share similar characteristics, such as geographic location, customer demographics, or historical patterns of tourist activity. By focusing on clusters rather than individual units, we can mitigate the ethical concerns of treating some individual hotels differently than others, since the selection occurs within naturally similar groups rather than arbitrarily across the whole market.

Cluster analysis also reduces logistical complexity, because interventions can be coordinated at the cluster level rather than needing to monitor and enforce policies across hundreds of disparate hotels. Within each cluster, we can then select treated units that closely resemble the local market and control units that reflect the cluster’s underlying dynamics. This ensures that the observed effects of the intervention are not driven by unusual outliers or idiosyncratic behaviors, but rather capture the typical response within a given market segment. But how would we select the treated units and control units in the first place? Even within a single cluster, we still would not be able to treat everybody for a host of reasons. This blog post covers how we can use synthetic control methods for market experimental settings.

Notation and Synthetic Control Designs

Let \(\mathcal{J} = \{1, \dots, J\}\) denote neighborhoods observed over \(T_0\) pre-treatment periods \(\mathcal{T}_0 = \{1, \dots, T_0\}\), nested within \(K\) clusters \(\mathcal{K} = \{1, \dots, K\}\). Let \(I_k \subseteq \mathcal{J}\) denote the neighborhoods in cluster \(k\), with \(j \in I_{k(j)}\) indicating cluster membership. Observed outcomes are collected in \(\mathbf{Y} \in \mathbb{R}^{J \times T_0}\), where \(Y_{j,t}\) represents the outcome for neighborhood \(j\) at time \(t\). Potential outcomes are \(Y_{j,t}^I\) under treatment and \(Y_{j,t}^N\) without treatment, with cluster-weighted average treatment effect

\[ \tau_t = \sum_{j=1}^J f_j \, (Y_{j,t}^I - Y_{j,t}^N), \quad t > T_0, \]

where \(f_j \ge 0\) and \(\sum_{j=1}^J f_j = 1\). Predictor vectors \(\mathbf{x}_j \in \mathbb{R}^r\) have cluster mean

\[ \mathbf{\bar{x}}_k = \frac{\sum_{j \in I_k} f_j \mathbf{x}_j}{\sum_{j \in I_k} f_j}. \]

Distances are defined as

\[ D_{1,j,k} = \|\mathbf{x}_j - \mathbf{\bar{x}}_k\|_2^2, \quad D_{2,j,j',k} = \|\mathbf{x}_j - \mathbf{x}_{j'}\|_2^2. \]

Weights \(w_j \ge 0\) represent synthetic treated contributions, \(v_{i,j} \ge 0\) represent unit-level control contributions, and \(z_j \in \{0,1\}\) encodes treatment assignment. Cluster budgets \(B_k\) and cost vector \(\mathbf{c}\) may constrain feasible treatments. All constraints are collected in \(\mathcal{F}\):

\[ \begin{aligned} \mathcal{F} = \Big\{ (w,v) \,\Big| \; & w_j, v_j \ge 0, \quad \sum_{j \in I_k} w_j = \sum_{j \in I_k} v_j = 1, \\ & w_j \le z_j, \; v_j \le 1 - z_j, \quad \sum_{j \in I_k} z_j \in [m_{\text{min},k}, m_{\text{max},k}], \\ & \sum_{j \in I_k} c_j w_j \le B_k, \quad \sum_{k=1}^K \sum_{j \in I_k} c_j w_j \le B_{\text{total}}, \\ & \sum_{k=1}^K z_{j,k} \le 1 \Big\}. \end{aligned} \]

For weights \((w,v) \in \mathcal{F}\), the cluster-level loss can be written with underbraces labeling each term:

\[ \begin{aligned} \mathcal{L}(w,v) = \sum_{k=1}^K \Big( \sum_{j \in I_k} f_j \Big) \Big[ & \underbrace{\|\mathbf{\bar{x}}_k - X_{I_k}^\top w_{I_k}\|_2^2}_{\text{match synthetic treated to cluster mean}} + \underbrace{\|\mathbf{\bar{x}}_k - X_{I_k}^\top v_{I_k}\|_2^2}_{\text{match synthetic control to cluster mean}} \\ & + \underbrace{\beta \| X_{I_k}^\top w_{I_k} - X_{I_k}^\top v_{I_k} \|_2^2}_{\text{encourage treated/control similarity}} + \underbrace{\lambda_1 w_{I_k}^\top D_{1,k} w_{I_k}}_{\text{penalize treated distance from cluster mean}} + \underbrace{\lambda_2 v_{I_k}^\top D_{1,k} v_{I_k}}_{\text{penalize control distance from cluster mean}} \\ & + \underbrace{\xi \sum_{j \in I_k} w_j \| \mathbf{x}_j - X_{I_k}^\top V_{k,j} \|_2^2}_{\text{unit-level treated vs. synthetic control}} + \underbrace{\lambda_{2,\text{unit}} \mathrm{tr}(W_{I_k}^\top D_{2,k} V_k)}_{\text{pairwise unit-level penalty}} \Big]. \end{aligned} \]

The synthetic control optimization is then

\[ \min_{(w,v) \in \mathcal{F}} \mathcal{L}(w,v). \]

Variants of the design correspond to selecting which terms are active:

  • Base: only match synthetic treated and control to cluster mean.
  • Weak: adds \(\beta\) penalty for treated/control similarity (ATT-focused).
  • Penalized: adds \(\lambda_1, \lambda_2\) for cluster fidelity.
  • Unit-level: adds \(\xi, \lambda_{1,\text{unit}}, \lambda_{2,\text{unit}}\) for fine-grained matching at the individual unit level.

A Simple Example

To illustrate, suppose we have 120 pre-treatment periods of Net Revenue per Available Room across two clusters of neighborhoods in Curaçao. The non-Willemstad cluster includes Barber, Lagún, Oostpunt, Santa Rosa, Sint Willibrordus, Soto, Spaanse Water, Tera Corá, and Westpunt. In this cluster, the treated units that received weight were Oostpunt with 0.39, Soto with 0.28, and Tera Corá with 0.33. The controls receiving weight were Barber 0.17, Lagún 0.10, Santa Rosa 0.18, Sint Willibrordus 0.17, Spaanse Water 0.15, and Westpunt 0.23. The RMSE between the synthetic treated unit and the synthetic control was 0.30, indicating a reasonably close match.

The Willemstad cluster consists of Brievengat, Groot Kwartier, Groot Piscadera, Hato, Koraal Partir, Otrobanda, Pietermaai, Piscadera Bay, Saliña, Scharloo, Sint Michiel, and Steenrijk. Among the treated units, Groot Kwartier received 0.24, Koraal Partir 0.02, Saliña 0.33, and Scharloo 0.41. The control units with weight were Groot Piscadera 0.13, Otrobanda 0.47, Piscadera Bay 0.34, and Steenrijk 0.06. The RMSE between the synthetic treated and synthetic control units in this cluster was 0.53, reflecting a slightly less precise fit compared to the non-Willemstad cluster.

The plot below gives our results from the penalized syntehtic experiment. We can see that the penalized estimator gives good pretreatment fit

/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/site-packages/mlsynth/config_models.py:68: UserWarning:

DataFrame was not sorted by [town, time] — auto-sorting applied.

Now I solve with a quadratic relaxation of the above objective, where we get

/opt/hostedtoolcache/Python/3.13.7/x64/lib/python3.13/site-packages/mlsynth/config_models.py:68: UserWarning:

DataFrame was not sorted by [town, time] — auto-sorting applied.