Propensity Scores

基本的概念
重要的结果
应用
评估

Rosenbaum P. and Rubin D. The Central Role of the Propensity Score in Observational Studies For Causal Effects. Biometrika, 1983, 70(1): 41-55.

Propensity score matching, wiki.

Austin P. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies.Multivariate behavioral research, 2011, 46(3): 399-424.

基本的概念

符号	说明
X	covariate, 用于决策何种treatment
\(Z \in \{0, 1 \}\)	Treatment
\(r_{ni}\)	第\(n\)个实例, \(z_n=i\) 下的反应(outcome)

Strongly ignorable treatment assignment:

即满足条件可交换性:

\[(r_0, r_1) \perp Z | X.
\]

Balancing Score:

一个关于随机变量\(X\)的函数\(b(X)\)被称为balancing score, 若:

\[X \perp Z | b(X).
\]

Propensity Score:

\[e(x) := P(Z=1|X=x).
\]

重要的结果

\(X \perp Z | b(X)\)

一个函数\(b(X)\)是balancing score, 当且仅当存在一个映射\(f\)使得\(e(X) = f((b(X))\).

\(\Leftarrow\)

当, \(b(x) \not= b\)的时候, 显然\(P(Z=z, X=x|b(X)=b)=0\), 此时满足条件独立性, 故只需考虑\(b(x) = b\)的情况.

\[\begin{array}{ll}
P(Z=z, X=x|b(X)=b)
&= P(Z=z| X=x, b(X) = b) \: P(X=x|b(X)=b) \\
&= P(Z=z| X=x) \: P(X=x|b(X)=b) \\
&\mathop{=}\limits^{?}P(Z=z|b(X)=b) \: P(X=x|b(X)=b).
\end{array}
\]

显然最后一个等式成立, 只需满足:

\[P(Z=z|b(X)=b) = P(Z=z|X=x) = e(x')^z \cdot (1 - e(x'))^{1-z}, \quad \forall x' \in \{x| b(X) = b\}
\]

注: 最后一个等式成立, 是因为\(e(x') = f(b(x')) = f(b)\).

又

\[\begin{array}{ll}
P(Z=z|b(X)=b)
&= \sum_{x' \in \{b(X) = b\}}P(Z=z|X=x', b(X)=b)\: P(X=x'|b(X)=b) \\
&= \sum_{x' \in \{b(X) = b\}}P(Z=z|X=x')\: P(X=x'|b(X)=b) \\
&= \sum_{x' \in \{b(X) = b\}} e(x')^z \cdot (1 - e(x'))^{1-z} \: P(X=x'|b(X)=b) \\
&= e(x')^z \cdot (1 - e(x'))^{1-z}\\
&= P(Z=z|X=x).
\end{array}
\]

注: 显然上面的证明是要求\(Z \in \{0, 1\}\)的, 即二元的treatment.

除非有额外的条件, 比如:

\[P(Z=z|X=x) = P(Z=z|X=x') = p(z, b)
\]

对所有的\(x, x' \in \{x| b(X) = b\}\).

\(\Rightarrow\)

首先, 如果\(b(X)\)本身从\(X\)的一个单射, 那么显然存在这样的\(f\).

若\(b()\)不是单射, 且不存在\(f\)使得\(e(X) = f(b(X))\), 则一定存在\(x, x'\)使得

\[e(x) \not= e(x'), \quad b(x) = b(x').
\]

此时:

\[P(Z=1|X=x) \not= P(Z=z|X=x') \rightarrow
P(Z=1|X=x, b(x)) \not= P(Z=z|X=x', b(x')).
\]

故\(b(X)\)不是balancing score, 矛盾.

注: 显然\(e(X)\)以及\(b(X) = X\)均为balancing score.

\((r_0, r_1) \perp Z | b(X)\)

若:

\[(r_0, r_1) \perp Z | X, \quad 0 < P(Z=1|X) < 1,
\]

则:

\[(r_0, r_1) \perp Z | b(X), \quad 0 < P(Z=1|b(X)) < 1.
\]

不等式的证明是显然的.

只需证明:

\[P(Z=1|r_0, r_1, b(X)=b) = P(Z=1|b(X)=b) = e(X).
\]

\[\begin{array}{ll}
P(Z=1|r_0, r_1, b(X)=b)
&= \mathbb{E}_{x} \mathbb{E}_{z} [[Z|X=x, r_0, r_1,b(X)=b)] |r_0, r_1,b(X)=b] \\
&= \mathbb{E}_{x} \mathbb{E}_{z} [[Z|X=x)] |r_0, r_1,b(X)=b] \\
&= \mathbb{E}_{x} [e(X) |r_0, r_1,b(X)=b] \\
&= e(X)
\end{array}
\]

最后一个等式成立, 是因为, \(b(X)=b \rightarrow e(X) = f(b)\).

倘若上面的额外的条件成立, 即

\[P(Z=z|X=x) = P(Z=z|X=x') = p(z).
\]

则有:

\[\begin{array}{ll}
P(Z=z|r_0, r_1, b(X)=b)
&= \sum_{x' \in \{b(X) = b\}} P(Z=z, X=x'|r_0, r_1, b(X)=b) \\
&= \sum_{x'} P(Z=z|r_0, r_1, X=x', b(X)=b)\: P(X=x'|r_0, r_1, b(X)=b) \\
&= \sum_{x'} P(Z=z|X=x')\: P(X=x'|r_0, r_1, b(X)=b) \\
&= \sum_{x'} p(z)\: P(X=x'|r_0, r_1, b(X)=b) \\
&= p(z) = P(Z=z|X=x) = P(Z=z|b(X)=b).
\end{array}
\]

总结为:

若:

\[(r_0, r_1) \perp Z | X, \quad 0 < P(Z=z|X) < 1,
\]

且:

\[P(Z=z|X=x) = P(Z=z|X=x') = p(z;b), \quad \forall x, x' \in \{b(x)=b\}.
\]

则:

\[(r_0, r_1) \perp Z | b(X), \quad 0 < P(Z=z|b(X)) < 1.
\]

应用

假设\(X\)包含所有地confounders, 即

\[r \perp Z | X.
\]

Propensity Score Matching

既然, 在\(e(x)\)下:

\[r \perp Z | e(x),
\]

那么:

\[\mathbb{E}[r_1 - r_0] = \mathbb{E}_{e(x)} \: \{\mathbb{E} [r|e(x), Z=1] - \mathbb{E}[r|e(x), Z=0]\}.
\]

这个期望的过程可以分解为:

随机采样\(e(x)\);
在所有\(e(X)=e(x)\)的样本中, 随机选择\(Z=0\)和\(Z=1\)的样本;

通过此过程构造的新的数据集, 显然只需要将treated group中的群体对\(r\)取平均减去control group中的平均就能得到最后的treatment effect的估计了.

通过 propensity score matching 重采样构造的数据集满足:

\[Z \perp e(X).
\]

因为对于每一个treated group 中有一个样本\(e(x) = e\), 在control group中就有一个对应的\(e(x') = e\).

propensity score matching 重采样的实际方式可以简化为:

从treated group 中随机采样一个样本\((x,z,r)\);
计算其propensity score \(e(x)\);
从control group 中找到一个对应的\((x',z',r')\) 满足\(e(x')=e(x)\);
若存在多个\(x'\), 在其中随机采样一个.

上述采样过程中, 会遇到的问题:

不存在\(x'\), 这种情况是很容易遇到的, 一般, 我们可以选取\(x'\)使得\(e(x')\)最接近\(e(x)\), 这种方式一般称为greedy matching; 或者, 我们可以指定一个threshold, 在threshold内的\(\{x'\}\)中采样, 若一个都没有, 则舍弃\(x\).
\(x, x'\)被选中之后, 是否仍有机会被采样, 这是俩种策略;

Stratification on the Propensity Score

即将\(e(X)\)的值域分割成互斥的K个部分, 每个部分所包含的样本数量相近.

然后对每一个部分计算treatment effect, 最后再平均(加权平均, 权重为样本数量).

一般情况下, \(K=5\), 就能使得每一个stratum内的\(e(X)\)的值非常接近, 这就能够近似保证:

\[X \perp Z
\]

在每一个stratum内成立.

那么, 此时我们只需通过取平均就能直接计算出每一个stratum的treatment effect.

Inverse Probability of Treatment Weighting Using the Propensity Score

这个实际上就是普通的 IP weighting.

评估

显然, 我们多半需要从已有的数据中估计出 propensity score, 比如用常见的逻辑斯蒂回归模型. 自然地, 我们需要判断我们拟合的模型是否正确.

既然propensity score 也是一个 balancing score, 那么如果拟合的比较正确, 就应该有:

\[X \perp Z | e(X).
\]

也就是说, 我们需要判断, 在每一个\(e(x)\)下, \(X, Z\)是否独立.

对于matching, 若条件独立满足, 则有:

\[\mathbb{E}_{e(x)}\{\mathbb{E}[X|Z=1, e(x)] | Z=1 \}
=\mathbb{E}_{e(x)}\{\mathbb{E}[X|Z=0, e(x)] | Z=0 \}
\]

一个期望里用了条件独立, 第二个条件期望相等是因为matching 保证:

\[e(X) | Z.
\]

故, 我们只需要比较treated group 和 control group的一阶矩的差别:

\[\mathbb{E}[X|Z=1] - \mathbb{E}[X|Z=0].
\]

在实际中, 比较的是如下的标准化的:

\[d = \frac{|\bar{x}_{treated} - \bar{x}_{control}|}{\sqrt{(s_{treated}^2 + s_{control}^2) / 2}}.
\]

一般\(d < 0.1\)就可以认为这个propensity score拟合的不错.

对于stratification, 我们只需对每一个strata判断上面的结果.

对于IP weighing, 说实话没读懂:

For IPTW this assessment involves comparing treated and untreated subjects in the sample weighted by the inverse probability of treatment.