We consider two recent suggestions for how to perform an empirically motivated Monte Carlo study to help select a treatment effect estimator under unconfoundedness. We show theoretically that neither is likely to be informative except under restrictive conditions that are unlikely to be satisfied in many contexts. To test empirical relevance, we also apply the approaches to a real‐world setting where estimator performance is known. Both approaches are worse than random at selecting estimators that minimize absolute bias. They are better when selecting estimators that minimize mean squared error. However, using a simple bootstrap is at least as good and often better. For now, researchers would be best advised to use a range of estimators and compare estimates for robustness.