Institute for Fiscal Studies | Working Papers ISSN: 1742-0415 Working papers undergo an informal review process and are edited by Ian Preston. http://www.ifs.org.uk Sun, 26 Mar 2017 07:14:15 +0000 <![CDATA[Discretizing unobserved heterogeneity]]> We study panel data estimators based on a discretization of unobserved heterogeneity when individual heterogeneity is not necessarily discrete in the population. We focus on two-step grouped- fixed effects estimators, where individuals are classi ed into groups in a rst step using kmeans clustering, and the model is estimated in a second step allowing for group-speci c heterogeneity. We analyze the asymptotic properties of these discrete estimators as the number of groups grows with the sample size, and we show that bias reduction techniques can improve their performance. In addition to reducing the number of parameters, grouped fixed-effects methods provide e ective regularization. When allowing for the presence of time-varying unobserved heterogeneity, we show they enjoy fast rates of convergence depending of the underlying dimension of heterogeneity. Finally, we document the nite sample properties of two-step grouped fi xed-effects estimators in two applications: a structural dynamic discrete choice model of migration, and a model of wages with worker and rm heterogeneity.

]]>
http://www.ifs.org.uk/publications/9088 Wed, 22 Mar 2017 00:00:00 +0000
<![CDATA[Estimation of random coefficients logit demand models with interactive fixed effects]]> We extend the Berry, Levinsohn and Pakes (BLP, 1995) random coeffcients discrete-choice demand model, which underlies much recent empirical work in IO. We add interactive fixed effects in the form of a factor structure on the unobserved product characteristics. The interactive fixed effects can be arbitrarily correlated with the observed product characteristics (including price), which accommodates endogeneity and, at the same time, captures strong persistence in market shares across products and markets. We propose a two-step least squares-minimum distance (LS-MD) procedure to calculate the estimator. Our estimator is easy to compute, and Monte Carlo simulations show that it performs well. We consider an empirical illustration to US automobile demand.

]]>
http://www.ifs.org.uk/publications/8911 Wed, 22 Feb 2017 00:00:00 +0000
<![CDATA[Equality-minded treatment choice]]> The goal of many randomized experiments and quasi-experimental studies in economics is to inform policies that aim to raise incomes and reduce economic inequality. A policy maximizing the sum of individual incomes may not be desirable if it magnifies economic inequality and post-treatment redistribution of income is infeasible. This paper develops a method to estimate the optimal treatment assignment policy based on observable individual covariates when the policy objective is to maximize an equality-minded rank-dependent social welfare function, which puts higher weight on individuals with lower-ranked outcomes. We estimate the optimal policy by maximizing a sample analog of the rank-dependent welfare over a properly constrained set of policies. Although an analytical characterization of the optimal policy under a rank-dependent social welfare is not available even with the knowledge of potential outcome distributions, we show that the average social welfare attained by our estimated policy converges to the maximal attainable welfare at n-1/2 rate uniformly over a large class of data distributions. We also show that this rate is minimax optimal. We provide an application of our method using the data from the National JTPA Study.]]> http://www.ifs.org.uk/publications/8909 Wed, 22 Feb 2017 00:00:00 +0000 <![CDATA[Nonparametric identification of random coefficients in endogenous and heterogeneous aggregate demand models]]> This paper studies nonparametric identi fication in market level demand models for differentiated products with heterogeneous consumers. We consider a general class of models that allows for the individual speci fic coefficients to vary continuously across the population and give conditions under which the density of these coefficients, and hence also functionals such as welfare measures, is identified. Building on earlier work by Berry and Haile (2013), we show that key identifying restrictions are provided by (i) a set of moment conditions generated by instrumental variables together with an inversion of aggregate demand in unobserved product characteristics; and (ii) the variation of the product characteristics across markets that is exogenous to the individual heterogeneity. We further show that two leading models, the BLP-model (Berry, Levinsohn, and Pakes,1995) and the pure characteristics model (Berry and Pakes, 2007), require considerably different conditions on the support of the product characteristics.

]]>
http://www.ifs.org.uk/publications/8910 Wed, 22 Feb 2017 00:00:00 +0000
<![CDATA[Optimal sup-norm rates and uniform inference on nonlinear functionals of nonparametric IV regression]]> This paper makes several important contributions to the literature about nonparametric instrumental variables (NPIV) estimation and inference on a structural function h0 and its functionals. First, we derive sup-norm convergence rates for computationally simple sieve NPIV (series 2SLS) estimators of h0 and its derivatives. Second, we derive a lower bound that describes the best possible (minimax) sup-norm rates of estimating h0 and its derivatives, and show that the sieve NPIV estimator can attain the minimax rates when h0 is approximated via a spline or wavelet sieve. Our optimal sup-norm rates surprisingly coincide with the optimal root-mean-squared rates for severely ill-posed problems, and are only a logarithmic factor slower than the optimal root-mean-squared rates for mildly ill-posed problems. Third, we use our sup-norm rates to establish the uniform Gaussian process strong approximations and the score bootstrap uniform con dence bands (UCBs) for collections of nonlinear functionals of h0 under primitive conditions, allowing for mildly and severely ill-posed problems. Fourth, as applications, we obtain the first asymptotic pointwise and uniform inference results for plug-in sieve t-statistics of exact consumer surplus (CS) and deadweight loss (DL) welfare functionals under low-level conditions when demand is estimated via sieve NPIV. Empiricists could read our real data application of UCBs for exact CS and DL functionals of gasoline demand that reveals interesting patterns and is applicable to other markets.

]]>
http://www.ifs.org.uk/publications/8896 Mon, 13 Feb 2017 00:00:00 +0000
<![CDATA[An econometric model of network formation with degree heterogeneity]]> I introduce a model of undirected dyadic link formation which allows for assortative matching on observed agent characteristics (homophily) as well as unrestricted agent level heterogeneity in link surplus (degree heterogeneity). Like in fixed effects panel data analyses, the joint distribution of observed and unobserved agent-level characteristics is left unrestricted. Two estimators for the (common) homophily parameter, β, are developed and their properties studied under an asymptotic sequence involving a single network growing large. The first, tetrad logit (TL), estimator conditions on a sufficient statistic for the degree heterogeneity. The second, joint maximum likelihood (JML), estimator treats the degree heterogeneity {Ai0}Ni=1 as additional (incidental) parameters to be estimated. The TL estimate is consistent under both sparse and dense graph sequences, whereas consistency of the JML estimate is shown
only under dense graph sequences.

Supplement for CWP 08/17

]]>
http://www.ifs.org.uk/publications/8894 Fri, 10 Feb 2017 00:00:00 +0000
<![CDATA[Instrumental variables estimation for nonparametric models]]> Whitney Newey and James Powell, founding CeMMAP Fellows, wrote an influential paper on instrumental variable estimation of an additive error non-parametrically specified structural equation, presented at the December 1988 North American Winter Meetings of the Econometric Society. A version circulated in 1989 but the results were not published until 14 years later: “Instrumental Variable Estimation of Nonparametric Models”, Econometrica, 71(5), (September 2003), 1565-1578. The 1989 working paper is  often referred to, but hard to find. For the sake of completeness CeMMAP is pleased to publish the 1989 paper.

]]>
http://www.ifs.org.uk/publications/8874 Fri, 03 Feb 2017 00:00:00 +0000
<![CDATA[Design of optimal corrective taxes in the alcohol market]]> Alcohol consumption is associated with costs to society due to its impact on crime and health. Tax can lead consumers to internalise these externalities. We study optimal corrective taxation in the alcohol market. We allow for the fact that the externality generating commodity (ethanol) is available in many di fferentiated products, over which consumers might have heterogeneous preferences, and that there may also be heterogeneity in marginal externalities across consumers. We show that, if there is correlation in preferences and marginal externalities, setting di fferent tax rates across products can improve welfare relative to a single tax rate on ethanol. We estimate a model of demand in the UK alcohol market and numerically solve for the optimal tax rates. Moving to an optimal system that taxes alcohol types at diff erent rates would close half of the welfare gap between the current UK system and the fi rst best.

]]>
http://www.ifs.org.uk/publications/8868 Tue, 31 Jan 2017 00:00:00 +0000
<![CDATA[A coupled component GARCH model for intraday and overnight volatility]]> We propose a semi-parametric coupled component GARCH model for intraday and overnight volatility that allows the two periods to have di fferent properties. To capture the very heavy tails of overnight returns, we adopt a dynamic conditional score model with t innovations. We propose a several step estimation procedure that captures the nonparametric slowly moving components by kernel estimation and the dynamic parameters by t maximum likelihood. We establish the consistency and asymptotic normality of our estimation procedures. We extend the modelling to the multivariate case. We apply our model to the study of the Dow Jones industrial average component stocks over the period 1991-2016 and the CRSP cap based portfolios over the period of 1992-2015. We show that actually the ratio of overnight to intraday volatility has increased in importance for big stocks in the last 20 years. In addition, our model provides better intraday volatility forecast since it takes account of the full dynamic consequences of the overnight shock and previous ones.

]]>
http://www.ifs.org.uk/publications/8861 Thu, 26 Jan 2017 00:00:00 +0000
<![CDATA[The influence function of semiparametric estimators]]> There are many economic parameters that depend on nonparametric first steps. Examples include games, dynamic discrete choice, average consumer surplus, and treatment effects. Often estimators of these parameters are asymptotically equivalent to a sample average of an object referred to as the influence function. The influence function is useful in formulating regularity conditions for asymptotic normality, for bias reduction, in efficiency comparisons, and for analyzing robustness. We show that the influence function of a semiparametric estimator is the limit of a Gateaux derivative with respect to a smooth deviation as the deviation approaches a point mass. This result generalizes the classic Von Mises (1947) and Hampel (1974) calculation to apply to estimators that depend on smooth nonparametic first steps. We characterize the influence function of M and GMM-estimators.We apply the Gateaux derivative to derive the influence function with a first step nonparametric two stage least squares estimator based on orthogonality conditions. We also use the influence function to analyze high level and primitive regularity conditions for asymptotic normality. We give primitive regularity conditions for linear functionals of series regression that are the weakest known, except for a log term, when the regression function is smooth enough.

]]>
http://www.ifs.org.uk/publications/8862 Thu, 26 Jan 2017 00:00:00 +0000
<![CDATA[Inference in linear regression models with many covariates and heteroskedasticity]]> The linear regression model is widely used in empirical work in Economics, Statistics, and many other disciplines. Researchers often include many covariates in their linear model speci…cation in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of included covariates are allowed to grow as fast as the sample size. We fi…nd that all of the usual versions of Eicker-White heteroskedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroskedasticity consistent standard error formula that is fully automatic and robust to both (conditional) heteroskedasticity of unknown form and the inclusion of possibly many covariates. We apply our fi…ndings to three settings: parametric linear models with many covariates, linear panel models with many …fixed effects, and semiparametric semi-linear models with many technical regressors. Simulation evidence consistent with our theoretical results is also provided. The proposed methods are also illustrated with an empirical application.

]]>
http://www.ifs.org.uk/publications/8855 Fri, 20 Jan 2017 00:00:00 +0000
<![CDATA[Likelihood inference and the role of initial conditions for the dynamic panel data model]]> Lancaster (2002) proposes an estimator for the dynamic panel data model with homoskedastic errors and zero initial conditions. In this paper, we show this estimator is invariant to orthogonal transformations, but is inefficient because it ignores additional information available in the data. The zero initial condition is trivially satis fied by subtracting initial observations from the data. We show that di fferencing out the data further erodes efficiency compared to drawing inference conditional on the rst observations. Finally, we compare the conditional method with standard random eff ects approaches for unobserved data. Standard approaches implicitly rely on normal approximations, which may not be reliable when unobserved data is very skewed with some mass at zero values. For example, panel data on fi rms naturally depend on the fi rst period in which the fi rm enters on a new state. It seems unreasonable then to assume that the process determining unobserved data is known or stationary. We can instead make inference on structural parameters by conditioning on the initial observations.

]]>
http://www.ifs.org.uk/publications/8856 Fri, 20 Jan 2017 00:00:00 +0000
<![CDATA[Two decades of income inequality in Britain: the role of wages, household earnings and redistribution]]> We study earnings and income inequality in Britain over the past two decades, including the period of relatively “inclusive” growth from 1997-2004 and the Great Recession. We focus on the middle 90%, where trends have contrasted strongly with the “new inequality” at the very top. Household earnings inequality has risen, driven by male earnings – although a ‘catch-up’ of female earnings did hold down individual earnings inequality and reduce within-household inequality. Nevertheless, net household income inequality fell due to deliberate increases in redistribution, the tax and transfer system’s insurance role during the Great Recession, falling household worklessness, and rising pensioner incomes.

]]>
http://www.ifs.org.uk/publications/8833 Fri, 13 Jan 2017 00:00:00 +0000
<![CDATA[A discrete choice model for large heterogeneous panels with interactive fixed effects with an application to the determinants of corporate bond issuance]]> What is the eff ect of funding costs on the conditional probability of issuing a corporate bond? We study this question in a novel dataset covering 5610 issuances by US firms over the period from 1990 to 2014. Identifi cation of this eff ect is complicated because of unobserved, common shocks such as the global fi nancial crisis. To account for these shocks, we extend the common correlated eff ects estimator to settings where outcomes are discrete. Both the asymptotic properties and the small sample behavior of this estimator are documented. We fi nd that for non- financial fi rms, yields are negatively related to bond issuance but that e ffect is larger in the pre-crisis period.

]]>
http://www.ifs.org.uk/publications/8848 Thu, 12 Jan 2017 00:00:00 +0000
<![CDATA[A bootstrap method for constructing pointwise and uniform confidence bands for conditional quantile functions]]> This paper is concerned with inference about the conditional quantile function in a nonparametric quantile regression model. Any method for constructing a confidence interval or band for this function must deal with the asymptotic bias of nonparametric estimators of the function. In estimation methods such as local polynomial estimation, this is usually done through undersmoothing or explicit bias correction. The latter usually requires oversmoothing. However, there are no satisfactory empirical methods for selecting bandwidths that under- or oversmooth. This paper extends the bootstrap method of Hall and Horowitz (2013) for conditional mean functions to conditional quantile functions. The paper also shows how the bootstrap method can be used to obtain uniform confidence bands. The bootstrap method uses only bandwidths that are selected by standard methods such as cross validation and plug-in. It does not use under- or oversmoothing. The results of Monte Carlo experiments illustrate the numerical performance of the bootstrap method.

]]>
http://www.ifs.org.uk/publications/8847 Thu, 12 Jan 2017 00:00:00 +0000
<![CDATA[Identifying preferences in networks with bounded degree]]> This paper provides a framework for identifying preferences in a large network where links are pairwise stable. Network formation models present dificulties for identification, especially when links can be interdependent: e.g., when indirect connections matter. We show how one can use the observed proportions of various local network structures to learn about the underlying preference parameters. The key assumption for our approach restricts individuals to have bounded degree in equilibrium, implying a fi nite number of payoff -relevant local structures. Our main result provides necessary conditions for parameters to belong to the identi fied set. We then develop a quadratic programming algorithm that can be used to construct this set. With further restrictions on preferences, we show that our conditions are also suficient for pairwise stability and therefore characterize the identi fied set precisely. Overall, the use of both the economic model along with pairwise stability allows us to obtain e ffective dimension reduction.

]]>
http://www.ifs.org.uk/publications/8834 Fri, 16 Dec 2016 00:00:00 +0000
<![CDATA[Explaining low employment rates among older women in urban China]]> In China, the employment rate among middle-aged and older urban residents is exceptionally low. For example, 27% of 55-64-year-old urban women were in work in 2013, compared to more than 50% in UK, Thailand and Philippines. This paper investigates potential explanations of this low level of employment in urban China. I document the stylized fact that a majority of individuals stop working as soon as they qualify for a public pension, which most often happens at age 50 for women. I also highlight the presence of signi cant amounts of financial and time transfers between generations. I provide descriptive evidence that transfers from children are responsive to parental incomes, and that mother's labour supply is aff ected by the expectation of transfers from her children. I then built and calibrate a life-cycle model of labour supply and saving. I fi nd that both the pension system and transfers from children have large eff ects on female labour supply. Increasing the female pension age from the status-quo to 60 would raise the employment rate of 50-59 year old women by 28 percentage points.

]]>
http://www.ifs.org.uk/publications/8801 Mon, 05 Dec 2016 00:00:00 +0000
<![CDATA[Free childcare and parents’ labour supply: is more better?]]> Despite the introduction of childcare subsidies in many countries, the cost of childcare is still thought to hinder parental employment. Many governments are considering increasing the generosity of their childcare subsidies, but the a priori effect of such a policy is ambiguous and little is known empirically about its likely impact. This paper compares the effects on parents’ labour supply of offering free part-time childcare and of expanding this offer to the whole school day in England using an empirical strategy which, unlike previous studies, exploits both date of birth discontinuities and panel data. We find that the provision of free part-time childcare has little, if any, causal impact on the labour market outcomes of mothers or fathers. Increasing the number of hours of free childcare to cover a full school day, however, leads to significant increases in the labour supply of mothers whose youngest child is eligible, with impacts emerging immediately and increasing over the months following entitlement.

]]>
http://www.ifs.org.uk/publications/8728 Fri, 02 Dec 2016 00:00:00 +0000
<![CDATA[Asymptotic properties of a Nadaraya-Watson type estimator for regression functions of in…finite order]]> We consider a class of nonparametric time series regression models in which the regressor takes values in a sequence space and the data are stationary and weakly dependent. We propose an infi…nite dimensional Nadaraya-Watson type estimator with a bandwidth sequence that shrinks the e¤ects of long lags. We investigate its asymptotic properties in detail under both static and dynamic regressions contexts. First we show pointwise consistency of the estimator under a set of mild regularity conditions. We establish a CLT for the estimator at a point under stronger conditions as well as for a feasibly studentized version of the estimator, thereby allowing pointwise inference to be conducted. We establish the uniform consistency over a compact set of logarithmically increasing dimension. We specify the explicit rates of convergence in terms of the Lambert W function, and show that the optimal rate that balances the squared bias and variance is of logarithmic order, the precise rate depending on the smoothness of the regression function and the dependence of the data in a non-trivial way.

]]>
http://www.ifs.org.uk/publications/8756 Wed, 23 Nov 2016 00:00:00 +0000
<![CDATA[‘Randomisation bias’ in the medical literature: a review]]> Randomised controlled or clinical trials (RCTs) are generally viewed as the most reliable method to draw causal inference as to the effects of a treatment, as they should guarantee that the individuals being compared differ only in terms of their exposure to the treatment of interest. This ‘gold standard’ result however hinges on the requirement that the randomisation device determines the random allocation of individuals to the treatment without affecting any other element of the causal model. This ‘no randomisation bias’ assumption is generally untestable but if violated would undermine the causal inference emerging from an RCT, both in terms of its internal validity and in terms of its relevance for policy purposes. This paper offers a concise review of how the medical literature identifies and deals with such issues.

]]>
http://www.ifs.org.uk/publications/8746 Mon, 21 Nov 2016 00:00:00 +0000
<![CDATA[What happens when employers are obliged to nudge? Automatic enrolment and pension saving in the UK]]> This paper studies the first nationwide introduction of automatic enrolment, in which employers in the United Kingdom are obliged to enrol employees into a workplace pension scheme, which employees can then choose to leave if they wish. We exploit the phased rollout of automatic enrolment since 2012 to estimate its effect on pension saving. As a result of automatic enrolment, participation in workplace pensions among eligible private sector workers is estimated to have increased by 37 percentage points, and workplace pension membership reached 88% amongst those affected by April 2015. Automatic enrolment significantly increased the average pension contribution rate, in part because some newlyenrolled employees received an employer contribution well above the minimum mandated by the government. Furthermore, many employees who did not have to be automatically enrolled were nonetheless brought into a workplace pension scheme as a result of the policy. We find no evidence of employers reducing employer contributions for newly-hired employees or existing members of workplace pensions.

]]>
http://www.ifs.org.uk/publications/8723 Thu, 17 Nov 2016 00:00:00 +0000
<![CDATA[Choice in the presence of experts: the role of general practitioners in patients' hospital choice]]> This paper considers the micro-econometric analysis of patients' hospital choice for elective medical procedures when their choice set is pre-selected by a general practitioner (GP). It proposes a two-stage choice model that encompasses both, patient and GP level optimization, and it discusses identifi cation. The empirical analysis demonstrates biases and inconsistencies that arise when strategic pre-selection is not properly taken into account. We fi nd that patients defer to GPs when assessing hospital quality and focus on tangible attributes, like hospital amenities; and that GPs, in turn, as patients'
agents present choice options based on quality, but as agents of health authorities also consider their financial implications.

]]>
http://www.ifs.org.uk/publications/8727 Mon, 14 Nov 2016 00:00:00 +0000
<![CDATA[The Right to Buy public housing in Britain: a welfare analysis]]> We investigate the impact on social welfare of the United Kingdom (UK) policy introduced in 1980 by which public housing tenants (council housing in UK parlance) had the right to purchase their houses at heavily discounted prices. This was known as the Right to Buy (RTB) policy. Although this internationally-unique policy was the largest source of public privatization revenue in the UK and raised home ownership as a share of housing tenure by around 15%, the policy has been little analyzed by economists. We investigate the equilibrium housing policy of the public authority in terms of quality and quantity of publicly-provided housing both in the absence and presence of a RTB policy. We find that RTB can improve the aggregate welfare of low-income households only if the council housing quality is sufficiently low such that middle-wealth households have no incentive to exercise RTB. We also explore the welfare effects of various adjustments to the policy, in particular (i) reduce discounts on RTB sales; (ii) loosen restrictions on resale; (iii) return the proceeds from RTB sales to local authorities to construct new public properties; and (iv) replace RTB with rent subsidies in cash.

]]>
http://www.ifs.org.uk/publications/8726 Mon, 14 Nov 2016 00:00:00 +0000
<![CDATA[Estimation of a multiplicative covariance structure in the large dimensional case]]> We propose a Kronecker product structure for large covariance or correlation matrices. One feature of this model is that it scales logarithmically with dimension in the sense that the number of free parameters increases logarithmically with the dimension of the matrix. We propose an estimation method of the parameters based on a log-linear property of the structure, and also a quasi-maximum likelihood estimation (QMLE) method. We establish the rate of convergence of the estimated parameters when the size of the matrix diverges. We also establish a central limit theorem (CLT) for our method. We derive the asymptotic distributions of the estimators of the parameters of the spectral distribution of the Kronecker product correlation matrix, of the extreme logarithmic eigenvalues of this matrix, and of the variance of the minimum variance portfolio formed using this matrix. We also develop tools of inference including a test for over-identifi cation. We apply our methods to portfolio choice for S&P500 daily returns and compare with sample covariance-based methods and with the recent Fan, Liao, and Mincheva (2013) method.

]]>
http://www.ifs.org.uk/publications/8724 Wed, 09 Nov 2016 00:00:00 +0000
<![CDATA[Nonlinear panel data methods for dynamic heterogeneous agent models]]> Recent developments in nonlinear panel data analysis allow identifying and estimating general dynamic systems. In this review we describe some results and techniques for nonparametric identifi cation and flexible estimation in the presence of time-invariant and time-varying latent variables. This opens the possibility to estimate nonlinear reduced forms in a large class of structural dynamic models with heterogeneous agents. We show how such reduced forms may be used to document policy-relevant derivative e ffects, and to improve the understanding and facilitate the implementation of structural models.

]]>
http://www.ifs.org.uk/publications/8712 Tue, 01 Nov 2016 00:00:00 +0000
<![CDATA[Spillovers of community based health interventions on consumption smoothing]]> Community-based interventions, particularly group-based ones, are considered to be a cost-effective way of delivering interventions in low-income settings. However, design features of these programs could also influence dimensions of household and community behaviour beyond those targeted by the intervention. This paper studies spillover effects of a participatory community health intervention in rural Malawi, implemented through a cluster randomised control trial, on an outcome not directly targeted by the intervention: household consumption smoothing after crop losses. We find that while crop losses reduce consumption growth in the absence of the intervention, households in treated areas are able to compensate for this loss and perfectly insure their consumption. Asset decumulation also falls in treated areas. We provide suggestive evidence that these effects are driven by increased social interactions, which could have alleviated contracting frictions; and rule out that they are driven by improved health or reductions in the incidence of crop losses.

]]>
http://www.ifs.org.uk/publications/8700 Tue, 18 Oct 2016 00:00:00 +0000
<![CDATA[On cross-validated Lasso]]> In this paper, we derive a rate of convergence of the Lasso estimator when the penalty parameter λ for the estimator is chosen using K-fold cross-validation; in particular, we show that in the model with Gaussian noise and under fairly general assumptions on the candidate set of values of λ, the prediction norm of the estimation error of the cross-validated Lasso estimator is with high probability bounded from above up-to a constant by (s log p/n)1/2 (log7/8n) as long as p log n/n = o(1) and some other mild regularity conditions are satisfi ed where n is the sample size of available data, p is the number of covariates, and s is the number of non-zero coefficients in the model. Thus, the cross-validated Lasso estimator achieves the fastest possible rate of convergence up-to the logarithmic factor log7/8 n. In addition, we derive a sparsity bound for the cross-validated Lasso estimator; in particular, we show that under the same conditions as above, the number of non-zero coefficients of the estimator is with high probability bounded from above up-to a constant by s log5 n. Finally, we show that our proof technique generates non-trivial bounds on the prediction norm of the estimation error of the cross-validated Lasso estimator even if p is much larger than n and the assumption of Gaussian noise fails; in particular, the prediction norm of the estimation error is with high-probability bounded from above up-to a constant by (s log2(pn) / n)1/4 under mild regularity conditions.

]]>
http://www.ifs.org.uk/publications/8509 Tue, 27 Sep 2016 00:00:00 +0000
<![CDATA[An economic theory of statistical testing]]> This paper models the use of statistical hypothesis testing in regulatory approval. A privately informed agent proposes an innovation. Its approval is benefi cial to the proponent, but potentially detrimental to the regulator. The proponent can conduct a costly clinical trial to persuade the regulator. I show that the regulator can screen out all ex-ante undesirable proponents by committing to use a simple statistical test. Its level is the ratio of the trial cost to the proponent's bene fit from approval. In application to new drug approval, this level is around 15% for an average Phase III clinical trial.

]]>
http://www.ifs.org.uk/publications/8512 Tue, 27 Sep 2016 00:00:00 +0000
<![CDATA[Nonparametric instrumental variable estimation under monotonicity]]> The ill-posedness of the inverse problem of recovering a regression function in a nonparametric instrumental variable (NPIV) model leads to estimators that may suffer from poor statistical performance. In this paper, we explore the possibility of imposing shape restrictions to improve the performance of the NPIV estimators. We assume that the regression function is monotone and consider sieve estimators that enforce the monotonicity constraint. We define a restricted measure of ill-posedness that is relevant for the constrained estimators and show that under the monotone IV assumption and certain other conditions, our measure of ill-posedness is bounded uniformly over the dimension of the sieve space, in stark contrast with a well-known result that the unrestricted sieve measure of ill-posedness that is relevant for the unconstrained estimators grows to infinity with the dimension of the sieve space. Based on this result, we derive a novel non-asymptotic error bound for the constrained estimators. The bound gives a set of data-generating processes where the monotonicity constraint has a particularly strong regularization effect and considerably improves the performance of the estimators. The bound shows that the regularization effect can be strong even in large samples and for steep regression functions if the NPIV model is severely ill-posed a finding that is confirmed by our simulation study. We apply the constrained estimator to the problem of estimating gasoline demand from U.S. data.

]]>
http://www.ifs.org.uk/publications/8510 Tue, 27 Sep 2016 00:00:00 +0000
<![CDATA[Double machine learning for treatment and causal parameters]]> Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coffiecients, average treatment e ffects, average lifts, and demand or supply elasticities. In fact, estimators of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly. For example, the resulting estimators may formally have inferior rates of convergence with respect to the sample size n caused by regularization bias. Fortunately, this regularization bias can be removed by solving auxiliary prediction problems via ML tools. Speci ficially, we can form an efficient score for the target low-dimensional parameter by combining auxiliary and main ML predictions. The efficient score may then be used to build an efficient estimator of the target parameter which typically will converge at the fastest possible 1/√ n rate and be approximately unbiased and normal, allowing simple construction of valid con fidence intervals for parameters of interest. The resulting method thus could be called a "double ML" method because it relies on estimating primary and auxiliary predictive models. Such double ML estimators achieve the fastest rates of convergence and exhibit robust good behavior with respect to a broader class of probability distributions than naive "single" ML estimators. In order to avoid overfi tting, following [3], our construction also makes use of the K-fold sample splitting, which we call cross- fitting. The use of sample splitting allows us to use a very broad set of ML predictive methods in solving the auxiliary and main prediction problems, such as random forests, lasso, ridge, deep neural nets, boosted trees, as well as various hybrids and aggregates of these methods (e.g. a hybrid of a random forest and lasso). We illustrate the application of the general theory through application to the leading cases of estimation and inference on the main parameter in a partially linear regression model and estimation and inference on average treatment eff ects and average treatment e ffects on the treated under conditional random assignment of the treatment. These applications cover randomized control trials as a special case. We then use the methods in an empirical application which estimates the e ffect of 401(k) eligibility on accumulated financial assets.

]]>
http://www.ifs.org.uk/publications/8511 Tue, 27 Sep 2016 00:00:00 +0000
<![CDATA[Life-cycle consumption patterns at older ages in the US and the UK: can medical expenditures explain the difference?]]> In this paper we document significantly steeper declines in nondurable expenditures in the UK compared to the US, in spite of income paths being similar. We explore several possible causes, including different employment paths, housing ownership and expenses, levels and paths of health status, number of household members, and out-of -pocket medical expenditures. Among all the potential explanations considered, we find that those to do with healthcare—differences in levels and age paths in medical expenses—can fully account for the steeper declines in nondurable consumption in the UK compared to the US.

]]>
http://www.ifs.org.uk/publications/8466 Fri, 09 Sep 2016 00:00:00 +0000
<![CDATA[Mobility and the lifetime distributional impact of tax and transfer reforms]]> The distributional impact of proposed reforms plays a central role in public debates around tax and transfer policy. We show that accounting for realistic patterns of mobility in employment, earnings and household circumstances over the life-cycle greatly affects our assessment of the distributional effects of tax and transfer reforms. We focus on three reforms modelled in the UK context: (i) changes to out-of-work versus in-work benefits, (ii) adjustments to income tax rates, and (iii) reforms to indirect taxation. In all three cases, the long-run distributional impact differs to that implied by a standard crosssection analysis in important ways. 

]]>
http://www.ifs.org.uk/publications/8468 Fri, 09 Sep 2016 00:00:00 +0000
<![CDATA[Conditional quantile processes based on series or many regressors]]> Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR-series framework covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantile-speci fic coefficients and estimate the function-valued coefficients from the data. We develop large sample theory for the QR-series coefficient process, namely we obtain uniform strong approximations to the QR-series coefficient process by conditionally pivotal and Gaussian processes. Based on these two strong approximations, or couplings, we develop four resampling methods (pivotal, gradient bootstrap, Gaussian, and weighted bootstrap) that can be used for inference on the entire QR-series coefficient function. We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Speci fically, we obtain uniform rates of convergence and show how to use the four resampling methods mentioned above for inference on the functionals. All of the above results are for function-valued parameters, holding uniformly in both the quantile index and the covariate value, and covering the pointwise case as a by-product. We demonstrate the practical utility of these results with an empirical example, where we estimate the price elasticity function and test the Slutsky condition of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption.

]]>
http://www.ifs.org.uk/publications/8462 Tue, 30 Aug 2016 00:00:00 +0000
<![CDATA[Testing many moment inequalities]]> This paper considers the problem of testing many moment inequalities where the number of moment inequalities, denoted by p, is possibly much larger than the sample size n. There are variety of economic applications where the problem of testing many moment inequalities appears; a notable example is a market structure model of Ciliberto and Tamer (2009) where p = 2m+1 with m being the number of fi rms. We consider the test statistic given by the maximum of p Studentized (or t-type) statistics, and analyze various ways to compute critical values for the test statistic. Speci cally, we consider critical values based upon (i) the union bound combined with a moderate deviation inequality for self-normalized sums, (ii) the multiplier and empirical bootstraps, and (iii) two-step and three-step variants of (i) and (ii) by incorporating selection of uninformative inequalities that are far from being binding and novel selection of weakly informative inequalities that are potentially binding but do not provide fi rst order information. We prove validity of these methods, showing that under mild conditions, they lead to tests with error in size decreasing polynomially in n while allowing for p being much larger than n; indeed p can be of order exp(nc) for some c > 0. Importantly, all these results hold without any restriction on correlation structure between p Studentized statistics, and also hold uniformly with respect to suitably large classes of underlying distributions. Moreover, when p grows with n, we show that all of our tests are (minimax) optimal in the sense that they are uniformly consistent against alternatives whose "distance" from the null is larger than the threshold (2(log p)=n)1/2, while any test can only have trivial power in the worst case when the distance is smaller than the threshold. Finally, we show validity of a test based on block multiplier bootstrap in the case of dependent data under some general mixing conditions.

]]>
http://www.ifs.org.uk/publications/8458 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Anti-concentration and honest, adaptive confidence bands]]> Modern construction of uniform confidence bands for nonpara-metric densities (and other functions) often relies on the classical Smirnov-Bickel-Rosenblatt (SBR) condition; see, for example, Giné and Nickl (2010). This condition requires the existence of a limit distribution of an extreme value type for the supremum of a studentized empirical process (equivalently, for the supremum of a Gaussian process with the same covariance function as that of the studentized empirical process). The principal contribution of this paper is to remove the need for this classical condition. We show that a considerably weaker sufficient condition is derived from an anti-concentration property of the supremum of the approximating Gaussian process, and we derive an inequality leading to such a property for separable Gaussian processes. We refer to the new condition as a generalized SBR condition. Our new result shows that the supremum does not concentrate too fast around any value.

We then apply this result to derive a Gaussian multiplier boot-strap procedure for constructing honest confidence bands for non-parametric density estimators (this result can be applied in other nonparametric problems as well). An essential advantage of our ap-proach is that it applies generically even in those cases where the limit distribution of the supremum of the studentized empirical pro-cess does not exist (or is unknown). This is of particular importance in problems where resolution levels or other tuning parameters have been chosen in a data-driven fashion, which is needed for adaptive constructions of the confidence bands. Finally, of independent inter-est is our introduction of a new, practical version of Lepski’s method, which computes the optimal, non-conservative resolution levels via a Gaussian multiplier bootstrap method.

]]>
http://www.ifs.org.uk/publications/8459 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Characterizations of identified sets delivered by structural econometric models]]> This paper develops characterizations of identifi…ed sets of structures and structural features for complete and incomplete models involving continuous or discrete variables. Multiple values of unobserved variables can be associated with particular combinations of observed variables. This can arise when there are multiple sources of heterogeneity, censored or discrete endogenous variables, or inequality restrictions on functions of observed and unobserved variables. The models generalize the class of incomplete instrumental variable (IV) models in which unobserved variables are single-valued functions of observed variables. Thus the models are referred to as Generalized IV (GIV) models, but there are important cases in which instrumental variable restrictions play no signifi…cant role. Building on a de…finition of observational equivalence for incomplete models the development uses results from random set theory which guarantee that the characterizations deliver sharp bounds, thereby dispensing with the need for case-by-case proofs of sharpness. The use of random sets de…ned on the space of unobserved variables allows identi…cation analysis under mean and quantile independence restrictions on the distributions of unobserved variables conditional on exogenous variables as well as under a full independence restriction. The results are used to develop sharp bounds on the distribution of valuations in
an incomplete model of English auctions, improving on the pointwise bounds available till now. Application of many of the results of the paper requires no familiarity with random set theory.

]]>
http://www.ifs.org.uk/publications/8460 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Central limit theorems and bootstrap in high dimensions]]> In this paper, we derive central limit and bootstrap theorems for probabilities that centered high-dimensional vector sums hit rectangles and sparsely convex sets. Specifically, we derive Gaussian and bootstrap approximations for the probabilities that a root-n rescaled sample average of Xis in A, where X1,..., Xnare independent random vectors in Rp and is a rectangle, or, more generally, a sparsely convex set, and show that the approximation error converges to zero even if p=pn-> infinity and p>>n; in particular, p can be as large as O(e^(Cn^c)) for some constants c,C>0.  The result holds uniformly over all rectangles, or more generally, sparsely convex sets, and does not require any restrictions on the correlation among components of Xi. Sparsely convex sets are sets that can be represented as intersections of many convex sets whose indicator functions depend nontrivially only on a small subset of their arguments, with rectangles being a special case.

]]>
http://www.ifs.org.uk/publications/8455 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Comparison and anti-concentration bounds for maxima of Gaussian random vectors]]> Slepian and Sudakov-Fernique type inequalities, which compare expectations of maxima of Gaussian random vectors under certain restrictions on the covariance matrices, play an important role in probability theory, especially in empirical process and extreme value theories. Here we give explicit comparisons of expectations of smooth functions and distribution functions of maxima of Gaussian random vectors without any restriction on the covariance matrices. We also establish an anti-concentration inequality for the maximum of a Gaussian random vector, which derives a useful upper bound on the Lévy concentration function for the Gaussian maximum. The bound is dimension-free and applies to vectors with arbitrary covariance matrices. This anti-concentration inequality plays a crucial role in establishing bounds on the Kolmogorov distance between maxima of Gaussian random vectors. These results have immediate applications in mathematical statistics. As an example of application, we establish a conditional multiplier central limit theorem for maxima of sums of independent random vectors where the dimension of the vectors is possibly much larger than the sample size.

]]>
http://www.ifs.org.uk/publications/8456 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Gaussian approximation of suprema of empirical processes]]> This paper develops a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the sup-norm. We prove an abstract approximation theorem applicable to a wide variety of statistical problems, such as construction of uniform confidence bands for functions. Notably, the bound in the main approximation theorem is nonasymptotic and the theorem does not require uniform boundedness of the class of functions. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein's method for normal approximation, and some new empirical process techniques. We study applications of this approximation theorem to local and series empirical processes arising in nonparametric estimation via kernel and series methods, where the classes of functions change with the sample size and are non-Donsker. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples.

]]>
http://www.ifs.org.uk/publications/8457 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Partial identification in applied research: benefits and challenges]]> Advances in the study of partial identifi cation allow applied researchers to learn about parameters of interest without making assumptions needed to guarantee point identifi cation. We discuss the roles that assumptions and data play in partial identi fication analysis, with the goal of providing information to applied researchers that can help them employ these methods in practice. To this end, we present a sample of econometric models that have been used in a variety of recent applications where parameters of interest are partially identifi ed, highlighting common features and themes across these papers. In addition, in order to help illustrate the combined roles of data and assumptions, we present numerical illustrations for a particular application, the joint determination of wages and labor supply. Finally we discuss the benefi ts and challenges of using partially identifying models in empirical work and point to possible avenues of future research.

]]>
http://www.ifs.org.uk/publications/8461 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[New Joints: Private providers and rising demand in the English National Health Service]]> Reforms to public services have extended consumer choice by allowing for the entry of private providers. The aim is to generate competitive pressure to improve quality when consumers choose between providers. However, for many services new entrants could also affect whether a consumer demands the service at all. We explore this issue by considering how demand for elective surgery responds following the entry of private providers into the market for publicly funded health care in England. For elective hip replacements, we find that demand shifts account for at least 7% of public procedures conducted by private hospitals. These results are robust to instrumenting for location using the presence of existing healthcare facilities. Exploiting rarely used clinical audit data, we show that these additional procedures are not substitutions from privately funded procedures, and represent new surgeries, at least within a given year. The increase in volumes resulting from a demand shift improve consumer welfare, but impose fiscal costs, and do not contribute the original aim of the reforms to stimulate competition.

This is an updated version of W15/22 New joints: private providers and rising demand in the English National Health Service.

]]>
http://www.ifs.org.uk/publications/8451 Fri, 26 Aug 2016 00:00:00 +0000
<![CDATA[Valid post-selection and post-regularization inference: An elementary, general approach]]> Here we present an expository, general analysis of valid post-selection or post-regularization inference about a low-dimensional target parameter in the presence of a very high-dimensional nuisance parameter which is estimated using selection or regularization methods. Our analysis provides a set of high-level conditions under which inference for the low-dimensional parameter based on testing or point estimation methods will be regular despite selection or regularization biases occurring in estimation of the high-dimensional nuisance parameter. The results may be applied to establish uniform validity of post-selection or post-regularization inference procedures for low-dimensional target parameters over large classes of models. The high-level conditions allow one to clearly see the types of structure needed for achieving valid post-regularization inference and encompass many existing results. A key element of the structure we employ and discuss in detail is the use of orthogonal or "immunized" estimating equations that are locally insensitive to small mistakes in estimation of the high-dimensional nuisance parameter. As an illustration, we use the high-level conditions to provide readily veri able sucient conditions for a class of ane-quadratic models that include the usual linear model and linear instrumental variables model as special cases. As a further application and illustration, we use these results to provide an analysis of post-selection inference in a linear instrumental variables model with many regressors and many instruments. We conclude with a review of other developments in post-selection inference and note that many of the developments can be viewed as special cases of the general encompassing framework of orthogonal estimating equations provided in this paper.

]]>
http://www.ifs.org.uk/publications/8448 Thu, 25 Aug 2016 00:00:00 +0000
<![CDATA[hdm: High-Dimensional Metrics]]> In this article the package High-dimensional Metrics (hdm) is introduced. It is a collection of statistical methods for estimation and quantification of uncertainty in high-dimensional approximately sparse models. It focuses on providing confidence intervals and significance testing for (possibly many) low-dimensional subcomponents of the high-dimensional parameter vector. Efficient estimators and uniformly valid confidence intervals for regression coefficients on target variables (e.g., treatment or policy variable) in a high-dimensional approximately sparse regression model, for average treatment effect (ATE) and average treatment effect for the treated (ATET), as well for extensions of these parameters to the endogenous setting are provided. Theory grounded, data-driven methods for selecting the penalization parameter in Lasso regressions under heteroscedastic and non-Gaussian errors are implemented. Moreover, joint/ simultaneous confidence intervals for regression coefficients of a high-dimensional sparse regression are implemented. Data sets which have been used in the literature and might be useful for classroom demonstration and for testing new estimators are included.

]]>
http://www.ifs.org.uk/publications/8449 Thu, 25 Aug 2016 00:00:00 +0000
<![CDATA[Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings]]> We derive strong approximations to the supremum of the non-centered empirical process indexed by a possibly unbounded VC-type class of functions by the suprema of the Gaussian and bootstrap processes. The bounds of these approximations are non-asymptotic, which allows us to work with classes of functions whose complexity increases with the sample size. The construction of couplings is not of the Hungarian type and is instead based on the Slepian-Stein methods and Gaussian comparison inequalities. The increasing complexity of classes of functions and non-centrality of the processes make the results useful for applications in modern nonparametric statistics (Giné and Nickl [14]), in particular allowing us to study the power properties of nonparametric tests using Gaussian and bootstrap approximations.

]]>
http://www.ifs.org.uk/publications/8450 Thu, 25 Aug 2016 00:00:00 +0000
<![CDATA[Generic inference on quantile and quantile effect functions for discrete outcomes]]> This paper provides a method to construct simultaneous con fidence bands for quantile and quantile eff ect functions for possibly discrete or mixed discrete-continuous random variables. The construction is generic and does not depend on the nature of the underlying problem. It works in conjunction with parametric, semiparametric, and nonparametric modeling strategies and does not depend on the sampling schemes. It is based upon projection of simultaneous con dfidence bands for distribution functions.

We apply our method to analyze the distributional impact of insurance coverage on health care utilization and to provide a distributional decomposition of the racial test score gap. Our analysis generates new interesting fi ndings, and complements previous analyses that focused on mean e ffects only. In both applications, the outcomes of interest are discrete rendering standard inference methods invalid for obtaining uniform con fidence bands for quantile and quantile e ffects functions.

]]>
http://www.ifs.org.uk/publications/8447 Thu, 25 Aug 2016 00:00:00 +0000
<![CDATA[A quantile correlated random coefficients panel data model]]> We propose a generalization of the linear quantile regression model to accommodate possibilities afforded by panel data. Specifically, we extend the correlated random coefficients representation of linear quantile regression (e.g., Koenker, 2005; Section 2.6). We show that panel data allows the econometrician to (i) introduce dependence between the regressors and the random coefficients and (ii) weaken the assumption of comonotonicity across them (i.e., to enrich the structure of allowable dependence between different coefficients). We adopt a “fixed effects” approach, leaving any dependence between the regressors and the random coefficients unmodelled. We motivate different notions of quantile partial effects in our model and study their identification. For the case of discretely-valued covariates we present analog estimators and characterize their large sample properties. When the number of time periods (T) exceeds the number of random coefficients (P), identification is regular, and our estimates are √ N - consistent. When T = P, our identification results make special use of the subpopulation of stayers - units whose regressor values change little over time - in a way which builds on the approach of Graham and Powell (2012). In this just-identified case we study asymptotic sequences which allow the frequency of stayers in the population to shrink with the sample size. One purpose of these “discrete bandwidth asymptotics” is to approximate settings where covariates are continuously-valued and, as such, there is only an infinitesimal fraction of exact stayers, while keeping the convenience of an analysis based on discrete covariates. When the mass of stayers shrinks with N, identification is irregular and our estimates converge at a slower than √ N rate, but continue to have limiting normal distributions. We apply our methods to study the effects of collective bargaining coverage on earnings using the National Longitudinal Survey of Youth 1979 (NLSY79). Consistent with prior work (e.g., Chamberlain, 1982; Vella and Verbeek, 1998), we find that using panel data to control for unobserved worker heteroegeneity results in sharply lower estimates of union wage premia. We estimate a median union wage premium of about 9 percent, but with, in a more novel finding, substantial heterogeneity across workers. The 0.1 quantile of union effects is insignificantly different from zero, whereas the 0.9 quantile effect is of over 30 percent. Our empirical analysis further suggests that, on net, unions have an equalizing effect on the distribution of wages.

Supplement for CWP34/16

]]>
http://www.ifs.org.uk/publications/8446 Thu, 25 Aug 2016 00:00:00 +0000
<![CDATA[The effect of gender-targeted conditional cash transfers on household expenditures: Evidence from a randomized experiment]]> This paper studies the differential effect of targeting cash transfers to men or women on the structure of household expenditures on non-durables. We study a policy intervention in the Republic of Macedonia, offering cash transfers to poor households, conditional on having their children attending secondary school. The recipient of the transfer is randomized across municipalities to be either the household head or the mother. Using data collected to evaluate the conditional cash transfer program, we show that the gender of the recipient has an effect on the structure of expenditure shares. Targeting transfers to women increases the expenditure share on food by about 4 to 5%. To study the allocation of expenditures within the food basket, we estimate a demand system for food and we find that targeting payments to mothers induces, for different food categories, not only a significant intercept shift, but also a change in the slope of the Engel curve.

]]>
http://www.ifs.org.uk/publications/8434 Fri, 19 Aug 2016 00:00:00 +0000
<![CDATA[Approximate permutation tests and induced order statistics in the regression discontinuity design]]> In the regression discontinuity design, it is common practice to asses the credibility of the design by testing whether the means of baseline covariates do not change at the cuto ff (or threshold) of the running variable. This practice is partly motivated by the stronger implication derived by Lee (2008), who showed that under certain conditions the distribution of baseline covariates in the RDD must be continuous at the cuto ff. We propose a permutation test based on the so-called induced ordered statistics for the null hypothesis of continuity of the distribution of baseline covariates at the cutoff ; and introduce a novel asymptotic framework to analyze its properties. The asymptotic framework is intended to approximate a small sample phenomenon: even though the total number n of observations may be large, the number of eff ective observations local to the cuto ff is often small. Thus, while traditional asymptotics in RDD require a growing number of observations local to the cuto ff as n → ∞ , our framework keeps the number q of observations local to the cutoff fixed as n → ∞. The new test is easy to implement, asymptotically valid under weak conditions, exhibits finite sample validity under stronger conditions than those needed for its asymptotic validity, and has favorable power properties relative to tests based on means. In a simulation study, we fi nd that the new test controls size remarkably well across designs. We then use our test to evaluate the validity of the design in Lee (2008), a well-known application of the RDD to study incumbency advantage.

]]>
http://www.ifs.org.uk/publications/8432 Fri, 19 Aug 2016 00:00:00 +0000
<![CDATA[Money or fun? Why students want to pursue further education]]> We study students’ motives for educational attainment in a unique survey of 885 secondary school students in the UK. As expected, students who perceive the monetary returns to education to be higher are more likely to intend to continue in full-time education. However, the main driver is the perceived consumption value, which alone explains around half of the variation of the intention to pursue higher education. Moreover, the perceived consumption value can account for a substantial part of both the socio-economic gap and the gender gap in intentions to continue in full-time education.

]]>
http://www.ifs.org.uk/publications/8408 Mon, 08 Aug 2016 00:00:00 +0000
<![CDATA[Fixed-effect regressions on network data]]> This paper studies inference on fixed eff ects in a linear regression model estimated from network data. We derive bounds on the variance of the fixed-e ffect estimator that uncover the importance of the smallest non-zero eigenvalue of the (normalized) Laplacian of the network and of the degree structure of the network. The eigenvalue is a measure of connectivity, with smaller values indicating less-connected networks. These bounds yield conditions for consistent estimation and convergence rates, and allow to evaluate the accuracy of first-order approximations to the variance of the fixed-eff ect estimator.

Supplement for CWP32/16

]]>
http://www.ifs.org.uk/publications/8410 Mon, 08 Aug 2016 00:00:00 +0000
<![CDATA[Housing equity, saving and debt dynamics over the Great Recession]]> This paper uses the large and heterogeneous house price shocks in Denmark from 2006-2009 to provide new evidence on the contested determinants of the correlation between house prices and saving. Crucially, to compare the savings behaviour of home-owners who experienced di fferent house price shocks but similar shocks to income expectations, we exploit the structure of the wage setting process in the Danish public sector. We fi nd strong evidence of a causal link between changes in house prices and saving for young and old home-owners, both through a direct wealth eff ect and through housing equity serving as collateral or precautionary wealth.

]]>
http://www.ifs.org.uk/publications/8402 Tue, 02 Aug 2016 00:00:00 +0000
<![CDATA[Locally robust semiparametric estimation]]> This paper shows how to construct locally robust semiparametric GMM estimators, meaning equivalently moment conditions have zero derivative with respect to the first step and the first step does not affect the asymptotic variance. They are constructed by adding to the moment functions the adjustment term for first step estimation. Locally robust estimators have several advantages. They are vital for valid inference with machine learning in the first step, see Belloni et. al. (2012, 2014), and are less sensitive to the specification of the first step. They are doubly robust for affine moment functions, where
moment conditions continue to hold when one first step component is incorrect. Locally robust moment conditions also have smaller bias that is flatter as a function of first step smoothing leading to improved small sample properties. Series first step estimators confer local robustness on any moment conditions and are doubly robust for affine moments, in the direction of the series approximation. Many new locally and doubly robust estimators are given here, including for economic structural models. We give simple asymptotic theory for estimators that use cross-fitting in the first step, including machine learning.

]]>
http://www.ifs.org.uk/publications/8401 Tue, 02 Aug 2016 00:00:00 +0000
<![CDATA[A simple parametric model selection test]]> We propose a simple model selection test for choosing among two parametric likelihoods which can be applied in the most general setting without any assumptions on the relation between the candidate models and the true distribution. That is, both, one or neither is allowed to be correctly speci ed or misspeci ed, they may be nested, non-nested, strictly non-nested or overlapping. Unlike in previous testing approaches, no pre-testing is needed, since in each case, the same test statistic together with a standard normal critical value can be used. The new procedure controls asymptotic size uniformly over a large class of data generating processes. We demonstrate its finite sample properties in a Monte Carlo experiment and its practical relevance in an empirical application comparing Keynesian versus new classical macroeconomic models.

]]>
http://www.ifs.org.uk/publications/8400 Tue, 02 Aug 2016 00:00:00 +0000
<![CDATA[Can’t work or won’t work: quasi-experimental evidence on work search requirements for single parents]]> Increasing the labour market participation of single parents, whether to boost incomes or reduce welfare spending, is a major policy objectives in a number of countries. This paper presents causal evidence on the impact of work search requirements on single parents’ transitions into work and onto other benefits. We use rich administrative data on all single parent welfare recipients, and apply a difference-in-differences approach that exploits the staggered roll-out of a reform in the UK that gradually decreased the age of the youngest child at which single parents lose the right to an unconditional cash benefit. Consistent with the predictions of a simple search model, the work search requirements have heterogeneous impacts, leading some single parents to move into work (especially those with strong previous labour market attachments), but leading some (especially those with weak previous labour market attachments) to move onto disability benefits (with no search conditionalities) or non-claimant unemployment.

]]>
http://www.ifs.org.uk/publications/8399 Thu, 28 Jul 2016 00:00:00 +0000
<![CDATA[Nonparametric estimation and inference under shape restrictions]]> Economic theory often provides shape restrictions on functions of interest in applications, such as monotonicity, convexity, non-increasing (non-decreasing) returns to scale, or the Slutsky inequality of consumer theory; but economic theory does not provide finite-dimensional parametric models. This motivates nonparametric estimation under shape restrictions. Nonparametric estimates are often very noisy. Shape restrictions stabilize nonparametric estimates without imposing arbitrary restrictions, such as additivity or a single-index structure, that may be inconsistent with economic theory and the data. This paper explains how to estimate and obtain an asymptotic uniform confidence band for a conditional mean function under possibly nonlinear shape restrictions, such as the Slutsky inequality. The results of Monte Carlo experiments illustrate the finite-sample performance of the method, and an empirical example illustrates its use in an application.

]]>
http://www.ifs.org.uk/publications/8389 Mon, 25 Jul 2016 00:00:00 +0000
<![CDATA[Consumption during the Great Recession in Italy]]> We use Italian micro data to investigate how consumers reacted to the Great Recession. In particular, we study the age profiles of non-durable consumption, durable purchases and wealth over the 2008-2012 period for different year-of-birth cohorts, and how they departed from the way they would have been had consumer behavior been the same as it was over the 1995-2006 period. We find that consumption dropped most for younger households - only part of these drops can be explained by the increase in unemployment. We also investigate whether the crisis had an impact on the way consumers allocate their spending among broad consumption bundles. We find that the budget elasticity of the demand for food changed during the recession period, particularly among
the young.

]]>
http://www.ifs.org.uk/publications/8380 Wed, 20 Jul 2016 00:00:00 +0000
<![CDATA[The marriage market, labour supply and education choice]]> We develop an equilibrium lifecycle model of education, marriage and labor supply and consumption in a transferable utility context. Individuals start by choosing their investments in education anticipating returns in the marriage market and the labor market. They then match based on the economic value of marriage and on preferences. Equilibrium in the marriage market determines intrahousehold allocation of resources. Following marriage households (married or single) save, supply labor and consume private and public commodities under uncertainty. Marriage thus has the dual role of providing public goods and offering risk sharing. The model is estimated using the British HPS.

]]>
http://www.ifs.org.uk/publications/8372 Mon, 18 Jul 2016 00:00:00 +0000
<![CDATA[MCMC confidence sets for identified sets]]> In complicated/nonlinear parametric models, it is generally hard to determine whether the model parameters are (globally) point identifi ed. We provide computationally attractive procedures to construct con fidence sets (CSs) for identifi ed sets of parameters in econometric models defi ned through a likelihood or a vector of moments. The CSs for the identi fied set or for a function of the identi fied set (such as a subvector) are based on inverting an optimal sample criterion (such as likelihood or continuously updated GMM), where the cutoff values are computed via Monte Carlo simulations directly from a quasi posterior distribution of the criterion. We establish new Bernstein-von Mises type theorems for the posterior distributions of the quasi-likelihood ratio (QLR) and pro file QLR statistics in partially identifi ed models, allowing for singularities. These results imply that the Monte Carlo criterion-based CSs have correct frequentist coverage for the identi fied set as the sample size increases, and that they coincide with Bayesian credible sets based on inverting a LR statistic for point-identi fied likelihood models. We also show that our Monte Carlo optimal criterion-based CSs are uniformly valid over a class of data generating processes that include both partially- and pointidentifi ed models. We demonstrate good finite sample coverage properties of our proposed methods in four non-trivial simulation experiments: missing data, entry game with correlated payoff shocks, Euler equation and finite mixture models. Finally, our proposed procedures are applied in two empirical examples.

]]>
http://www.ifs.org.uk/publications/8351 Thu, 07 Jul 2016 00:00:00 +0000
<![CDATA[Partial independence in nonseparable models]]> We analyze identi cation of nonseparable models under three kinds of exogeneity assumptions weaker than full statistical independence. The first is based on quantile independence. Selection on unobservables drives deviations from full independence. We show that such deviations based on quantile independence require non-monotonic and oscillatory propensity scores. Our second and third approaches are based on a distance-from-independence metric, using either a conditional cdf or a propensity score. Under all three approaches we obtain simple analytical characterizations of identi ed sets for various parameters of interest. We do this in three models: the exogenous regressor model of Matzkin (2003), the instrumental variable model of Chernozhukov and Hansen (2005), and the binary choice model with nonparametric latent utility of Matzkin (1992).

]]>
http://www.ifs.org.uk/publications/8332 Tue, 21 Jun 2016 00:00:00 +0000
<![CDATA[The UK wage premium puzzle: how did a large increase in university graduates leave the education premium unchanged?]]> Since the early-1990s the UK experienced an unprecedented increase in university graduates. The proportion of people with a university degree by age 30 more than doubled from 16% for born in 1965-69 to 33% for those born ten years later. At the same time the age profile of the graduate premium remained largely unchanged across cohorts. This paper first establishes the facts using a detailed analysis of micro-data on wage and employment patterns over the last two decades, benchmarked against the US economy. We then show that the stability of the age profile in the premium across different birth cohorts is unlikely to be explained by either composition changes or selection on unobservables. We also argue that it is inconsistent with skill-biased technical change affecting all advanced economies in the same way. We further rule out explanations based on factor price equalisation. Our resolution of the puzzle is a model in which increases in level of education induce firms to transit toward a decentralised technology in which decision-making is spread more widely through the workforce. We provide empirical support for this view.

]]>
http://www.ifs.org.uk/publications/8322 Fri, 17 Jun 2016 00:00:00 +0000
<![CDATA[Selling daughters: age of marriage, income shocks and the bride price tradition]]> When markets are incomplete, cultural norms may play an important role in shaping economic behavior. In this paper, we explore whether income shocks increase the probability of child marriages in societies that engage in bride price payments – transfers from the groom to the bride’s parents at marriage. We develop a simple model in which households are exposed to income volatility and have no access to credit markets. If a daughter marries, the household obtains a bride price and has fewer members to support. In this framework, girls have a higher probability of marrying early when their parents have higher marginal utility of consumption because of adverse income shocks. We test the prediction of the model by exploiting variation in rainfall shocks over a woman’s life
cycle, using a survey dataset from rural Tanzania. We find that adverse shocks during teenage years increase the probability of early marriages and early fertility among women.

]]>
http://www.ifs.org.uk/publications/8323 Fri, 17 Jun 2016 00:00:00 +0000
<![CDATA[A critical value function approach, with an application to persistent time-series]]> Researchers often rely on the t-statistic to make inference on parameters in statistical models. It is common practice to obtain critical values by simulation techniques. This paper proposes a novel numerical method to obtain an approximately similar test. This test rejects the null hypothesis when the test statistic is larger than a critical value function (CVF) of the data. We illustrate this procedure when regressors are highly persistent, a case in which commonly-used simulation methods encounter difficulties controlling size uniformly. Our approach works satisfactorily, controls size, and yields a test which outperforms the two other known similar tests.

Supplement for CWP24/16

]]>
http://www.ifs.org.uk/publications/8319 Tue, 14 Jun 2016 00:00:00 +0000
<![CDATA[Nonparametric analysis of random utility models]]> This paper develops and implements a nonparametric test of Random Utility Models. The motivating application is to test the null hypothesis that a sample of cross-sectional demand distributions was generated by a population of rational consumers. We test a necessary and sucient condition for this that does not rely on any restriction on unobserved heterogeneity or the number of goods. We also propose and implement a control function approach to account for endogenous expenditure. An econometric result of independent interest is a test for linear inequality constraints when these are represented as the vertices of a polyhedron rather than its faces. An empirical application to the U.K. Household Expenditure Survey illustrates computational feasibility of the method in demand problems with 5 goods.

]]>
http://www.ifs.org.uk/publications/8336 Tue, 14 Jun 2016 00:00:00 +0000
<![CDATA[Optimal two-sided tests for instrumental variables regression with heteroskedastic and autocorrelated errors]]> This paper considers two-sided tests for the parameter of an endogenous variable in an instrumental variable (IV) model with heteroskedastic and autocorrelated errors. We develop the finite-sample theory of weighted-average power (WAP) tests with normal errors and a known long-run variance. We introduce two weights which are invariant to orthogonal transformations of the instruments; e.g., changing the order in which the instruments appear. While tests using the MM1 weight can be severely biased, optimal tests based on the MM2 weight are naturally two-sided when errors are homoskedastic. We propose two boundary conditions that yield two-sided tests whether errors are homoskedastic or not. The locally unbiased (LU) condition is related to the power around the null hypothesis and is a weaker requirement than unbiasedness. The strongly unbiased (SU) condition is more restrictive than LU, but the associated WAP tests are easier to implement. Several tests are SU in finite samples or asymptotically, including tests robust to weak IV (such as the Anderson-Rubin, score, conditional quasi-likelihood ratio, and I. Andrews' (2015) PI-CLC tests) and two-sided tests which are optimal when the sample size is large and instruments are strong. We refer to the WAP-SU tests based on our weights as MM1-SU and MM2-SU tests. Dropping the restrictive assumptions of normality and known variance, the theory is shown to remain valid at the cost of asymptotic approximations. The MM2-SU test is optimal under the strong IV asymptotics, and outperforms other existing tests under the weak IV asymptotics.

 

 

]]>
http://www.ifs.org.uk/publications/8320 Tue, 14 Jun 2016 00:00:00 +0000
<![CDATA[Estimation of a Multiplicative Covariance Structure]]> We consider a Kronecker product structure for large covariance matrices, which has the feature that the number of free parameters increases logarithmically with the dimensions of the matrix. We propose an estimation method of the free parameters based on the log linear property of this structure, and also a Quasi-Likelihood method. We establish the rate of convergence of the estimated parameters when the size of the matrix diverges. We also establish a CLT for our method. We apply the method to portfolio choice for S&P500 daily returns and compare with sample covariance based methods and with the recent Fan et al. (2013) method.

]]>
http://www.ifs.org.uk/publications/8288 Tue, 17 May 2016 00:00:00 +0000
<![CDATA[The value of private schools: evidence from Pakistan]]> Using unique data from Pakistan we estimate a model of demand for diff erentiated products in 112 rural education markets with signi ficant choice among public and private schools. Our model accounts for the endogeneity of school fees and the characteristics of students attending the school. As expected, central determinants of school choice are the distance to school, school fees, and the characteristics of peers. Families are willing to pay on average between 75% and 115% of the average annual private school fee for a 500 meter reduction in distance. In contrast, price elasticities are low: -0.5 for girls and -0.2 for boys. Both distance and price elasticities are consistent with other estimates in the literature, but at odds with a belief among policy makers that school fees deter enrollment and participation in private schooling. Using the estimates from the demand model we show that the existence of a low fee private school market is of great value for households in our sample, reaching about 25% to 100% of monthly per capita income for those choosing private schools. A voucher policy that reduces the fees of private schools to $0 (from an average annual fee of $13) increases private school enrollment by 7.5 percentage points for girls and 4.2 percentage points for boys. Our demand estimates and policy simulations, which account for key challenges specifi c to the schooling market, help situate ongoing debate around private schools within a larger framework of consumer choice and welfare.

]]>
http://www.ifs.org.uk/publications/8283 Fri, 13 May 2016 00:00:00 +0000
<![CDATA[Inference under Covariate-Adaptive Randomization]]> This paper studies inference for the average treatment eff ect in randomized controlled trials with covariate-adaptive randomization. Here, by covariate-adaptive randomization, we mean randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve "balance" within each stratum. Such schemes include, for example, Efron's biased-coin design and strati ed block randomization. When testing the null hypothesis that the average treatment eff ect equals a pre-speci fied value in such settings, we fi rst show that the usual two-sample t-test is conservative in the sense that it has limiting rejection probability under the null hypothesis no greater than and typically strictly less than the nominal level. In a simulation study, we fi nd that the rejection probability may in fact be dramatically less than the nominal level. We show further that these same conclusions remain true for a naïve  permutation test, but that a modi fied version of the permutation test yields a test that is non-conservative in the sense that its limiting rejection probability under the null hypothesis equals the nominal level for a wide variety of randomization schemes. The modi fied version of the permutation test has the additional advantage that it has rejection probability exactly equal to the nominal level for some distributions satisfying the null hypothesis and some randomization schemes. Finally, we show that the usual t-test (on the coefficient on treatment assignment) in a linear regression of outcomes on treatment assignment and indicators for each of the strata yields a non-conservative test as well under even weaker assumptions on the randomization scheme. In a simulation study, we fi nd that the non-conservative tests have substantially greater power than the usual two-sample t-test.

]]>
http://www.ifs.org.uk/publications/8274 Tue, 10 May 2016 00:00:00 +0000
<![CDATA[Posterior distribution of nondifferentiable functions]]> This paper examines the asymptotic behavior of the posterior distribution of a possibly nondifferentiable function g(theta), where theta is a finite dimensional parameter. The main assumption is that the distribution of the maximum likelihood estimator theta_n, its bootstrap approximation, and the Bayesian posterior for theta all agree asymptotically. It is shown that whenever g is Lipschitz, though not necessarily differentiable, the posterior distribution of g(theta) and the bootstrap distribution of g(theta_n) coincide asymptotically. One implication is that Bayesians can interpret bootstrap inference for g(theta) as approximately valid posterior inference in a large sample. Another implication—built on known results about bootstrap inconsistency—is that the posterior distribution of g(theta) does not coincide with the asymptotic distribution of g(theta_n) at points of nondifferentiability. Consequently, frequentists cannot presume that credible sets for a nondifferentiable parameter g(theta) can be interpreted as approximately valid confidence sets (even when this relation holds true for theta).

]]>
http://www.ifs.org.uk/publications/8263 Mon, 09 May 2016 00:00:00 +0000
<![CDATA[Bias-corrected confidence intervals in a class of linear inverse problems]]> In this paper we propose a novel method to construct confi dence intervals in a class of linear inverse problems. First, point estimators are obtained via a spectral cut-o ff method depending on a regularisation parameter , that determines the bias of the estimator. Next, the proposed con fidence interval corrects for this bias by explicitly estimating it based on a second regularisation parameter , which is asymptotically smaller than . The coverage error of the interval is shown to converge to zero. The proposed method is illustrated via two simulation studies, one in the context of functional linear regression, and the second one in the context of instrumental regression.

]]>
http://www.ifs.org.uk/publications/8262 Mon, 09 May 2016 00:00:00 +0000
<![CDATA[Updating ambiguous beliefs in a social learning experiment]]> We present a novel experimental design to study social learning in the laboratory. Subjects have to predict the value of a good in a sequential order. We elicit each subject’s belief twice: first (“prior belief”), after he observes his predecessors’ action; second (“posterior belief”), after he observes a private signal on the value of the good. We are therefore able to disentangle social learning from learning from a private signal. Our main result is that subjects update on their private signal in an asymmetric way. They weigh the private signal as a Bayesian agent would do when the signal confirms their prior belief; they overweight the signal when it contradicts their prior belief. We show that this way of updating, incompatible with Bayesianism, can be explained by ambiguous beliefs (multiple priors on the predecessor’s rationality) and a generalization of the Maximum Likelihood Updating rule.

]]>
http://www.ifs.org.uk/publications/8261 Mon, 09 May 2016 00:00:00 +0000
<![CDATA[Bounds On Treatment E ffects On Transitions]]> This paper considers identif cation of treatment eff ects on conditional transition probabilities. We show that even under random assignment only the instantaneous average treatment e ffect is point identi fied. Because treated and control units drop out at diff erent rates, randomization only ensures the comparability of treatment and controls at the time of randomization, so that long run average treatment e ffects are not point identifi ed. Instead we derive informative bounds on these average treatment e ffects. Our bounds do not impose (semi)parametric restrictions, as e.g. proportional hazards. We also explore various assumptions such as monotone treatment response, common shocks and positively correlated outcomes that tighten the bounds.

]]>
http://www.ifs.org.uk/publications/8243 Fri, 22 Apr 2016 00:00:00 +0000
<![CDATA[Taxing high-income earners: tax avoidance and mobility]]> The taxation of high-income earners is of importance to every country and is the subject of a considerable amount of recent academic research. Such high-income earners contribute substantial amounts of tax and generate signifi cant positive spillovers, but are also highly mobile: a 1% increase in the top marginal income tax rate increases out-migrations by around 1.5 to 3%. We review research into taxation of high-income earners to provide a synthesis of existing theoretical and empirical understanding. We o ffer various avenues for potential future theoretical and empirical research.

]]>
http://www.ifs.org.uk/publications/8242 Fri, 22 Apr 2016 00:00:00 +0000
<![CDATA[Homophily and transitivity in dynamic network formation]]> In social and economic networks linked agents often share additional links in common. There are two competing explanations for this phenomenon. First, agents may have a structural taste for transitive links – the returns to linking may be higher if two agents share links in common. Second, agents may assortatively match on unobserved attributes, a process called homophily. I study parameter identifiability in a simple model of dynamic network formation with both effects. Agents form, maintain, and sever links over time in order to maximize utility. The return to linking may be higher if agents share friends in common. A pair-specific utility component allows for arbitrary homophily on time-invariant agent attributes. I derive conditions under which it is possible to detect the presence of a taste for transitivity in the presence of assortative matching on unobservables. I leave the joint distribution of the initial network and the pair-specific utility component, a very high dimensional object, unrestricted. The analysis is of the 'fixed effects' type. The identification result is constructive, suggesting an analog estimator, whose single large network properties I characterize.

]]>
http://www.ifs.org.uk/publications/8238 Fri, 15 Apr 2016 00:00:00 +0000
<![CDATA[How English domiciled graduate earnings vary with gender, institution attended, subject and socio-economic background]]> This paper uses tax and student loan administrative data to measure how the earnings of English graduates around 10 years into the labour market vary with gender, institution attended subject and socioeconomic background. The English system is competitive to enter, with some universities demanding very high entrance grades. Students specialise early, nominating their subject before they enter higher education (HE). We find subjects like Medicine, Economics, Law, Maths and Business deliver substantial premiums over typical graduates, while disappointingly, Creative Arts delivers earnings which are roughly typical of non-graduates. Considerable variation in earnings is observed across diff erent institutions. Much of this is explained by student background and subject mix. Based on a simple measure of parental income, we see that students from higher income families have median earnings which are around 25% more than those from lower income families. Once we control for institution attended and subject chosen this premium falls to around 10%.

]]>
http://www.ifs.org.uk/publications/8233 Wed, 13 Apr 2016 00:00:00 +0000
<![CDATA[Optimal data collection for randomized control trials]]> In a randomized control trial, the precision of an average treatment e ffect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment e ect estimator's mean squared error, subject to the researcher's budget constraint. We rely on an orthogonal greedy algorithm that is conceptually simple, easy to implement (even when the number of potential covariates is very large), and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment eff ect estimator, respectively.

The original version of the working paper, posted on 01 April, 2016, is available here.

]]>
http://www.ifs.org.uk/publications/8223 Fri, 01 Apr 2016 00:00:00 +0000
<![CDATA[Estimating Matching Games with Transfers]]> I explore the estimation of transferable utility matching games, encompassing many-to-many matching, marriage and matching with trading networks (trades). I introduce a matching maximum score estimator that does not suffer from a computational curse of dimensionality in the number of agents in a matching market. I apply the estimator to data on the car parts supplied by automotive suppliers to estimate the returns from different portfolios of parts to suppliers and automotive assemblers.

]]>
http://www.ifs.org.uk/publications/8222 Thu, 24 Mar 2016 00:00:00 +0000
<![CDATA[Scotland’s fiscal framework: assessing the agreement]]> The Smith Commission Agreement, published on 27 November 2014, set out proposals for substantial fiscal devolution to the Scottish Parliament. The Scotland Bill – due to receive Royal Assent shortly – will enshrine these powers in law.

Both the Smith Commission Agreement and the UK Government’s subsequent Command Paper, ‘An Enduring Settlement recognised that the devolution of fiscal powers has to be accompanied by the development of a new Fiscal Framework for Scotland.

Without such a framework there could be no fiscal devolution. It is essential in order to set out rules such as: how the Scottish Government’s block grant will be calculated in light of its new fiscal powers; what level of borrowing powers Scotland will have to enable it to deal with the additional economic risks and revenue volatility that it will face; the extent and scope of fiscal rules governing Scottish Government deficits and debt; arrangements for independent fiscal scrutiny, including fiscal forecasting; and arrangements for governing the increasingly complex interactions between Scottish and UK fiscal policy, including dispute resolution.

The Fiscal Framework is not part of the Scotland Bill: it is instead an agreement between the UK and Scottish governments (and therefore does not have the same legal standing as the Bill). It was finally published on 25 February 2016 after many months of negotiations between the two governments. The process of reaching agreement was protracted, and there were a number of contentious areas. But it seems the most significant area of disagreement was how the Scottish Government’s block grant should be adjusted to reflect its new powers.

The Smith Commission Agreement established that Scotland’s underlying block grant funding would continue to be determined by the Barnett Formula. But the Barnett-determined block grant would then have to be adjusted to reflect the new powers. On the one hand, the grant would have to be reduced to reflect the transfer of tax revenues from the UK to the Scottish Government, while on the other, an addition would need to be made to reflect the transfer of new welfare spending responsibilities to the Scottish Government.

The Smith Commission Agreement also established a number of high-level principles which it felt the Fiscal Framework should adhere to, and which were expected to govern the development of a proposal to adjust Scotland’s block grant. But, as we showed in our previous report, it is not possible to design a method for adjusting Scotland’s block grant that meets all of the Smith Commission principles simultaneously.

This inconsistency between the Smith principles was the main cause of the protracted negotiations between the two governments, and for several months it seemed likely to undermine the progress of the Scotland Bill. Each government interpreted the principles somewhat differently and chose to prioritise them differently, with the result that each favoured an alternative approach to adjusting Scotland’s block grant. Compromise was finally reached in February 2016, with an agreement on how to adjust the block grant for the next five years. While the mechanism chosen is complex and seems to blend elements of the UK and Scottish governments’ preferred approaches, ultimately it is the Scottish government’s approach that will determine the block grant available to Scotland during this period. After five years, an independent assessment will be carried out and negotiations will take place on how to adjust the block grant in the years beyond 2022. 

This report reviews and appraises the Fiscal Framework Agreement, with a particular focus on this issue of block grant adjustment.

The work was carried out jointly with authors at the ESRC Centre on Constitutional Change is the hub for research of the UK’s changing constitutional relationships. Its fellows examine how the evolving relationships between governments and parliaments in London, Edinburgh, Cardiff, Belfast and Brussels impact on the polity, economy and society of the UK and its component nations. 

]]>
http://www.ifs.org.uk/publications/8212 Tue, 22 Mar 2016 00:00:00 +0000
<![CDATA[Education policy and intergenerational transfers in equilibrium]]> This paper examines the equilibrium effects of alternative financial aid policies intended to promote college participation. We build an overlapping generations life-cycle, heterogeneous-agent, incomplete-markets model with education, labor supply, and consumption/saving decisions. Driven by both altruism and paternalism, parents make inter vivos transfers to their children. Both cognitive and non-cognitive skills determine the non-pecuniary cost of schooling. Labor supply during college, government grants and loans, as well as private loans, complement parental resources as means of funding college education. We find that the current financial aid system in the U.S. improves welfare, and removing it would reduce GDP by 4-5 percentage points in the long-run. Further expansions of government-sponsored loan limits or grants would have no salient aggregate effects because of substantial crowding-out: every additional dollar of government grants crowds out 30 cents of parental transfers plus an equivalent amount through a reduction in student’s labor supply. However, a small group of high-ability children from poor families, especially girls, would greatly benefit from more
generous federal aid.

]]>
http://www.ifs.org.uk/publications/8211 Mon, 21 Mar 2016 00:00:00 +0000
<![CDATA[Program evaluation and causal inference with high-dimensional data]]> In this paper, we provide efficient estimators and honest con fidence bands for a variety of treatment eff ects including local average (LATE) and local quantile treatment eff ects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment e ffects, and function-valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces ecient estimators and honest bands for (functional) average treatment eff ects (ATE) and quantile treatment eff ects (QTE). To make informative inference possible, we assume that key reduced form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post-regularization and post-selection inference that are uniformly valid (honest) across a wide-range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced form functional parameters. We illustrate the use of the proposed methods with an application to estimating the eff ect of 401(k) eligibility and participation on accumulated assets. The results on program evaluation are obtained as a consequence of more general results on honest inference in a general moment condition framework, which arises from structural equation models in econometrics. Here too the crucial ingredient is the use of orthogonal moment conditions, which can be constructed from the initial moment conditions. We provide results on honest inference for (function-valued) parameters within this general framework where any high-quality, modern machine learning methods can be used to learn the nonparametric/high-dimensional components of the model. These include a number of supporting auxilliary results that are of major independent interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) o er a uniformly valid functional delta method, and (3) provide results for sparsity-based estimation of regression functions for function-valued outcomes.

]]>
http://www.ifs.org.uk/publications/8209 Sat, 19 Mar 2016 00:00:00 +0000
<![CDATA[Simple Nonparametric Estimators for the Bid-Ask Spread in the Roll Model]]> We propose new methods for estimating the bid-ask spread from observed transaction prices alone. Our methods are based on the empirical characteristic function instead of the sample autocovariance function like the method of Roll (1984). As in Roll (1984), we have a closed form expression for the spread, but this is only based on a limited amount of the model-implied identification restrictions. We also provide methods that take account of more identification information. We compare our methods theoretically and numerically with the Roll method as well as with its best known competitor, the Hasbrouck (2004) method, which uses a Bayesian Gibbs methodology under a Gaussian assumption. Our estimators are competitive with Roll’s and Hasbrouck’s when the latent true fundamental return distribution is Gaussian, and perform much better when this distribution is far from Gaussian. Our methods are applied to the Emini futures contract on the S&P 500 during the Flash Crash of May 6, 2010. Extensions to models allowing for unbalanced order flow or Hidden Markov trade direction indicators or trade direction indicators having general asymmetric sup port or adverse selection are also presented, without requiring additional data.

]]>
http://www.ifs.org.uk/publications/8208 Fri, 18 Mar 2016 00:00:00 +0000
<![CDATA[Possibly Nonstationary Cross-Validation]]> Cross-validation is the most common data-driven procedure for choosing smoothing parameters in nonparametric regression. For the case of kernel estimators with iid or strong mixing data, it is well-known that the bandwidth chosen by crossvalidation is optimal with respect to the average squared error and other performance measures. In this paper, we show that the cross-validated bandwidth continues to be optimal with respect to the average squared error even when the datagenerating process is a -recurrent Markov chain. This general class of processes covers stationary as well as nonstationary Markov chains. Hence, the proposed procedure adapts to the degree of recurrence, thereby freeing the researcher from the need to assume stationary (or nonstationary) before inference begins. We study finite sample performance in a Monte Carlo study. We conclude by demonstrating the practical usefulness of cross-validation in a highly-persistent environment, namely that of nonlinear predictive systems for market returns.

]]>
http://www.ifs.org.uk/publications/8207 Sat, 12 Mar 2016 00:00:00 +0000
<![CDATA[Identification and efficiency bounds for the average match function under conditionally exogenous matching]]> Consider two heterogenous populations of agents who, when matched, jointly produce an output, Y. For example, teachers and classrooms of students together produce achievement, parents raise children, whose life outcomes vary in adulthood, assembly plant managers and workers produce a certain number of cars per month, and lieutenants and their platoons vary in unit effectiveness. Let∈ 𝕎= {ω1, . . . ,ωJ} and X∈ 𝕏= {x1, . . . ,xK} denote agent types in the two populations. Consider the following matching mechanism: take a random draw from the W = wj subgroup of the first population and match her with an independent random draw from the X = xk subgroup of the second population. Let β (wj, xk), the average match function (AMF), denote the expected output associated with this match. We show that (i) the AMF is identified when matching is conditionally exogenous, (ii) conditionally exogenous matching is compatible with a pairwise stable aggregate matching equilibrium under specific informational assumptions, and (iii) we calculate the AMF’s semiparametric efficiency bound.

]]>
http://www.ifs.org.uk/publications/8195 Fri, 11 Mar 2016 00:00:00 +0000
<![CDATA[Teacher Quality and Learning Outcomes in Kindergarten]]> We assigned two cohorts of kindergarten students, totaling more than 24,000 children, to teachers within schools with a rule that is as-good-as-random. We collected data on children at the beginning of the school year, and applied 12 tests of math, language and executive function (EF) at the end of the year. All teachers were filmed teaching for a full day, and the videos were coded using a wellknown classroom observation tool, the Classroom Assessment Scoring System (or CLASS). We find substantial classroom effects: A one-standard deviation increase in classroom quality results in 0.11, 0.11, and 0.07 standard deviation higher test scores in language, math, and EF, respectively. Teacher behaviors, as measured by the CLASS, are associated with higher test scores. Parents recognize better teachers, but do not change their behaviors appreciably to take account of differences in teacher quality.

]]>
http://www.ifs.org.uk/publications/8189 Thu, 03 Mar 2016 00:00:00 +0000
<![CDATA[Measuring and Changing Control: Women's Empowerment and Targeted Transfers]]> This paper studies how targeted cash transfers to women a ffect their empowerment. We use a novel identi cation strategy to measure women's willingness to pay to receive cash transfers instead of their partner receiving it. We apply this among women living in poor households in urban Macedonia. We match experimental data with a unique policy intervention (CCT) in Macedonia o ffering poor households cash transfers conditional on having their children attending secondary school. The program randomized whether the transfer was off ered to household heads or mothers at municipality level, providing us with an exogenous source of variation in (off ered) transfers. We show that women who were o ffered the transfer reveal a lower willingness to pay, and we show that this is in line with theoretical predictions.

]]>
http://www.ifs.org.uk/publications/8184 Wed, 02 Mar 2016 00:00:00 +0000
<![CDATA[Female labour supply, human capital and welfare reform]]> We estimate a dynamic model of employment, human capital accumulation - including education, and savings for women in the UK, exploiting tax and benefit reforms, and use it to analyze the effects of welfare policy. We find substantial elasticities for labor supply and particularly for lone mothers. Returns to experience, which are important in determining the longer-term effects of policy, increase with education, but experience mainly accumulates when in full-time employment. Tax credits are welfare improving in the UK and increase lone-mother labor supply, but the employment effects do not extend beyond the period of eligibility. Marginal increases in tax credits improve welfare more than equally costly increases in income support or tax cuts.

]]>
http://www.ifs.org.uk/publications/8170 Fri, 19 Feb 2016 00:00:00 +0000
<![CDATA[A new model for interdependent durations with an application to joint retirement]]> This paper introduces a bivariate version of the generalized accelerated failure time model. It allows for simultaneity in the econometric sense that the two realized outcomes depend structurally on each other. Another feature of the proposed model is that it will generate equal durations with positive probability. The motivating example is retirement decisions by married couples. In that example it seems reasonable to allow for the possibility that each partner's optimal retirement time depends on the retirement time of the spouse.

Moreover, the data suggest that the wife and the husband retire at the same time for a nonnegligible fraction of couples. Our approach takes as a starting point a stylized economic model that leads to a univariate generalized accelerated failure time model. The covariates of that generalized accelerated failure time model act as utility-flow shifters in the economic model. We introduce simultaneity by allowing the utility flow in retirement to depend on the retirement status of the spouse. The econometric model is then completed by assuming that the observed outcome is the Nash bargaining solution in that simple economic model. The advantage of this approach is that it includes independent realizations from the generalized accelerated failure time model as a special case, and deviations from this special case can be given an economic interpretation. We illustrate the model by studying the joint retirement decisions in married couples using the Health and Retirement Study. We provide a discussion of relevant identifying variation and estimate our model using indirect  inference. The main empirical nding is that the simultaneity seems economically important. In our preferred speci cation the indirect utility associated with being retired increases by approximately 5% when one's spouse retires. The estimated model also predicts that the marginal eff ect of a change in the husbands' pension plan on wives' retirement dates is about 3.3% of the direct eff ect on the husbands'.

]]>
http://www.ifs.org.uk/publications/8168 Wed, 17 Feb 2016 00:00:00 +0000
<![CDATA[Technology entry in the presence of patent thickets]]> We analyze the effect of patent thickets on entry into technology areas by firms in the UK. We present a model that describes incentives to enter technology areas characterized by varying technological opportunity, complexity of technology, and the potential for hold‐up in patent thickets. We show empirically that our measure of patent thickets is associated with a reduction of first time patenting in a given technology area controlling for the level of technological complexity and opportunity. Technological areas characterized by more technological complexity and opportunity, in contrast, see more entry. Our evidence indicates that patent thickets raise entry costs, which leads to
less entry into technologies regardless of a firm’s size.

]]>
http://www.ifs.org.uk/publications/8163 Fri, 12 Feb 2016 00:00:00 +0000
<![CDATA[Practical and theoretical advances in inference for partially identified models]]> This paper surveys some of the recent literature on inference in partially identified models. After reviewing some basic concepts, including the definition of a partially identified model and the identified set, we turn our attention to the construction of confidence regions in partially identified settings. In our discussion, we emphasize the importance of requiring confidence regions to be uniformly consistent in level over relevant classes of distributions. Due to space limitations, our survey is mainly limited to the class of partially identified models in which the identified set is characterized by a finite number of moment inequalities or the closely related class of partially identified models in which the identified set is a function of a such a set. The latter class of models most commonly arise when interest focuses on a subvector of a vector valued parameter, whose values are limited by a finite number of moment inequalities. We then rapidly review some important parts of the broader literature on inference in partially identified models and conclude by providing some thoughts on fruitful directions for future research.

]]>
http://www.ifs.org.uk/publications/8133 Fri, 29 Jan 2016 00:00:00 +0000
<![CDATA[Econometrics of network models]]> In this article I provide a (selective) review of the recent econometric literature on networks. I start with a discussion of developments in the econometrics of group interactions. I subsequently provide a description of statistical and econometric models for network formation and approaches for the joint determination of networks and interactions mediated through those networks. Finally, I give a very brief discussion of measurement issues in both outcomes and networks. My focus is on identification and computational issues, but estimation aspects are also discussed.

]]>
http://www.ifs.org.uk/publications/8134 Fri, 29 Jan 2016 00:00:00 +0000
<![CDATA[Dual regression]]> We propose an alternative ('dual regression') to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while largely avoiding the need for `rearrangement' to repair the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach relies on a mathematical programming characterization of conditional distribution functions which, in its simplest form, provides a simultaneous estimator of location and scale parameters in a linear heteroscedastic model. The statistical properties of this estimator are derived.

]]>
http://www.ifs.org.uk/publications/8120 Wed, 13 Jan 2016 00:00:00 +0000
<![CDATA[Doubly robust uniform confidence band for the conditional average treatment effect function]]> In this paper, we propose a doubly robust method to present the heterogeneity of the average treatment e ffect with respect to observed covariates of interest. We consider a situation where a large number of covariates are needed for identifying the average treatment eff ect but the covariates of interest for analyzing heterogeneity are of much lower dimension. Our proposed estimator is doubly robust and avoids the curse of dimensionality. We propose a uniform confi dence band that is easy to compute, and we illustrate its usefulness via Monte Carlo experiments and an application to the eff ects of smoking on birth weights.

]]>
http://www.ifs.org.uk/publications/8119 Sun, 10 Jan 2016 00:00:00 +0000
<![CDATA[Confidence intervals for projections of partially identified parameters]]> This paper proposes a bootstrap-based procedure to build confi dence intervals for single components of a partially identifi ed parameter vector, and for smooth functions of such components, in moment (in)equality models. The extreme points of our confi dence interval are obtained by maximizing/minimizing the value of the component (or function) of interest subject to the sample analog of the moment (in)equality conditions properly relaxed. The novelty is that the amount of relaxation, or critical level, is computed so that the component (or function) of θ, instead of θ itself, is uniformly asymptotically covered with prespeci ed probability. Calibration of the critical level is based on repeatedly checking feasibility of linear programming problems, rendering it computationally attractive. Computation of the extreme points of the con fidence interval is based on a novel application of the response surface method for global optimization, which may prove of independent interest also for applications of other methods of inference in the moment (in)equalities literature.

The critical level is by construction smaller (in fi nite sample) than the one used if projecting con fience regions designed to cover the entire parameter vector. Hence, our con fidence interval is weakly shorter than the projection of established con fidence sets (Andrews and Soares, 2010), if one holds the choice of tuning parameters constant. We provide simple conditions under which the comparison is strict. Our inference method controls asymptotic coverage uniformly over a large class of data-generating processes. Our assumptions and those used in the leading alternative approach (a profi ling-based method) are not nested. We explain why we employ some restrictions that are not required by other methods and provide examples of models for which our method is uniformly valid but profi ling-based methods are not.

]]>
http://www.ifs.org.uk/publications/8118 Tue, 05 Jan 2016 00:00:00 +0000
<![CDATA[Compactness of infinite dimensional parameter spaces]]> We provide general compactness results for many commonly used parameter spaces in nonparametric estimation. We consider three kinds of functions: (1) functions with bounded domains which satisfy standard norm bounds, (2) functions with bounded domains which do not
satisfy standard norm bounds, and (3) functions with unbounded domains. In all three cases we provide two kinds of results, compact embedding and closedness, which together allow one to show that parameter spaces defined by a ||·||s-norm bound are compact under a norm ||·||c. We apply these results to nonparametric mean regression and nonparametric instrumental variables estimation.

]]>
http://www.ifs.org.uk/publications/8117 Sun, 03 Jan 2016 00:00:00 +0000
<![CDATA[The sorted effects method: discovering heterogeneous effects beyond their averages]]> The partial (ceteris paribus) effects of interest in nonlinear and interactive linear models are heterogeneous as they can vary dramatically with the underlying observed or unobserved covariates. Despite the apparent importance of heterogeneity, a common practice in modern empirical work is to largely ignore it by reporting average partial effects (or, at best, average effects for some groups, see e.g. Angrist and Pischke (2008)). While average effects provide very convenient scalar summaries of typical effects, by definition they fail to reflect the entire variety of the heterogenous effects. In order to discover these effects much more fully, we propose to estimate and report sorted effects – a collection of estimated partial effects sorted in increasing order and indexed by percentiles. By construction the sorted effect curves completely represent and help visualize all of the heterogeneous effects in one plot. They are as convenient and easy to report in practice as the conventional average partial effects. We also provide a quantification of uncertainty (standard errors and confidence bands) for the estimated sorted effects. We apply the sorted effects method to demonstrate several striking patterns of gender-based discrimination in wages, and of race-based discrimination in mortgage lending.

Using differential geometry and functional delta methods, we establish that the estimated sorted effects are consistent for the true sorted effects, and derive asymptotic normality and bootstrap approximation results, enabling construction of pointwise confidence bands (point-wise with respect to percentile indices). We also derive functional central limit theorems and bootstrap approximation results, enabling construction of simultaneous confidence bands (simultaneous with respect to percentile indices). The derived statistical results in turn rely on establishing Hadamard differentiability of the multivariate sorting operator, a result of independent mathematical interest.

]]>
http://www.ifs.org.uk/publications/8098 Mon, 21 Dec 2015 00:00:00 +0000
<![CDATA[Quantile selection models: with an application to understanding changes in wage inequality]]> We propose a method to correct for sample selection in quantile regression models. Selection is modelled via the cumulative distribution function, or copula, of the percentile error in the outcome equation and the error in the participation decision. Copula parameters are estimated by minimizing a method-of-moments criterion. Given these parameter estimates, the percentile levels of the outcome are re-adjusted to correct for selection, and quantile parameters are estimated by minimizing a rotated “check” function. We apply the method to correct wage percentiles for selection into employment, using data for the UK for the period 1978-2000. We also extend the method to account for the presence of equilibrium effects when performing counterfactual exercises.

]]>
http://www.ifs.org.uk/publications/8100 Mon, 21 Dec 2015 00:00:00 +0000
<![CDATA[Income changes and their determinants over the lifecycle]]> What explains the variation in how income changes as people age? Using household panel data, we investigate the contribution of different time-varying factors in explaining variation in income changes over prime working-age life (between 35-44 and 50-59). We find that demographic changes, such as acquiring or losing a partner and the entry or exit of children to and from the household, account for a larger share of the variation in household income changes than shifts in employment status or occupation. This is particularly true for women, for whom demographic changes explain 82% of ex-post predictable variation in household income changes, compared to only 12% explained by employment status and occupation. We find a similar result when looking at the transition into retirement (between 50-59 and 66-75). These results illustrate an important limitation of the extensive literature examining consumption and savings behaviour over the lifecycle: focusing on earnings and income whilst ignoring changes in household composition excludes the largest source of ex-post predictable variation in income changes.  

 

Also available: Executive Summary

]]>
http://www.ifs.org.uk/publications/8097 Mon, 21 Dec 2015 00:00:00 +0000
<![CDATA[The housing stock, housing prices, and user costs: the roles of location, structure and unobserved quality]]> Using the English Housing Survey, we estimate a supply side selection model of the allocation of properties to the owner-occupied and rental sectors. We find that location, structure and unobserved quality are important for understanding housing prices, rents and selection. Structural characteristics and unobserved quality are important for selection. Location is not. Accounting for selection is important for estimates of rent-to-price ratios and can explain some puzzling correlations between rent-to-price ratios and homeownership rates. We interpret this as strong evidence in favor of contracting frictions in the rental market likely related to housing maintenance.

]]>
http://www.ifs.org.uk/publications/8091 Thu, 10 Dec 2015 00:00:00 +0000
<![CDATA[Identifying effects of multivalued treatments]]> Multivalued treatment models have only been studied so far under restrictive assumptions: ordered choice, or more recently unordered monotonicity. We show how marginal treatment effects can be identified in a more general class of models. Our results rely on two main assumptions: treatment assignment must be a measurable function of threshold-crossing rules; and enough continuous instruments must be available. On the other hand, we do not require any kind of monotonicity condition. We illustrate our approach on several commonly used models; and we also discuss the identification power of discrete instruments.

]]>
http://www.ifs.org.uk/publications/8080 Tue, 08 Dec 2015 00:00:00 +0000
<![CDATA[Sanitation and child health in India]]> Our study contributes to the understanding of key drivers of stunted growth, a factor widely recognized as major impediment to human capital development. Speci cally, we examine the e ffects of sanitation coverage and usage on child height for age in a semi-urban setting in Northern India. We use instrumental variables to control for endogeneity of sanitation usage coverage. We fi nd that sanitation coverage plays a signi cant and positive role in height growth during the fi rst years of life.

]]>
http://www.ifs.org.uk/publications/8076 Thu, 03 Dec 2015 00:00:00 +0000
<![CDATA[A bias bound approach to nonparametric inference]]> The traditional approach to obtain valid confidence intervals for nonparametric quantities is to select a smoothing parameter such that the bias of the estimator is negligible relative to its standard deviation. While this approach is apparently simple, it has two drawbacks: First, the question of optimal bandwidth selection is no longer well-defined, as it is not clear what ratio of bias to standard deviation should be considered negligible. Second, since the bandwidth choice necessarily deviates from the optimal (mean squares-minimizing) bandwidth, such a confidence interval is very inefficient. To address these issues, we construct valid confidence intervals that account for the presence of a nonnegligible bias and thus make it possible to perform inference with optimal mean squared error minimizing bandwidths. The key difficulty in achieving this involves finding a strict, yet feasible, bound on the bias of a nonparametric estimator. It is well-known that it is not possible to consistently estimate the point-wise bias of an optimal nonparametric estimator (for otherwise, one could subtract it and obtain a faster convergence rate violating Stone's bounds on optimal convergence rate). Nevertheless, we find that, under minimal primitive assumptions, it is possible to consistently estimate an upper bound on the magnitude of the bias, which is sufficient to deliver a valid confidence interval whose length decreases at the optimal rate and which does not contradict Stone’s results.

]]>
http://www.ifs.org.uk/publications/8069 Wed, 25 Nov 2015 00:00:00 +0000
<![CDATA[Bounding average treatment effects using linear programming]]> This paper presents a method of calculating sharp bounds on the average treatment effect using linear programming under identifying assumptions commonly used in the literature. This new method provides a sensitivity analysis of the identifying assumptions and missing data in an application regarding the effect of parent’s schooling on children’s schooling. Even a mild departure from identifying assumptions may substantially widen the bounds on average treatment effects. Allowing for a small fraction of the data to be missing also has a large impact on the results.

]]>
http://www.ifs.org.uk/publications/8048 Fri, 13 Nov 2015 00:00:00 +0000
<![CDATA[Fuzzy differences-in-differences]]> In many applications of the differences-in-differences (DID) method, the treatment increases more in the treatment group, but some units are also treated in the control group. In such fuzzy designs, a popular estimator of treatment effects is the DID of the outcome divided by the DID of the treatment, or OLS and 2SLS regressions with time and group fixed effects estimating weighted averages of this ratio across groups. We start by showing that when the treatment also increases in the control group, this ratio estimates a causal effect only if treatment effects are homogenous in the two groups. Even when the distribution of treatment is stable, it requires that treatment effects be constant over time. As this assumption is not always applicable, we propose two alternative estimators. The first estimator relies on a generalization of common trends assumptions to fuzzy designs, while the second extends the changes-in-changes estimator of Athey & Imbens (2006). When the distribution of treatment changes in the control group, treatment effects are partially identified. Finally, we prove that our estimators are asymptotically normal and use them to revisit applied papers using fuzzy designs.

]]>
http://www.ifs.org.uk/publications/8029 Mon, 26 Oct 2015 00:00:00 +0000
<![CDATA[Testing exogeneity in nonparametric instrumental variables identified by conditional quantile restrictions]]> This paper presents a test for exogeneity of explanatory variables in a nonparametric instrumental variables (IV) model whose structural function is identified through a conditional quantile restriction. Quantile regression models are increasingly important in applied econometrics.  As with mean-regression models, an erroneous assumption that the explanatory variables in a quantile regression model are exogenous can lead to highly misleading results.  In addition, a test of exogeneity based on an incorrectly specified parametric model can produce misleading results.  This paper presents a test of exogeneity that does not assume the structural function belongs to a known finite-dimensional parametric family and does not require nonparametric estimation of this function.  The latter property is important because, owing to the ill-posed inverse problem, a test based on a nonparametric estimator of the structural function has low power.  The test presented here is consistent whenever the structural function differs from the conditional quantile function on a set of non-zero probability.  The test has non-trivial power uniformly over a large class of structural functions that differ from the conditional quantile function by O(n−1/2) .  The results of Monte Carlo experiments illustrate the usefulness of the test.

]]>
http://www.ifs.org.uk/publications/8028 Wed, 21 Oct 2015 00:00:00 +0000
<![CDATA[Nonparametric estimation and inference under shape restrictions]]> Economic theory often provides shape restrictions on functions of interest in applications, such as monotonicity, convexity, non-increasing (non-decreasing) returns to scale, or the Slutsky inequality of consumer theory; but economic theory does not provide finite-dimensional parametric models.  This motivates nonparametric estimation under shape restrictions.  Nonparametric estimates are often very noisy.  Shape restrictions stabilize nonparametric estimates without imposing arbitrary restrictions, such as additivity or a single-index structure, that may be inconsistent with economic theory and the data.  This paper explains how to estimate and obtain an asymptotic uniform confidence band for a conditional mean function under possibly nonlinear shape restrictions, such as the Slutsky inequality.  The results of Monte Carlo experiments illustrate the finite-sample performance of the method, and an empirical example illustrates its use in an application.

]]>
http://www.ifs.org.uk/publications/8024 Mon, 19 Oct 2015 00:00:00 +0000
<![CDATA[Group size and the efficiency of informal risk sharing]]> The objective of this paper is to understand and test empirically the relationship between group size and informal risk sharing. Models of informal risk sharing with limited commitment and grim-trigger punishments upon deviation imply that larger groups provide better informal insurance. However, when subgroups of households can credibly deviate, so that sustainable informal arrangements ought to be coalition-proof, the relationship between group size and the amount of insurance is unclear. Building on the framework of Genicot and Ray (2003), we show that this relationship is theoretically ambiguous. We then investigate it empirically using data on the size of the sibships of the household head and spouse in rural Malawi. To identify the relevant potential group within which risk is shared, we exploit a social norm among the main ethnic group in our sample which is such that the brothers of the wife should play a key role in ensuring her household’s wellbeing. We find that households in which the wife has many brothers are not well-insured against crop loss events. Importantly, we fail to uncover a similar relationship for the sisters of the wife, ruling out that our findings are driven by wives with many siblings (e.g. brothers) having poorer extended family networks. Calibrating our theoretical framework using values similar to those in our sample produces a relationship between household risk sharing and group size that is similar to that uncovered in the data, indicating that the threat of coalitional deviations can explain our empirical findings.

]]>
http://www.ifs.org.uk/publications/8022 Fri, 16 Oct 2015 00:00:00 +0000
<![CDATA[Intergenerational mobility and the timing of parental income]]> We extend the standard intergenerational mobility literature by modelling individual outcomes as a function of the whole history of parental income, using data from Norway. We find that, conditional on permanent income, education is maximized when income is balanced between the early childhood and middle childhood years. In addition, there is an advantage to having income occur in late adolescence rather than in early childhood. These result are consistent with a model of parental investments in children with multiple periods of childhood, income shocks, imperfect insurance, dynamic complementarity and uncertainty about the production function and the ability of the child.

]]>
http://www.ifs.org.uk/publications/8016 Wed, 14 Oct 2015 00:00:00 +0000
<![CDATA[Robust confidence regions for incomplete models]]> Call an economic model incomplete if it does not generate a probabilistic prediction even given knowledge of all parameter values. We propose a method of inference about unknown parameters for such models that is robust to heterogeneity and dependence of unknown form. The key is a Central Limit Theorem for belief functions; robust confidence regions are then constructed in a fashion paralleling the classical approach. Monte Carlo simulations support tractability of the method and demonstrate its enhanced robustness relative to existing methods.

]]>
http://www.ifs.org.uk/publications/8015 Thu, 08 Oct 2015 00:00:00 +0000
<![CDATA[Partial identification in applied research: benefits and challenges]]> Advances in the study of partial identification allow applied researchers to learn about parameters of interest without making assumptions needed to guarantee point identification. We discuss the roles that assumptions and data play in partial identification analysis, with the goal of providing information to applied researchers that can help them employ these methods in practice. To this end, we present a sample of econometric models that have been used in a variety of recent applications where parameters of interest are partially identified, highlighting common features and themes across these papers. In addition, in order to help illustrate the combined roles of data and assumptions, we present numerical illustrations for a particular application, the joint determination of wages and labor supply. Finally we discuss the benefits and challenges of using partially identifying models in empirical work and point to possible avenues of future research.

]]>
http://www.ifs.org.uk/publications/8012 Wed, 07 Oct 2015 00:00:00 +0000
<![CDATA[Melting pot or salad bowl: the formation of heterogeneous communities]]> Relatively little is known about what determines whether a heterogenous population ends up in a cooperative or divisive situation. This paper proposes a theoretical model to understand what social structures arise in heterogeneous populations. Individuals face a trade-off between cultural and economic incentives: an individual prefers to maintain his cultural practices, but doing so can inhibit interaction and economic exchange with those who adopt different practices. We find that a small minority group will adopt majority cultural practices and integrate. In contrast, minority groups above a certain critical mass, may retain diverse practices and may also segregate from the majority. The size of this critical mass depends on the cultural distance between groups, the importance of culture in day to day life, and the costs of forming a social tie. We test these predictions using data on migrants to the United States in the era of mass migration, and find support for the existence of a critical mass of migrants above which social structure in heterogeneous populations changes discretely towards cultural distinction and segregation.

]]>
http://www.ifs.org.uk/publications/8011 Wed, 07 Oct 2015 00:00:00 +0000
<![CDATA[Characterizations of identified sets delivered by structural econometric models]]> This paper develops characterizations of identified sets of structures and structural features for complete and incomplete models involving continuous and/or discrete variables. Multiple values of unobserved variables can be associated with particular combinations of observed variables. This can arise when there are multiple sources of heterogeneity, censored or discrete endogenous variables, or inequality restrictions on functions of observed and unobserved variables. The models generalize the class of incomplete instrumental variable (IV) models in which unobserved variables are single-valued functions of observed variables. Thus the models are referred to as Generalized IV (GIV) models, but there are important cases in which instrumental variable restrictions play no significant role. The paper provides the first formal definition of observational equivalence for incomplete models. The development uses results from random set theory which guarantee that the characterizations deliver sharp bounds, thereby dispensing with the need for case-by-case proofs of sharpness. One innovation is the use of random sets defined on the space of unobserved variables. This allows identification analysis under mean and quantile independence restrictions on the distributions of unobserved variables conditional on exogenous variables as well as under a full independence restriction. It leads to a novel general characterization of identified sets of structural functions when the sole restriction on the distribution of unobserved and observed exogenous variables is that they are independently distributed. Illustrations are presented for a parametric random coefficients linear model and for a model with an interval censored outcome, in both cases with endogenous explanatory variables, and for an incomplete nonparametric model of English auctions. Numerous other applications are indicated.

]]>
http://www.ifs.org.uk/publications/8010 Mon, 05 Oct 2015 00:00:00 +0000
<![CDATA[Semiparametric model averaging of ultra-high dimensional time series]]> In this paper, we consider semiparametric model averaging of the nonlinear dynamic time series system where the number of exogenous regressors is ultra large and the number of autoregressors is moderately large. In order to accurately forecast the response variable, we propose two semiparametric approaches of dimension reduction among the exogenous regressors and auto-regressors (lags of the response variable). In the first approach, we introduce a Kernel Sure Independence Screening (KSIS) technique for the nonlinear time series setting which screens out the regressors whose marginal regression (or auto-regression) functions do not make significant contribution to estimating the joint multivariate regression function and thus reduces the dimension of the regressors from a possible exponential rate to a certain polynomial rate, typically smaller than the sample size; then we consider a semiparametric method of Model Averaging MArginal Regression (MAMAR) for the regressors and auto-regressors that survive the screening procedure, and propose a penalised MAMAR method to further select the regressors which have significant effects on estimating the multivariate regression function and predicting the future values of the response variable. In the second approach, we impose an approximate factor modelling structure on the ultra-high dimensional exogenous regressors and use a well-known principal component analysis to estimate the latent common factors, and then apply the penalised MAMAR method to select the estimated common factors and lags of the response variable which are significant. Through either of the two approaches, we can finally determine the optimal combination of the significant marginal regression and auto-regression functions. Under some regularity conditions, we derive the asymptotic properties for the two semiparametric dimension-reduction approaches. Some numerical studies including simulation and an empirical application are provided to illustrate the proposed methodology.

]]>
http://www.ifs.org.uk/publications/8009 Sun, 04 Oct 2015 00:00:00 +0000
<![CDATA[Nonparametric Euler equation identification and estimation]]> We consider nonparametric identification and estimation of pricing kernels, or equivalently of marginal utility functions up to scale, in consumption based asset pricing Euler equations. Ours is the first paper to prove nonparametric identification of Euler equations under low level conditions (without imposing functional restrictions or just assuming completeness). We also propose a novel nonparametric estimator based on our identification analysis, which combines standard kernel estimation with the computation of a matrix eigenvector problem. Our estimator avoids the ill-posed inverse issues associated with existing nonparametric instrumental variables based Euler equation estimators. We derive limiting distributions for our estimator and for relevant associated functionals. We provide a Monte Carlo analysis and an empirical application to US household-level consumption data.

]]>
http://www.ifs.org.uk/publications/8007 Thu, 01 Oct 2015 00:00:00 +0000
<![CDATA[Clinical trial design enabling ε-optimal treatment rules]]> Medical research has evolved conventions for choosing sample size in randomized clinical trials that rest on the theory of hypothesis testing. Bayesians have argued that trials should be designed to maximize subjective expected utility in settings of clinical interest. This perspective is compelling given a credible prior distribution on treatment response, but Bayesians have struggled to provide guidance on specification of priors. We use the frequentist statistical decision theory of Wald (1950) to study design of trials under ambiguity. We show that ε-optimal rules exist when trials have large enough sample size. An ε-optimal rule has expected welfare within ε of the welfare of the best treatment in every state of nature. Equivalently, it has maximum regret no larger than ε. We consider trials that draw predetermined numbers of subjects at random within groups stratified by covariates and treatments. The principal analytical findings are simple sufficient conditions on sample sizes that ensure existence of ε-optimal treatment rules when outcomes are bounded. These conditions are obtained by application of Hoeffding (1963) large deviations inequalities to evaluate the performance of empirical success rules.

]]>
http://www.ifs.org.uk/publications/7999 Mon, 28 Sep 2015 00:00:00 +0000
<![CDATA[Going beyond simple sample size calculations: a practitioner's guide]]> Basic methods to compute required sample sizes are well understood and supported by widely available software. However, the sophistication of the methods commonly used has not kept pace with the complexity of commonly employed experimental designs. We compile available methods for sample size calculations for continuous and binary outcomes with and without covariates, for both clustered and non-clustered RCTs. Formulae for both panel data and unbalanced designs are provided. Extensions include methods to: (1) optimise the sample when costs constraints are binding, (2) compute the power of a complex design by simulation, and (3) adjust calculations for multiple testing.

Click here to view accompanying sample size calculators for this paper.

]]>
http://www.ifs.org.uk/publications/7844 Mon, 28 Sep 2015 00:00:00 +0000
<![CDATA[Comparing sample survey measures of English earnings of graduates with administrative data during the Great Recession]]> This paper compares survey based labour earnings data for English graduates, taken from the UK’s Labour Force Survey (LFS), with the UK Government administrative sources of official individual level earnings data. This type of administrative data has few sample selection issues, is substantially longitudinal and its large samples mean the earnings of subpopulations can be potentially studied (e.g. those who study a specific subject at a specific university and graduate in a specific year). We find that very broadly the LFS and administrative data show a similar distribution of graduates’ earnings. However, the administrative data has considerably less gender disparity, higher high quantiles and more time series persistence. We also report on how the distribution of graduate and non-graduate earnings fell during each year of the Great Recession.

]]>
http://www.ifs.org.uk/publications/7997 Thu, 24 Sep 2015 00:00:00 +0000
<![CDATA[Shopping around: how households adjusted food spending over the Great Recession]]> Over the Great Recession UK households reduced real food expenditure. We show that they were able to maintain the number of calories that they purchased, and the nutritional quality of these calories, by adjusting their shopping behaviour. We document the mechanisms that households used. We motivate our analysis with a model of shopping behaviour in which households adjust shopping effort and the characteristics of their shopping basket in response to economic shocks. We use detailed longitudinal data and focus on within household changes in basket characteristics and proxies for shopping effort.

]]>
http://www.ifs.org.uk/publications/7996 Wed, 23 Sep 2015 00:00:00 +0000
<![CDATA[Redistribution from a lifetime perspective]]> Most analysis of the effects of the tax and benefit system is based on snapshot information about a single cross-section of people. Such an approach gives only a partial picture because it cannot account for the fact that circumstances change over life. This paper investigates how our impression of redistribution undertaken by the tax and benefit system changes when viewed from a lifetime perspective. To do so, we simulate lifecycle data designed to be representative of the experiences of the baby-boom cohort, born 1945–54. We examine the properties of the current tax and benefit system as well as historical and hypothetical reforms from both a lifetime and a snapshot perspective. We find that much of what the tax and benefit system achieves is effectively to redistribute across periods of life and, as a result, it is much less effective at reducing lifetime inequality than inequality at a snapshot.

]]>
http://www.ifs.org.uk/publications/7986 Tue, 22 Sep 2015 00:00:00 +0000
<![CDATA[A lava attack on the recovery of sums of dense and sparse signals]]> Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of non-zero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small non-zero parameters. We consider a generalization of these two basic models, termed here a “sparse + dense” model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for traditional dense estimation methods, such as ridge estimation. We propose a new penalization-based method, called lava, which is computationally efficient. With suitable choices of penalty parameters, the proposed method strictly dominates both lasso and ridge. We derive analytic expressions for the finite-sample risk function of the lava estimator in the Gaussian sequence model. We also provide a deviation bound for the prediction risk in the Gaussian regression model with fixed design. In both cases, we provide Stein's unbiased estimator for lava's prediction risk. A simulation example compares the performance of lava to lasso, ridge, and elastic net in a regression example using data-dependent penalty parameters and illustrates lava's improved performance relative to these benchmarks.

]]>
http://www.ifs.org.uk/publications/7989 Tue, 22 Sep 2015 00:00:00 +0000
<![CDATA[Constrained conditional moment restriction models]]> This paper examines a general class of inferential problems in semiparametric and nonparametric models defined by conditional moment restrictions. We construct tests for the hypothesis that at least one element of the identified set satisfies a conjectured (Banach space) “equality” and/or (a Banach lattice) “inequality” constraint. Our procedure is applicable to identified and partially identified models, and is shown to control the level, and under some conditions the size, asymptotically uniformly in an appropriate class of distributions. The critical values are obtained by building a strong approximation to the statistic and then bootstrapping a (conservatively) relaxed form of the statistic. Sufficient conditions are provided, including strong approximations using Koltchinskii's coupling.

Leading important special cases encompassed by the framework we study include: (i) Tests of shape restrictions for infinite dimensional parameters; (ii) Confidence regions for functionals that impose shape restrictions on the underlying parameter; (iii) Inference for functionals in semiparametric and nonparametric models defined by conditional moment (in)equalities; and (iv) Uniform inference in possibly nonlinear and severely ill-posed problems.

]]>
http://www.ifs.org.uk/publications/7992 Tue, 22 Sep 2015 00:00:00 +0000
<![CDATA[Monge-Kantorovich depth, quantiles, ranks and signs]]> We propose new concepts of statistical depth, multivariate quantiles, vector quantiles and ranks, ranks, and signs, based on canonical transportation maps between a distribution of interest on Rd and a reference distribution on the d-dimensional unit ball. The new depth concept, called Monge-Kantorovich depth, specializes to halfspace depth for d = 1 and in the case of spherical distributions, but, for more general distributions, differs from the latter in the ability for its contours to account for non convex features of the distribution of interest. We propose empirical counterparts to the population versions of those Monge-Kantorovich depth contours, quantiles, ranks, signs, and vector quantiles and ranks, and show their consistency by establishing a uniform convergence property for empirical (forward and reverse) transport maps, which is the main theoretical result of this paper.

]]>
http://www.ifs.org.uk/publications/7990 Tue, 22 Sep 2015 00:00:00 +0000
<![CDATA[Vector quantile regression: an optimal transport approach]]> We propose a notion of conditional vector quantile function and a vector quantile regression. A conditional vector quantile function (CVQF) of a random vector Y, taking values in Rd given covariates Z=z, taking values in Rk, is a map u --> QY|Z(u,z), which is monotone, in the sense of being a gradient of a convex function, and such that given that vector U follows a reference non-atomic distribution FU, for instance uniform distribution on a unit cube in Rd, the random vector QY|Z(U,z) has the distribution of Y conditional on Z=z. Moreover, we have a strong representation, Y =QY|Z(U,Z) almost surely, for some version of U. The vector quantile regression (VQR) is a linear model for CVQF of Y given Z. Under correct specification, the notion produces strong representation,Y=β(U)Tf(Z),for f(Z) denoting a known set of transformations of Z, where u --> β(u)T f(Z) is a monotone map, the gradient of a convex function, and the quantile regression coefficients u --> β(u) have the interpretations analogous to that of the standard scalar quantile regression. As f(Z) becomes a richer class of transformations of Z, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge-Kantorovich's optimal transportation problem at its core as a special case. In the classical case, where Y is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. An application to multiple Engel curve estimation is considered.

]]>
http://www.ifs.org.uk/publications/7991 Tue, 22 Sep 2015 00:00:00 +0000
<![CDATA[Program evaluation with high-dimensional data]]> In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post-regularization and post-selection inference that are uniformly valid (honest) across a wide-range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets. The results on program evaluation are obtained as a consequence of more general results on honest inference in a general moment condition framework, where we work with possibly a continuum of moments. We provide results on honest inference for (function-valued) parameters within this general framework where modern machine learning methods are used to fit the nonparametric/highdimensional components of the model. These include a number of supporting new results that are of major independent interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) offer a uniformly valid functional delta method, and (3) provide results for sparsity-based estimation of regression functions for function-valued outcomes.

]]>
http://www.ifs.org.uk/publications/7988 Tue, 22 Sep 2015 00:00:00 +0000
<![CDATA[Unemployment cycles]]> The labor market by itself can create cyclical outcomes, even in the absence of exogenous shocks. We propose a theory that shows that the search behavior of the employed has profound aggregate implications for the unemployed. There is a strategic complementarity between active on-the-job search and vacancy posting by firms: active search changes the number of searchers and the duration of a job, and in the presence of sorting, it improves the quality of the pool of searchers. More vacancy posting in turn makes costly on-the-job search more attractive, a self-fulfilling belief. The absence of on-the-job search discourages vacancy posting, rendering costly on-the-job search unattractive. This model of multiple equilibria can account for large fluctuations in vacancies, unemployment, and job-to-job transitions; it provides a rationale for the Jobless Recovery through a novel channel of the employed searchers crowding out the unemployed; and it gives rise to a shift in the Beveridge Curve (the unemployment-vacancy locus). Each of these phenomena is matched in the data.

]]>
http://www.ifs.org.uk/publications/7984 Wed, 16 Sep 2015 00:00:00 +0000
<![CDATA[Earnings and consumption dynamics: a nonlinear panel data framework]]> We develop a new quantile-based panel data framework to study the nature of income persistence and the transmission of income shocks to consumption. Log-earnings are the sum of a general Markovian persistent component and a transitory innovation. The persistence of past shocks to earnings is allowed to vary according to the size and sign of the current shock. Consumption is modeled as an age-dependent nonlinear function of assets and the two earnings components. We establish the nonparametric identification of the nonlinear earnings process and the consumption policy rule. Exploiting the enhanced consumption and asset data in recent waves of the Panel Study of Income Dynamics, we find nonlinear persistence and conditional skewness to be key features of the earnings process. We show that the impact of earnings shocks varies substantially across earnings histories, and that this nonlinearity drives heterogeneous consumption responses. The transmission of shocks is found to vary systematically with assets.

]]>
http://www.ifs.org.uk/publications/7981 Mon, 14 Sep 2015 00:00:00 +0000
<![CDATA[Revealed preferences over risk and uncertainty]]> Consider a finite data set where each observation consists of a bundle of contingent consumption chosen by an agent from a constraint set of such bundles. We develop a general procedure for testing the consistency of this data set with a broad class of models of choice under risk and under uncertainty. Unlike previous work, we do not require that the agent has a convex preference, so we allow for risk loving and elation seeking behavior. Our procedure can also be extended to calculate the magnitude of violations from a particular model of choice, using an index first suggested by Afriat (1972, 1973). We then apply this index to evaluate different models (including expected utility and disappointment aversion) in the data collected by Choi et al. (2007). We show that among those subjects exhibiting choice behavior consistent with the maximization of some increasing utility function, more than half are consistent with models of expected utility and disappointment aversion.

]]>
http://www.ifs.org.uk/publications/7982 Mon, 14 Sep 2015 00:00:00 +0000
<![CDATA[Inference for functions of partially identified parameters in moment inequality models]]> This paper introduces a bootstrap-based inference method for functions of the parameter vector in a moment (in)equality model. As a special case, our method yields marginal confidence sets for individual coordinates of this parameter vector. Our inference method controls asymptotic size uniformly over a large class of data distributions. The current literature describes only two other procedures that deliver uniform size control for this type of problem: projection-based and subsampling inference. Relative to projection-based procedures, our method presents three advantages: (i) it weakly dominates in terms of finite sample power, (ii) it strictly dominates in terms of asymptotic power, and (iii) it is typically less computationally demanding. Relative to subsampling, our method presents two advantages: (i) it strictly dominates in terms of asymptotic power (for reasonable choices of subsample size), and (ii) it appears to be less sensitive to the choice of its tuning parameter than subsampling is to the choice of subsample size.

]]>
http://www.ifs.org.uk/publications/7978 Tue, 08 Sep 2015 00:00:00 +0000
<![CDATA[Econometrics of network models]]> In this article I provide a (selective) review of the recent econometric literature on networks. I start with a discussion of developments in the econometrics of group interactions. I subsequently provide a description of statistical and econometric models for network formation and approaches for the joint determination of networks and interactions mediated through those networks. Finally, I give a very brief discussion of measurement issues in both outcomes and networks. My focus is on identification and computational issues, but estimation aspects are also discussed.

]]>
http://www.ifs.org.uk/publications/7969 Mon, 07 Sep 2015 00:00:00 +0000
<![CDATA[Earnings and consumption dynamics: a nonlinear panel data framework]]> We develop a new quantile-based panel data framework to study the nature of income persistence and the transmission of income shocks to consumption. Log-earnings are the sum of a general Markovian persistent component and a transitory innovation. The persistence of past shocks to earnings is allowed to vary according to the size and sign of the current shock. Consumption is modeled as an age-dependent nonlinear function of assets and the two earnings components. We establish the nonparametric identification of the nonlinear earnings process and the consumption policy rule. Exploiting the enhanced consumption and asset data in recent waves of the Panel Study of Income Dynamics, we find nonlinear persistence and conditional skewness to be key features of the earnings process. We show that the impact of earnings shocks varies substantially across earnings histories, and that this nonlinearity drives heterogeneous consumption responses. The transmission of shocks is found to vary systematically with assets.

]]>
http://www.ifs.org.uk/publications/7970 Mon, 07 Sep 2015 00:00:00 +0000
<![CDATA[Nonparametric identification of endogenous and heterogeneous aggregate demand models: complements, bundles and the market level]]> This paper studies nonparametric identification in market level demand models for differentiated products. We generalize common models by allowing for the distribution of heterogeneity parameters (random coefficients) to have a nonparametric distribution across the population and give conditions under which the density of the random coefficients is identified. We show that key identifying restrictions are provided by (i) a set of moment conditions generated by instrumental variables together with an inversion of aggregate demand in unobserved product characteristics; and (ii) an integral transform (Radon transform) that maps the random coefficient density to the aggregate demand. This feature is shown to be common across a wide class of models, and we illustrate this by studying leading demand models. Our examples include demand models based on the multinomial choice (Berry, Levinsohn, Pakes, 1995), the choice of bundles of goods that can be substitutes or complements, and the choice of goods consumed in multiple units.

]]>
http://www.ifs.org.uk/publications/7966 Thu, 03 Sep 2015 00:00:00 +0000
<![CDATA[Identification and estimation of preference distributions when voters are ideological]]> This paper studies the nonparametric identification and estimation of voters' preferences when voters are ideological. We establish that voter preference distributions and other parameters of interest can be identified from aggregate electoral data. We also show that these objects can be consistently estimated and illustrate our analysis by performing an actual estimation using data from the 1999 European Parliament elections.

]]>
http://www.ifs.org.uk/publications/7965 Wed, 02 Sep 2015 00:00:00 +0000
<![CDATA[Optimal bandwidth selection for the fuzzy regression discontinuity estimator]]> A new bandwidth selection method for the fuzzy regression discontinuity estimator is proposed. The method chooses two bandwidths simultaneously, one for each side of the cut-off point by using a criterion based on the estimated asymptotic mean square error taking into account a second-order bias term. A simulation study demonstrates the usefulness of the proposed method.

]]>
http://www.ifs.org.uk/publications/7964 Tue, 01 Sep 2015 00:00:00 +0000
<![CDATA[A tax micro-simulator for Mexico (MEXTAX) and its application to the 2010 tax reforms]]> We develop a tax micro-simulator model (MEXTAX) that can quantify the revenue and distributional impact of tax reforms in Mexico using micro-level data. We use MEXTAX to assess revenue-raising reforms to Mexico’s direct and indirect tax systems of 2010. Initial proposals by the Executive Power included the introduction of a uniform expenditure tax covering traditionally untaxed necessities (such as food). The reform approved by the Congress replaced this with an increase in the standard (non-uniform) rate of VAT, to avoid regressive impacts. Both reform packages included other minor changes to income tax and excise duties. We argue that given that indirect taxes were changed the most in both reforms, expenditure should be used to measured living standards and proportional progressivity. We find that both the reform package proposed and the reform package approved are progressive if expenditure is used as a measure of living standards, although this is not the case for the proposed reform if income is used. However, the proposed reform would have raised more revenues than the approved reform and we argue that the foregone revenues due to the amendments could have been used to target poorer households more effectively using more direct instruments for redistribution. We also find that using alternative assumptions about missing income or labor supply response affect quantitatively, but not qualitatively, results. The model can be extended to incorporate further behavioral margins and to other countries with similar tax structures.  

]]>
http://www.ifs.org.uk/publications/7962 Wed, 26 Aug 2015 00:00:00 +0000
<![CDATA[New joints: private providers and rising demand in the English National Health Service]]> This paper investigates how changes in hospital choice sets affect levels of patient demand for elective hospital care. We exploit a set of reforms in England that opened up the market for publicly-funded patients to private hospitals. Impacts on demand are estimated using variation in distance to these private hospitals, within regions where supply constraints are fixed. We find that the reforms increased demand for publicly-funded procedures. For public hospitals, volumes remained unchanged but waiting times fell. Taken together, our results provide new insights into how individuals make choices about their care and the scope of competition between hospitals.

]]>
http://www.ifs.org.uk/publications/7961 Wed, 26 Aug 2015 00:00:00 +0000
<![CDATA[Public hospital spending in England: evidence from National Health Service administrative records]]> Health spending per capita in England has more than doubled since 1997, yet relatively little is known about how that spending is distributed across the population. This paper uses administrative National Health Service (NHS) hospital records to examine key features of public hospital spending in England. We describe how costs vary across the lifecycle, and the concentration of spending among people and over time. We find that costs per person start to increase after age 50 and escalate after age 70. Spending is highly concentrated in a small section of the population: with 32% of all hospital spending accounted for by 1% of the general population, and 18% of spending by 1% of all patients. There is persistence in spending over time with patients with high spending more likely to have spending in subsequent years, and those with zero expenditures more likely to remain out of hospital.

]]>
http://www.ifs.org.uk/publications/7960 Wed, 26 Aug 2015 00:00:00 +0000
<![CDATA[Identification in differentiated product markets]]> Empirical models of demand for - and, often, supply of – differentiated products are widely used in practice, typically employing parametric functional forms and distributions of consumer heterogeneity. We review some recent work studying identification in a broad class of such models. This work shows that parametric functional forms and distributional assumptions are not essential for identification. Rather, identification relies primarily on the standard requirement that instruments be available for the endogenous variables - here, typically, prices and quantities. We discuss the kinds of instruments needed for identification and how the reliance on instruments can be reduced by nonparametric functional form restrictions or better data. We also discuss results on discrimination between alternative models of oligopoly competition.

]]>
http://www.ifs.org.uk/publications/7950 Wed, 19 Aug 2015 00:00:00 +0000
<![CDATA[Identification of nonparametric simultaneous equations models with a residual index structure]]> We present new results on the identifiability of a class of nonseparable nonparametric simultaneous equations models introduced by Matzkin (2008). These models combine exclusion restrictions with a requirement that each structural error enter through a “residual index”. Our identification results encompass a variety of special cases allowing tradeoffs between the exogenous variation required of instruments and restrictions on the joint density of structural errors. Among these special cases are results avoiding any density restriction and results allowing instruments with arbitrarily small support.

]]>
http://www.ifs.org.uk/publications/7951 Wed, 19 Aug 2015 00:00:00 +0000
<![CDATA[Model averaging in semiparametric estimation of treatment effects]]> In the practice of program evaluation, choosing the covariates and the functional form of the propensity score is an important choice that the researchers make when estimating treatment effects. This paper proposes a data-driven way of averaging the estimators over the candidate specifications in order to resolve the issue of specification uncertainty in the propensity score weighting estimation of the average treatment effects for treated (ATT). The proposed averaging procedures aim to minimize the estimated mean squared error (MSE) of the ATT estimator in a local asymptotic framework. We formulate model averaging as a statistical decision problem in a limit experiment, and derive an averaging scheme that is Bayes optimal with respect to a given prior for the localization parameters. Analytical comparisons of the Bayes asymptotic MSE show that the averaging estimator outperforms post model selection estimators and the estimators in any of the candidate models. Our Monte Carlo studies confirm these theoretical results and illustrate the size of the MSE gains from averaging. We apply the averaging procedure to evaluate the effect of the labor market program analyzed in LaLonde (1986).

]]>
http://www.ifs.org.uk/publications/7946 Thu, 13 Aug 2015 00:00:00 +0000
<![CDATA[Mutually consistent revealed preference bounds]]> Revealed preference restrictions are increasingly used to bound demand responses and as shape restrictions in nonparametric estimation exercises. However, the restrictions imposed are not sufficient for rationality when predictions are made at more than a single price regime. We highlight the nonlinearities in revealed preference restrictions and the nonconvexities in the set of predictions that arise when making multiple predictions. We develop a mixed integer programming characterisation of the problem that can be used to impose rationality on multiple predictions. The approach is applied to the UK Family Expenditure Survey to recover jointly rational nonparametric estimates of income expansion paths.

]]>
http://www.ifs.org.uk/publications/7939 Tue, 11 Aug 2015 00:00:00 +0000
<![CDATA[Wage regulation and the quality of police officer recruits]]> The paper analyses the impact of centrally regulated pay on the quality of applicants to be police officers in England and Wales using a unique dataset of individual test scores from the national assessment that is required of all applicants. It provides empirical evidence of two distinct channels through which centrally regulated pay induces variation in the quality of applicants. First, national wage setting implies that relative wages between the police and other occupations vary spatially. We show that higher outside wages are associated with lower quality applicants, using several spatially-varying measures of outside wages. Second, nationally-set wages cannot adjust to reflect spatial variation in the disamenity of an occupation. We demonstrate that a greater disamenity of policing (as measured primarily by area differences in crime rates and in the proportion of crime that is violent) is also associated with lower quality police applicants. 

]]>
http://www.ifs.org.uk/publications/7937 Tue, 11 Aug 2015 00:00:00 +0000
<![CDATA[The influence function of semiparametric estimators]]> Often semiparametric estimators are asymptotically equivalent to a sample average. The object being averaged is referred to as the influence function. The influence function is useful in formulating primitive regularity conditions for asymptotic normality, in efficiency comparions, for bias reduction, and for analyzing robustness. We show that the influence function of a semiparametric estimator can be calculated as the limit of the Gateaux derivative of a parameter with respect to a smooth deviation as the deviation approaches a point mass. We also consider high level and primitive regularity conditions for validity of the influence function calculation. The conditions involve Frechet differentiability, nonparametric convergence rates, stochastic equicontinuity, and small bias conditions. We apply these results to examples.

]]>
http://www.ifs.org.uk/publications/7935 Fri, 07 Aug 2015 00:00:00 +0000
<![CDATA[Inference under covariate-adaptive randomization]]> This paper studies inference for the average treatment effect in randomized controlled trials with covariate-adaptive randomization. Here, by covariate-adaptive randomization, we mean randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve 'balance' within each stratum. Such schemes include, for example, Efron's biased-coin design and stratified block randomization. When testing the null hypothesis that the average treatment effect equals a pre-specified value in such settings, we first show that the usual two-sample t-test is conservative in the sense that it has limiting rejection probability under the null hypothesis no greater than and typically strictly less than the nominal level. In a simulation study, we find that the rejection probability may in fact be dramatically less than the nominal level. We show further that these same conclusions remain true for a naïve permutation test, but that a modified version of the permutation test yields a test that is non-conservative in the sense that its limiting rejection probability under the null hypothesis equals the nominal level. The modified version of the permutation test has the additional advantage that it has rejection probability exactly equal to the nominal level for some distributions satisfying the null hypothesis. Finally, we show that the usual t-test (on the coefficient on treatment assignment) in a linear regression of outcomes on treatment assignment and indicators for each of the strata yields a non-conservative test as well. In a simulation study, we find that the non-conservative tests have substantially greater power than the usual two-sample t-test.

]]>
http://www.ifs.org.uk/publications/7936 Fri, 07 Aug 2015 00:00:00 +0000
<![CDATA[An econometric model of link formation with degree heterogeneity]]> I formulate and study a model of undirected dyadic link formation which allows for assortative matching on observed agent characteristics (homophily) as well as unrestricted agent level heterogeneity in link surplus (degree heterogeneity). Similar to fixed effects panel data analyses, the joint distribution of observed and unobserved agent-level characteristics is left unrestricted. To motivate the introduction of degree heterogeneity, as well as its fixed effect treatment, I show how its presence can bias conventional homophily measures. Two estimators for the (common) homophily parameter, beta0, are developed and their properties studied under an asymptotic sequence involving a single network growing large. The first,tetrad logit (TL), estimator conditions on a sufficient statistic for the degree heterogeneity. The TL estimator is a fourth-order U-Process minimizer. Although the fourth-order summation in the TL criterion function is over the i = 1...N agents in the network, due to a degeneracy property, the leading variance term of hat-beta_TL is of order 1/n, where n = N*(N-1)/2 equals the number of observed dyads. Using martingale theory, I show that the limiting distribution of hat-beta_TL (appropriately scaled and normalized) is normal. The second, joint maximum likelihood (JML), estimator treats the degree heterogeneity as additional (incidental) parameters to be estimated. The properties of hat-beta_JML are also non-standard due to a parameter space which grows with the size of the network. Adapting and extending recent results from random graph theory and non-linear panel data analysis (e.g., Chatterjee, Diaconis and Sly, 2011; Hahn and Newey, 2004), I show that the limit distribution of hat-beta_JML is also normal, but contains a bias term. Accurate inference necessitates bias-correction. The TL estimate is consistent under sparse graph sequences, where the number of links per agent is small relative to the total number of agents, as well as dense graphs sequences, where the number of links per agent is proportional to the total number of agents in the limit. Consistency of the JML estimate, in contrast, is shown only under dense graph sequences. The finite sample properties of hat-beta_TL and hat-beta_JML are explored in a series of Monte Carlo experiments.

]]>
http://www.ifs.org.uk/publications/7930 Tue, 04 Aug 2015 00:00:00 +0000
<![CDATA[Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator]]> A new bandwidth selection rule that uses different bandwidths for the local linear regression estimators on the left and the right of the cut-off point is proposed for the sharp regression discontinuity estimator of the mean program impact at the cut-off point. The asymptotic mean squared error of the estimator using the proposed bandwidth selection rule is shown to be smaller than other bandwidth selection rules proposed in the literature. An extensive simulation study shows that the proposed method's performances for the samples sizes 500, 2000, and 5000 closely match the theoretical predictions.

Supplementary material for this paper is available here.

]]>
http://www.ifs.org.uk/publications/7928 Thu, 30 Jul 2015 00:00:00 +0000
<![CDATA[Global engagement in R&D: a portrait of biopharmaceutical patenting firms]]> This paper provides a novel portrait of firms engaging in the international use of inventors. I focus on drug discovery activity of pharmaceutical and biotechnological firms head-quartered in Europe, over the period 1996-2005. An important part of the most high-valued added R&D activities are conducted by inventors, who are engaged in the creation of new technologies. I use a novel and particularly rich dataset that provides a comparable picture across host locations and over time of research activity of EU firms. The main results are that firm-level heterogeneity is a key feature in the internationalisation of inventors and this is similar to patterns from data analysing good and service traders and MNEs. Furthermore, host country distance characteristics are associated with the number of inventors in a similar fashion to patterns found in gravity models explaining good and service trade.

]]>
http://www.ifs.org.uk/publications/7912 Fri, 24 Jul 2015 00:00:00 +0000
<![CDATA[Finite sample bias corrected IV estimation for weak and many instruments]]> This paper considers the finite sample distribution of the 2SLS estimator and derives bounds on its exact bias in the presence of weak and/or many instruments. We then contrast the behavior of the exact bias expressions and the asymptotic expansions currently popular in the literature, including a consideration of the no-moment problem exhibited by many Nagar-type estimators. After deriving a finite sample unbiased k-class estimator, we introduce a double k-class estimator based on Nagar (1962) that dominates k-class estimators (including 2SLS), especially in the cases of weak and/or many instruments. We demonstrate these properties in Monte Carlo simulations showing that our preferred estimators outperforms Fuller (1977) estimators in terms of mean bias and MSE.

]]>
http://www.ifs.org.uk/publications/7907 Tue, 21 Jul 2015 00:00:00 +0000
<![CDATA[Nonlinear panel data estimation via quantile regressions]]> We introduce a class of quantile regression estimators for short panels. Our framework covers static and dynamic autoregressive models, models with general predetermined regressors, and models with multiple individual effects. We use quantile regression as a flexible tool to model the relationships between outcomes, covariates, and heterogeneity. We develop an iterative simulation-based approach for estimation, which exploits the computational simplicity of ordinary quantile regression in each iteration step. Finally, an application to measure the effect of smoking during pregnancy on children’s birthweights completes the paper.

]]>
http://www.ifs.org.uk/publications/7877 Wed, 15 Jul 2015 00:00:00 +0000
<![CDATA[Nonparametric instrumental variable estimation under monotonicity]]> The ill-posedness of the inverse problem of recovering a regression function in a nonparametric instrumental variable model leads to estimators that may suffer from a very slow, logarithmic rate of convergence. In this paper, we show that restricting the problem to models with monotone regression functions and monotone instruments significantly weakens the ill-posedness of the problem. In stark contrast to the existing literature, the presence of a monotone instrument implies boundedness of our measure of ill-posedness when restricted to the space of monotone functions. Based on this result we derive a novel non-asymptotic error bound for the constrained estimator that imposes monotonicity of the regression function. For a given sample size, the bound is independent of the degree of ill-posedness as long as the regression function is not too steep. As an implication, the bound allows us to show that the constrained estimator converges at a fast, polynomial rate, independently of the degree of ill-posedness, in a large, but slowly shrinking neighborhood of constant functions. Our simulation study demonstrates significant finite-sample performance gains from imposing monotonicity even when the regression function is rather far from being a constant. We apply the constrained estimator to the problem of estimating gasoline demand functions from U.S. data.

]]>
http://www.ifs.org.uk/publications/7875 Mon, 13 Jul 2015 00:00:00 +0000
<![CDATA[A discrete model for bootstrap iteration]]> In an attempt to free bootstrap theory from the shackles of asymptotic considerations, this paper studies the possibility of justifying, or validating, the bootstrap, not by letting the sample size tend to infinity, but by considering the sequence of bootstrap P values obtained by iterating the bootstrap. The main idea of the paper is that, if this sequence converges to a random variable that follows the uniform U(0; 1) distribution, then the bootstrap is valid. The idea is studied by making the model under test discrete and finite, so that it is characterised by a finite three-dimensional array of probabilities. This device, when available, renders bootstrap iteration to any desired order feasible. It is used for studying a unit-root test for a process driven by a stationary MA(1) process, where it is known that the unit-root test, even when bootstrapped, becomes quite unreliable when the MA(1) parameter is in the vicinity of -1. Iteration of the bootstrap P value to convergence achieves reliable inference except for a parameter value very close to -1. The paper then endeavours to see these specific results in a wider context, and tries to cast new light on where bootstrap theory may be going.

]]>
http://www.ifs.org.uk/publications/7874 Mon, 13 Jul 2015 00:00:00 +0000
<![CDATA[Alternative asymptotics and the partially linear model with many regressors]]> Many empirical studies estimate the structural effect of some variable on an outcome of interest while allowing for many covariates. We present inference methods that account for many covariates. The methods are based on asymptotics where the number of covariates grows as fast as the sample size. We find a limiting normal distribution with variance that is larger than the standard one. We also find that with homoskedasticity this larger variance can be accounted for by using degrees of freedom adjusted standard errors. We link this asymptotic theory to previous results for many instruments and for small bandwidths distributional approximations.

]]>
http://www.ifs.org.uk/publications/7856 Fri, 10 Jul 2015 00:00:00 +0000
<![CDATA[Treatment effects with many covariates and heteroskedasticity]]> The linear regression model is widely used in empirical work in Economics. Researchers often include many covariates in their linear model specification in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of covariates are allowed to grow as fast as the sample size. We find that all of the usual versions of Eicker-White heteroskedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroskedasticity consistent standard error formula that is fully automatic and robust to both (conditional) heteroskedasticity of unknown form and the inclusion of possibly many covariates. We apply our findings to three settings: (i) parametric linear models with many covariates; (ii) semiparametric semi-linear models with many technical regressors; and (iii) linear panel models with many fixed effects.

]]>
http://www.ifs.org.uk/publications/7857 Fri, 10 Jul 2015 00:00:00 +0000
<![CDATA[Identification and estimation of nonparametric panel data regressions with measurement error]]> This paper provides a constructive argument for identification of nonparametric panel data models with measurement error in a continuous explanatory variable. The approach point identifies all structural elements of the model using only observations of the outcome and the mismeasured explanatory variable; no further external variables such as instruments are required. In the case of two time periods, restricting either the structural or the measurement error to be independent over time allows past explanatory variables or outcomes to serve as instruments. Time periods have to be linked through serial dependence in the latent explanatory variable, but the transition process is left nonparametric. The paper discusses the general identification result in the context of a nonlinear panel data regression model with additively separable fixed effects. It provides a nonparametric plug-in estimator, derives its uniform rate of convergence, and presents simulation evidence for good performance in finite samples.

]]>
http://www.ifs.org.uk/publications/7833 Thu, 02 Jul 2015 00:00:00 +0000
<![CDATA[Variable selection and estimation in high-dimensional models]]> Models with high-dimensional covariates arise frequently in economics and other fields. Often, only a few covariates have important effects on the dependent variable. When this happens, the model is said to be sparse. In applications, however, it is not known which covariates are important and which are not. This paper reviews methods for discriminating between important and unimportant covariates with particular attention given to methods that discriminate correctly with probability approaching 1 as the sample size increases. Methods are available for a wide variety of linear, nonlinear, semiparametric, and nonparametric models. The performance of some of these methods in finite samples is illustrated through Monte Carlo simulations and an empirical example.

]]>
http://www.ifs.org.uk/publications/7837 Thu, 02 Jul 2015 00:00:00 +0000
<![CDATA[The triangular model with random coefficients]]> The triangular model is a very popular way to capture endogeneity. In this model, an outcome is determined by an endogenous regressor, which in turn is caused by an instrument in a first stage. In this paper, we study the triangular model with random coefficients and exogenous regressors in both equations. We establish a profound non-identification result: the joint distribution of the random coefficients is not identified, implying that counterfactual outcomes are also not identified in general. This result continues to hold, if we confine ourselves to the joint distribution of coefficients in the outcome equation or any marginal, except the one on the endogenous regressor. Identification continues to fail, even if we focus on means of random coefficients (implying that IV is generally biased), or let the instrument enter the first stage in a monotonic fashion. Based on this insight, we derive bounds on the joint distribution of random parameters, and suggest an additional restriction that allows to point identify the distribution of random coefficients in the outcome equation. We extend this framework to cover the case where the regressors and instruments have limited support, and analyze semi- and nonparametric sample counterpart estimators in finite and large samples. Finally, we give an application of the framework to consumer demand.

]]>
http://www.ifs.org.uk/publications/7830 Tue, 30 Jun 2015 00:00:00 +0000
<![CDATA[A weak instrument F-test in linear IV models with multiple endogenous variables]]> We consider testing for weak instruments in a model with multiple endogenous variables. Unlike Stock and Yogo (2005), who considered a weak instruments problem where the rank of the matrix of reduced form parameters is near zero, here we consider a weak instruments problem of a near rank reduction of one in the matrix of reduced form parameters. For example, in a two-variable model, we consider weak instrument asymptotics of the form π1 = δ π2 + c / √n  where  π1 and π2 are the parameters in the two reduced-form equations, c is a vector of constants and n is the sample size. We investigate the use of a conditional first-stage F-statistic along the lines of the proposal by Angrist and Pischke (2009) and show that, unless δ = 0, the variance in the denominator of their F-statistic needs to be adjusted in order to get a correct asymptotic distribution when testing the hypothesis H0 : π1 = δπ2. We show that a corrected conditional F-statistic is equivalent to the Cragg and Donald (1993) minimum eigenvalue rank test statistic, and is informative about the maximum total relative bias of the 2SLS estimator and the Wald tests size distortions. When δ = 0 in the two-variable model, or when there are more than two endogenous variables, further information over and above the Cragg-Donald statistic can be obtained about the nature of the weak instrument problem by computing the conditional first-stage F-statistics.

]]>
http://www.ifs.org.uk/publications/7828 Tue, 30 Jun 2015 00:00:00 +0000
<![CDATA[Optimal sup-norm rates, adaptivity and inference in nonparametric instrumental variables estimation]]> This paper makes several contributions to the literature on the important yet difficult problem of estimating functions nonparametrically using instrumental variables. First, we derive the minimax optimal sup-norm convergence rates for nonparametric instrumental variables (NPIV) estimation of the structural function h0 and its derivatives. Second, we show that a computationally simple sieve NPIV estimator can attain the optimal sup-norm rates for h0 and its derivatives when h0 is approximated via a spline or wavelet sieve. Our optimal sup-norm rates surprisingly coincide with the optimal L2-norm rates for severely ill-posed problems, and are only up to a [log(n)] (with ∈ < 1=2) factor slower than the optimal L2-norm rates for mildly ill-posed problems. Third, we introduce a novel data-driven procedure for choosing the sieve dimension optimally. Our data-driven procedure is sup-norm rate- adaptive: the resulting estimator of h0 and its derivatives converge at their optimal sup-norm rates even though the smoothness of hand the degree of ill-posedness of the NPIV model are unknown. Finally, we present two non-trivial applications of the sup-norm rates to inference on nonlinear functionals of h0 under low-level conditions. The first is to derive the asymptotic normality of sieve t-statistics for exact consumer surplus and deadweight loss functionals in nonparametric demand estimation when prices, and possibly incomes, are endogenous. The second is to establish the validity of a sieve score bootstrap for constructing asymptotically exact uniform confidence bands for collections of nonlinear functionals of h0. Both applications provide new and useful tools for empirical research on nonparametric models with endogeneity.

]]>
http://www.ifs.org.uk/publications/7829 Tue, 30 Jun 2015 00:00:00 +0000
<![CDATA[Identification of the distribution of valuations in an incomplete model of English auctions]]> An incomplete model of English auctions with symmetric independent private values, similar to the one studied in Haile and Tamer (2003), is shown to fall in the class of Generalized Instrumental Variable Models introduced in Chesher and Rosen (2014). A characterization of the sharp identified set for the distribution of valuations is thereby obtained and shown to refine the bounds available until now.

]]>
http://www.ifs.org.uk/publications/7827 Mon, 29 Jun 2015 00:00:00 +0000
<![CDATA[Demand analysis with partially observed prices]]> In empirical demand, industrial organization, and labor economics, prices are often unobserved or unobservable since they may only be recorded when an agent transacts. In the absence of any additional information, this partial observability of prices is known to lead to a number of identification problems. However, in this paper, we show that theory-consistent demand analysis remains feasible in the presence of partially observed prices, and hence partially observed implied budget sets, even if we are agnostic about the nature of the missing prices. Our revealed preference approach is empirically meaningful and easy to implement. We illustrate using simple examples.

]]>
http://www.ifs.org.uk/publications/7813 Wed, 24 Jun 2015 00:00:00 +0000
<![CDATA[Please call me John: name choice and the assimilation of immigrants in the United States, 1900-1930]]> The vast majority of immigrants to the United States at the beginning of the 20th century adopted first names that were common among natives. The rate of adoption of an American name increases with time in the US, although most immigrants adopt an American name within the first year of arrival. Choice of an American first name was associated with a more successful assimilation, as measured by job occupation scores, marriage to a US native and take-up of US citizenship. We examine economic determinants of name choice, by studying the relationship between changes in the proportion of immigrants with an American first name and changes in the concentration of immigrants as well as changes in local labor market conditions, across different census years. We find that high concentrations of immigrants of a given nationality in a particular location discouraged members of that nationality from taking American names. Poor local labor market conditions for immigrants (and good local labor market conditions for natives) led to more frequent name changes among immigrants.

]]>
http://www.ifs.org.uk/publications/7814 Wed, 24 Jun 2015 00:00:00 +0000
<![CDATA[Identification of preferences in network formation games]]> This paper provides a framework for identifying preferences in a large network under the assumption of pairwise stability of network links. Network data present difficulties for identification, especially when links between nodes in a network can be interdependent: e.g., where indirect connections matter. Given a preference specification, we use the observed proportions of various possible payoff-relevant local network structures to learn about the underlying parameters. We show how one can map the observed proportions of these local structures to sets of parameters that are consistent with the model and the data. Our main result provides necessary conditions for parameters to belong to the identified set, and this result holds for a wide class of models. We also provide sufficient conditions - and hence a characterization of the identified set - for two empirically relevant classes of specifications. An interesting feature of our approach is the use of the economic model under pairwise stability as a vehicle for effective dimension reduction. The paper then provides a quadratic programming algorithm that can be used to construct the identified sets. This algorithm is illustrated with a pair of simulation exercises.

]]>
http://www.ifs.org.uk/publications/7815 Wed, 24 Jun 2015 00:00:00 +0000
<![CDATA[Approximate permutation tests and induced order statistics in the regression discontinuity design]]> This paper proposes an asymptotically valid permutation test for a testable implication of the identification assumption in the regression discontinuity design (RDD). Here, by testable implication, we mean the requirement that the distribution of observed baseline covariates should not change discontinuously at the threshold of the so-called running variable. This contrasts to the common practice of testing the weaker implication of continuity of the means of the covariates at the threshold. When testing our null hypothesis using observations that are “close” to the threshold, the standard requirement for the finite sample validity of a permutation does not necessarily hold. We therefore propose an asymptotic framework where there is a fixed number of closest observations to the threshold with the sample size going to infinity, and propose a permutation test based on the so-called induced order statistics that controls the limiting rejection probability under the null hypothesis. In a simulation study, we find that the new test controls size remarkably well in most designs. Finally, we use our test to evaluate the validity of the design in Lee (2008), a well-known application of the RDD to study incumbency advantage.

]]>
http://www.ifs.org.uk/publications/7788 Mon, 22 Jun 2015 00:00:00 +0000
<![CDATA[Breaking the curse of dimensionality in conditional moment inequalities for discrete choice models]]> This paper studies inference of preference parameters in semiparametric discrete choice models when these parameters are not point-identified and the identified set is characterized by a class of conditional moment inequalities. Exploring the semiparametric modeling restrictions, we show that the identified set can be equivalently formulated by moment inequalities conditional on only two continuous indexing variables. Such formulation holds regardless of the covariate dimension, thereby breaking the curse of dimensionality for nonparametric inference based on the underlying conditional moment inequalities. We also extend this dimension reducing characterization result to a variety of semi-parametric models under which the sign of conditional expectation of a certain transformation of the outcome is the same as that of the indexing variable.

]]>
http://www.ifs.org.uk/publications/7784 Thu, 18 Jun 2015 00:00:00 +0000
<![CDATA[Random coefficients on endogenous variables in simultaneous equations models]]> This paper considers a classical linear simultaneous equations model with random coefficients on the endogenous variables. Simultaneous equations models are used to study social interactions, strategic interactions between firms, and market equilibrium. Random coefficient models allow for heterogeneous marginal effects. I show that random coefficient seemingly unrelated regression models with common regressors are not point identified, which implies random coefficient simultaneous equations models are not point identified. Important features of these models, however, can be identified. For two-equation systems, I give two sets of sufficient conditions for point identification of the coefficients’ marginal distributions conditional on exogenous covariates. The first allows for small support continuous instruments under tail restrictions on the distributions of unobservables which are necessary for point identification. The second requires full support instruments, but allows for nearly arbitrary distributions of unobservables. I discuss how to generalize these results to many equation systems, where I focus on linear-in-means models with heterogeneous endogenous social interaction effects. I give sufficient conditions for point identification of the distributions of these endogenous social effects. I suggest a nonparametric kernel estimator for these distributions based on the identification arguments. I apply my results to the Add Health data to analyze peer effects in education.

]]>
http://www.ifs.org.uk/publications/7773 Mon, 15 Jun 2015 00:00:00 +0000
<![CDATA[Sanitation dynamics: toilet acquisition and its economic and social implications]]> Poor sanitation is an important policy issue facing India, which accounts for over half of the 1.1 billion people worldwide that defecate in the open [JMP, 2012]. Achieving global sanitation targets, and reducing the social and economic costs of open defecation, therefore requires effectively extending sanitation services to India's citizens. The Indian Government has shown strong commitment to improving sanitation. However, uptake and usage of safe sanitation remains low: almost 50% of Indian households do not have access to a private or public latrine (2011 Indian census). This highlights the need for novel approaches to foster the uptake and sustained usage of safe sanitation in this context. This study contributes to addressing this need in two ways: First, we use primary data collected in both rural and urban contexts in two states of India, to understand determinants of toilet ownership and acquisition. A theoretical model is presented accompanying our empirical findings. Second, while ours is not a randomized control trial, we are able to offer a rich picture on the main determinants and potential outcomes of sanitation uptake. Contrary to many studies on sanitation, our focus is not primarily on health outcomes but we emphasize economic and social status considerations. Further, toilet acquisition is analyzed in the context of an intervention that alleviated one of the major constraints to acquisition - financial resources - which allows us to highlight the importance of attending this constraint. These three contributions have important implications for the design of strategies to promote sanitation, a major focus of many governments of developing countries and international organizations at present.

]]>
http://www.ifs.org.uk/publications/7769 Tue, 09 Jun 2015 00:00:00 +0000
<![CDATA[Nonparametric stochastic discount factor decomposition]]> We introduce econometric methods to perform estimation and inference on the permanent and transitory components of the stochastic discount factor (SDF) in dynamic Markov environments. The approach is nonparametric in that it does not impose parametric restrictions on the law of motion of the state process. We propose sieve estimators of the eigenvalue-eigenfunction pair which are used to decompose the SDF into its permanent and transitory components, as well as estimators of the long-run yield and the entropy of the permanent component of the SDF, allowing for a wide variety of empirically relevant setups. Consistency and convergence rates are established. The estimators of the eigenvalue, yield and entropy are shown to be asymptotically normal and semiparametrically efficient when the SDF is observable. We also introduce nonparametric estimators of the continuation value under Epstein-Zin preferences, thereby extending the scope of our estimators to an important class of recursive preferences. The estimators are simple to implement, perform favorably in simulations, and may be used to numerically compute the eigenfunction and its eigenvalue in fully specified models when analytical solutions are not available.

]]>
http://www.ifs.org.uk/publications/7770 Tue, 09 Jun 2015 00:00:00 +0000
<![CDATA[Counterfactual worlds]]> We study a generalization of the treatment effect model in which an observed discrete classifier indicates in which one of a set of counterfactual processes a decision maker is observed. The other observed outcomes are delivered by the particular counterfactual process in which the decision maker is found. Models of the counterfactual processes can be incomplete in the sense that even with knowledge of the values of observed exogenous and unobserved variables they may not deliver a unique value of the endogenous outcomes. We study the identifying power of models of this sort that incorporate (i) conditional independence restrictions under which unobserved variables and the classifier variable are stochastically independent conditional on some of the observed exogenous variables and (ii) marginal independence restrictions under which unobservable variables and a subset of the exogenous variables are independently distributed. Building on results in Chesher and Rosen (2014a), we characterize the identifying power of these models for fundamental structural relationships and probability distributions and for interesting functionals of these objects, some of which may be point identified. In one example of an application, we observe the entry decisions of firms that can choose which of a number of markets to enter and we observe various endogenous outcomes delivered in the markets they choose to enter.

]]>
http://www.ifs.org.uk/publications/7767 Mon, 08 Jun 2015 00:00:00 +0000
<![CDATA[Income effects and the welfare consequences of tax in differentiated product oligopoly]]> Random utility models are widely used to study consumer choice. The vast majority of applications make strong assumptions about the marginal utility of income, which restricts income effects, demand curvature and pass-through. We show that flexibly modeling income effects can be important, particularly if one is interested in the distributional effects of a policy change, even in a market in which, a priori, the expectation is that income effects will play a limited role. We allow for much more flexible forms of income effects than is common and we illustrate the implications by simulating the introduction of an excise tax.

Supplementary material for this paper is available here.

]]>
http://www.ifs.org.uk/publications/7768 Mon, 08 Jun 2015 00:00:00 +0000
<![CDATA[Individual heterogeneity, nonlinear budget sets and taxable income]]> Many studies have estimated the effect of taxes on taxable income. To account for nonlinear taxes these studies either use instrumental variables approaches that are not fully consistent or impose strong functional form assumptions. None allow for general heterogeneity in preferences. In this paper we derive the expected value and distribution of taxable income conditional on a nonlinear budget set, allowing general heterogeneity and optimization error in taxable income. We find an important dimension reduction and use that to develop nonparametric estimation methods. We show how to nonparametrically estimate the expected value of taxable income imposing all the restrictions of utility maximization and allowing for measurement errors. We characterize what can be learned nonparametrically from kinks about compensated tax effects. We apply our results to Swedish data and estimate for prime age males a significant net of tax elasticity of 0.21 and a significant nonlabor income effect of about -1. The income effect is substantially larger in magnitude than it is found to be in other taxable income studies.

]]>
http://www.ifs.org.uk/publications/7755 Tue, 12 May 2015 00:00:00 +0000
<![CDATA[Robust confidence regions for incomplete models]]> Call an economic model incomplete if it does not generate a probabilistic prediction even given knowledge of all parameter values. We propose a method of inference about unknown parameters for such models that is robust to heterogeneity and dependence of unknown form. The key is a Central Limit Theorem for belief functions; robust confidence regions are then constructed in a fashion paralleling the classical approach. Monte Carlo simulations support tractability of the method and demonstrate its enhanced robustness relative to existing methods.

]]>
http://www.ifs.org.uk/publications/7729 Fri, 24 Apr 2015 00:00:00 +0000
<![CDATA[Partial insurance and investments in children]]> This paper studies the impact of permanent and transitory shocks to income on parental investments in children. We use panel data on family income, and an index of investments in children in time and goods, from the Children of the National Longitudinal Survey of Youth. Consistent with the literature focusing on non-durable expenditure, we find that there is only partial insurance of parental investments against permanent income shocks, but the magnitude of the estimated responses is small. We cannot reject the hypothesis full insurance against temporary shocks. Another interpretation of our findings is that there is very little insurance available, but the fact that skill is a non-separable function of parental investments over time results in small reactions of these investments to income shocks, especially at later ages.

]]>
http://www.ifs.org.uk/publications/7715 Wed, 15 Apr 2015 00:00:00 +0000
<![CDATA[A tale of three distributions: inheritances, wealth and lifetime income]]> This paper investigates the impact of inheritances and gifts received on the distribution of wealth among older households in England, and the implications for inequality in lifetime incomes. Whereas previous work has looked only at marketable wealth, we consider broader measures including public and private pensions. We find that once pension wealth is included, inheritances and gifts no longer have an equalising impact on the distribution of wealth. Without pension wealth, including transfers takes the wealth share of the top 10% from 40% to 38%; with pension wealth, the impact is near zero. This has important implications for the impact of inheritances and gifts on the distribution of lifetime incomes. Exploiting a link with administrative data on lifetime earnings, we show that savings rates are significantly increasing in lifetime incomes when pension wealth is excluded, but less so when it is included. Our results thus indicate that the impact of intergenerational transfers on the distribution of lifetime incomes among these individuals is likely to be negligible or inequality-increasing, rather than inequality-reducing.

]]>
http://www.ifs.org.uk/publications/7694 Wed, 08 Apr 2015 00:00:00 +0000
<![CDATA[Estimating private provision of public goods with heterogenous participants: a structural analysis]]> This paper estimates a structural model of private provision of public goods to provide some new empirical evidence on individuals' strategic contributing behaviors. In the model, individuals' contributing behaviors are allowed to be heterogenous and time-varying. We show that all the main components of the model including the number of different contributing strategies, functional form for each strategy, and how individuals adjust their strategies are identified from the revealed contribution choices of individuals. Further, the structural model is estimated using the data collected in a threshold public good experiment. The empirical results suggest that subjects in our experiment employ three contributing strategies, and they strategically respond to provision history by adjusting their preceding behavior. In addition, the response is heterogenous and dependent on subjects' contributing strategies.

]]>
http://www.ifs.org.uk/publications/7690 Thu, 02 Apr 2015 00:00:00 +0000
<![CDATA[Individual and time effects in nonlinear panel models with large <i>N</i>, <i>T</i>]]> Fixed effects estimators of nonlinear panel data models can be severely biased because of the incidental parameter problem. We develop analytical and jackknife bias corrections for nonlinear models with both individual and time effects. Under asymptotic sequences where the time-dimension (T) grows with the cross-sectional dimension (N), the time effects introduce additional incidental parameter bias. As the existing bias corrections apply to models with only individual effects, we derive the appropriate corrections for the case when both effects are present. The basis for the corrections are general asymptotic expansions of fixed effects estimators with incidental parameters in multiple dimensions. We apply the expansions to conditional maximum likelihood estimators with concave objective functions in parameters for panel models with additively separable individual and time effects. These estimators cover fixed effects estimators of the most popular limited dependent variable models such as logit, probit, ordered probit, Tobit and Poisson models. Our analysis therefore extends the use of large-T bias adjustments to an important class of models.

We also analyze the properties of fixed effects estimators of functions of the data, parameters and individual and time effects including average partial effects. Here, we uncover that the incidental parameter bias is asymptotically of second order, because the rate of the convergence of the fixed effects estimators is slower for average partial effects than for model parameters. The bias corrections are still effective to improve finite-sample properties. 

View the supplementary document for this paper here.

]]>
http://www.ifs.org.uk/publications/7689 Thu, 02 Apr 2015 00:00:00 +0000
<![CDATA[Identification and estimation in first-price auctions with risk-averse bidders and selective entry]]> We study identification and estimation in first-price auctions with risk-averse bidders and selective entry, building on a flexible entry and bidding framework we call the Affiliated Signal with Risk Aversion (AS- RA) model. This framework extends the AS model of Gentry and Li (2014) to accommodate arbitrary bidder risk aversion, thereby nesting a variety of standard models as special cases. It poses, however, a unique methodological challenge - existing results on identification with risk aversion fail in the presence of selection, while the selection-robust bounds of Gentry and Li (2014) fail in the presence of risk aversion. Motivated by this problem, we translate excludable variation in potential competition into identified sets for AS-RA primitives under various classes of restrictions on the model. We show that a single parametric restriction - on the copula governing selection into entry - is typically sufficient to restore point identification of all primitives. In contrast, a parametric form for utility yields point identification of the utility function but only partial identification of remaining primitives. Finally, we outline a simple semiparametric estimator combining Constant Relative Risk Aversion utility with a parametric signal-value copula. Simulation evidence suggests that this estimator performs very well even in small samples, underscoring the practical value of our identification results.

]]>
http://www.ifs.org.uk/publications/7686 Wed, 01 Apr 2015 00:00:00 +0000
<![CDATA[Decentralizing education resources: school grants in Senegal]]> The impact of school resources on the quality of education in developing countries may depend crucially on whether resources are targeted efficiently. In this paper we use a randomized experiment to analyze the impact of a school grants program in Senegal, which decentralized a portion of the country's education budget. We find large positive effects on test scores at younger grades that persist at least two years. We show that these effects are concentrated among schools that focused funds on human resources improvements rather than school materials, suggesting that teachers and principals may be a central determinant of school quality.

]]>
http://www.ifs.org.uk/publications/7681 Tue, 31 Mar 2015 00:00:00 +0000
<![CDATA[The marriage market, labor supply and education choice]]> We develop an equilibrium lifecycle model of education, marriage and labor supply and consumption in a transferable utility context. Individuals start by choosing their investments in education anticipating returns in the marriage market and the labor market. They then match based on the economic value of marriage and on preferences. Equilibrium in the marriage market determines intra-household allocation of resources. Following marriage households (married or single) save, supply labor and consume private and public under uncertainty. Marriage thus has the dual role of providing public goods and offering risk sharing. The model is estimated using the British HPS.

]]>
http://www.ifs.org.uk/publications/7677 Fri, 27 Mar 2015 00:00:00 +0000
<![CDATA[An investigation into multivariate variance ratio statistics and their application to stock market predictability]]> We propose several multivariate variance ratio statistics. We derive the asymptotic distribution of the statistics and scalar functions thereof under the null hypothesis that returns are unpredictable after a constant mean adjustment (i.e., under the weak form Efficient Market Hypothesis). We do not impose the no leverage assumption of Lo and MacKinlay (1988) but our asymptotic standard errors are relatively simple and in particular do not require the selection of a bandwidth parameter. We extend the framework to allow for a time varying risk premium through common systematic factors. We show the limiting behaviour of the statistic under a multivariate fads model and under a moderately explosive bubble process: these alternative hypotheses give opposite predictions with regards to the long run value of the statistics. We apply the methodology to five weekly size-sorted CRSP portfolio returns from 1962 to 2013 in three subperiods. We find evidence of a reduction of linear predictability in the most recent period, for small and medium cap stocks. The main findings are not substantially affected by allowing for a common factor time varying risk premium.

]]>
http://www.ifs.org.uk/publications/7665 Tue, 24 Mar 2015 00:00:00 +0000
<![CDATA[Children, time allocation and consumption insurance]]> We consider the life-cycle problem of a household that in each period decides how much to consume and how to allocate spouses’ time to work, leisure, and childcare. In an environment with uncertainty, the allocation of goods and time over the life cycle plays the further role of providing insurance against shocks. We use longitudinal data on consumption, and husband and wife separate information on hourly wages, hours of work, and time spent with children to estimate structural parameters measuring the sensitivity of consumption and time allocation choices to transitory and permanent wage shocks. These structural parameters provide a full picture regarding the ability of household to smooth marginal utility in response to shocks. In addition, information on hours of work and hours spent on childcare allows to decompose overall Frisch response into two components, one reflecting the degree of complementarity between husband’s and wife’s leisure ("companionship" or "love") and another reflecting the degree of substitutability of their childcare time in the production of childcare services.

]]>
http://www.ifs.org.uk/publications/7663 Mon, 23 Mar 2015 00:00:00 +0000
<![CDATA[Prices versus preferences: taste change and revealed preference]]> A systematic approach for incorporating taste variation into a revealed preference framework for heterogeneous consumers is developed. We create a new methodology that enables the recovery of the minimal variation in tastes that are required to rationalise observed choice patterns. This approach is used to examine the extent to which changes in tobacco consumption have been driven by price changes or by taste changes, and whether the significance of these two channels varies across socioeconomic groups. A censored quantile approach is used to allow for unobserved heterogeneity and censoring of consumption. Statistically significant educational differences in the marginal willingness to pay for tobacco are recovered. More highly educated cohorts are found to have experienced a greater shift in their effective tastes away from tobacco.

]]>
http://www.ifs.org.uk/publications/7661 Mon, 23 Mar 2015 00:00:00 +0000
<![CDATA[Life-cycle consumption patterns at older ages in the US and the UK: can medical expenditures explain the difference?]]> In this paper we document significantly steeper declines in nondurable expenditures in the UK compared to the US, in spite of income paths being similar. We explore several possible causes, including different employment paths, housing ownership and expenses, levels and paths of health status, and out-of -pocket medical expenditures. Among all the potential explanations considered, we find that those to do with healthcare – differences in levels, age paths, and uncertainty in medical expenses – are the main factor accounting for the steeper declines in nondurable expenses in the UK compared to the US.

]]>
http://www.ifs.org.uk/publications/7662 Mon, 23 Mar 2015 00:00:00 +0000
<![CDATA[Quantile regression with panel data]]> We propose a generalization of the linear quantile regression model to accommodate possibilities afforded by panel data. Specifically, we extend the correlated random coefficients representation of linear quantile regression (e.g., Koenker, 2005; Section 2.6). We show that panel data allows the econometrician to (i) introduce dependence between the regressors and the random coefficients and (ii) weaken the assumption of comonotonicity across them (i.e., to enrich the structure of allowable dependence between different coefficients). We adopt a “fixed effects” approach, leaving any dependence between the regressors and the random coefficients unmodelled. We motivate different notions of quantile partial effects in our model and study their identification.

For the case of discretely-valued covariates we present analog estimators and characterize their large sample properties. When the number of time periods (T) exceeds the number of random coefficients (P), identification is regular, and our estimates are √N - consistent. When T = P, our identification results make special use of the subpopulation of stayers - units whose regressor values change little over time - in a way which builds on the approach of Graham and Powell (2012). In this just-identified case we study asymptotic sequences which allow the frequency of stayers in the population to shrink with the sample size. One purpose of these “discrete bandwidth asymptotics” is to approximate settings where covariates are continuously-valued and, as such, there is only an infinitesimal fraction of exact stayers, while keeping the convenience of an analysis based on discrete covariates. When the mass of stayers shrinks with N, identification is irregular and our estimates converge at a slower than √N rate, but continue to have limiting normal distributions.

We apply our methods to study the effects of collective bargaining coverage on earnings using the National Longitudinal Survey of Youth 1979 (NLSY79). Consistent with prior work (e.g., Chamberlain, 1982; Vella and Verbeek, 1998), we find that using panel data to control for unobserved worker heterogeneity results in sharply lower estimates of union wage premia. We estimate a median union wage premium of about 9 percent, but with, in a more novel finding, substantial heterogeneity across workers. The 0.1 quantile of union effects is insignificantly different from zero, whereas the 0.9 quantile effect is of over 30 percent. Our empirical analysis further suggests that, on net, unions have an equalizing effect on the distribution of wages.

Supporting material is available in a supplementary appendix here.

]]>
http://www.ifs.org.uk/publications/7646 Tue, 17 Mar 2015 00:00:00 +0000
<![CDATA[The distribution of school funding and inputs in England: 1993-2013]]> School funding per pupil increased substantially between 1999-00 and 2012-13 in England. It also became more varied across schools with higher levels of funds targeted at more deprived schools. Real-terms increases in funding per pupil were much larger for the most deprived group of primary and secondary schools (83% and 93%, respectively) as compared with the least deprived primary and secondary schools (56% and 59%). In this paper, we decompose these increases in funding per pupil into the amount explained by quantities of different types of staff per pupil, their price and changes in non-staffing costs. We find that some of these increases in funding per pupil translated into larger numbers of teachers per pupil and a higher real-terms cost per teacher (about 20-30% of the increase in funding per pupil). However, a much larger portion of the increases in funding can be accounted for by higher levels and increased variation in the use of teaching assistants (largely lower skilled staff), other non-teaching staff and non-staff inputs (such as learning resources, professional services and energy). Furthermore, there is also evidence to suggest that differences in expenditure between the most and least deprived schools are smaller than differences in funding, with more deprived secondary schools running slightly larger surpluses. Increased use of non-teaching staff was partly an intended policy shift by policymakers at the time. However, we argue that the scale of the changes in inputs are more likely to reflect rigidities, the flexibility of contracts and uncertainty over future funding allocations.

]]>
http://www.ifs.org.uk/publications/7645 Tue, 17 Mar 2015 00:00:00 +0000
<![CDATA[Nonparametric testing for exogeneity with discrete regressors and instruments]]> This paper presents new approaches to testing for exogeneity in non-parametric models with discrete regressors and instruments. Our interest is in learning about an unknown structural (conditional mean) function. An interesting feature of these models is that under endogeneity the identifying power of a discrete instrument depends on the number of support points of the instruments relative to that of the regressors, a result driven by the discreteness of the variables. Observing that the simple nonparametric additive error model can be interpreted as a linear regression, we present two test-statistics. For the point identifying model, the test is an adapted version of the standard Wu-Hausman approach. This extends the work of Blundell and Horowitz (2007) to the case of discrete regressors and instruments. For the set identifying model, the Wu-Hausman approach is not available. In this case the test-statistic is derived from a constrained minimization problem. The asymptotic distributions of the test-statistics are derived under the null and fixed and local alternatives. The tests are shown to be consistent, and a simulation study reveals that the proposed tests have satisfactory finite-sample properties.

]]>
http://www.ifs.org.uk/publications/7636 Wed, 11 Mar 2015 00:00:00 +0000
<![CDATA[Who should be treated? Empirical welfare maximization methods for treatment choice]]> One of the main objectives of empirical analysis of experiments and quasi-experiments is to inform policy decisions that determine the allocation of treatments to individuals with different observable covariates. We propose the Empirical Welfare Maximization (EWM) method, which estimates a treatment assignment policy by maximizing the sample analog of average social welfare over a class of candidate treatment policies. The EWM approach is attractive in terms of both statistical performance and practical implementation in realistic settings of policy design. Common features of these settings include: (i) feasible treatment assignment rules are constrained exogenously for ethical, legislative, or political reasons, (ii) a policy maker wants a simple treatment assignment rule based on one or more eligibility scores in order to reduce the dimensionality of individual observable characteristics, and/or (iii) the proportion of individuals who can receive the treatment is a priori limited due to a budget or a capacity constraint. We show that when the propensity score is known, the average social welfare attained by EWM rules converges at least at n^(-1/2) rate to the maximum obtainable welfare uniformly over a minimally constrained class of data distributions, and this uniform convergence rate is minimax optimal. In comparison with this benchmark rate, we examine how the uniform convergence rate of the average welfare improves or deteriorates depending on the richness of the class of candidate decision rules, the distribution of conditional treatment effects, and the lack of knowledge of the propensity score. We provide an asymptotically valid inference procedure for the population welfare gain obtained by exercising the EWM rule. We offer easily implementable algorithms for computing the EWM rule and an application using experimental data from the National JTPA Study

]]>
http://www.ifs.org.uk/publications/7624 Tue, 10 Mar 2015 00:00:00 +0000
<![CDATA[Disability benefit receipt and reform: reconciling trends in the United Kingdom]]> The UK has enacted a number of reforms to the structure of disability benefits, which has made it a major case study for other countries thinking of reform. The introduction of Incapacity Benefit in 1995 coincided with a strong decline in disability benefit expenditure, reversing previous sharp increases. From 2008 the replacement of Incapacity Benefit with Employment and Support Allowance was intended to reduce spending further. We bring together administrative and survey data over the period and highlight key differences in receipt of disability benefits by age, sex and health. These disability benefit reforms and the trends in receipt are also put into the context of broader trends in health and employment by education and sex. We document a growing proportion of claimants in any age group with mental and behavioural disorders as their principal health condition. We also show the decline in the number of older working age men receiving disability benefits to have been partially offset by growth in the number of younger women receiving these benefits. We speculate on the impact of disability reforms on employment.

]]>
http://www.ifs.org.uk/publications/7623 Fri, 06 Mar 2015 00:00:00 +0000
<![CDATA[Estimation of stochastic volatility models by nonparametric filtering]]> A two-step estimation method of stochastic volatility models is proposed: In the first step, we nonparametrically estimate the (unobserved) instantaneous volatility process. In the second step, standard estimation methods for fully observed diffusion processes are employed, but with the filtered/estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases and variances due to the first-step estimation, but under regularity conditions we show that these vanish asymptotically and our estimators inherit the asymptotic properties of the infeasible estimators based on observations of the volatility process. A simulation study examines the finite-sample properties of the proposed estimators.

]]>
http://www.ifs.org.uk/publications/7620 Thu, 05 Mar 2015 00:00:00 +0000
<![CDATA[Mean Ratio Statistic for measuring predictability]]> We propose an alternative Ratio Statistic for measuring predictability of stock prices. Our statistic is based on actual returns rather than logarithmic returns and is therefore better suited to capturing price predictability. It captures not only linear dependence in the same way as the variance ratio statistics of Lo and MacKinlay (1988) but also some nonlinear dependencies. We derive the asymptotic distribution of the statistics under the null hypothesis that simple gross returns are unpredictable after a constant mean adjustment. This represents a test of the weak form of the Efficient Market Hypothesis. We also consider the multivariate extension, in particular, we derive the restrictions implied by the EMH on multiperiod portfolio gross returns. We apply our methodology to test the gross return predictability of various financial series.

]]>
http://www.ifs.org.uk/publications/7602 Fri, 20 Feb 2015 00:00:00 +0000
<![CDATA[Classification of nonparametric regression functions in heterogeneous panels]]> We investigate a nonparametric panel model with heterogeneous regression functions. In a variety of applications, it is natural to impose a group structure on the regression curves. Specifically, we may suppose that the observed individuals can be grouped into a number of classes whose members all share the same regression function. We develop a statistical procedure to estimate the unknown group structure from the observed data. Moreover, we derive the asymptotic properties of the procedure and investigate its finite sample performance by means of a simulation study and a real-data example.

]]>
http://www.ifs.org.uk/publications/7597 Fri, 20 Feb 2015 00:00:00 +0000
<![CDATA[Semiparametric dynamic portfolio choice with multiple conditioning variables]]> Dynamic portfolio choice has been a central and essential objective for institutional investors in active asset management. In this paper, we study the dynamic portfolio choice depending on multiple conditioning variables, where the number of the conditioning variables can be either fixed or diverging to infinity at certain polynomial rate in comparison with the sample size. We propose a novel data-driven method to estimate the nonparametric optimal portfolio choice, motivated by the model averaging marginal regression approach suggested by Li, Linton and Lu (2014). Specifically, in order to avoid curse of dimensionality associated with the problem and to make it practically implementable, we first estimate the optimal portfolio choice by maximising the conditional utility function for each individual conditioning variable, and then construct the dynamic optimal portfolio choice through the weighted average of the marginal optimal portfolio across all the conditioning variables. Under some mild regularity conditions, we have established the large sample properties for the developed portfolio choice procedure. Both simulation studies and empirical application well demonstrate the performance of the proposed methodology with finite sample and real data.

]]>
http://www.ifs.org.uk/publications/7598 Fri, 20 Feb 2015 00:00:00 +0000
<![CDATA[Value Added Tax policy and the case for uniformity: empirical evidence from Mexico]]> Value added taxes (VAT) are an important, and in many cases increasing, source of revenue in both developed and developing countries. Unsurprisingly there is an intense academic and policy debate about the appropriate VAT rate structure, for both equity and efficiency reasons. In this paper we examine the distributional and efficiency case for VAT rate differentiation in Mexico, and analyse the effects of the 2010 reforms to Mexico’s tax system, making use of a tax micro-simulation model, MEXTAX.

The amendments to the initial proposed reforms were made to make the tax change more ‘progressive’. We find that, measured as a proportion of income or expenditure, poorer households did gain most from the amendments, but that the cash-terms gains were much larger for households with high levels of income and expenditure. In other words, the reduction in tax take from the amendments was weakly targeted at poorer households; even simple universal cash transfers would have been much more beneficial to poor households. This shows the distributional case for zero rates of VAT on goods like food is weak – especially given the growing sophistication of cash transfer programmes in particularly middle income countries.

We then examine the efficiency implications of Mexico’s VAT rate structure. We find that deviations from uniformity have a notable effect on spending patterns, but very little effect on aggregate welfare and economic efficiency as estimated by a standard QUAIDS model of consumer demand. We then argue that economic informality may actually provide an efficiency reason for lower rates of tax on goods like food for which informal production and transactions seem to be much more prevalent. This may turn the typical arguments about differential VAT rates on their head. Rather than being justifiable on distributional grounds, but entailing an efficiency cost, the reverse may actually be true.

]]>
http://www.ifs.org.uk/publications/7569 Wed, 18 Feb 2015 00:00:00 +0000
<![CDATA[The right to buy social housing in Britain: a welfare analysis]]> We investigate the impact on social welfare of the UK policy introduced in 1980 by which public housing tenants (council housing in UK parlance) had the right to purchase their houses at heavily discounted prices. This was known as the Right to Buy (RTB) policy. Although this internationally-unique policy was the largest source of public privatization revenue in the UK and raised home ownership as a share of housing tenure by around 15 percentage points, the policy has been little analyzed by economists. We analyze the equilibrium housing policy of the public authority in terms of quality and quantity of publicly provided housing both in the absence and presence of a RTB policy. We examine the incentives to purchase using RTB for households with different wealth trajectories and differing qualities of public housing. We investigate the welfare effects of various adjustments to the policy, in particular (i) tighter restrictions on resale; (ii) reduced discounts on RTB sales; and (iii) returning the proceeds from RTB sale to local authorities to replace part of the public properties sold.

]]>
http://www.ifs.org.uk/publications/7568 Tue, 17 Feb 2015 00:00:00 +0000
<![CDATA[Child poverty in Britain: recent trends and future prospects]]> This paper analyses the key trends in child poverty in Britain, with particular focus on changes since the late 1990s when the issue was promoted towards the top of the policy agenda. The position of low-income families with children in the income distribution improved considerably in the late 1990s and early 2000s, recovering much – though not all – of the ground that they had lost on the rest of the population during the 1980s. I show that these gains were heavily dependent on large amounts of additional government spending on cash transfers. Since the mid 2000s, the absolute living standards of poor families with children have stagnated or declined: further reductions in the headline relative income poverty measure since the recession were driven only by falling median income and by the failure of this measure to account for the higher inflation rates faced by poorer households over this period. Looking ahead, it is not clear what mechanisms could bring about the large additional reductions in child poverty that are in theory legally required under the Child Poverty Act, in light of the heavy reliance of past gains on cash transfers, the current fiscal climate, and the current government’s lack of a clear and effective child poverty strategy.

A version of this paper appeared in Spanish in the December 2014 issue of Panorama Social, available here.

]]>
http://www.ifs.org.uk/publications/7584 Fri, 13 Feb 2015 00:00:00 +0000
<![CDATA[A lava attack on the recovery of sums of dense and sparse signals]]> Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of non-zero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small non-zero parameters. We consider a generalization of these two basic models, termed here a “sparse+dense” model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structure poses problems for traditional sparse estimators, such as the lasso, and for traditional dense estimation methods, such as ridge estimation. We propose a new penalization-based method, called lava, which is computationally efficient. With suitable choices of penalty parameters, the proposed method strictly dominates both lasso and ridge. We derive analytic expressions for the finite-sample risk function of the lava estimator in the Gaussian sequence model. We also provide a deviation bound for the prediction risk in the Gaussian regression model with fixed design. In both cases, we provide Stein’s unbiased estimator for lava’s prediction risk. A simulation example compares the performance of lava to lasso, ridge, and elastic net in a regression example using feasible, data-dependent penalty parameters and illustrates lava’s improved performance relative to these benchmarks.

]]>
http://www.ifs.org.uk/publications/7578 Fri, 13 Feb 2015 00:00:00 +0000
<![CDATA[Estimating the production function for human capital: results from a randomized controlled trial in Colombia]]> We examine the channels through which a randomized early childhood intervention in Colombia led to significant gains in cognitive and socio-emotional skills among a sample of disadvantaged children. We estimate production functions for cognitive and socio-emotional skills as a function of maternal skills and child's past skills, as well as material and time investments that are treated as endogenous. The effects of the program can be fully explained by increases in parental investments, which have strong effects on outcomes and are complementary to both maternal skills and child's past skills.

]]>
http://www.ifs.org.uk/publications/7574 Tue, 10 Feb 2015 00:00:00 +0000
<![CDATA[The short run elasticity of National Health Service nurses’ labour supply in Great Britain]]> The paper investigates the short run responsiveness of National Health Service (NHS) nurses’ labour supply to changes in wages of NHS nurses relative to wages in outside options available to nurses, utilising the panel data aspect of the Annual Survey of Hours and Earnings. We find the short run responsiveness of NHS nurses’ labour supply to the relative wage of NHS nurses is positive and statistically significant, albeit economically small, in regions outside the London area. In contrast, in the London region, the short run elasticity is much higher. We discuss the policy implications of these findings.

]]>
http://www.ifs.org.uk/publications/7566 Fri, 06 Feb 2015 00:00:00 +0000
<![CDATA[Monge-Kantorovich depth, quantiles, ranks and signs]]> We propose new concepts of statistical depth, multivariate quantiles, ranks and signs, based on canonical transportation maps between a distribution of interest on IRd and a reference distribution on the d-dimensional unit ball. The new depth concept, called Monge-Kantorovich depth, specializes to halfspace depth in the case of elliptical distributions, but, for more general distributions, differs from the latter in the ability for its contours to account for non convex features of the distribution of interest. We propose empirical counterparts to the population versions of those Monge-Kantorovich depth contours, quantiles, ranks and signs, and show their consistency by establishing a uniform convergence property for empirical transport maps, which is of independent interest.

]]>
http://www.ifs.org.uk/publications/7540 Wed, 28 Jan 2015 00:00:00 +0000
<![CDATA[Fluctuations in hours of work and employment across age and gender]]> This paper documents the heterogeneity in labor market volatility across ages and gender in the United States over 1976-2014. We separate fluctuations in hours worked into fluctuations in the average number of hours per worker (the intensive margin) and fluctuations in the number of individuals at work (the extensive margin) and examine the relative importance of these two margins for each demographic group. We then compute the contribution of each demographic group to the change in aggregate hours worked over the business cycle. We discuss the implications of our findings for theories of labor market fluctuations.

]]>
http://www.ifs.org.uk/publications/7538 Tue, 27 Jan 2015 00:00:00 +0000
<![CDATA[Microeconomic models with latent variables: applications of measurement error models in empirical industrial organization and labor economics]]> This paper reviews recent developments in nonparametric identification of measurement error models and their applications in applied microeconomics, in particular, in empirical industrial organization and labor economics. Measurement error models describe mappings from a latent distribution to an observed distribution. The identification and estimation of measurement error models focus on how to obtain the latent distribution and the measurement error distribution from the observed distribution. Such a framework may be suitable for many microeconomic models with latent variables, such as models with unobserved heterogeneity or unobserved state variables and panel data models with fixed effects. Recent developments in measurement error models allow very flexible specification of the latent distribution and the measurement error distribution. These developments greatly broaden economic applications of measurement error models. This paper provides an accessible introduction of these technical results to empirical researchers so as to expand applications of measurement error models.

]]>
http://www.ifs.org.uk/publications/7537 Mon, 26 Jan 2015 00:00:00 +0000
<![CDATA[Labour supply and taxation with restricted choices]]> A model of labour supply is developed in which individuals face restrictions on hours choices. Observed hours reflect both the distribution of preferences and the distribution of offers. In this framework the choice set is limited and observed hours may not satisfy the revealed preference conditions for ‘rational’ choice. We show first that when the offer distribution is known, preferences can be identified. We then show that, where preferences are known, the offer distribution can be fully recovered. We also develop conditions for identification of both preferences and the offer distribution. We illustrate this approach in a labour supply setting with nonlinear budget constraints. The occurrence of nonlinearities in the budget constraint can directly reveal restrictions on choices. This framework is then used to study the labour supply choices of a large sample of working age mothers in the UK, accounting for nonlinearities in the tax and welfare benefit system, fixed costs of work and restrictions on hours choices.

]]>
http://www.ifs.org.uk/publications/7533 Thu, 22 Jan 2015 00:00:00 +0000
<![CDATA[Constructing full adult life-cycles from short panels]]> In this paper we discuss two alternative approaches to constructing complete adult life-cycles using data from an 18-year panel. The first of these is a splicing approach - closely related to imputation - that involves stitching together individuals observed at different ages. The second is a microsimulation approach that uses panel data to estimate transition probabilities between different states at adjacent ages and then simulates a large number of individuals with different initial values. Our aim throughout is to construct life-cycle profiles of employment, earnings and family circumstances that are representative of UK individuals born between 1945 and 1954. On balance, we find the microsimulation approach is to be preferred because it allows us to correct for observable differences across cohorts, and it is more amenable to counterfactual modelling.

]]>
http://www.ifs.org.uk/publications/7529 Fri, 16 Jan 2015 00:00:00 +0000
<![CDATA[Post-selection and post-regularization inference in linear models with many controls and instruments]]> In this note, we offer an approach to estimating structural parameters in the presence of many instruments and controls based on methods for estimating sparse high-dimensional models. We use these high-dimensional methods to select both which instruments and which control variables to use. The approach we take extends Belloni et al. (2012), which covers selection of instruments for IV models with a small number of controls, and extends Belloni, Chernozhukov and Hansen (2014), which covers selection of controls in models where the variable of interest is exogenous conditional on observables, to accommodate both a large number of controls and a large number of instruments. We illustrate the approach with a simulation and an empirical example. 

Technical supporting material is available in a supplementary appendix here.

]]>
http://www.ifs.org.uk/publications/7528 Wed, 14 Jan 2015 00:00:00 +0000
<![CDATA[Bounds on treatment effects on transitions]]> This paper considers identification of treatment effects on conditional transition probabilities. We show that even under random assignment only the instantaneous average treatment effect is point identified. Because treated and control units drop out at different rates, randomization only ensures the comparability of treatment and controls at the time of randomization, so that long run average treatment effects are not point identified. Instead we derive informative bounds on these average treatment effects. Our bounds do not impose (semi)parametric restrictions, as e.g. proportional hazards. We also explore various assumptions such as monotone treatment response, common shocks and positively correlated outcomes.

]]>
http://www.ifs.org.uk/publications/7514 Fri, 09 Jan 2015 00:00:00 +0000
<![CDATA[Nonparametric identification in panels using quantiles]]> This paper considers identification and estimation of ceteris paribus effects of continuous regressors in nonseparable panel models with time homogeneity. The effects of interest are derivatives of the average and quantile structural functions of the model. We find that these derivatives are identified with two time periods for “stayers”, i.e. for individuals with the same regressor values in two time periods. We show that the identification results carry over to models that allow location and scale time effects. We propose nonparametric series methods and a weighted bootstrap scheme to estimate and make inference on the identified effects. The bootstrap proposed allows inference for function-valued parameters such as quantile effects uniformly over a region of quantile indices and/or regressor values. An empirical application to Engel curve estimation with panel data illustrates the results.

]]>
http://www.ifs.org.uk/publications/7527 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Central limit theorems and bootstrap in high dimensions]]> In this paper, we derive central limit and bootstrap theorems for probabilities that centered high-dimensional vector sums hit rectangles and sparsely convex sets. Specifically, we derive Gaussian and bootstrap approximations for the probabilities that a root-n rescaled sample average of Xis in A, where X1,..., Xn are independent random vectors in Rp and is a rectangle, or, more generally, a sparsely convex set, and show that the approximation error converges to zero even if p=pn-> infinity and p>>n; in particular, p can be as large as O(e^(Cn^c)) for some constants c,C>0.  The result holds uniformly over all rectangles, or more generally, sparsely convex sets, and does not require any restrictions on the correlation among components of Xi. Sparsely convex sets are sets that can be represented as intersections of many convex sets whose indicator functions depend nontrivially only on a small subset of their arguments, with rectangles being a special case.

]]>
http://www.ifs.org.uk/publications/7507 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Valid post-selection inference in high-dimensional approximately sparse quantile regression models]]> This work proposes new inference methods for the estimation of a regression coefficient of interest in quantile regression models. We consider high-dimensional models where the number of regressors potentially exceeds the sample size but a subset of them suffice to construct a reasonable approximation of the unknown quantile regression function in the model. The proposed methods are protected against moderate model selection mistakes, which are often inevitable in the approximately sparse model considered here. The methods construct (implicitly or explicitly) an optimal instrument as a residual from a density-weighted projection of the regressor of interest on other regressors. Under regularity conditions, the proposed estimators of the quantile regression coefficient are asymptotically root-n normal, with variance equal to the semi-parametric efficiency bound of the partially linear quantile regression model. In addition, the performance of the technique is illustrated through Monte-carlo experiments and an empirical example, dealing with risk factors in childhood malnutrition. The numerical results confirm the theoretical findings that the proposed methods should outperform the naive post-model selection methods in non-parametric settings. Moreover, the empirical results demonstrate soundness of the proposed methods.

]]>
http://www.ifs.org.uk/publications/7526 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Inference in high dimensional panel models with an application to gun control]]> We consider estimation and inference in panel data models with additive unobserved individual specific heterogeneity in a high dimensional setting. The setting allows the number of time varying regressors to be larger than the sample size. To make informative estimation and inference feasible, we require that the overall contribution of the time varying variables after eliminating the individual specific heterogeneity can be captured by a relatively small number of the available variables whose identities are unknown. This restriction allows the problem of estimation to proceed as a variable selection problem. Importantly, we treat the individual specific heterogeneity as fixed effects which allows this heterogeneity to be related to the observed time varying variables in an unspecified way and allows that this heterogeneity may be non-zero for all individuals. Within this framework, we provide procedures that give uniformly valid inference over a fixed subset of parameters in the canonical linear fixed effects model and over coefficients on a fixed vector of endogenous variables in panel data instrumental variables models with fixed effects and many instruments. An input to developing the properties of our proposed procedures is the use of a variant of the Lasso estimator that allows for a grouped data structure where data across groups are independent and dependence within groups is unrestricted. We provide formal conditions within this structure under which the proposed Lasso variant selects a sparse model with good approximation properties. We present simulation results in support of the theoretical developments and illustrate the use of the methods in an application aimed at estimating the effect of gun prevalence on crime rates.

]]>
http://www.ifs.org.uk/publications/7508 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Uniform post selection inference for LAD regression and other Z-estimation problems]]> We develop uniformly valid confidence regions for regression coefficients in a high-dimensional sparse median regression model with homoscedastic errors. Our methods are based on a moment equation that is immunized against non-regular estimation of the nuisance part of the median regression function by using Neyman’s orthogonalization. We establish that the resulting instrumental median regression estimator of a target regression coefficient is asymptotically normally distributed uniformly with respect to the underlying sparse model and is semiparametrically efficient. We also generalize our method to a general non-smooth Z-estimation framework with the number of target parameters p1 being possibly much larger than the sample size n. We extend Huber’s results on asymptotic normality to this setting, demonstrating uniform asymptotic normality of the proposed estimators over p1-dimensional rectangles, constructing simultaneous confidence bands on all of the p1 target parameters, and establishing asymptotic validity of the bands uniformly over underlying approximately sparse models.

]]>
http://www.ifs.org.uk/publications/7511 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Testing many moment inequalities]]> This paper considers the problem of testing many moment inequalities where the number of moment inequalities, denoted by p, is possibly much larger than the sample size n. There are a variety of economic applications where the problem of testing many moment inequalities appears; a notable example is a market structure model of Ciliberto and Tamer (2009) where p = 2m+1 with m being the number of firms. We consider the test statistic given by the maximum of p Studentized (or t-type) statistics, and analyze various ways to compute critical values for the test statistic. Specifically, we consider critical values based upon (i) the union bound combined with a moderate deviation inequality for self-normalized sums, (ii) the multiplier and empirical bootstraps, and (iii) two-step and three-step variants of (i) and (ii) by incorporating selection of uninformative inequalities that are far from being binding and novel selection of weakly informative inequalities that are potentially binding but do not provide first order information. We prove validity of these methods, showing that under mild conditions, they lead to tests with error in size decreasing polynomially in n while allowing for p being much larger than n; indeed p can be of order exp(nc) for some c > 0. Importantly, all these results hold without any restriction on correlation structure between p Studentized statistics, and also hold uniformly with respect to suitably large classes of underlying distributions. Moreover, when p grows with n, we show that all of our tests are (minimax) optimal in the sense that they are uniformly consistent against alternatives whose "distance" from the null is larger than the threshold (2(log p)/n)1/2, while any test can only have trivial power in the worst case when the distance is smaller than the threshold. Finally, we show validity of a test based on block multiplier bootstrap in the case of dependent data under some general mixing conditions.

]]>
http://www.ifs.org.uk/publications/7512 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Vector quantile regression]]> We propose a notion of conditional vector quantile function and a vector quantile regression. A conditional vector quantile function (CVQF) of a random vector Y, taking values in Rd given covariates Z=z, taking values in Rk, is a map u --> QY|Z(u,z), which is monotone, in the sense of being a gradient of a convex function, and such that given that vector U follows a reference non-atomic distribution FU, for instance uniform distribution on a unit cube in Rd, the random vector QY|Z(u,z) has the distribution of Y conditional on Z=z. Moreover, we have a strong representation, Y =QY|Z(U,Z) almost surely, for some version of U. The vector quantile regression (VQR) is a linear model for CVQF of Y given Z. Under correct specification, the notion produces strong representation, Y=β(U)Tf(Z),for f(Z) denoting a known set of transformations of Z, where u --> β(u)T f(Z) is a monotone map, the gradient of a convex function, and the quantile regression coefficients u --> β(u) have the interpretations analogous to that of the standard scalar quantile regression. As f(Z) becomes a richer class of transformations of Z, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge-Kantorovich's optimal transportation problem at its core as a special case. In the classical case, where Y is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. Several applications to diverse problems such as multiple Engel curve estimation, and measurement of financial risk, are considered.

]]>
http://www.ifs.org.uk/publications/7503 Wed, 31 Dec 2014 00:00:00 +0000
<![CDATA[Dynamic linear panel regression models with interactive fixed effects]]> We analyze linear panel regression models with interactive fixed effects and predetermined regressors, e.g. lagged-dependent variables. The first order asymptotic theory of the least squares (LS) estimator of the regression coefficients is worked out in the limit where both the cross sectional dimension and the number of time periods become large. We find that there are two sources of asymptotic bias of the LS estimator: bias due to correlation or heteroscedasticity of the idiosyncratic error term, and bias due to predetermined (as opposed to strictly exogenous) regressors. A bias corrected least squares estimator is provided. We also present bias corrected versions of the three classical test statistics (Wald, LR and LM test) and show that their asymptotic distribution is a chi-square-distribution. Monte Carlo simulations show that the bias correction of the LS estimator and of the test statistics also work well for finite sample sizes.

 

Supplementary material for this paper is available here.

]]>
http://www.ifs.org.uk/publications/7500 Mon, 22 Dec 2014 00:00:00 +0000
<![CDATA[Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions]]> We show that spline and wavelet series regression estimators for weakly dependent regressors attain the optimal uniform (i.e. sup-norm) convergence rate (n= log n)–p=(2p+d) of Stone (1982), where d is the number of regressors and p is the smoothness of the regression function. The optimal rate is achieved even for heavy-tailed martingale difference errors with finite (2 + (d=p))th absolute moment for d=p < 2.We also establish the asymptotic normality of t statistics for possibly nonlinear, irregular functionals of the conditional mean function under weak conditions. The results are proved by deriving a new exponential inequality for sums of weakly dependent random matrices, which is of independent interest.

]]>
http://www.ifs.org.uk/publications/7498 Mon, 22 Dec 2014 00:00:00 +0000
<![CDATA[Empirical methods for networks data: social effects, network formation and measurement error]]> In many contexts we may be interested in understanding whether direct connections between agents, such as declared friendships in a classroom or family links in a rural village, affect their outcomes. In this paper we review the literature studying econometric methods for the analysis of social networks. We begin by providing a common framework for models of social effects, a class that includes the `linear-in-means' local average model, the local aggregate model, and models where network statistics affect outcomes. We discuss identification of these models using both observational and experimental/quasi-experimental data. We then discuss models of network formation, drawing on a range of literatures to cover purely predictive models, reduced form models, and structural models, including those with a strategic element. Finally we discuss how one might collect data on networks, and the measurement error issues caused by sampling of networks, as well as measurement error more broadly.

]]>
http://www.ifs.org.uk/publications/7515 Fri, 19 Dec 2014 00:00:00 +0000
<![CDATA[Inference about Non-Identified SVARs]]> We propose a method for conducting inference on impulse responses in structural vector autoregressions (SVARs) when the impulse response is not point identified because the number of equality restrictions one can credibly impose is not sufficient for point identification and/or one imposes sign restrictions. We proceed in three steps. We first define the object of interest as the identified set for a given impulse response at a given horizon and discuss how inference is simple when the identified set is convex, as one can limit attention to the set’s upper and lower bounds. We then provide easily verifiable conditions on the type of equality and sign restrictions that guarantee convexity. These cover most cases of practical interest, with exceptions including sign restrictions on multiple shocks and equality restrictions that make the impulse response locally, but not globally, identified. Second, we show how to conduct inference on the identified set. We adopt a robust Bayes approach that considers the class of all possible priors for the non-identified aspects of the model and delivers a class of associated posteriors. We summarize the posterior class by reporting the "posterior mean bounds", which can be interpreted as an estimator of the identified set. We also consider a "robustified credible region" which is a measure of the posterior uncertainty about the identified set. The two intervals can be obtained using a computationally convenient numerical procedure. Third, we show that the posterior bounds converge asymptotically to the identified set if the set is convex. If the identified set is not convex, our posterior bounds can be interpreted as an estimator of the convex hull of the identified set. Finally, a useful diagnostic tool delivered by our procedure is the posterior belief about the plausibility of the imposed identifying restrictions.

]]>
http://www.ifs.org.uk/publications/7463 Wed, 26 Nov 2014 00:00:00 +0000
<![CDATA[The Quantile Performance of Statistical Treatment Rules Using Hypothesis Tests to Allocate a Population to Two Treatments]]> This paper modifies the Wald development of statistical decision theory to offer new perspective on the performance of certain statistical treatment rules.  We study the quantile performance of test rules, ones that use the outcomes of hypothesis tests to allocate a population to two treatments.  Let λ denote the quantile used to evaluate performance.  Define a test rule to be λ-quantile optimal if it maximizes λ-quantile welfare in every state of nature.  We show that a test rule is λ-quantile optimal if and only if its error probabilities are less than λ in all states where the two treatments yield different welfare.  We give conditions under which λ-quantile optimal test rules do and do not exist.  A sufficient condition for existence of optimal rules is that the state space be finite and the data enable sufficiently precise estimation of the true state.  Optimal rules do not exist when the state space is connected and other regularity conditions hold, but near-optimal rules may exist.  These nuanced findings differ sharply from measurement of mean performance, as mean optimal test rules generically do not exist.  We present further analysis that holds when the data are real-valued and generated by a sampling distribution which satisfies the monotone-likelihood ratio (MLR) property with respect to the average treatment effect.  We use the MLR property to characterize the stochastic-dominance admissibility of STRs when the data have a continuous distribution and then generate findings on the quantile admissibility of test rules.

]]>
http://www.ifs.org.uk/publications/7452 Thu, 20 Nov 2014 00:00:00 +0000
<![CDATA[Challenges to promoting social inclusion of the extreme poor: evidence from a large scale experiment in Colombia]]> We evaluate the large scale pilot of an innovative and major welfare intervention in Colombia, which combines homes visits by trained social workers to households in extreme poverty with preferential access to social programs. We use a randomized control trial and a very rich dataset collected as part of the evaluation to identify program impacts on the knowledge and take-up of social programs and the labor supply of targeted households. We find no consistent impact of the program on these outcomes, possibly because the way the pilot was implemented resulted in very light treatment in terms of home visits. Importantly, administrative data indicates that the program has been rolled out nationally in a very similar fashion, suggesting that this major national program is likely to fail in making a significant contribution to reducing extreme poverty. We suggest that the program should undergo substantial reforms, which in turn should be evaluated.

]]>
http://www.ifs.org.uk/publications/7446 Fri, 14 Nov 2014 00:00:00 +0000
<![CDATA[Credit Counseling: A Substitute for Consumer Financial Literacy?]]> Is financial literacy a substitute or complement for financial advice? In this paper we analyze the decision by consumers to seek financial advice in the form of credit counseling concerning their credit and debt. Credit counseling is an important component of the consumer credit sector for consumers facing debt problems. We combine instrumental variable approaches to account for the endogeneity of an individual’s financial situation to financial literacy, and the endogeneity of financial literacy to exposure to credit counseling. Our results show credit counseling substitutes for financial literacy. Individuals with better financial literacy are 60% less likely to use credit counseling. These results suggest credit counseling provides a safety net for poor financial literacy.

]]>
http://www.ifs.org.uk/publications/7431 Wed, 05 Nov 2014 00:00:00 +0000
<![CDATA[Unobserved heterogeneity in income dynamics: an empirical Bayes perspective]]> Empirical Bayes methods for Gaussian compound decision problems involving longitudinal data are considered. The new convex optimization formulation of the nonparametric (Kiefer-Wolfowitz) maximum likelihood estimator for mixture models is employed to construct nonparametric Bayes rules for compound decisions. The methods are fi rst illustrated with some simulation examples and then with an application to models of income dynamics. Using PSID data we estimate a simple dynamic model of earnings that incorporates bivariate heterogeneity in intercept and variance of the innovation process. Profi le likelihood is employed to estimate an AR(1) parameter controlling the persistence of the innovations. We fi nd that persistence is relatively modest,  ρ≈ 0.48, when we permit heterogeneity in variances. Evidence of negative dependence between individual intercepts and variances is revealed by the nonparametric estimation of the mixing distribution, and has important consequences for forecasting future income trajectories.

]]>
http://www.ifs.org.uk/publications/7427 Tue, 04 Nov 2014 00:00:00 +0000
<![CDATA[Socio-economic differences in university outcomes in the UK: drop-out, degree completion and degree class]]> There are large socio-economic gaps in higher education participation. But returns to education in the UK derive largely from the attainment of qualifications rather than years of study, and additionally vary by institution, subject and degree class for graduates. This paper provides new evidence on what happens to young people from different backgrounds once they arrive at university, exploring socio-economic differences in drop-out, degree completion and degree class. We find that the large raw differences in university outcomes between individuals from different socio-economic backgrounds can largely be explained by the fact that they arrive at university with very different levels of human capital. Comparing individuals on the same course makes relatively little difference to the remaining socio-economic gaps in university outcomes, with those from higher socio-economic backgrounds still 3.4 percentage points less likely to drop-out, 5.3 percentage points more likely to graduate and 3.7 percentage points more likely to graduate with a first or 2:1 than those from lower socio-economic backgrounds. These findings are in stark contrast to similar analysis by school characteristics (e.g. Crawford, 2014), which shows that, amongst students with the same grades on entry to university, those from worse-performing schools are less likely to drop-out, more likely to complete their degree and more likely to obtain a first or 2.1 than those from better-performing schools. This suggests that it is more challenging for universities interested in using contextual data to inform their admissions policies to predict those with high potential based on socio-economic background than based on school characteristics.

]]>
http://www.ifs.org.uk/publications/7420 Tue, 04 Nov 2014 00:00:00 +0000
<![CDATA[Heterogeneity in graduate earnings by socio-economic background]]> Education is often regarded as a route to social mobility. For this to be the case, however, the link between family background and adult outcomes must be broken (or at least reduced) once we take account of an individual’s education history. This paper focuses on individuals who have completed university and provides new evidence on differences in graduates’ earnings by socio-economic background, with a particular focus on whether they attended a private school. We use data on the population of individuals graduating from UK universities in 2006-07 and find that those who attended private schools earn around 7% more per year, on average, than state school students some 3.5 years after graduation, even when comparing otherwise similar graduates and allowing for differences in degree subject, university attended and degree classification. This work complements Macmillan et al. (2013), who found that graduates from private schools were more likely to enter “high status” occupations. However, our results show that earnings differences persist even within occupations, with graduates who attended private schools earning 6% more than their state school compatriots working in the same occupations. This is equivalent to around £1,500 extra per year in our data. Together, these results suggest that there is a pressing need to understand why private schooling confers such an advantage in the labour market, even amongst similarly achieving graduates, and why higher education does not appear to be the leveller it was hoped to be. 

]]>
http://www.ifs.org.uk/publications/7419 Thu, 30 Oct 2014 00:00:00 +0000
<![CDATA[Revealed preference and consumption behaviour at retirement]]> This paper sets out revealed preference tests for different models of consumption behaviour over retirement that we applied to a Spanish consumption panel dataset. We reject the perfect foresight model both with separable preferences and allowing for preference change. The first order conditions for the life-cycle model allowing for uncertainty do not provide very strong restrictions on possible choices. In fact they are no stronger than those implied by the most basic revealed preference requirement: GARP. We then go on to investigate the patterns of deviations from a perfectly smoothed marginal utility of wealth and ask whether they fit the predictions of the life-cycle model. We find a tendency of these to increase over time, suggesting consumption falls more than we'd expect. After considering various possible explanations, we settle on non-rational behaviour as the most plausible.

]]>
http://www.ifs.org.uk/publications/7399 Mon, 13 Oct 2014 00:00:00 +0000
<![CDATA[The impact of family composition on educational achievment]]> Parents preferring sons tend to go on to have more children until one or more boys are born, and to concentrate investment in boys for a given sibsize. Therefore, having a brother may affect child outcomes in two ways: indirectly, by decreasing sibsize, and directly, where sibsize remains constant. We develop an identification strategy that allows us to separate these two effects. We then apply this to capture the heterogeneous effects of male siblings in both direct and indirect channels, using 0.8 million Taiwanese first-borns. Our empirical evidence indicates that neither effect is important in explaining first-born boys' education levels. In contrast, both effects for first-born girls are evident but go in opposite directions, resulting in a near-zero total effect which has previously been a measure of gender bias. These results offer new evidence of sibling rivalry and gender bias in family settings that has not been detected in the literature.

]]>
http://www.ifs.org.uk/publications/7392 Tue, 07 Oct 2014 00:00:00 +0000
<![CDATA[What is a minimum wage for? Empirical results and theories of justice]]> I undertake a political economy exercise of a type described in John Rawls' A Theory of Justice; namely, one in which economic institutions are judged by how well they match the key principles in theories of distributive justice. My main contention is that such an exercise is integrally related not only to economics in general but to empirical economics in particular. I argue that most standard theories of justice place a large weight on self and social respect and that such respect has a lot to do with the position a person holds in the productive process - their wage and employment outcomes. That, in turn, means that assessments of justice in the real world hinge critically on how labour markets actually function in assigning wages and employment. The answers to these questions are ultimately empirical. I explore these ideas by examining one particular institution (the minimum wage) in relation to a set of the most prominent recent theories of distributive justice. This exercise leads to a di fferent emphasis on what minimum wage related outcomes need study, and to a claim that minimum wage setting is related to standards of fairness.

]]>
http://www.ifs.org.uk/publications/7387 Tue, 07 Oct 2014 00:00:00 +0000
<![CDATA[House prices, wealth effects and labor supply]]> We examine the impact of housing wealth on labor supply decisions using data on exogenous local variation in house prices merged into household panel data for Britain. Our estimates are conditioned on variations in local labor demand and income expectations as these may co-determine housing wealth and labor supply. We use renters as a control group and test for the potential endogeneity of tenure and location. We find significant housing wealth effects on labor supply among young married / co-habiting female owners and older male owners, consistent with leisure being a normal good. The size of these effects is economically important. Our estimates imply housing wealth effects have stronger effects then local labor market conditions upon participation decisions for these workers.

]]>
http://www.ifs.org.uk/publications/7388 Tue, 07 Oct 2014 00:00:00 +0000
<![CDATA[Household consumption when marriage is stable]]> We develop a novel framework to analyze the structural implications of the marriage market for household consumption. We define a revealed preference characterization of efficient household consumption when the marriage is stable. Stability means that the marriage matching is individually rational and has no blocking pairs. We characterize stable marriage with intrahousehold (consumption) transfers but without assuming transferable utility. We show that our revealed preference characterization generates testable conditions even with a single consumption observation per household and heterogeneous individual preferences across households. The characterization also allows for identifying the intrahousehold decision structure (including the sharing rule) under the same minimalistic assumptions. An application to Dutch household data demonstrates the usefulness of our theoretical results. We find that the female gets a higher income share when her relative wage increases, which we can give a structural interpretation in terms of outside options from marriage that vary with individual wages.

]]>
http://www.ifs.org.uk/publications/7389 Tue, 07 Oct 2014 00:00:00 +0000
<![CDATA[Optimal tax progressivity: an analytical framework]]> What shapes the optimal degree of progressivity of the tax and transfer system? On the one hand, a progressive tax system can counteract inequality in initial conditions and substitute for imperfect private insurance against idiosyncratic earnings risk. At the same time, progressivity reduces incentives to work and to invest in skills, and aggravates the externality associated with valued public expenditures. We develop a tractable equilibrium model that features all of these trade-offs. The analytical expressions we derive for social welfare deliver a transparent understanding of how preferences, technology, and market structure parameters influence the optimal degree of progressivity. A calibration for the U.S. economy indicates that endogenous skill investment, flexible labor supply, and the externality linked to valued government purchases play quantitatively similar roles in limiting desired progressivity.

]]>
http://www.ifs.org.uk/publications/7390 Tue, 07 Oct 2014 00:00:00 +0000
<![CDATA[Individual Heterogeneity and Average Welfare]]> Individual heterogeneity is an important source of variation in demand. Allowing for general heterogeneity is needed for correct welfare comparisons. We consider general heterogenous demand where preferences and linear budget sets are statistically independent. Only the marginal distribution of demand for each price and income is identified from cross-section data where only one price and income is observed for each individual. Thus, objects that depend on varying price and/or income for an indiviual are not generally identified, including average exact consumer surplus. We use bounds on income effects to derive relatively simple bounds on the average surplus, including for discrete/continous choice. We also sketch an approach to bounding surplus that does not use income effect bounds. We apply the results to gasoline demand. We find tight bounds for average surplus in thisapplication but wider bounds for average deadweight loss.

]]>
http://www.ifs.org.uk/publications/7386 Mon, 06 Oct 2014 00:00:00 +0000
<![CDATA[Economic theory and forecasting: lessons from the literature]]> Does economic theory help in forecasting key macroeconomic variables? This article aims to provide some insight into the question by drawing lessons from the literature. The definition of "economic theory" includes a broad range of examples, such as accounting identities, disaggregation and spatial restrictions when forecasting aggregate variables, cointegration and forecasting with Dynamic Stochastic General Equilibrium (DSGE) models. We group the lessons into three themes. The first discusses the importance of using the correct econometric tools when answering the question. The second presents examples of theory-based forecasting that have not proven useful, such as theory-driven variable selection and some popular DSGE models. The third set of lessons discusses types of theoretical restrictions that have shown some usefulness in forecasting, such as accounting identities, disaggregation and spatial restrictions, and cointegrating relationships. We conclude by suggesting that economic theory might help in overcoming the widespread instability that affects the forecasting performance of econometric models by guiding the search for stable relationships that could be usefully exploited for forecasting.

]]>
http://www.ifs.org.uk/publications/7381 Wed, 24 Sep 2014 00:00:00 +0000
<![CDATA[Recombinant innovation and the boundaries of the firm]]> There is considerable interest in understanding how important market frictions are in stiffing the transmission of ideas from one firm to another. Although the theoretical literature emphasizes the importance of these frictions, direct empirical evidence on them is limited. We use comprehensive patent data from the European Patent Office and a multiple spells duration model to provide estimates that suggest that they are substantial. It is around 30% more costly to successfully discover and utilize new ideas created in another fi rm than in your own. This compares to the increased costs of accessing new ideas across national borders of around 5%, and across technologies of around 20%. These result point towards substantial imperfections in the market for technology.

]]>
http://www.ifs.org.uk/publications/7372 Tue, 16 Sep 2014 00:00:00 +0000
<![CDATA[Sieve Wald and QLR Inferences on Semi/nonparametric Conditional Moment Models]]> This paper considers inference on functionals of semi/nonparametric conditional moment restrictions with possibly nonsmooth generalized residuals, which include all of the (nonlinear) nonparametric instrumental variables (IV) as special cases. There models are often illposed and hence it is difficult to verify whether a (possibly nonlinear) functional is root-n estimable or not. We provide computationally simple, uni ed inference procedures that are asymptotically valid regardless of whether a functional is root-n estimable or not. We establish the following new useful results: (1) the asymptotic normality of a plug-in penalized sieve minimum distance (PSMD) estimator of a (possibly nonlinear) functional; (2) the consistency of simple sieve variance estimators for the plug-in PSMD estimator, and hence the asymptotic chi-square distribution of the sieve Wald statistic; (3) the asymptotic chi-square distribution of an optimally weighted sieve quasi likelihood ratio (QLR) test under the null hypothesis; (4) the asymptotic tight distribution of a non-optimally weighted sieve QLR statistic under the null; (5) the consistency of generalized residual bootstrap sieve Wald and QLR tests; (6) local power properties of sieve Wald and QLR tests and of their bootstrap versions; (7) asymptotic properties of sieve Wald and SQLR for functionals of increasing dimension. Simulation studies and an empirical illustration of a nonparametric quantile IV regression are presented.

]]>
http://www.ifs.org.uk/publications/7369 Fri, 12 Sep 2014 00:00:00 +0000
<![CDATA[Cash and Pensions: Have the elderly in England saved optimally for retirement?]]> Using a model where households can save in either a safe asset or in an illiquid, tax-advantaged pension, we assess the extent to which those who recently reached the state pension age in the UK have saved optimally for retirement. The policy environment specified closely matches that prevailing in the UK. Using the model and administrative data linked with survey data from the English Longitudinal Study of Ageing, an optimal level of wealth is calculated for each household. This is compared to the levels of wealth observed in the data. Our results show that, for those born in the 1940s, the vast majority of households have wealth levels far greater than necessary to maintain their living standards into and through retirement.

]]>
http://www.ifs.org.uk/publications/7357 Tue, 09 Sep 2014 00:00:00 +0000
<![CDATA[Retirement sorted? The adequacy and optimality of wealth among the near-retired]]> Much of the focus of the UK pensions policy debate over the past decade has been on the adequacy (or otherwise) of private retirement saving. In this paper, we present the first assessment of the optimality of the retirement resources of English couple households born in the 1940s. Here, ‘optimal’ wealth holdings are those that allow households to enjoy the same level of living standards in both working life and retirement. We use a life-cycle model of consumption and saving to calculate this level of wealth, and compare that with how much wealth households are observed to hold. We find that the majority of households hold more wealth than our model suggests is optimal and that this would still be true even if housing wealth were excluded from observed wealth holdings. A comparison of this approach with the replacement rate approach commonly used to assess the adequacy of households’ retirement resources suggests that using a simple replacement rate benchmark could give a misleading picture of households’ preparedness for retirement as it cannot capture the vast heterogeneity in households’ circumstances.

 

This paper will be presented at the 'Are you prepared for retirement?' conference this afternoon. 

]]>
http://www.ifs.org.uk/publications/7358 Tue, 09 Sep 2014 00:00:00 +0000
<![CDATA[Nonparametric identification of positive eigenfunctions]]> Important features of certain economic models may be revealed by studying positive eigenfunctions of appropriately chosen linear operators. Examples include long-run risk-return relationships in dynamic asset pricing models and components of marginal utility in external habit formation models. This paper provides identi fication conditions for positive eigenfunctions in nonparametric models. Identifi cation is achieved if the operator satisfi es two mild positivity conditions and a power compactness condition. Both existence and identi cation are achieved under a further non-degeneracy condition. The general results are applied to obtain new identifi cation conditions for external habit formation models and for positive eigenfunctions of pricing operators in dynamic asset pricing models.

]]>
http://www.ifs.org.uk/publications/7352 Thu, 04 Sep 2014 00:00:00 +0000
<![CDATA[Inference in Ordered Response Games with Complete Information]]> We study econometric models of complete information games with ordered action spaces, such as the number of store fronts operated in a market by a rm, or the daily number of flights on a city-pair off ered by an airline. The model generalizes single agent models such as ordered probit and logit to a simultaneous equations model of ordered response, allowing for multiple equilibria and set identi cation. We characterize identif ed sets for model parameters under mild shape restrictions on agents' payo functions. We then propose a novel inference method for a parametric version of our model based on a test statistic that embeds conditional moment inequalities implied by equilibrium behavior. Using maximal inequalities for U-processes, we show that an asymptotically valid con dence set is attained by employing an easy to compute fi xed critical value, namely the appropriate quantile of a chi-square random variable. We apply our method to study capacity decisions measured as the number of stores operated by Lowe's and Home Depot in geographic markets. We demonstrate how our con dence sets for model parameters can be used to perform inference on other quantities of economic interest, such as the probability that any given outcome is an equilibrium and the propensity with which any particular outcome is selected when it is one of multiple equilibria, and we perform a counterfactual analysis of store con gurations under both collusive and monopolistic regimes.

]]>
http://www.ifs.org.uk/publications/7351 Thu, 04 Sep 2014 00:00:00 +0000
<![CDATA[A contribution to the Reinhart and Rogoff debate: not 90 percent but maybe 30 percent]]> Using the Reinhart-Rogoff dataset, we fi nd a debt threshold not around 90 percent but around 30 percent, above which the median real GDP growth falls abruptly. Our work is the first to formally test for threshold eff ects in the relationship between public debt and median real GDP growth. The null hypothesis of no threshold eff ect is rejected at the 5 percent signi cance level for most cases. While we fi nd no evidence of a threshold around 90 percent, our fi ndings suggest that the debt threshold for economic growth may exist around a relatively small debt-to-GDP ratio of 30 percent. Empirical results are more robust with the postwar sample than the long sample that goes before World War II.

]]>
http://www.ifs.org.uk/publications/7370 Mon, 01 Sep 2014 00:00:00 +0000
<![CDATA[Linear regression for panel with unknown number of factors as interactive fixed effects]]> In this paper we study the least squares (LS) estimator in a linear panel regression model with unknown number of factors appearing as interactive fixed e ffects. Assuming that the number of factors used in estimation is larger than the true number of factors in the data we establish the limiting distribution of the LS estimator for the regression coefficients, as the number of time periods and the number of crosssectional units jointly go to infi nity. The main result of the paper is that under certain assumptions the limiting distribution of the LS estimator is independent of the number of factors used in the estimation, as long as this number is not underestimated. The important practical implication of this result is that for inference on the regression coefficients one does not necessarily need to estimate the number of interactive fixed eff ects consistently.

Supplementary material for this paper is available here.

 

]]>
http://www.ifs.org.uk/publications/7336 Thu, 21 Aug 2014 00:00:00 +0000
<![CDATA[A Test for Instrument Validity]]> This paper develops a specification test for instrument validity in the heterogeneous treatment effect model with a binary treatment and a discrete instrument. The strongest testable implication for instrument validity is given by the condition for non-negativity of point- identifiable complier’s outcome densities. Our specification test infers this testable implication using a variance-weighted Kolmogorov-Smirnov test statistic. Implementation of the proposed test does not require smoothing parameters, even though the testable implications involve non-parametric densities. The test can be applied to both discrete and continuous outcome cases, and an extension of the test to settings with conditioning covariates is provided.

]]>
http://www.ifs.org.uk/publications/7332 Tue, 19 Aug 2014 00:00:00 +0000
<![CDATA[Program evaluation with high-dimensional data]]> In this paper, we consider estimation of general modern moment-condition problems in econometrics in a data-rich environment where there may be many more control variables available than there are observations. The framework we consider allows for a continuum of target parameters and for Lasso-type or Post-Lasso type methods to be used as estimators of a continuum of high-dimensional nuisance functions. As an important leading example of this environment, we first provide detailed results on estimation and inference for relevant treatment eff ects, such as local average and quantile treatment eff ects. The setting we work in is designed expressly to handle many control variables, endogenous receipt of treatment, heterogeneous treatment eff ects, and possibly function-valued outcomes. To make informative inference possible, we assume that key reduced form predictive relationships are approximately sparse. That is, we require that the relationship between the control variables and the outcome, treatment status, and instrument status can be captured up to a small approximation error by a small number of the control variables whose identities are unknown to the researcher. This condition permits estimation and inference to proceed after datadriven selection of control variables. We provide conditions under which post selection inference is uniformly valid across a wide-range of models and show that a key condition underlying the uniform validity of post-selection inference allowing for imperfect model selection is the use of orthogonal moment conditions. We illustrate the use of the proposed methods with an application to estimating the e ffect of 401(k) participation on accumulated assets.

]]>
http://www.ifs.org.uk/publications/7331 Thu, 14 Aug 2014 00:00:00 +0000
<![CDATA[The redistribution and insurance value of welfare reform]]> Relatively little is known about the roles that taxes and transfers play in redistributing resources and providing insurance across individuals and across the lifecycle. We embed these alternative roles in a lifecycle model, allowing us to demonstrate what the tax and transfer system achieves from a lifecycle perspective and why it is valuable. We undertake a five-way decomposition of net transfers into a giveaway term and terms corresponding to between- and within-individual redistribution and between- and within-individual insurance. These components are distinguished from perspective of the start of working life, and we consider both the magnitude of net transfers involved and the associated welfare values. Our focus is on females and we also highlight how behavioural responses affect the results. Analysis is conducted for the 2015 UK tax and transfer system relative to a flat-rate baseline, showing what value is provided by the complex tax and welfare entitlement rules in a modern economy. We also consider what is achieved by two important UK benefit reforms--the working families' tax credit (WFTC) reform of 1999 and the universal credit (UC) reform that began in 2013. Our main conclusions are that insurance against wage and family composition shocks is substantial and highly valued by individuals. Within-individual redistribution (i.e. across periods of life) is generally of little value even in the presence of strict borrowing constraints. Behavioural responses tend to increase the size of reform giveaways at the expense of the other components.

]]>
http://www.ifs.org.uk/publications/7329 Tue, 12 Aug 2014 00:00:00 +0000
<![CDATA[From Me to You? How the UK State Pension System Redistributes]]> The redistributive objectives of the UK state pension system have often been somewhat ambiguous, and have changed over time as different governments have come and gone. In this paper, we use detailed data on households’ histories of employment, earnings and contributions to the National Insurance (NI) system to examine the degree of intragenerational redistribution achieved by the UK state pension system for the cohort born in the 1930s. We also estimate what redistribution could have been achieved by alternative stylised state pension systems, which approximate the steady-state version of some of the main reforms that have been implemented in the UK over the last 40 years. We find that the majority of state pension spending under all the systems we consider reflects a transfer of money across individuals’ lifetimes, rather than between different individuals in the cohort. Comparisons between the different state pension systems, in terms of the extent of redistribution they imply, depend crucially on the stance taken as to whether or not individuals in couples pool their resources. 

These findings will be presented at a briefing on 9 September, alongside several other pieces of work which shed light on how financial preparedness for retirement differs across cohorts and important differences within cohorts.

]]>
http://www.ifs.org.uk/publications/7324 Wed, 06 Aug 2014 00:00:00 +0000
<![CDATA[Labour supply effects of increasing the female state pension age in the UK from age 60 to 62]]> In a previous study we examined the impact on employment of increasing the state pension age for women from age 60 to 61 (Cribb, Emmerson and Tetlow, 2013). This short paper incorporates more recent data, now available up to March 2014, which allows us to study the impact on employment over the period when the female state pension age rose to age 62. Using the same difference-in-differences methodology as before, we find that women’s employment rates at ages 60 to 61 were increased by 5.9 percentage points as a result of the state pension age increasing from age 60 to age 62 between April 2010 and March 2014. We find no statistically significant evidence of a different impact on employment between April 2010 and March 2012 (when the state pension age rose from age 60 to 61) and between April 2012 and March 2014 (when it rose from age 61 to 62). The more recent data boost our sample size, allowing us to estimate the impact of the reform with greater precision. However, we continue to find little statistically significant evidence of differences in response among women with different characteristics. The one exception we find is that the rise in the state pension age increases the employment rate of single women by 10.1 percentage points, which is statistically significantly greater (at the 10% level) than the 4.4 percentage point increase we find for women in couples.

]]>
http://www.ifs.org.uk/publications/7323 Thu, 31 Jul 2014 00:00:00 +0000
<![CDATA[Never mind the hyperbolics: nonparametric analysis of time-inconsistent preferences]]> We investigate necessary and sufficient nonparametric conditions for the quasi-hyperbolic consumer. These turn out to be quite tractable. We investigate the performance of this model compared to the standard exponential discounting model using consumer panel data.

]]>
http://www.ifs.org.uk/publications/7315 Tue, 29 Jul 2014 00:00:00 +0000
<![CDATA[The impact of financial education on adolescents' intertemporal choices]]> We study the impact of financial education on intertemporal choice in adolescence. The program was randomly assigned among high-school students and intertemporal choices were measured using an incentivized experiment. Students who participated in the program display a decrease in time inconsistency; an increase in the allocation of payment to a single payment date, compared to spreading payment across two dates; and increased consistency of choice with the law of demand. These findings suggest that the effect of such educational programs is to increase comprehension and decrease bracketing in intertemporal choice.

This working paper was updated in May 2015.

]]>
http://www.ifs.org.uk/publications/7319 Tue, 29 Jul 2014 00:00:00 +0000
<![CDATA[Using a temporary indirect tax cut as a fiscal stimulus: evidence from the UK]]> This paper evaluates a novel form of fiscal stimulus: a temporary cut in the rate of Value Added Tax (VAT). In December 2008, the UK cut the standard rate of VAT by 2.5 percentage points for 13 months in an effort to stimulate spending. We estimate the effect of the cut on prices and spending using alternative strategies for identifying the counter-factual. Although firms initially passed through the VAT cut by lowering their prices, at least part of the pass through of the VAT cut was reversed after only a few months. Despite this early reversal, the cut raised the volume of retail sales by around 1% which on its own generates a 0.4% increase in total expenditure. The cut raised retail sales by encouraging consumers to bring forward their purchases and we find a significant fall in sales after the VAT cut ended. Thus an indirect tax cut stimulates significant intertermporal substitution in purchases.

]]>
http://www.ifs.org.uk/publications/7310 Mon, 28 Jul 2014 00:00:00 +0000
<![CDATA[The importance of product reformulation versus consumer choice in improving diet quality]]> Improving diet quality has been a major target of public health policy. Governments have encouraged consumers to make healthier food choices and fi rms to reformulate food products. Evaluation of such policies has focused on the impact on consumer behaviour; firm behaviour has been less well studied. We study the recent decline in dietary salt intake in the UK, and show that it was entirely attributable to product reformulation by fi rms; a contemporaneous information campaign had little impact, consumer switching between products in fact worked in the opposite direction and led to a slight increase in the salt intensity of groceries purchased. These findings point to the important role that fi rms can play in achieving public policy goals.

]]>
http://www.ifs.org.uk/publications/7308 Thu, 24 Jul 2014 00:00:00 +0000
<![CDATA[Holy cows or cash cows?]]> In a recent paper, Anagol, Etang and Karlan (2013) consider the income generated by these owning a cow or a buffalo in two districts of Uttar Pradesh, India.  The net profit generated ignoring  labour costs, gives rise to a small positive rate of return. Once any reasonable estimate of labour costs is added to costs, the rate of return is a large negative number. The authors  conclude that households holding this type of assets do not behave according to the tenets of capitalism. A variety of explanations, typically appealing to religious or cultural factors have been invoked for such a puzzling fact.

In this note, we point to a simple explanation that is fully consistent with rational behaviour on the part of Indian farmers. In computing the return on cows and buffaloes, the authors used data from a single year. Cows are assets whose return varies through time. In drought years, when fodder is scarce and expensive, milk production is lower and profits are low. In non-drought years, when fodder is abundant and cheaper, milk production is higher and profits can be considerably higher.  The return on cows and buffaloes, like that of many stocks traded on Wall Street, is positive in some years and negative in others. We report evidence from three years of data on the return on cows and buffaloes in the district of Anantapur and show that in one of the three years returns are very high, while in drought years they are similar to the figures obtained by Anagol, Etang and Karlan (2013).

This paper is also published as part of the NBER working paper series no. 20304

]]>
http://www.ifs.org.uk/publications/7292 Tue, 22 Jul 2014 00:00:00 +0000
<![CDATA[Individual and time effects in nonlinear panel models with large <i>N</i>, <i>T</i>]]> Fixed e ffects estimators of nonlinear panel data models can be severely biased because of the well-known incidental parameter problem. We develop analytical and jackknife bias corrections for nonlinear models with both individual and time e ffects. Under asymptotic sequences where the time-dimension (T) grows with the cross-sectional dimension (N), the time eff ects introduce additional incidental parameter bias. As the existing bias corrections apply to models with only individual e ffects, we derive the appropriate corrections for the case when both e ffects are present. The basis for the corrections are general asymptotic expansions of fixed eff ects estimators with incidental parameters in multiple dimensions. We apply the expansions to conditional maximum likelihood estimators with concave objective functions in parameters for panel models with additive individual and time eff ects. These estimators cover fi xed e ects estimators of the most popular limited dependent variable models such as logit, probit, ordered probit, Tobit and Poisson models. Our analysis therefore extends the use of large-T bias adjustments to an important class of models.

We also analyze the properties of fixed eff ects estimators of functions of the data, parameters and individual and time eff ects including average partial effects. Here, we uncover that the incidental parameter bias is asymptotically of second order, because the rate of the convergence of the fixed e ffects estimators is slower for average partial eff ects than for model parameters. The bias corrections are still useful to improve fi nite-sample properties.

]]>
http://www.ifs.org.uk/publications/7283 Wed, 16 Jul 2014 00:00:00 +0000
<![CDATA[For love or reward? Characterising preference for giving to parents in an experimental setting]]> This paper examines the motivation for intergenerational transfers between adult children and their parents, and the nature of preferences for such giving behaviour, in an experimental setting. Participants in our experiment play a series of dictator games with parents and strangers, in which we vary endowments and prices for giving to each recipient. We fi…nd that preferences for giving are typically rational. When parents are recipients as opposed to strangers, participants display greater sensitivity to the price of giving, and a higher relative proclivity for giving. Our …findings also provide evidence of reciprocal motivations for giving, as players give more to parents who have full information regarding the context in which giving occurs.

]]>
http://www.ifs.org.uk/publications/7285 Wed, 16 Jul 2014 00:00:00 +0000
<![CDATA[Bayesian exploratory factor analysis]]> This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identifi cation criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study confi rms the validity of the approach. The method is used to produce interpretable low dimensional aggregates from a high dimensional set of psychological measurements.

]]>
http://www.ifs.org.uk/publications/7271 Mon, 14 Jul 2014 00:00:00 +0000
<![CDATA[Direct and indirect treatment effects: causal chains and mediation analysis with instrumental variables]]> This paper discusses the nonparametric identification of causal direct and indirect effects of a binary treatment based on instrumental variables. We identify the indirect effect, which operates through a mediator (i.e. intermediate variable) that is situated on the causal path between the treatment and the outcome, as well as the unmediated direct effect of the treatment using distinct instruments for the endogenous treatment and the endogenous mediator. We examine different settings to obtain nonparametric identification of (natural) direct and indirect as well as controlled direct effects for continuous and discrete mediators and continuous and discrete instruments. We illustrate our approach in two applications: to disentangle the effects (i) of education on health, which may be mediated by income, and (ii) of the Job Corps training program, which may affect earnings indirectly via working longer hours and directly via higher wages per hour.

]]>
http://www.ifs.org.uk/publications/7272 Mon, 14 Jul 2014 00:00:00 +0000
<![CDATA[Modelling work, health, care and income in the older population. The IFS retirement simulator (RetSim)]]> The pensioner population a decade from now is likely to look different to today’s population. There will not only be more pensioners but those retiring over the next few years will have experienced different economic conditions in their working lives, been subject to a different policy environment at different points in their lives, benefited from different technological and medical advances, and made different decisions about their savings than have today’s pensioners.


This paper sets out the methodology, assumptions, and modelling specifications used to produce the report The changing face of retirement by Emmerson, Heald and Hood (2014), which aims to shed some light on how the demographic and financial circumstances of this group will change.

]]>
http://www.ifs.org.uk/publications/7253 Thu, 26 Jun 2014 00:00:00 +0000
<![CDATA[The socio-economic gradient of child development: cross-sectional evidence from children 6-42 months in Bogota]]> We study the socio-economic gradient of child development on a representative sample of low- and middle-income children aged 6-42 months in Bogota, using the Bayley Scales of Infant Development, a high quality test based on direct observation of the child’s abilities. We find a statistically significant difference between children in the 90th and 10th percentile of the wealth distribution in our sample of 0.33 standard deviations (SD) in cognition, 0.29 SD in receptive language and 0.38 SD in expressive language at 14 months. The socio-economic gap increases substantially with age to 1 SD (cognition), 0.80 SD (receptive language) and 0.69 SD (expressive language) by 42 months. While the gap persists after controlling for mediating factors such as parental and biomedical characteristics, the level of stimulation in the home, and the quality of the institutional care setting; its size is significantly reduced by variables related to the home environment – i.e. parental investments in care quantity and quality. These findings have important implications for the design of well-targeted, effective and timely interventions that promote early childhood development.

This paper is also published as part of the Inter-American Development Bank Working Paper series No. IDB-WP-527.

]]>
http://www.ifs.org.uk/publications/7241 Thu, 19 Jun 2014 00:00:00 +0000
<![CDATA[Multivariate variance ratio statistics]]> We propose several multivariate variance ratio statistics. We derive the asymptotic distribution of the statistics and scalar functions thereof under the null hypothesis that returns are unpredictable after a constant mean adjustment (i.e., under the Efficient Market Hypothesis). We do not impose the no leverage assumption of Lo and MacKinlay (1988) but our asymptotic standard errors are relatively simple and in particular do not require the selection of a bandwidth parameter. We extend the framework to allow for a smoothly varying risk premium in calendar time, and show that the limiting distribution is the same as in the constant mean adjustment case. We show the limiting behaviour of the statistic under a multivariate fads model and under a moderately explosive bubble process: these alternative hypotheses give opposite predictions with regards to the long run value of the statistics. We apply the methodology to three weekly size-sorted CRSP portfolio returns from 1962 to 2013 in three subperiods. We …find evidence of a reduction of linear predictability in the most recent period, for small and medium cap stocks. We …find similar results for the main UK stock indexes. The main findings are not substantially affected by allowing for a slowly varying risk premium.

]]>
http://www.ifs.org.uk/publications/7247 Fri, 06 Jun 2014 00:00:00 +0000
<![CDATA[Asymptotic efficiency of semiparametric two-step GMM]]> Many structural economics models are semiparametric ones in which the unknown nuisance functions are identifi…ed via nonparametric conditional moment restrictions with possibly non-nested or overlapping conditioning sets, and the …finite dimensional parameters of interest are over-identi…fied via unconditional moment restrictions involving the nuisance functions. In this paper we characterize the semiparametric efficiency bound for this class of models. We show that semiparametric two-step optimally weighted GMM estimators achieve the efficiency bound, where the nuisance functions could be estimated via any consistent nonparametric methods in the fi…rst step. Regardless of whether the efficiency bound has a closed form expression or not, we provide easy-to-compute sieve based optimal weight matrices that lead to asymptotically efficient two-step GMM estimators.

]]>
http://www.ifs.org.uk/publications/7232 Wed, 04 Jun 2014 00:00:00 +0000
<![CDATA[Maximum score estimation with nonparametrically generated regressors]]> The estimation problem in this paper is motivated by maximum score estimation of preference parameters in the binary choice model under uncertainty in which the decision rule is affected by conditional expectations. The preference parameters are estimated in two stages: we estimate conditional expectations nonparametrically in
the fi…rst stage and then the preference parameters in the second stage based on Manski (1975, 1985)’s maximum score estimator using the choice data and …first stage estimates. This setting can be extended to maximum score estimation with nonparametrically generated regressors. The paper establishes consistency and derives rate of convergence of the two-stage maximum score estimator. Moreover, the paper also provides sufficient conditions under which the two-stage estimator is asymptotically equivalent in distribution to the corresponding single-stage estimator that assumes the …first stage input is known. The paper also presents some Monte Carlo simulation results for …finite-sample behavior of the two-stage estimator.

]]>
http://www.ifs.org.uk/publications/7220 Wed, 28 May 2014 00:00:00 +0000
<![CDATA[The lasso for high-dimensional regression with a possible change-point]]> We consider a high-dimensional regression model with a possible change-point due to a covariate threshold and develop the Lasso estimator of regression coefficients as well as the threshold parameter. Our Lasso estimator not only selects covariates but also selects a model between linear and threshold regression models. Under a sparsity assumption, we derive non-asymptotic oracle inequalities for both the prediction risk and the l1 estimation loss for regression coefficients. Since the Lasso estimator selects variables simultaneously, we show that oracle inequalities can be established without pretesting the existence of the threshold e ect. Furthermore, we establish conditions under which the estimation error of the unknown threshold parameter can be bounded by a nearly n-1 factor even when the number of regressors can be much larger than the sample size (n). We illustrate the usefulness of our proposed estimation method via Monte Carlo simulations and an application to real data.

]]>
http://www.ifs.org.uk/publications/7219 Wed, 28 May 2014 00:00:00 +0000
<![CDATA[Implementing intersection bounds in Stata]]> We present the clrbound, clr2bound, clr3bound, and clrtest commands for estimation and inference on intersection bounds as developed by Chernozhukov et al. (2013). The intersection bounds framework encompasses situations where a population parameter of interest is partially identified by a collection of consistently estimable upper and lower bounds. The identified set for the parameter is the intersection of regions defined by this collection of bounds. More generally, the methodology can be applied to settings where an estimable function of a vector-valued parameter is bounded from above and below, as is the case when the identified set is characterized by conditional moment inequalities.

The commands clrbound, clr2bound, and clr3bound provide bound estimates that can be used directly for estimation or to construct asymptotically valid confidence sets. clrtest performs an intersection bound test of the hypothesis that a collection of lower intersection bounds is no greater than zero. The command clrbound provides bound estimates for one-sided lower or upper intersection bounds on a parameter, while clr2bound and clr3bound provide two-sided bound estimates based on both lower and upper intersection bounds. clr2bound uses Bonferroni’s inequality to construct two-sided bounds that can be used to perform asymptotically valid inference on the identified set or the parameter of interest, whereas clr3bound provides a generally tighter confidence interval for the parameter by inverting the hypothesis test performed by clrtest. More broadly, inversion of this test can also be used to construct confidence sets based on conditional moment inequalities as described in Chernozhukov et al. (2013). The commands include parametric, series, and local linear estimation procedures, and can be installed from within STATA by typing “ssc install clrbound”.

]]>
http://www.ifs.org.uk/publications/7218 Wed, 28 May 2014 00:00:00 +0000
<![CDATA[Tackling social exclusion: evidence from Chile]]> We study an innovative welfare program in Chile which combines a period of frequent home visits to households in extreme poverty, with guaranteed access to social services. Program impacts are identified using a regression discontinuity design, exploring the fact that programme eligibility is a discontinuous function of an index of family income and assets. We find strong and lasting impacts of the program on the take up of subsidies and employment services. These impacts are important only for families who had little access to the welfare system prior to the intervention.

]]>
http://www.ifs.org.uk/publications/7217 Tue, 27 May 2014 00:00:00 +0000
<![CDATA[Dealing with randomisation bias in a social experiment: the case of ERA]]> One of the most powerful critiques of the use of randomised experiments in the social sciences is the possibility that individuals might react to the randomisation itself, thereby rendering the causal inference from the experiment irrelevant for policy purposes. In this paper we set out a theoretical framework for the systematic consideration of “randomisation bias”, and provide what is to our knowledge the first empirical evidence on this form of bias in an actual social experiment, the UK Employment Retention and Advancement (ERA) study. Specifically, we empirically test the extent to which random assignment has affected the process of participation in the ERA study. We further propose a non-experimental way of assessing the extent to which the treatment effects stemming from the experimental sample are representative of the impacts that would have been experienced by the population who would have been exposed to the program in routine mode. We consider both the case of administrative outcome measures available for the entire relevant sample and of survey-based outcome measures. For the case of survey outcomes we extend our estimators to also account for selective non-response based on observed characteristics. Both for the case of administrative and survey data we further extend our proposed estimators to deal with the nonlinear case of binary outcomes.

]]>
http://www.ifs.org.uk/publications/7214 Fri, 23 May 2014 00:00:00 +0000
<![CDATA[Nonparametric identification of endogenous and heterogeneous aggregate demand models: complements, bundles and the market level]]> This paper studies nonparametric identi fication in market level demand models for di fferentiated products. We generalize common models by allowing for the distribution of heterogeneity parameters (random coefficients) to have a nonparametric distribution across the population and give conditions under which the density of the random coefficients is identi fied. We show that key identifying restrictions are provided by (i) a set of moment conditions generated by instrumental variables together with an inversion of aggregate demand in unobserved product characteristics; and (ii) an integral transform (Radon transform) that maps the random coefficient density to the aggregate demand. This feature is shown to be common across a wide class of models, and we illustrate this by studying leading demand models. Our examples include demand models based on the multinomial choice (Berry, Levinsohn, Pakes, 1995), the choice of bundles of goods that can be substitutes or complements, and the choice of goods consumed in multiple units.

]]>
http://www.ifs.org.uk/publications/7209 Mon, 19 May 2014 00:00:00 +0000
<![CDATA[Tax without design: recent developments in UK tax policy]]> This paper considers the development of tax policy in the UK over the last decade or so and assesses policy change against a low bar- consistency and coherence. While this government has followed some consistent policies- notably, in some aspects of corporation tax and in increasing the income tax personal allowance- there are few signs of a wider coherent strategy. The same has been true of other recent governments. Many aspects of the system have become more complex. There have been numerous policy reversals. And few of those aspects of the system in most need of reform have been tackled. The need for reform, and a clear strategy for reform, remain as pressing as ever.

]]>
http://www.ifs.org.uk/publications/7203 Tue, 13 May 2014 00:00:00 +0000
<![CDATA[Consume now or later? Time inconsistency, collective choice and revealed preference]]> In this paper, we develop a revealed preference methodology that allows us to explore whether time inconsistencies in household choice are the product of individual preference nonstationarities or the result of individual heterogeneity and renegotiation within the collective unit. An empirical application to household-level microdata highlights that an explicit recognition of the collective nature of household choice enables the vast majority of observed behaviour to be rationalised by a theory that assumes preference stationarity at the individual level. The methodology created in this paper also facilitates the recovery of theory-consistent discount rates for each individual within a particular household under study. We find that couples characterised by lower divergence in spousal discount rates are older, which we take as an indication of experiencing higher match quality.

]]>
http://www.ifs.org.uk/publications/7201 Mon, 12 May 2014 00:00:00 +0000
<![CDATA[Inference for functions of partially identified parameters in moment inequality models]]> This paper introduces a bootstrap-based inference method for functions of the parameter vector in a moment (in)equality model. As a special case, our method yields marginal con fidence sets for individual coordinates of this parameter vector. Our inference method controls asymptotic size uniformly over a large class of data distributions. The current literature describes only two other procedures that deliver uniform size control for this type of problem: projection-based and subsampling inference. Relative to projection-based procedures, our method presents three advantages: (i) it weakly dominates in terms of fi nite sample power, (ii) it strictly dominates in terms of asymptotic power, and (iii) it is typically less computationally demanding. Relative to subsampling, our method presents two advantages: (i) it strictly dominates in terms of asymptotic power (for reasonable choices of subsample size), and (ii) it appears to be less sensitive to the choice of its tuning parameter than subsampling is to the choice of subsample size.

]]>
http://www.ifs.org.uk/publications/7196 Wed, 07 May 2014 00:00:00 +0000
<![CDATA[Individual heterogeneity, nonlinear budget sets, and taxable income]]> Given the key role of the taxable income elasticity in designing an optimal tax system there are many studies attempting to estimate this elasticity. To account for nonlinear taxes these studies either use instrumental variables approaches that are not fully consistent, or impose strong functional form assumptions. None allow for general heterogeneity in preferences. In this paper we derive the mean and distribution of taxable income, conditional on a nonlinear budget set, allowing general heterogeneity and optimization errors for the mean. We find an important dimension reduction and use that to develop nonparametric estimation methods. We show how to nonparametrically estimate the conditional mean of taxable income imposing all the restrictions of utility maximization and allowing for measurement errors. We apply this method to Swedish data and estimate for prime age males a significant net of tax elasticity of 0.6 and a significant income elasticity of -0.08.

]]>
http://www.ifs.org.uk/publications/7191 Thu, 01 May 2014 00:00:00 +0000
<![CDATA[Estimation of random coefficients logit demand models with interactive fixed effects]]> http://www.ifs.org.uk/publications/7183 Fri, 25 Apr 2014 00:00:00 +0000 <![CDATA[The measurement of household consumption expenditures]]> Household-level data on consumer expenditures underpins a wide range of empirical research in modern economics, spanning micro- and macroeconomics. This research includes work on consumption and saving, on poverty and inequality, and on risk sharing and insurance. We review different ways in which such data can be collected or captured: traditional detailed budget surveys, less onerous survey procedures that might be included in more general surveys, and administrative or process data. We discuss the advantages and difficulties of each approach and suggest directions for future investigation.

]]>
http://www.ifs.org.uk/publications/7170 Thu, 10 Apr 2014 00:00:00 +0000
<![CDATA[Specification tests for partially identified models defined by moment inequalities]]> This paper studies the problem of specifi cation testing in partially identi fied models defi ned by a fi nite number of moment equalities and inequalities (i.e. (in)equalities). Under the null hypothesis, there is at least one parameter value that simultaneously satis fies all of the moment (in)equalities whereas under the alternative hypothesis there is no such parameter value. This problem has not been directly addressed in the literature (except in particular cases), although several papers have suggested a test based on checking whether con fidence sets for the parameters of interest are empty or not, referred to as Test BP.

We propose two new speci fication tests, denoted Tests RS and RC, that achieve uniform asymptotic size control and dominate Test BP in terms of power in any finite sample and in the asymptotic limit. Test RC is particularly convenient to implement because it requires little additional work beyond the con fidence set construction. Test RS requires a separate procedure to compute, but has the best power. The separate procedure is computationally easier than confi dence set construction in typical cases.

]]>
http://www.ifs.org.uk/publications/7169 Thu, 10 Apr 2014 00:00:00 +0000
<![CDATA[Nonparametric spectral-based estimation of latent structures]]> We present a constructive identification proof of p-linear decompositions of q-way arrays. The analysis is based on the joint spectral decomposition of a set of matrices. It has applications in the analysis of a variety of latent-structure models, such as q-variate mixtures of p distributions. As such, our results provide a constructive alternative to Allman, Matias and Rhodes [2009]. The identification argument suggests a joint approximate-diagonalization estimator that is easy to implement and whose asymptotic properties we derive. We illustrate the usefulness of our approach by applying it to nonparametrically estimate multivariate finite-mixture models and hidden Markov models.

]]>
http://www.ifs.org.uk/publications/7163 Tue, 01 Apr 2014 00:00:00 +0000
<![CDATA[Can survey participation alter household saving behavior?]]> http://www.ifs.org.uk/publications/7162 Fri, 28 Mar 2014 00:00:00 +0000 <![CDATA[The identification power of smoothness assumptions in models with counterfactual outcomes]]> http://www.ifs.org.uk/publications/7161 Wed, 26 Mar 2014 00:00:00 +0000 <![CDATA[International trends in technological progress: stylized facts from patent citations, 1980-2011]]> http://www.ifs.org.uk/publications/7160 Wed, 26 Mar 2014 00:00:00 +0000 <![CDATA[Optimal bandwidth selection for robust generalized method of moments estimation]]> A two-step generalized method of moments estimation procedure can be made robust to heteroskedasticity and autocorrelation in the data by using a nonparametric estimator of the optimal weighting matrix. This paper addresses the issue of choosing the corresponding smoothing parameter (or bandwidth) so that the resulting point estimate is optimal in a certain sense. We derive an asymptotically optimal bandwidth that minimizes a higher-order approximation to the asymptotic mean-squared error of the estimator of interest. We show that the optimal bandwidth is of the same order as the one minimizing the mean-squared error of the nonparametric plugin estimator, but the constants of proportionality are signifi cantly di fferent. Finally, we develop a data-driven bandwidth selection rule and show, in a simulation experiment, that it may substantially reduce the estimator's mean-squared error relative to existing bandwidth choices, especially when the number of moment conditions is large.

]]>
http://www.ifs.org.uk/publications/7159 Mon, 24 Mar 2014 00:00:00 +0000
<![CDATA[Household Sharing and Commitment: Evidence from Panel Data on Individual Expenditures and Time Use]]> http://www.ifs.org.uk/publications/7150 Tue, 18 Mar 2014 00:00:00 +0000 <![CDATA[Tenure, experience, human capital and wages: a tractable equilibrium search model of wage dynamics]]> We develop and estimate an equilibrium job search model of worker careers, allowing for human capital accumulation, employer heterogeneity and individual-level shocks. Career wage growth is decomposed into the contributions of human capital and job search, within and between jobs. Human capital accumulation is largest for highly educated workers, and both human capital accumulation and job search contribute to the observed concavity of wage-experience pro files. The contribution from job search to wage growth, both within- and between-job, declines over the fi rst ten years of a career- the `job-shopping' phase of a working life - after which workers settle into high-quality jobs and use outside off ers to generate gradual wage increases, thus reaping the benefi ts from competition between employers.

]]>
http://www.ifs.org.uk/publications/7139 Fri, 14 Mar 2014 00:00:00 +0000
<![CDATA[A simple parametric model selection test]]> We propose a simple model selection test for choosing among two parametric likelihoods which can be applied in the most general setting without any assumptions on the relation between the candidate models and the true distribution. That is, both, one or neither is allowed to be correctly speci fied or misspeci fied, they may be nested, non-nested, strictly non-nested or overlapping. Unlike in previous testing approaches, no pre-testing is needed, since in each case, the same test statistic together with a standard normal critical value can be used. The new procedure controls asymptotic size uniformly over a large class of data generating processes. We demonstrate its finite sample properties in a Monte Carlo experiment and its practical relevance in an empirical application comparing Keynesian versus new classical macroeconomic models.

]]>
http://www.ifs.org.uk/publications/7137 Fri, 14 Mar 2014 00:00:00 +0000
<![CDATA[Grade retention and unobserved heterogeneity]]> http://www.ifs.org.uk/publications/7141 Fri, 14 Mar 2014 00:00:00 +0000 <![CDATA[Labour supply and taxation with restricted choices]]> This working paper has been updated to a new version, W15/02, which can be downloaded here.

]]>
http://www.ifs.org.uk/publications/7142 Fri, 14 Mar 2014 00:00:00 +0000
<![CDATA[Labor market reforms and unemployment dynamics]]> http://www.ifs.org.uk/publications/7140 Fri, 14 Mar 2014 00:00:00 +0000 <![CDATA[Nonparametric estimation of finite measures]]> The aim of this paper is to provide simple nonparametric methods to estimate finite mixture models from data with repeated measurements. Three measurements suffice for the mixture to be fully identified and so our approach can be used even with very short panel data. We provide distribution theory for estimators of the mixing proportions and the mixture distributions, and various functionals thereof. We also discuss inference on the number of components. These estimators are found to perform well in a series of Monte Carlo exercises. We apply our techniques to document heterogeneity in log annual earnings using PSID data spanning the period 1969–1998.

]]>
http://www.ifs.org.uk/publications/7138 Fri, 14 Mar 2014 00:00:00 +0000
<![CDATA[Estimating the effect of teacher pay on pupil attainment using boundary discontinuities]]> http://www.ifs.org.uk/publications/7125 Thu, 06 Mar 2014 00:00:00 +0000 <![CDATA[Interdependent durations in joint retirement]]> This paper introduces a bivariate version of the generalized accelerated failure time model. It allows for simultaneity in the econometric sense that the two realized outcomes depend structurally on each other. The proposed model also has the feasure that it will generate equal durations with positive probability. The motivating example is retirement decisions by married couples. In that example it seems reasonable to allow for the possibility that the each partner's optimal retirement time depends on the retirement time of the spouse. Moreover, the data suggest that the wife and the husband retire at the same time for a non-negligible fraction of couples. Our approach takes as starting point a stylized economic model that leads to a univariate generalized accelerated failure time model. The covariates of that generalized accelerated failure time model act as utility-flow shifters in the economic model. We introduce simultaneity by allowing the utility flow in retirement to depend on the retirement status of the spouse. The econometric model is then completed by assuming that the observed outcome is the Nash bargaining solution in that simple economic model. The advantage of this approach is that it includes independent realizations from the generalized accelerated failure time model as a special case, and deviations from this special case can be given an economic interpretation. We illustrate the model by studying the joint retirement decisions in married couples using the Health and Retirement Study. We provide a discussion of relevant identifying variation and estimate our model using indirect inference. The main empirical fi nding is that the simultaneity seems economically important. In our preferred speci cation the indirect utility associated with being retired increases by approximately 5% if one's spouse is already retired and unobservables exhibit positive correlation. The estimated model also predicts that the indirect eff ect of a change in husbands' pension plan on wives' retirement dates is about 4% of the direct e ffect on the husbands.

]]>
http://www.ifs.org.uk/publications/7128 Thu, 06 Mar 2014 00:00:00 +0000
<![CDATA[Testing for a general class of functional inequalities]]> Lp functionals of kernel-type estimators (1 < p < ∞) and is easy to implement in general, mainly due to its recourse to the bootstrap method. The bootstrap procedure is based on nonparametric bootstrap applied to kernel-based test statistics, with estimated "contact sets". We provide regularity conditions under which the bootstrap test is asymptotically valid uniformly over a large class of distributions, including the cases that the limiting distribution of the test statistic is degenerate. Our bootstrap test is shown to exhibit good power properties in Monte Carlo experiments, and we provide a general form of the local power function. As an illustration, we consider testing implications from auction theory, provide primitive conditions for our test, and demonstrate its usefulness by applying our test to real data. We supplement this example with the second empirical illustration in the context of wage inequality.]]> http://www.ifs.org.uk/publications/7129 Thu, 06 Mar 2014 00:00:00 +0000 <![CDATA[Single stock circuit breakers on the London Stock Exchange: do they improve subsequent market quality?]]> http://www.ifs.org.uk/publications/7122 Thu, 20 Feb 2014 00:00:00 +0000 <![CDATA[The cross-quantilogram: measuring quantile dependence and testing directional predictability between time series]]> http://www.ifs.org.uk/publications/7121 Thu, 20 Feb 2014 00:00:00 +0000 <![CDATA[Inference for functions of partially identified parameters in moment inequality models]]> This paper introduces a new hypothesis test for the null hypothesis H0 : f(θ) = y0, where f(.) is a known function, y0 is a known constant, and θ is a parameter that is partially identi ed by a moment (in)equality model. The main application of our test is sub-vector inference in moment inequality models, that is, for a multidimensional θ, the function f(θ) = θk selects the kth coordinate of θ. Our test controls asymptotic size uniformly over a large class of distributions of the data and has better asymptotic power properties than currently available methods. In particular, we show that the new test has asymptotic power that dominates the one corresponding to two existing competitors in the literature: subsampling and projection-based tests.

]]>
http://www.ifs.org.uk/publications/7065 Mon, 27 Jan 2014 00:00:00 +0000
<![CDATA[Nutrition, information, and household behaviour: experimental evidence from Malawi]]> Incorrect knowledge of the health production function may lead to inefficient household choices, and thereby to the production of suboptimal levels of health. This paper studies the effects of a randomised intervention in rural Malawi which, over a six-month period, provided mothers of young infants with information on child nutrition without supplying any monetary or in-kind resources. A simple model first investigates theoretically how nutrition and other household choices including labour supply may change in response to the improved nutrition knowledge observed in the intervention areas. We then show empirically that, in line with this model, the intervention improved child nutrition, household consumption and consequently health. These increases are funded by an increase in male labor supply. We consider and rule out alternative explanations behind these findings. This paper is the first to establish that non-health choices, particularly parental labor supply, are affected by parents’ knowledge of the child health production function.

]]>
http://www.ifs.org.uk/publications/8912 Fri, 24 Jan 2014 00:00:00 +0000
<![CDATA[Nutrition, information, and household behaviour: experimental evidence from Malawi]]> Incorrect knowledge of the health production function may lead to inefficient household choices, and thereby to the production of suboptimal levels of health. This paper studies the effects of a randomised intervention in rural Malawi which, over a six-month period, provided mothers of young infants with information on child nutrition without supplying any monetary or in-kind resources. A simple model first investigates theoretically how nutrition and other household choices including labour supply may change in response to the improved nutrition knowledge observed in the intervention areas. We then show empirically that, in line with this model, the intervention improved child nutrition, household consumption and consequently health. These increases are funded by an increase in male labor supply. We consider and rule out alternative explanations behind these findings. This paper is the first to establish that non-health choices, particularly parental labor supply, are affected by parents’ knowledge of the child health production function.

]]>
http://www.ifs.org.uk/publications/7064 Fri, 24 Jan 2014 00:00:00 +0000
<![CDATA[Generalized instrumental variable models]]> The ability to allow for flexible forms of unobserved heterogeneity is an essential ingredient in modern microeconometrics. In this paper we extend the application of instrumental variable (IV) methods to a wide class of problems in which multiple values of unobservable variables can be associated with particular combinations of observed endogenous and exogenous variables. In our Generalized Instrumental Variable (GIV) models, in contrast to traditional IV models, the mapping from unobserved heterogeneity to endogenous variables need not admit a unique inverse. The class of GIV models allows unobservables to be multivariate and to enter non-separably into the determination of endogenous variables, thereby removing strong practical limitations on the role of unobserved heterogeneity. Important examples include models with discrete or mixed continuous/discrete outcomes and continuous unobservables, and models with excess heterogeneity where many combinations of different values of multiple unobserved variables, such as random coefficients, can deliver the same realizations of outcomes. We use tools from random set theory to study identification in such models and provide a sharp characterization of the identified set of structures admitted. We demonstrate the application of our analysis to a continuous outcome model with an interval-censored endogenous explanatory variable.

]]>
http://www.ifs.org.uk/publications/7060 Thu, 23 Jan 2014 00:00:00 +0000
<![CDATA[Labor income dynamics and the insurance from taxes, transfers and the family]]> What do labor income dynamics look like over the life-cycle? What is the relative importance of persistent shocks, transitory shocks and heterogeneous profi les? To what extent do taxes, transfers and the family attenuate these various factors in the evolution of life-cycle inequality? In this paper, we use rich Norwegian data to answer these important questions. We let individuals with di fferent education levels have a separate income process; and within each skill group, we allow for non-stationarity in age and time, heterogeneous experience profi les, and shocks of varying persistence. We find that the income processes diff er systematically by age, skill level and their interaction. To accurately describe labor income dynamics over the life-cycle, it is necessary to allow for heterogeneity by education levels and account for non-stationarity in age and time. Our findings suggest that the progressive nature of the Norwegian tax-transfer system plays a key role in attenuating the magnitude and persistence of income shocks, especially among the low skilled. By comparison, spouse's income matters less for the dynamics of inequality over the life-cycle.

]]>
http://www.ifs.org.uk/publications/7056 Fri, 17 Jan 2014 00:00:00 +0000
<![CDATA[Asymptotically efficient estimation of weighted average derivatives with an interval censored variable]]> http://www.ifs.org.uk/publications/7055 Wed, 15 Jan 2014 00:00:00 +0000 <![CDATA[Instrumental variables estimation of a generalized correlated random coefficients model]]> n–consistency and asymptotic normality. Monte Carlo simulations show excellent finite-sample performance that is comparable in precision to the standard two-stage least squares estimator. We apply our results to analyze the effect of air pollution on house prices, and find substantial heterogeneity in first stage instrument effects as well as heterogeneity in treatment effects that is consistent with household sorting.]]> http://www.ifs.org.uk/publications/7041 Wed, 08 Jan 2014 00:00:00 +0000 <![CDATA[Random coefficients on endogenous variables in simultaneous equations models]]> This paper considers a classical linear simultaneous equations model with random coefficients on the endogenous variables. Simultaneous equations models are used to study social interactions, strategic interactions between firms, and market equilibrium. Random coefficient models allow for heterogeneous marginal effects. For two-equation systems, I give two sets of sufficient conditions for point identification of the coefficients’ marginal distributions conditional on exogenous covariates. The first requires full support instruments, but allows for nearly arbitrary distributions of unobservables. The second allows for continuous instruments without full support, but places tail restrictions on the distributions of unobservables. I show that a nonparametric sieve maximum likelihood estimator for these distributions is consistent. I apply my results to the Add Health data to analyze the social determinants of obesity.

]]>
http://www.ifs.org.uk/publications/7040 Wed, 08 Jan 2014 00:00:00 +0000
<![CDATA[Program evaluation with high-dimensional data]]> In the first part of the paper, we consider estimation and inference on policy relevant treatment effects, such as local average and local quantile treatment effects, in a data-rich environment where there may be many more control variables available than there are observations. In addition to allowing many control variables, the setting we consider allows endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. To make informative inference possible, we assume that some reduced form predictive relationships are approximately sparse. That is, we require that the relationship between the control variables and the outcome, treatment status, and instrument status can be captured up to a small approximation error using a small number of the control variables whose identities are unknown to the researcher. This condition allows estimation and inference for a wide variety of treatment parameters to proceed after selection of an appropriate set of controls formed by selecting control variables separately for each reduced form relationship and then appropriately combining these reduced form relationships. We provide conditions under which post-selection inference is uniformly valid across a wide-range of models and show that a key condition underlying the uniform validity of post-selection inference allowing for imperfect model selection is the use of approximately unbiased estimating equations. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) participation on accumulated assets.

In the second part of the paper, we present a generalization of the treatment effect framework to a much richer setting, where possibly a continuum of target parameters is of interest and the Lasso-type or post-Lasso type methods are used to estimate a continuum of high-dimensional nuisance functions. This framework encompasses the analysis of local treatment effects as a leading special case and also covers a wide variety of classical and modern moment-condition problems in econometrics. We establish a functional central limit theorem for the continuum of the target parameters, and also show that it holds uniformly in a wide range of data-generating processes P, with continua of approximately sparse nuisance functions. We also establish validity of the multiplier bootstrap for resampling the first order approximations to the standardized continuum of the estimators, and also establish uniform validity in P. We propose a notion of the functional delta method for finding limit distribution and multiplier bootstrap of the smooth functionals of the target parameters that is valid uniformly in P. Finally, we establish rate and consistency results for continua of Lasso or post-Lasso type methods for estimating continua of the (nuisance) regression functions, also providing practical, theoretically justified penalty choices. Each of these results is new and could be of independent interest.

]]>
http://www.ifs.org.uk/publications/7039 Tue, 31 Dec 2013 00:00:00 +0000
<![CDATA[Honest confidence regions for a regression parameter in logistic regression with a large number of controls]]> n rate when the total number p of other regressors, called controls, exceed the sample size n, using the sparsity assumptions. The sparsity assumption means that only s unknown controls are needed to accurately approximate the nuisance part of the regression function, where s is smaller than n. Importantly, the estimators and these resulting confidence regions are 'honest' in the formal sense that their properties hold uniformly over s-sparse models. Moreover, these procedures do not rely on traditional 'consistent model selection' arguments for their validity; in fact, they are robust with respect to 'moderate' model selection mistakes in variable selection steps. Moreover, the estimators are semi-parametrically efficient in the sense of attaining the semi-parametric efficiency bounds for the class of models in this paper.]]> http://www.ifs.org.uk/publications/7029 Mon, 30 Dec 2013 00:00:00 +0000 <![CDATA[Anti-concentration and honest, adaptive confidence bands]]> Modern construction of uniform confidence bands for non-parametric densities (and other functions) often relies on the classical Smirnov-Bickel-Rosenblatt (SBR) condition; see, for example, Giné and Nickl (2010). This condition requires the existence of a limit distribution of an extreme value type for the supremum of a studentized empirical process (equivalently, for the supremum of a Gaussian process with the same covariance function as that of the studentized empirical process). The principal contribution of this paper is to remove the need for this classical condition. We show that a considerably weaker sufficient condition is derived from an anti-concentration property of the supremum of the approximating Gaussian process, and we derive an inequality leading to such a property for separable Gaussian processes. We refer to the new condition as a generalized SBR condition. Our new result shows that the supremum does not concentrate too fast around any value.

We then apply this result to derive a Gaussian multiplier bootstrap procedure for constructing honest confidence bands for nonparametric density estimators (this result can be applied in other nonparametetric problems as well). An essential advantage of our approach is that it applies generically even in those cases where the limit distribution of the supremum of the studentized empirical process does not exist (or is unknown). This is of particular importance in problems where resolution levels or other tuning parameters have been chosen in a data-driven fashion, which is needed for adaptive constructions of the confidence bands. Furthermore, our approach is asymptotically honest at a polynomial rate - namely, the error in coverage level converges to zero at a fast, polynomial speed (with respect to the sample size). In sharp contrast, the approach based on extreme value theory is asymptotically honest only at a logarithmic rate - the error converges to zero at a slow, logarithmic speed. Finally, of independent interest is our introduction of a new, practical version of Lepski's method, which computes the optimal, non-conservative resolution levels via a Gaussian multiplier bootstrap method.]]> http://www.ifs.org.uk/publications/7031 Mon, 30 Dec 2013 00:00:00 +0000 <![CDATA[Posterior inference in curved exponential families under increasing dimensions]]> This work studies the large sample properties of the posterior-based inference in the curved exponential family under increasing dimension. The curved structure arises from the imposition of various restrictions on the model, such as moment restrictions, and plays a fundamental role in econometrics and other branches of data analysis. We establish conditions under which the posterior distribution is approximately normal, which in turn implies various good properties of estimation and inference procedures based on the posterior. In the process we also revisit and improve upon previous results for the exponential family under increasing dimension by making use of concentration of measure. We also discuss a variety of applications to high-dimensional versions of the classical econometric models including the multinomial model with moment restrictions, seemingly unrelated regression equations, and single structural equation models. In our analysis, both the parameter dimension and the number of moments are increasing with the sample size.

]]>
http://www.ifs.org.uk/publications/7030 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Uniform post selection inference for LAD regression and other z-estimation problems]]> We develop uniformly valid confidence regions for regression coefficients in a high-dimensional sparse least absolute deviation/median regression model. The setting is one where the number of regressors p could be large in comparison to the sample size n, but only s ≪ n of them are needed to accurately describe the regression function. Our new methods are based on the instrumental median regression estimator that assembles the optimal estimating equation from the output of the post ℓ1-penalized median regression and post ℓ1-penalized least squares in an auxiliary equation. The estimating equation is immunized against non-regular estimation of nuisance part of the median regression function, in the sense of Neyman. We establish that in a homoscedastic regression model, the instrumental median regression estimator of a single regression coefficient is asymptotically root-n normal uniformly with respect to the underlying sparse model. The resulting confidence regions are valid uniformly with respect to the underlying model. We illustrate the value of uniformity with Monte-Carlo experiments which demonstrate that standard/naive post-selection inference breaks down over large parts of the parameter space, and the proposed method does not. We then generalize our method to the case where p1 ≫ n regression coefficients are of interest in a non-smooth Z-estimation framework with approximately sparse nuisance functions, containing median regression with a single target regression coefficient as a very special case. We construct simultaneous confidence bands on all p1 coefficients, and establish their uniform validity over the underlying approximately sparse model.

]]>
http://www.ifs.org.uk/publications/7036 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[On the asymptotic theory for least squares series: pointwise and uniform results]]> In this work we consider series estimators for the conditional mean in light of three new ingredients: (i) sharp LLNs for matrices derived from the non-commutative Khinchin inequalities, (ii) bounds on the Lebesgue factor that controls the ratio between the L∞ and L2-norms, and (iii) maximal inequalities for processes whose entropy integrals diverge at some rate.

These technical tools allow us to contribute to the series literature, specifically the seminal work of Newey (1997), as follows. First, we weaken considerably the condition on the number k of approximating functions used in series estimation from the typical k2/n → 0 to k/n → 0, up to log factors, which was available only for spline and local polynomial partition series before. Second, under the same weak conditions we derive L2 rates and pointwise central limit theorems results when the approximation error vanishes. Under an incorrectly specified model, i.e. when the approximation error does not vanish, analogous results are also shown. Third, under stronger conditions we derive uniform rates and functional central limit theorems that hold if the approximation error vanishes or not. That is, we derive the strong approximation for the entire estimate of the nonparametric function. Finally, we derive uniform rates and inference results for linear functionals of interest of the conditional expectation function such as its partial derivative or conditional average partial derivative.

]]>
http://www.ifs.org.uk/publications/7035 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Nonparametric identification in panels using quantiles]]> This paper considers identification and estimation of ceteris paribus effects of continuous regressors in nonseparable panel models with time homogeneity. The effects of interest are derivatives of the average and quantile structural functions of the model. We find that these derivatives are identified with two time periods for 'stayers', i.e. for individuals with the same regressor values in two time periods. We show that the identification results carry over to models that allow location and scale time effects. We propose nonparametric series methods and a weighted bootstrap scheme to estimate and make inference on the identified effects. The bootstrap proposed allows uniform inference for function-valued parameters such as quantile effects over a region of quantiles or regressor values. An empirical application to Engel curve estimation with panel data illustrates the results.

]]>
http://www.ifs.org.uk/publications/7028 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors]]> p) is large compared to the sample size (n); in fact, p can be much larger than n, without restricting correlations of the coordinates of these vectors. We also show that the distribution of the maximum of a sum of the random vectors with unknown covariance matrices can be consistently estimated by the distribution of the maximum of a sum of the conditional Gaussian random vectors obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. This is the Gaussian multiplier (or wild) bootstrap procedure. Here too, p can be large or even much larger than n. These distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation to the distribution of the original maximum, often with approximation error decreasing polynomially in the sample size, and hence are of interest in many applications. We demonstrate how our Gaussian approximations and the multiplier bootstrap can be used for modern high dimensional estimation, multiple hypothesis testing, and adaptive specification testing. All these results contain non-asymptotic bounds on approximation errors.]]> http://www.ifs.org.uk/publications/7038 Mon, 30 Dec 2013 00:00:00 +0000 <![CDATA[Dynamic linear panel regression models with interactive fixed effects]]> We analyze linear panel regression models with interactive fixed effects and predetermined regressors, e.g. lagged-dependent variables. The first order asymptotic theory of the least squares (LS) estimator of the regression coefficients is worked out in the limit where both the cross sectional dimension and the number of time periods become large. We find that there are two sources of asymptotic bias of the LS estimator: bias due to correlation or heteroscedasticity of the idiosyncratic error term, and bias due to predetermined (as opposed to strictly exogenous) regressors. We provide an estimator for the bias and a bias corrected LS estimator for the case where idiosyncratic errors are independent across both panel dimensions. Furthermore, we provide bias corrected versions of the three classical test statistics (Wald, LR and LM test) and show that their asymptotic distribution is a chi-square-distribution. Monte Carlo simulations show that the bias correction of the LS estimator and of the test statistics also work well for finite sample sizes.

A supplement to this paper can be downloaded here.

]]>
http://www.ifs.org.uk/publications/7025 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Gaussian approximation of suprema of empirical processes]]> We develop a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the supremum norm. We prove an abstract approximation theorem that is applicable to a wide variety of problems, primarily in statistics. In particular, the bound in the main approximation theorem is non-asymptotic and the theorem does not require uniform boundedness of the class of functions. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein’s method for normal approximation, and some new empirical process techniques. We study applications of this approximation theorem to local empirical processes and series estimation in nonparametric regression where the classes of functions change with the sample size and are not Donsker-type. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples.

]]>
http://www.ifs.org.uk/publications/7037 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Robust inference in high-dimensional approximately sparse quantile regression models]]> This work proposes new inference methods for the estimation of a regression coefficient of interest in quantile regression models. We consider high-dimensional models where the number of regressors potentially exceeds the sample size but a subset of them suffice to construct a reasonable approximation of the unknown quantile regression function in the model. The proposed methods are protected against moderate model selection mistakes, which are often inevitable in the approximately spare model considered here. The methods construct (implicitly or explicitly) an optimal instrument as a residual from a density-weighed projection of the regressor of interest on other regressors. Under regularity conditions, the proposed estimators of the quantile regression coefficient are asymptotically root-n normal, with variance equal to the semi-parametric efficiency bound of the partially linear quantile regression model. In addition, the performance of the technique is illustrated through Monte-carlo experiments and an empirical example, dealing with risk factors in childhood malnutrition. The numerical results confirm the theoretical findings that the proposed methods should outperform the naive post-model selection methods in non-parametric settings. Moreover, the empirical results demonstrate soundness of the proposed methods.

]]>
http://www.ifs.org.uk/publications/7032 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Testing Many Moment Inequalities]]> This paper considers the problem of testing many moment inequalities where the number of moment inequalities, denoted by p, is possibly much larger than the sample size n. There are a variety of economic applications where the problem of testing many moment inequalities appears; a notable example is the entry model of Ciliberto and Tamer (2009) where p=2m+1 with m being the number of firms. We consider the test statistics given by the maximum of p Studentized (or t-type) statistics, and analyze various ways to compute critical values for the test. Specifically, we consider critical values based upon (i) the union (Bonferroni) bound combined with a moderate deviation inequality for self-normalized sums, (ii) the multiplier bootstrap. We also consider two step variants of (i) and (ii) by incorporating moment selection. We prove validity of these methods, showing that under mild conditions, they lead to tests with error in size decreasing polynomially in n while allowing for p being much larger than n; indeed p can be of order exp(nc) for some c>0. Importantly, all these results hold without any restriction on correlation structure between p Studentized statistics, and also hold uniformly with respect to suitably wide classes of underlying distributions. We also show that all the tests developed in this paper are asymptotically minimax optimal when p grows with n.

]]>
http://www.ifs.org.uk/publications/7027 Mon, 30 Dec 2013 00:00:00 +0000
<![CDATA[Comparison and anti-concentration bounds for maxima of Gaussian random vectors]]> http://www.ifs.org.uk/publications/7033 Mon, 30 Dec 2013 00:00:00 +0000 <![CDATA[Mostly harmless simulations? On the internal validity of empirical Monte Carlo studies]]> In this paper we evaluate the premise from the recent literature on Monte Carlo studies that an empirically motivated simulation exercise is informative about the actual ranking of various estimators when applied to a particular problem. We consider two alternative designs and provide an empirical test for both of them. We conclude that a necessary condition for the simulations to be informative about the true ranking is that the treatment effect in simulations must be equal to the (unknown) true effect. This severely limits the usefulness of such procedures, since were the effect known, the procedure would not be necessary.

]]>
http://www.ifs.org.uk/publications/7026 Sun, 29 Dec 2013 00:00:00 +0000
<![CDATA[Food for Thought? Breastfeeding and Child Development]]> http://www.ifs.org.uk/publications/7021 Wed, 18 Dec 2013 00:00:00 +0000 <![CDATA[Savings and wealth of the lifetime rich: evidence from the UK and US]]> http://www.ifs.org.uk/publications/7020 Tue, 17 Dec 2013 00:00:00 +0000 <![CDATA[Pivotal estimation via square-root lasso in nonparametric regression]]> http://www.ifs.org.uk/publications/7006 Tue, 10 Dec 2013 00:00:00 +0000 <![CDATA[Individual and time effects in nonlinear panel models with large N, T]]> Fixed effects estimators of nonlinear panel data models can be severely biased because of the well-known incidental parameter problem. We develop analytical and jackknife bias corrections for nonlinear models with both individual and time effects. Under asymptotic sequences where the time-dimension (T) grows with the cross-sectional dimension (N), the time effects introduce additional incidental parameter bias. As the existing bias corrections apply to models with only individual effects, we derive the appropriate corrections for the case when both effects are present. The basis for the corrections are general asymptotic expansions of fixed effects estimators with incidental parameters in multiple dimensions. We apply the expansions to M-estimators with concave objective functions in parameters for panel models with additive individual and time effects. These estimators cover fixed effects estimators of the most popular limited dependent variable models such as logit, probit, ordered probit, Tobit and Poisson models. Our analysis therefore extends the use of large-T bias adjustments to an important class of models. We also develop bias corrections for functions of the data, parameters and individual and time effects including average partial effects. In this case, the incidental parameter bias can be asymptotically of second order, but the corrections still improve finite-sample properties.

]]>
http://www.ifs.org.uk/publications/6987 Mon, 02 Dec 2013 00:00:00 +0000
<![CDATA[Covariate selection and model averaging in semiparametric estimation of treatment effects]]> In the practice of program evaluation, choosing the covariates and the functional form of the propensity score is an important choice for estimating treatment effects. This paper proposes data-driven model selection and model averaging procedures that address this issue for the propensity score weighting estimation of the average treatment effects for treated (ATT). Building on the focussed information criterion (FIC), the proposed selection and averaging procedures aim to minimize the estimated mean squared error (MSE) of the ATT estimator in a local asymptotic framework. We formulate model averaging as a statistical decision problem in a limit experiment, and derive an averaging scheme that is Bayes optimal with respect to a given prior for the localisation parameters in the local asymptotic framework. In our Monte Carlo studies, the averaging estimator outperforms the post-covariate-selection estimator in terms of MSE, and shows a substantial reduction in MSE compared to conventional ATT estimators. We apply the procedures to evaluate the effect of the labour market program described in LaLonde (1986).

]]>
http://www.ifs.org.uk/publications/6988 Mon, 02 Dec 2013 00:00:00 +0000
<![CDATA[High dimensional methods and inference on structural and treatment effects]]> The goal of many empirical papers in economics is to provide an estimate of the causal or structural effect of a change in a treatment or policy variable, such as a government intervention or a price, on another economically interesting variable, such as unemployment or amount of a product purchased. Applied economists attempting to estimate such structural effects face the problems that economically interesting quantities like government policies are rarely randomly assigned and that the available data are often high-dimensional. Failure to address either of these issues generally leads to incorrect inference about structural effects, so methodology that is appropriate for estimating and performing inference about these effects when treatment is not randomly assigned and there are many potential control variables provides a useful addition to the tools available to applied economists.

]]>
http://www.ifs.org.uk/publications/6967 Thu, 21 Nov 2013 00:00:00 +0000
<![CDATA[The UK's public finances in the long run: the IFS model]]> http://www.ifs.org.uk/publications/6951 Mon, 18 Nov 2013 00:00:00 +0000 <![CDATA[A weak instrument F-test in linear IV models with multiple endogenous variables]]> We consider testing for weak instruments in a model with multiple endogenous variables. Unlike Stock and Yogo (2005), who considered a weak instruments problem where the rank of the matrix of reduced form parameters is near zero, here we consider a weak instruments problem of a near rank reduction of one in the matrix of reduced form parameters. For example, in a two-variable model, we consider weak instrument asymptotics of the form  π1 = δ pi2 + c / √n where π1 and π2 are the parameters in the two reduced-form equations, c is a vector of constants and n is the sample size. We investigate the use of a conditional first-stage F-statistic along the lines of the proposal by Angrist and Pischke (2009) and show that, unless δ = 0 , the variance in the denominator of their F-statistic needs to be adjusted in order to get a correct asymptotic distribution when testing the hypothesis H0 : π1 = δ &pi2. We show that a corrected conditional F-statistic is equivalent to the Cragg and Donald (1993) minimum eigenvalue rank test statistic, and is informative about the maximum total relative bias of the 2SLS estimator and the Wald tests size distortions. When δ = 0 in the two-variable model, or when there are more than two endogenous variables, further information over and above the Cragg-Donald statistic can be obtained about the nature of the weak instrument problem by computing the conditional first-stage F-statistics.

(A typo on page 27 that erroneously resulted in the OLS estimator instead of the 2SLS estimator was corrected in July 2015).

]]>
http://www.ifs.org.uk/publications/6940 Wed, 13 Nov 2013 00:00:00 +0000
<![CDATA[Program evaluation with high-dimensional data]]> We consider estimation of policy relevant treatment effects in a data-rich environment where there may be many more control variables available than there are observations. In addition to allowing many control variables, the setting we consider allows heterogeneous treatment effects, endogenous receipt of treatment, and function-valued outcomes. To make information inference possible, we assume that reduced form predictive relationships are approximately sparse. That is, we require that the relationship between the covariates and the outcome, treatment status, and instrument status can be captured up to a small approximation error using a small number of controls whose identities are unknown to the researcher. This condition allows estimation and inference for a wide variety of treatment parameters to process after selection of an appropriate set of control variables formed by selecting controls separately for each reduced form relationship and then appropriately combining this set of reduced form predictive models and associated selected controls. We provide conditions under which post-selection inferences is uniformly valid across a wide-range of models and show that a key condition underlying uniform validity of post-selection inference allowing for imperfect model selection is the use of approximately unbiased estimating equations. We illustrate the use of the proposed treatment effect estimation methods with an application to estimating the effect of 401(k) participation on accumulated assets.

]]>
http://www.ifs.org.uk/publications/6939 Tue, 12 Nov 2013 00:00:00 +0000
<![CDATA[Optimal uniform convergence rates for sieve nonparametric instrumental variables regression]]> We study the problem of nonparametric regression when the regressor is endogenous, which is an important nonparametric instrumental variables (NPIV) regression in econometrics and a difficult ill-posed inverse problem with unknown operator in statistics. We first establish a general upper bound on the sup-norm (uniform) convergence rate of a sieve estimator, allowing for endogenous regressors and weakly dependent data. This result leads to the optimal sup-norm convergence rates for spline and wavelet least squares regression estimators under weakly dependent data and heavy-tailed error terms. This upper bound also yields the sup-norm convergence rates for sieve NPIV estimators under i.i.d. data: the rates coincide with the known optimal L2-norm rates for severely ill-posed problems, and are power of log(n) slower than the optimal L2- norm rates for mildly ill-posed problems. We then establish the minimax risk lower bound in sup-norm loss, which coincides with our upper bounds on sup-norm rates for the spline and wavelet sieve NPIV estimators. This sup-norm rate optimality provides another justification for the wide application of sieve NPIV estimators. Useful results on weakly-dependent random matricies are also provided.

]]>
http://www.ifs.org.uk/publications/6923 Mon, 04 Nov 2013 00:00:00 +0000
<![CDATA[Set inferences and sensitivity analysis in semiparametric conditionally identified models]]> This paper provides tools for partial identification inference and sensitivity analysis in a general class of semiparametric models. The main working assumption is that the finite-dimensional parameter of interest and the possibility infinite-dimensional nuisance parameter are identified conditionally on other nuisance parameters being known. This structure arises in numerous applications and leads to relatively simple inference procedures. The paper develops uniform convergence for a set of semiparametric two-step GMM estimators, and it uses the uniformity to establish set inferences, including confidence regions for the identified set and the true parameter. Sensitivity analysis considers a domain of variation for the unidentified parameter that can be well outside its identified set, which demands inference to be established under misspecification. The paper also introduces new measures of sensitivity. Inferences are implemented with new bootstrap methods. Several example applications illustrate the wide applicability of our results.

]]>
http://www.ifs.org.uk/publications/6911 Mon, 28 Oct 2013 00:00:00 +0000
<![CDATA[Efficient responses to targeted cash transfers]]> In this paper, we estimate a collective model of household consumption and test the restrictions of collective rationality using z-conditional demands in the context of a large Conditional Cash Transfer programme in rural Mexico. We show that the model is able to explain the impacts the programme has on the structure of food consumption. We use two plausible and novel distribution factors, that is variables that describe the mechanism by which decisions are reached within the household: the random allocation of a cash transfer to women, and the relative size and wealth of the husband and wife's family networks. We find that the structure we propose does better at predicting the effect of exogenous increases in household income than an alternative, unitary, structure. We cannot reject efficiency of household decisions.

]]>
http://www.ifs.org.uk/publications/6910 Fri, 25 Oct 2013 00:00:00 +0000
<![CDATA[Nonparametric estimation of a heterogeneous demand function under the Slutsky inequality restriction]]> http://www.ifs.org.uk/publications/6898 Thu, 17 Oct 2013 00:00:00 +0000 <![CDATA[Anchoring the yield curve using survey expectations]]> http://www.ifs.org.uk/publications/6889 Tue, 15 Oct 2013 00:00:00 +0000 <![CDATA[A bootstrap test for instrument validity in heterogeneous treatment effect models]]> This paper develops a specification test for the instrument validity conditions in the heterogeneous treatment effect model with a binary treatment and a discrete instrument. A necessary testable implication for the joint restriction of instrument exogeneity and instrument monotonicity is given by nonnegativity of point-identifiable complier's outcome densities. Our specification test infers this testable implication using a Kolmogorov-Smirnov type test statistic. We provide a bootstrap algorithm to implement the proposed test and show its asymptotic validity. The proposed test procedure can apply to both discrete and continuous outcome cases.

]]>
http://www.ifs.org.uk/publications/6890 Tue, 15 Oct 2013 00:00:00 +0000
<![CDATA[Identification and estimation of preference distributions when voters are ideological]]> This paper studies the nonparametric identification and estimation of voters' preferences when voters are ideological. We establish that voter preference distributions and other parameters of interest can be identified from aggregate electoral data. We also show that these objects can be consistently estimated and illustrate our analysis by performing an actual estimation using data from the 1999 European Parliament elections.

]]>
http://www.ifs.org.uk/publications/6887 Tue, 08 Oct 2013 00:00:00 +0000
<![CDATA[Generalized method of moments with latent variables]]> http://www.ifs.org.uk/publications/6886 Mon, 07 Oct 2013 00:00:00 +0000 <![CDATA[Linear regression for panel with unknown number of factors as interactive fixed effects]]> In this paper we study the least squares (LS) estimator in a linear panel regression model with interactive fixed effects for asymptotics where both the number of time periods and the number of cross-sectional units go to infinity. Under appropriate assumptions we show that the limiting distribution of the LS estimator for the regression coefficients is independent of the number of interactive fixed effects used in the estimation, as long as this number does not fall below the true number of interactive fixed effects present in the data. The important practical implication of this result is that for inference on the regression coefficients one does not necessarily need to estimate the number of interactive effects consistently, but can rely on an upper bound of this number to calculate the LS estimator.

Supplementary material for this paper is available here.

]]>
http://www.ifs.org.uk/publications/6883 Fri, 04 Oct 2013 00:00:00 +0000
<![CDATA[On the identification of structural linear functionals]]> http://www.ifs.org.uk/publications/6882 Fri, 04 Oct 2013 00:00:00 +0000 <![CDATA[Extremum sieve estimation in <i>k</i>-out-of-<i>n</i> systems]]> This paper considers nonparametric estimation of absolutely continuous distribution functions of lifetimes of non-identical components in k-out-of-n systems from the observed 'autopsy' data. In economics, ascending 'button' or 'clock' auctions with n heterogeneous bidders present 2-out-of-n systems. Classical competing risk models are examples of k-out-of-n systems. Under weak conditions on the underlying distributions the estimation problem is shown to be well-posed and the suggested extremum sieve estimator is proven to be consistent. The paper illustrates the suggested estimation method by using sieve spaces of Bernstein polynomials which allow an easy implementation of constraints on the monotonicity of estimated distribution functions.

]]>
http://www.ifs.org.uk/publications/6872 Wed, 02 Oct 2013 00:00:00 +0000
<![CDATA[Policy discontinuity and duration outcomes]]> http://www.ifs.org.uk/publications/6871 Wed, 02 Oct 2013 00:00:00 +0000 <![CDATA[Convolution without independence]]> http://www.ifs.org.uk/publications/6866 Mon, 23 Sep 2013 00:00:00 +0000 <![CDATA[Higher-order properties of approximate estimators]]> Many modern estimation methods in econometrics approximate an objective function, for instance, through simulation or discretisation. These approximations typically affect both bias and variance of the resulting estimator. We provide a higher-order expansion of such 'approximate' estimators that takes into account the errors due to the use of approximations. This expansion allows us to establish general conditions under which the approximate estimator is first-order equivalent to the exact estimator. Moreover, we use the expansion to propose adjustments of the approximate estimator that remove its first-order bias and adjust its standard errors. These adjustments apply to a broad class of approximate estimators that includes all known simulation-based procedures. We also propose another approach to reduce the impact of approximations, based on a Newton-Raphson adjustment. A Monte Carlo simulation on the mixed logit model shows that our proposed adjustments can yield spectacular improvements at a low computational cost.

]]>
http://www.ifs.org.uk/publications/6862 Thu, 19 Sep 2013 00:00:00 +0000
<![CDATA[Properties of the maximum likelihood estimator in spatial autoregressive models]]> The (quasi-) maximum likelihood estimator (MLE) for the autoregressive parameter in a spatial autoregressive model cannot in general be written explicitly in terms of the data. The only known properties of the estimator have hitherto been its first-order asymptotic properties (Lee, 2004, Econometrica), derived under specific assumptions on the evolution of the spatial weights matrix involved. In this paper we show that the exact cumulative distribution function of the estimator can, under mild assumptions, be written down explicitly. A number of immediate consequences of the main result are discussed, and several examples of theoretical and practical interest are analysed in detail. The examples are of interest in their own right, but also serve to illustrate some unexpected features of the distribution of the MLE. In particular, we show that the distribution of the MLE may not be supported on the entire parameter space, and may be nonanalytic at some points in its support.

Supplementary material relating to this working paper can be viewed here

]]>
http://www.ifs.org.uk/publications/6861 Thu, 19 Sep 2013 00:00:00 +0000
<![CDATA[Do the UK Government’s welfare reforms make work pay]]> http://www.ifs.org.uk/publications/6853 Wed, 11 Sep 2013 00:00:00 +0000 <![CDATA[Prospect theory and tax evasion: a reconsideration of the Yitzhaki Puzzle]]> puzzle. We disentangle four distinct elements of prospect theory and find loss aversion and probability weighting to be redundant in endogenous specification of the reference level. These classes include, as special cases, the most common specifications in the literature. New specifications of the reference level are needed, we conclude.]]> http://www.ifs.org.uk/publications/6843 Fri, 30 Aug 2013 00:00:00 +0000 <![CDATA[Livestock asset transfers with and without training: evidence from Rwanda]]> This paper presents evidence from Rwanda's Girinka ('One Cow per Poor Family') program that has distributed more than 130,000 livestock asset transfers in the form of cows to the rural poor since 2006. Supply side constraints on the programe results in some beneficiaries receiving complementary training with the cow transfer, and other households not receiving such training with their cow. We exploit these constraints to estimate the additional impact of receiving complementary training with the cow transfer, on household's economic outcomes up to six years after having receieved the livestock asset transfer. Our results show that even in a setting such as rural Rwanda where linkages between farmers and produce markets remain weak, the provision of training with asset transfers has permanent and economically significant impacts on milk production, milk yields from livestock, household earnings, and asset accumulation. The results have important implications for the current generation of 'ultra-poor' livestock asset transfer programes being trialled globally as a means to allow the rural poor to better their economic lives.

]]>
http://www.ifs.org.uk/publications/6841 Fri, 30 Aug 2013 00:00:00 +0000
<![CDATA[Career progression, economic downturns, and skills]]> http://www.ifs.org.uk/publications/6842 Fri, 30 Aug 2013 00:00:00 +0000 <![CDATA[Generalized instrumental variable models]]> The ability to allow for flexible forms of unobserved heterogeneity is an essential ingredient in modern microeconometrics. In this paper we extend the application of instrumental variable (IV) models to a wide class of problems in which multiple values of unobservable variables can be associated with particular combinations of observed endogenous and exogenous variables. In our Generalised Instrumental Variable (GIV) models, in contrast to traditional IV models, the mapping from unobserved heterogeneity to endogenous variables need not admit a unique inverse. The class of GIV models allows unobservables to be multivariate and to enter nonseparably into the determination of endogenous variables, thereby removing strong practical limitations on the role of unobserved heterogeneity. Important examples include models with discrete or mixed continuous/discrete outcomes and continuous unobservables, and models with excess heterogeneity where many combinations of different values of multiple unobserved variables, such as random coefficients, can deliver the same realisations of outcomes. We use tools from random set theory to study identification in such models and provide a sharp characterisation of the identified set of structures admitted. We demonstrate the application of our analysis to a continuous outcome model with an interval-censored endogenous explanatory variable.

]]>
http://www.ifs.org.uk/publications/6840 Thu, 29 Aug 2013 00:00:00 +0000
<![CDATA[The macro-dynamics of sorting between workers and firms]]> http://www.ifs.org.uk/publications/6838 Tue, 27 Aug 2013 00:00:00 +0000 <![CDATA[The effect of fragmentation in trading on market quality in the UK equity market]]> We investigate the effects of fragmentation in equity trading on the quality of the trading outcomes, specifically volatility, liquidity and volume. We use panel regression methods on a weekly dataset following the FTSE350 stocks over the period 2008-2011, which provides a lot of cross-sectional and time series variation in fragmentation. This period coincided with a great deal of turbulence in the UK equity markets which had multiple causes that need to be controlled for. To achieve this, we use a version of the common correlated effects estimator (Pesaran, 2006). One finding is that volatility is lower in a fragmented market when compared to a monopoly. Trading volume at the London Stock Exchange is lower too, but global trading volume is higher if order flow is fragmented across multiple venues. When separating overall fragmentation into visible fragmentation and dark reading, we find that the decline in LSE volume can be attributed to visible fragmentation, while the increase in global volume is due to dark trading.

]]>
http://www.ifs.org.uk/publications/6837 Tue, 27 Aug 2013 00:00:00 +0000
<![CDATA[Nonparametric estimation of multivariate elliptic densities via finite mixture sieves]]> This paper considers the class of p-dimensional elliptic distributions (p ≥ 1) satisfying the consistency property (Kano, 1994) and within this general frame work presents a two-stage semiparametric estimator for the Lebesgue density based on Gaussian mixture sieves. Under the online Exponentiated Gradient (EG) algorithm of Helmbold et al. (1997) and without restricting the mixing measure to have compact support, the estimator produces estimates converging uniformly in probability to the true elliptic density at a rate that is independent of the dimension of the problem, hence circumventing the familiar curse of dimensionality inherent to many semiparametric estimators. The rate performance of our estimator depends on the tail behaviour of the underlying mixing density (and hence that of the data) rather than smoothness properties. In fact, our method achieves a rate of at least Op(n-1/4), provided only some positive moment exists. When further moments exists, the rate improves reaching Op(n-3/8) as the tails of the true density converge to those of a normal. Unlike the elliptic density estimator of Liebscher (2005), our sieve estimator always yields an estimate that is a valid density, and is also attractive from a practical perspective as it accepts data as a stream, thus significantly reducing computational and storage requirements. Monte Carlo experimentation indicates encouraging finite sample performance over a range of elliptic densities. The estimator is also implemented in a binary classification task using the well-known Wisconsin breast cancer dataset.

This paper is forthcoming in the The Journal of Multivariate Analysis

]]>
http://www.ifs.org.uk/publications/6836 Thu, 22 Aug 2013 00:00:00 +0000
<![CDATA[Nonlinear difference-in-differences in repeated cross sections with continuous treatments]]> http://www.ifs.org.uk/publications/6834 Tue, 20 Aug 2013 00:00:00 +0000 <![CDATA[Outcome conditioned treatment effects]]> This paper introduces average treatment effects conditional on the outcomes variable in an endogenous setup where outcome Y, treatment X and instrument Z are continuous. These objects allow to refine well studied treatment effects like ATE and ATT in the case of continuous treatment (see Florens et al (2009)), by breaking them up according to the rank of the outcome distribution. For instance, in the returns to schooling case, the outcome conditioned average treatment effect on the treated (ATTO), gives the average effect of a small increase in schooling on the subpopulation characterised by a certain treatment intensity, say 16 years of schooling, and a certain rank in the wage distribution. We show that IV type approaches are better suited to identify overall averages across the population like the average partial effect, or outcome conditioned versions thereof, while selection type methods are better suited to identify ATT or ATTO. Importantly, none of the identification relies on rectangular support of the errors in the identification equation. Finally, we apply all concepts to analyse the nonlinear heterogeneous effects of smoking during pregnancy on infant birth weight.

]]>
http://www.ifs.org.uk/publications/6833 Tue, 20 Aug 2013 00:00:00 +0000
<![CDATA[Implementing intersection bounds in Stata]]> We present the clrbound, clr2bound, clr3bound and clrtest commands for estimation and inference developed by Chernozhukov et al. (2013). The commands clrbound, clr2bound and clr3bound provide bound estimates that can be used directly for estimation or to construct asymptotically valid confidence sets. The command clrbound provides bound estimates for one-sided lower or upper intersection bounds on a parameter, while clr2bound and clr3bound provide two-sided bound estimates based on both lower and upper intersection bounds. clr2bound uses Bonferroni's inequality to construct two-sided bounds, whereas clr3bound inverts a hypothesis test. The former can be used to perform asymptotically valid inference on the identified set or the parameter, while the latter can be used to provide asymptotically valid and generally tighter confidence intervals for the parameter. clrtest performs an intersection bound test of the hypothesis that a collection of lower intersection bounds is no greater than zero. Inversion of this test can be used to construct confidence sets based on conditional moment inequalities as described in Chernozhukov et al. (2013). The commands include parametric, series and local linear estimation procedures and can be installed from within Stata by typing 'ssc install clrbound'.

]]>
http://www.ifs.org.uk/publications/6832 Fri, 16 Aug 2013 00:00:00 +0000
<![CDATA[Consumer Demand System Estimation and Value Added Tax Reforms in the Czech Republic]]> http://www.ifs.org.uk/publications/6830 Tue, 13 Aug 2013 00:00:00 +0000 <![CDATA[Wealth effects and the consumption of Italian households in the Great Recession]]> http://www.ifs.org.uk/publications/6831 Tue, 13 Aug 2013 00:00:00 +0000 <![CDATA[Mismatch, sorting and wage dynamics]]> We develop an empirical search-matching model which is suitable for analysing the wage, employment and welfare impact of regulation in a labour market with heterogeneous workers and jobs. To achieve this we develop an equilirium model of wage determination and employment which extends the current literature on equilibrium wage determination with matching and provides a bridge between some of the most prominent macro models and microeconometric research. The model incorporates productivity shocks, long-term contracts, on-the-job search and counter-offers. Importantly, the model allows for the possibility of assortative matching between workers and jobs due to complementarities between worker and job characteristics. We use the model to estimate the potential gain from optimal regulation and we consider the potential gains and redistributive impacts from optimal unemployment insurance policy. The model is estimated on the NLSY using the method of moments.

]]>
http://www.ifs.org.uk/publications/6825 Fri, 09 Aug 2013 00:00:00 +0000
<![CDATA[Testing for intertemporal nonseparability]]> http://www.ifs.org.uk/publications/6829 Fri, 09 Aug 2013 00:00:00 +0000 <![CDATA[Ill-posed inverse problems in economics]]> http://www.ifs.org.uk/publications/6828 Fri, 09 Aug 2013 00:00:00 +0000 <![CDATA[Education policy and intergenerational transfers in equilibrium]]> http://www.ifs.org.uk/publications/6826 Fri, 09 Aug 2013 00:00:00 +0000 <![CDATA[Spatial sorting]]> We investigate the role of complementarities in production and skill mobility across cities. We propose a general equilibrium model of location choice by heterogeneously skilled workers, and consider different degrees of complementarities between the skills of workers. The nature of the complementarities determines the equilibrium skill distribution across cities. We prove that with extreme-skill complementarity, the skill distribution has fatter tails in large cities; with top-skill complementarity, there is first-order stochastic dominance. Using the model to back out skills from wage and housing price data, we find robust evidence of fat tails in large cities. Big cities have big inequality. This pattern of spatial sorting is consistent with extreme-skill complementarity: the productivity of high skilled workers and of the providers of low skilled services is mutually enhanced.

]]>
http://www.ifs.org.uk/publications/6827 Fri, 09 Aug 2013 00:00:00 +0000
<![CDATA[Nonparametric analysis of random utility models: testing]]> http://www.ifs.org.uk/publications/6824 Wed, 07 Aug 2013 00:00:00 +0000 <![CDATA[Do institutions affect social preferences? Evidence from divided Korea]]> The Cold War division of Korea, regarded as a natural experiment in institutional change, provides a unique opportunity to examine whether institutions affect social preferences. We recruited North Korean refugees and South Korean students to conduct laboratory experiments eliciting social preferences, together with standard surveys measuring subjective attitudes toward political and economic institutions. Our experiments employ widely used dictator and trust games, with four possible group matches between North and South Koreans by informing them of the group identity of their anonymous partners. Experimental behaviour and support for institutions differ substantially between and within groups. North Korean refugees prefer more egalitarian distribution in the dictator games than South Korean students, even after controlling for individual characteristics that could be correlated with social preferences; however, the two groups show little difference in the trust game, once we control for more egalitarian behaviour of North Koreans. North Korean refugees show less support for market economy and democracy than South Korean subjects. Attitudes toward institutions are more strongly associated with the experimental behaviours among South Korean subjects than among North Korean subjects.

An online appendix to accompany this publication is available here

]]>
http://www.ifs.org.uk/publications/6823 Thu, 01 Aug 2013 00:00:00 +0000
<![CDATA[Individual heterogeneity and average welfare]]> Individual heterogeneity is an important source of variation in demand. Allowing for general heterogeneity is needed for correct welfare comparisons. We consider general heterogenous demand where preferences and linear budget sets are statistically independent. We find that the dimension of heterogeneity and the individual demand functions are not identified. We also find that the exact consumer surplus of a price change, averaged across individuals, is not identified, motivating bounds analysis. We use bounds on income effects to derive relatively simple bounds on the average surplus, including for discrete/continuous choice. We also sketch an approach to bounding surplus that does not use income effect bounds. We apply the results with income effect bounds to gasoline demand. We find little sensitivity to the income effect bounds in this application.

]]>
http://www.ifs.org.uk/publications/6820 Tue, 30 Jul 2013 00:00:00 +0000
<![CDATA[Inference in ordered response games with complete information]]> We study econometric models of complete information games with ordered action spaces, such as the number of store fronts operated in a market by a firm, or the daily number of flights on a city-pair offered by an airline. The model generalises single agent models such as ordered probit and logit to a simultaneous model of ordered response. We characterise identified sets for model parameters under mild shape restrictions on agents' payoff functions. We then propose a novel inference method for a parametric version of our model based on a test statistic that embeds conditional moment inequalities implied by equilibrium behaviour. Using maximal inequalities for U-processes, we show that an asymptotically valid confidence set is attained by employing an easy to compute fixed critical value, namely the appropriate quantile of a chi-square random variable. We apply our method to study capacity decisions measured as the number of stores operated by Lowe's and Home Depot in geographic markets. We demonstrate how our confidence sets for model parameters can be used to perform inference on other quantities of economic interest, such as the probability that any given outcome is an equilibrium and the propensity with which any particular outcome is selected when it is one of multiple equilibria.

]]>
http://www.ifs.org.uk/publications/6819 Mon, 29 Jul 2013 00:00:00 +0000
<![CDATA[Inference on treatment effects after selection amongst high-dimensional controls]]> We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances where the number of controls may be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the 'post-double-selection' method. Our results apply to Lasso-type methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties.

The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We also present a simple generalisation of our method to a fully heterogeneous model with a binary treatment variable. We illustrate the use of the developed methods with numerical simulations and an application that considers the effect of abortion crime rates.

]]>
http://www.ifs.org.uk/publications/6731 Mon, 22 Jul 2013 00:00:00 +0000
<![CDATA[Dealing with randomisation bias in a social experiment exploiting the randomisation itself: the case of ERA]]> per se. We illustrate how this has happened in the case of the UK Employment Retention and Advancement (ERA) experiment, in which over one quarter of the eligible population was not represented. Our objective is to quantify the impact that the ERA eligible population would have experienced under ERA, and to assess how this impact relates to the experimental impact estimated on the potentially selected subgroup of study participants. We show that the typical matching assumption required to identify the average treatment effect of interest is made up of two parts. One part remains testable under the experiment even in the presence of randomisation bias, and offers a way to correct the non-experimental estimates should they fail to pass the test. The other part rests on what we argue is a very weak assumption, at least in the case of ERA. We implement these ideas to the ERA program and show the power of this strategy. Further exploiting the experiment we assess the validity in our application of the claim often made in the literature that knowledge of long and detailed labour market histories can control for most selection bias in the evaluation of labour market interventions. Finally, for the case of survey-based outcomes, we develop a reweighting estimator which takes account of both non-participation and non-response.]]> http://www.ifs.org.uk/publications/6816 Mon, 22 Jul 2013 00:00:00 +0000 <![CDATA[Entropic Latent Variable Integration via Simulation]]> This paper introduces a general method to convert a model defined by moment conditions involving both observed and unobserved variables into equivalent moment conditions involving only observable variables. This task can be accomplished without introducing infinite-dimensional nuisance parameters using a least-favourable entropy-maximising distribution. We demonstrate, through examples and simulations, that this approach covers a wide class of latent variables models, including some game-theoretic models and models with limited dependent variables, interval-valued data, errors-in-variables, or combinations thereof. Both point- and set-identified models are transparently covered. In the latter case, the method also complements the recent literature on generic set-inference methods by providing the moment conditions needed to construct a GMM-type objective function for a wide class of models. Extensions of the method that cover conditional moments, independence restrictions and some state-space models are also given.

]]>
http://www.ifs.org.uk/publications/6814 Wed, 17 Jul 2013 00:00:00 +0000
<![CDATA[A simple bootstrap method for constructing nonparametric confidence bands for functions]]> Standard approaches to constructing nonparametric confidence bands for functions are frustrated by the impact of bias, which generally is not estimated consistently when using the bootstrap and conventionally smoothed function estimators. To overcome this problem, it is common practice to either undersmooth, so as to reduce the impact of bias, or oversmooth, and thereby introduce an explicit or implicit bias estimator. However, these approaches and others based on nonstandard smoothing methods, complicate the process of inference, for example by requiring the choice of new, unconventional smoothing parameters and, in the case of undersmoothing, producing relatively wide bands. In this paper we suggest a new approach, which exploits to our advantage one of the difficulties that, in the past, has prevented an attractive solution to the problem— the fact that the standard bootstrap bias estimator suffer from relatively high-frequency stochastic error. The high frequency, together with a technique based on quantiles, can be exploited to dampen down the stochastic error term, leading to relatively narrow, simple-to-construct confidence bands.

A supplement to this article, which outlines theoretical properties underpinning the methodology and provides a proof of theorem, can be viewed here

]]>
http://www.ifs.org.uk/publications/6782 Mon, 01 Jul 2013 00:00:00 +0000
<![CDATA[A nonparametric test of a strong leverage hypothesis]]> http://www.ifs.org.uk/publications/6781 Mon, 01 Jul 2013 00:00:00 +0000 <![CDATA[Adaptive nonparametric instrumental variables estimation: empirical choice of the regularization parameter]]> In nonparametric instrumental variables estimation, the mapping that identifies the function of interest, g say, is discontinuous and must be regularised (that is, modified) to make consistent estimation possible. The amount of modification is controlled by a regularisation parameter. The optimal value of this parameter depends on unknown population characteristics and cannot be calculated in applications. Theoretically justified methods for choosing the regularisation parameter empirically in applications are not yet available. This paper presents such a method for use in series estimation, where the regularisation parameter is the number of terms in a series approximation to g. The method does not required knowledge of the smoothness of g or of other unknown functions. It adapts to their unknown smoothness. The estimator of g based on the empirically selected regularisation parameter converges in probability at a rate that is at least as fast as the asymptotically optimal rate multiplied by (logn)1/2, where n is the sample size. The asymptotic integrated mean-square error (AIMSE) of the estimator is within a specified factor of the optimal AIMSE.

]]>
http://www.ifs.org.uk/publications/6783 Mon, 01 Jul 2013 00:00:00 +0000
<![CDATA[Identification and shape restrictions in nonparametric instrumental variables estimation]]> L(g), where the function g satisfies the relation Y=g(X)+U; E(U |W)=0. In this relation, Y is the dependent variable, X is a possibly endogenous explanatory variable, W is an instrument for X and U is an unobserved random variable. The data are an independent random sample of (Y, X, W). In much applied research, X and W are discrete, and W has fewer points of support than X. Consequently, neither g nor L(g) is nonparametrically identified. Indeed, L(g) can have any value in (-∞, ∞). In applied research, this problem is typically overcome and point identification is achieved by assuming that g is a linear function of X. However, the assumption of linearity is arbitrary. It is untestable if W is binary, as is the case in many applications. This paper explores the use of shape restrictions, such as monotonicity or convexity, for achieving interval identification of L(g). Economic theory often provides such shape restrictions. This paper shows that they restrict L(g) to an interval whose upper and lower bounds can be obtained by solving linear programming problems. Inference about the identified interval and the functional L(g) can be carried out by using the bootstrap. An empirical application illustrates the usefulness of shape restrictions for carrying out nonparametric inferences about L(g). An extension to nonseparable and quantile IV models is described.]]> http://www.ifs.org.uk/publications/6784 Mon, 01 Jul 2013 00:00:00 +0000 <![CDATA[People or places? Factors associated with the presence of domestic energy efficiency measures in England.]]> http://www.ifs.org.uk/publications/6768 Tue, 18 Jun 2013 00:00:00 +0000 <![CDATA[Optimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design]]> We consider the problem of choosing two bandwidths simultaneously for estimating the difference of two functions at given points. When the asymptotic approximation of the mean squared error (AMSE) criterion is used, we show that minimisation problem is not well-defined when the sign of the product of the second derivatives of the underlying functions at the estimated points is positive. To address this problem, we theoretically define and construct estimators of the asymptotically first-order optimal (AFO) bandwidths which are well-defined regardless of the sign. They are based on objective functions which incorporate a second-order bias term. Our approach is general enough to cover estimation problems related to densities and regression functions at interior and boundary points. We provide a detailed treatment of the sharp regression discontinuity design.

This article is accompanied by a web appendix in which we present omitted discussions, an algorithm to implement the proposed method for the sharp RSS and proofs for the main results.

]]>
http://www.ifs.org.uk/publications/6767 Mon, 17 Jun 2013 00:00:00 +0000
<![CDATA[Anti-smoking policies and smoker well-being: evidence from Britain.]]> Anti-smoking policies can in theory make smokers better off, by helping smokers with time-inconsistent preferences commit to giving up or reducing the amount they smoke. We use almost 20 years of British individual-level panel data to explore the impact on self-reported psychological well-being of two policy interventions: large real-terms increases in tobacco excise taxes and bans on smoking in public places. We use a difference-in-differences approach to compare the effects on well-being for smokers and non-smokers. Smoking behaviour is likely to be influenced by policy interventions, leading to a selection problem if outcomes are compared across current smokers and non-smokers. We consider different ways of grouping individuals into 'treatment' and 'control' groups based on demographic characteristics and observed smoking histories. We find fairly robust evidence that increases in tobacco taxes raise the relative well-being of likely smokers. Exploiting regional variation in the timing of the smoking ban across British regions, we also find some evidence that it raised smoker well-being, though the effect is not robust to the measure of well-being. The economic significance of the effects also appears to be quite modest. Our findings therefore give cautious support to the view that such interventions are at least partly justifiablebecause of the benefits they have for smokers themselves.

This research was funded by the Nuffield Foundation

]]>
http://www.ifs.org.uk/publications/6766 Fri, 14 Jun 2013 00:00:00 +0000
<![CDATA[What can wages and employment tell us about the UK's productivity puzzle?]]> This paper uses individual data on employment and wages to shed light on the UK’s productivity puzzle. It finds that workforce composition cannot explain the reduction in wages and hence productivity that we observe; instead, real wages have fallen significantly within jobs. Why? One possibility we investigate is higher labour supply in this recession than in the past. Another is lower trade union membership. Alternatively, it might be driven by a fall in productivity as a result of a lower capital-labour ratio. We cannot tell whether productivity is driving wages or vice versa, but understanding why wages have fallen within jobs is at the heart of the UK's productivity puzzle.

]]>
http://www.ifs.org.uk/publications/6749 Wed, 12 Jun 2013 00:00:00 +0000
<![CDATA[Parental socialisation effort and the intergenerational transmission of risk preferences]]> We study the transmission of risk attititudes in a unique survey of mothers and children in which both participated in an incentivised risk preference elicitation task. We document that risk preferences are correlated between mothers and children when the children are just 7 to 8 years old. This correlation is only present for daughters. We show that a measure of parental involvement is a strong moderator of the association between mothers' and daughers' risk tolerance. These findings support a role for socialisation in the intergenerational transmission of preferences that predict economic behaviour.

]]>
http://www.ifs.org.uk/publications/6752 Wed, 12 Jun 2013 00:00:00 +0000
<![CDATA[Uniform post selection inference for LAD regression models]]> We develop uniformly valid confidence regions for a regression coefficient in a high-dimensional sparse LAD (least absolute deviation or median) regression model. The setting is one where the number of regressors p could be large in comparison to the sample size n, but only s « n of them are needed to accurately describe the regression function. Our new methods are based on the instrumental LAD regression estimator that assembles the optimal estimating equation from either post ℓ- penalised LAD regression or ℓ1- penalised LAD regression. The estimating equation is immunised against non-regular estimation of nuisance part of the regression function, in the sense of Neyman. We establish that in a homoscedastic regression model, under certain conditions, the instrumental LAD regression estimator of the regression coefficient is asymptotically root-n normal uniformly with respect to the underlying sparse model. The resulting confidence regions are valid uniformly with respect to the underlying model. The new inference methods outperform the naive, 'oracle based' inference methods, which are known to be not uniformly valid- with coverage property failing to hold uniformly with respect the underlying model- even in the setting with p = 2. We also provide Monte-Carlo experiments which demonstrate that standard post-selection inference breaks down over large parts of the parameter space, and the proposed method does not.

]]>
http://www.ifs.org.uk/publications/6729 Mon, 03 Jun 2013 00:00:00 +0000
<![CDATA[Quantile models with endogeneity]]> In this article, we review quantile models with endogeneity. We focus on models that achieve identification through the use of instrumental variables and discuss conditions under which partial and point identification are obtained. We discuss key conditions, which include monotonicity and full-rank-type conditions, in detail. In providing this review, we update the identification results of Chernozhukov and Hansen (2005). We illustrate the modelling assumptions through economically motivated examples. We also briefly review the literature on estimation and inference.

]]>
http://www.ifs.org.uk/publications/6730 Mon, 03 Jun 2013 00:00:00 +0000
<![CDATA[Calculating confidence intervals for continuous and discontinuous functions of parameters]]> http://www.ifs.org.uk/publications/6724 Fri, 31 May 2013 00:00:00 +0000 <![CDATA[Regressions with Berkson errors in covariates - a nonparametric approach]]> This paper establishes that so-called instrumental variables enable the identification and the estimation of a fully nonparametric regression model with Berkson-type measurement error in the regressors. An estimator is proposed and proven to be consistent. Its practical performance and feasibility are investigated via Monte Carlo simulations as well as through an epidemiological application investigating the effect of particulate air pollution on respiratory health. These examples illustrate that Berkson errors can clearly not be neglected in nonlinear regression models and that the proposed method represents an effective remedy.

]]>
http://www.ifs.org.uk/publications/6720 Tue, 28 May 2013 00:00:00 +0000
<![CDATA[The relationship between DSGE and VAR models]]> This chapter reviews the literature on the econometric relationship between DSGE and VAR models from the point of view of estimation and model validation. The mapping between DSGE and VAR models is broken down into three stages: 1) from DSGE to state-space model; 2) from state-space model to VAR(∞); 3) from VAR(∞) to finite order VAR. The focus is on discussing what can go wrong at each step of this mapping and on critically highlighting the hidden assumptions. I also point out some open research questions and interesting new research directions in the literature on the econometrics of DSGE models. These include, in no particular order: understanding the effects of log-linearization on estimation and identification; dealing with multiplicity of equilibria; estimating nonlinear DSGE models; incorporating into DSGE models information from atheoretical models and from survey data; adopting flexible modelling approaches that combine the theoretical rigor of DSGE models and the econometric model's ability to fit the data.

]]>
http://www.ifs.org.uk/publications/6719 Fri, 24 May 2013 00:00:00 +0000
<![CDATA[Bond returns and market expectations]]> A well-documented empirical result is that market expectations extracted from futures contracts on the federal funds rate are among the best predictors for the future course of monetary policy. We show how this information can be exploited to produce accurate forecasts of bond excess returns and to construct profitable investment strategies in bond markets. We use a tilting method for incorporating market expectations into forecasts from a standard term-structure model and then derive the implied forecasts for bond excess returns. We find that the method delivers substantial improvements in out-of-sample accuracy relative to a number of benchmarks. The accuracy improvements are both statistically and economically significant and robust across a number of maturities and forecast horizons. The method would have allowed an investor to obtain positive cumulative excess returns from simple "riding the yield curve" investment strategies over the past ten years, and in this respect it would have outperformed its competitors even after accounting for a risk-return tradeoff.

]]>
http://www.ifs.org.uk/publications/6718 Fri, 24 May 2013 00:00:00 +0000
<![CDATA[Likelihood inference in some finite mixture models]]> Parametric mixture models are commonly used in applied work, especially empirical economics, where these models are often employed to learn for example about the proportions of various types in a given population. This paper examines the inference question on the proportions (mixing probability) in a simply mixture model in the presence of nuisance parameters when sample size is large. It is well known that likelihood inference in mixture models is complicated due to 1) lack of point identification, and 2) parameters (for example, mixing probabilities) whose true value may lie on the boundary of the parameter space. These issues cause the profiled likelihood ration (PLR) statistic to admit asymptotic limits that differ discontinuously depending on how the true density of the data approaches the regions of singularities where there is lack of point identification. This lack of uniformity in the asymptotic distribution suggests that confidence intervals based on pointwise asymptotic approximations might lead to faulty inferences. This paper examines this problem in details in a finite mixture model and provides possible fixes based on the parametric bootstrap. We examine the performance of this parametric bootstrap in Monte Carlo experiments and apply it to data from Beauty Contest experiments. We also examine small sample inferences and projection methods.

]]>
http://www.ifs.org.uk/publications/6713 Wed, 22 May 2013 00:00:00 +0000
<![CDATA[Asymptotic theory for the QMLE in GARCH-X models with stationary and non-stationary covariates]]> This paper investigates the asymptotic properties of the Gaussian quasi-maximum-likelihood estimators (QMLE's) of the GARCH model augmented by including an additional explanatory variable- the so-called GARCH-X model. The additional covariate is allowed to exhibit any degree of persistence as captured by its long-memory parameter dx; in particular, we allow for both stationary and non-stationary covariates. We show that the QMLE's of the parameters entering the volatility equation are consistent and mixed-normally distributed in large samples. The convergence rates and limiting distributions of the QMLE's depend on whether the regressor is stationary or not. However, standard inferential tools for the parameters are robust to the level of persistence of the regressor with t-statistics following standard Normal distributions in large sample irrespective of whether the regressor is stationary or not.

]]>
http://www.ifs.org.uk/publications/6709 Fri, 17 May 2013 00:00:00 +0000
<![CDATA[Female labour supply, human capital and welfare reform]]> We consider the impact of tax credits and income support programs on female education choice, employment, hours and human capital accumulation over the life-cycle. We analyse both the short run incentive effects and the longer run implications of such programs. By allowing for risk aversion and savings, we quantify the insurance value of alternative programs. We find important incentive effects on education choice and labour supply, with single mothers having the most elastic labour supply. Returns to labour market experience are found to be substantial but only for full-time employment, and especially for women with more than basic formal education. For those with lower education the welfare programs are shown to have substantial insurance value. Based on the model, marginal increases to tax credits are preferred to equally costly increases in income support and to tax cuts, except by those in the highest education group.

]]>
http://www.ifs.org.uk/publications/6703 Thu, 16 May 2013 00:00:00 +0000
<![CDATA[Inference on counterfactual distributions]]> Counterfactual distributions are important ingredients for policy analysis and de-composition analysis in empirical economics. In this article we develop modelling and inference tools for counterfactual distributions based on regression methods. The counterfactual scenarios that we consider consist of ceteris paribus changes in either the distribution of covariates related to the outcome of interest or the conditional distribution of the outcome given covariates. For either of these scenarios we derive joint functional central limit theorems and bootstrap validity results for regression-based estimators of the status quo and counterfactual outcome distributions. These results allow us to construct simultaneous confidence sets for function-valued effects of the counterfactual changes, including the effects on the entire distribution and quantile functions of the outcome as well as on related functionals. These confidence sets can be used to test functional hypotheses such as no-effect, positive effect or stochastic dominance. Our theory applies to general counterfactual changes and covers the main regression methods including classical, quantile, duration and distribution regressions. We illustrate the results with an empirical application to wage decompositions using data for the United States.

As part of developing the main results, we introduce distribution regression as a comprehensive and flexible tool for modelling and estimating the entire conditional distribution. We show that distribution regression encompasses the Cox duration regression and represents a useful alternative to quantile regression. We establish functional central limit theorems and bootstrap validity results for the empirical distribution regression process and various related functionals.

This is a revision of CWP05/12 and CWP09/09

]]>
http://www.ifs.org.uk/publications/6695 Mon, 13 May 2013 00:00:00 +0000
<![CDATA[The drivers of month of birth differences in children's cognitive and non-cognitive skills: a regression discontinuity analysis]]> This paper uses data from a rich UK birth cohort to estimate the differences in cognitive and non-cognitive skills between children born at the start and end of the academic year. It builds on the previous literature on this topic in England by using a more robust regression discontinuity design and is also able to provide new insight into the drivers of the differences in outcomes between children born in different months that we observe. Specifically, we compare differences in tests that are affected by all three of the potential drivers (age at test, age of starting school and relative age) with differences in tests sat at the same age (which are therefore not affected by the age at test effect) as a way of separately identifying the age at test effect. We find that age at test is the most important factor driving the difference between the oldest and youngest children in an academic cohort; highlighting that children born at the end of the academic year are at a disadvantage primarily because they are almost a year younger than those born at the start of the academic year when they take national achievement tests. An appropriate policy response in this case is to appropriately age-adjust these tests. However, we also find evidence that a child’s view of their own scholastic competence differs significantly between those born at the start and end of the academic year, even when eliminating the age at test effect. This means that other policy responses may be required to correct for differences in outcomes amongst children born in different months, but not necessarily so: it may be that children’s view of their scholastic competence would change in response to the introduction of appropriately age-adjusted tests, for example as a result of positive reinforcement.

]]>
http://www.ifs.org.uk/publications/6689 Fri, 10 May 2013 00:00:00 +0000
<![CDATA[Identifying the drivers of month of birth differences in educational attainment]]> Children born at the end of the academic year have lower educational attainment, on average, than those born at the start of the academic year. Previous research shows that the difference is most pronounced early in pupils’ school lives, but remains evident and statistically significant in high-stakes exams taken at the end of compulsory schooling. To determine the most appropriate policy response, it is vital to understand which of the four possible factors (age at test, age of starting school, length of schooling and relative age without cohort) lead to these differences in attainment between those born at different points in the academic year. However, research to date has been unable to adequately address this problem, as the four potential drivers are all highly correlated with one another, and three of the four form an exact linear relationship (age at test = age of starting school + length of schooling). This paper is the first to apply the principle of maximum entropy to this problem. Using two complementary sources of data we find that a child’s age at the time they take the test is the most important driver of the differences observed, which suggests that age-adjusting national achievement test scores is likely to be the most appropriate policy response to ensure that children born towards the end of the year are not at a disadvantage simply because they are younger when they take their exams.

This working paper is supplemented by an online appendix which can be viewed here

]]>
http://www.ifs.org.uk/publications/6690 Fri, 10 May 2013 00:00:00 +0000
<![CDATA[The impact of age within academic year on adult outcomes]]> Children born at the end of the academic year have lower educational attainment, on average, than those born at the start of the academic year. Previous research has shown that the difference is most pronounced early in pupils’ school lives, but remains evident and statistically significant in high-stakes exams taken at the end of compulsory schooling. Those born later in the academic year are also significantly less likely to participate in post-compulsory education than those born at the start of the year. We provide the first evidence on whether these differences in childhood outcomes translate into differences in the probability of employment, occupation and earnings for adults in the UK. We also examine whether there are differences in broader measures of well-being such as self-perceived health and mental health. We find that the large and significant differences observed in educational attainment do not lead to pervasive differences in adulthood; those born towards the end of the academic year are more likely to experience unemployment (which is particularly true for females and those that don’t achieve a degree level qualification) but in general there are few substantial or statistically significant differences in terms of occupation, earnings and self-perceived health and mental health. It is not clear why this should be the case, but if employers reward productivity equally as they learn more about their workers, irrespective of their educational attainment, then this lack of significant differences may not be surprising.

This paper is supplemented by an online appendix which can be viewed here

 

]]>
http://www.ifs.org.uk/publications/6688 Fri, 10 May 2013 00:00:00 +0000
<![CDATA[Non-parametric transformation regression with non-stationary data]]> We examine a kernel regression smoother for time series that takes account of the error correlation structure as proposed by Xiao et al. (2008). We show that this method continues to improve estimation in the case where the regressor is a unit root or near unit root process.

]]>
http://www.ifs.org.uk/publications/6672 Tue, 23 Apr 2013 00:00:00 +0000
<![CDATA[Nonparametric estimation of multivariate elliptic densities via finite mixture sieves]]> This paper considers the class of p-dimensional elliptic distributions (p ≥ 1) satisfying the consistency property (Kano, 1994) and within this general framework presents a two-stage semiparametric estimator for the Lebesgue density based on Gaussian mixture sieves. Under the online Exponentiated Gradient (EG) algorithm of Helmbold et al. (1997) and without restricting the mixing measure to have compact support, the estimator produces estimates converging uniformly in probability to the true elliptic density at a rate that is independent of the dimension of the problem, hence circumventing the familiar curse of dimensionality inherent to many semiparametric estimators. The rate performance of our estimator depends on the tail behaviour of the underlying mixing density (and hence that of the data) rather than smoothness properties. In fact, our method achieves a rate of at least Op(n-1/4), provided only some positive moment exists. When further moments exist, the rate improves reaching Op(n-3/8) as the tails of the true density converge to those of a normal. Unlike the elliptic density estimator of Liebscher (2005), our sieve estimator always yields an estimate that is valid density, and is also attractive from a practical perspective as it accepts data as a stream, thus significantly reducing computational and storage requirements. Monte Carlo experimentation indicates encouraging finite sample performance over a range of elliptic densities. The estimator is also implemented in a binary classification task using the well-known Wisconsin breast cancer dataset.

]]>
http://www.ifs.org.uk/publications/6671 Tue, 23 Apr 2013 00:00:00 +0000
<![CDATA[Maximum score estimation of preference parameters for a binary choice model under uncertainty]]> This paper develops maximum score estimation of preference parameters in the binary choice model under uncertainty in which the decision rule is affected by conditional expectations. The preference parameters are estimated in two stages: we estimate conditional expectations nonparametrically in the first stage and the preference parameters in the second stage based on Manski (1975, 1985)'s maximum score estimator using the choice data and first stage estimates. The paper establishes consistency and derives the rate of convergence of the corresponding two-stage estimator, which is of independent interest for maximum score estimation with generated regressors. The paper also provides results of some Monte Carlo experiments.

]]>
http://www.ifs.org.uk/publications/6670 Tue, 23 Apr 2013 00:00:00 +0000
<![CDATA[Long memory via networking]]> Many time-series data are known to exhibit 'long memory', that is, they have an autocorrelation function that decays very slowly with lag. This behaviour has traditionally been attributed to either aggregation of heterogenous processes, nonlinearity, learning dynamics, regime switching, structural breaks, unit roots or fractional Brownian motion. This paper identifies an entirely different mechanism for long memory generation by showing that it can naturally arise when a large number of simply linear homogenous economic subsystems with a short memory are interconnected to form a network such that the outputs of each of the subsystem are fed into the inputs of others. This networking picture yields a type of aggregation that is not merely additive, resulting in a collective behaviour that is richer than that of individual subsystems. Interestingly, the long memory behaviour is found to be almost entirely determined by the geometry of the network while being relatively insensitive to the specific behaviour of individual agents.

]]>
http://www.ifs.org.uk/publications/6661 Tue, 02 Apr 2013 00:00:00 +0000
<![CDATA[Reform of ill-health retirement of police in England and Wales: impact on pension liabilities and the role of local finance]]> We examine ill-health retirement of police officers in England and Wales between 2002-3 and 2009-10. Differences in ill-health retirement rates across forces are statistically related to area-specific stresses of policing and force-specific differences in human resources policies. Reforms to police pensions plans- in particular a shift in the incidence of financing ill-health retirement from central government to local police authorities- impacted on the level of ill-health retirement, especially among forces with above-average rates of retirement. We find that residual differences in post-2006 ill-health retirement rates across forces are related to their differential capacities to raise revenue from local property taxes. We quantify the impact of these reforms on overal pension plan liabilities.

]]>
http://www.ifs.org.uk/publications/6660 Thu, 28 Mar 2013 00:00:00 +0000
<![CDATA[Ambiguity revealed]]> http://www.ifs.org.uk/publications/6659 Thu, 28 Mar 2013 00:00:00 +0000 <![CDATA[Random coefficients in static games of complete information]]> http://www.ifs.org.uk/publications/6657 Wed, 27 Mar 2013 00:00:00 +0000 <![CDATA[Identifying sibling influence on teenage substance use]]> http://www.ifs.org.uk/publications/6656 Tue, 26 Mar 2013 00:00:00 +0000 <![CDATA[Let's get LADE: robust estimation of semiparametric multiplicative volatility models]]> We investigate a model in which we connect slowly time varying unconditional long-run volatility with short-run conditional volatility whose representation is given as a semi-strong GARCH (1,1) process with heavy tailed errors. We focus on robust estimation of both long-run and short-run volatilities. Our estimation is semiparamentric since the long-run volatility is totally unspecified whereas the short-run conditional volatility is a parametric semi-strong GARCH (1,1) process. We propose different robust estimation methods for nonstationary and strictly stationary GARCH parameters with non-parametric long-run volatility function. Our estimation is based on a two-step LAD procedure. We establish the relevant asymptotic theory of the proposed estimators. Numerical results lend support to our theoretical results.

]]>
http://www.ifs.org.uk/publications/6648 Tue, 19 Mar 2013 00:00:00 +0000
<![CDATA[What do instrumental variable models deliver with discrete dependent variables?]]> We study models with discrete endogenous variables and compare the use of two stage least squares (2SLS) in a linear probability model with bounds analysis using a nonparametric instrumental variable model.

2SLS has the advantage of providing an easy to compute point estimator of a slope coefficient which can be interpreted as a local average treatment effect (LATE). However, the 2SLS estimator does not measure the value of other useful treatment effect parameters without invoking untenable restrictions.

The nonparametric instrumental variable (IV) model has the advantage of being weakly restrictive, so more generally applicable, but it usually delivers set identification. Nonetheless it can be used to consistently estimate bounds on many parameters of interest including, for example, average treatment effects. We illustrate using data from Angrist & Evans (1998) and study the effect of family size on female employment.

 

This October 2015 version corrects an error in the paper, as explained in footnote 1. The original version of the working paper is available here.

]]>
http://www.ifs.org.uk/publications/6643 Mon, 18 Mar 2013 00:00:00 +0000
<![CDATA[Testing for homogeneity in mixture models]]> Statistical models of unobserved heterogeneity are typically formalised as mixtures of simple parametric models and interest naturally focuses on testing for homogeneity versus general mixture alternatives. Many tests of this type can be interpreted as C(α) tests, as in Neyman (1959), and shown to be locally, asymptotically optimal. A unified approach to analysing the asymptotic behaviour of such tests will be described, employing a variant of the LeCam LAN framework. These C(α) tests will be contrasted with a new approach to likelihood ratio testing for mixture models. The latter tests are based on estimation of general (nonparametric) mixture models using the Kiefer and Wolfowitz (1956) maximum likelihood method. Recent developments in convex optimisation are shown to dramatically improve upon earlier EM methods for computation of these estimators, and new results on the large sample behaviour of likelihood rations involving such estimators yield a tractable form of asymptotic inference. We compare performance of the two approaches identifying circumstances in which each is preferred.

]]>
http://www.ifs.org.uk/publications/6638 Tue, 12 Mar 2013 00:00:00 +0000
<![CDATA[Wages and informality in developing countries]]> It is often argued that informal labour markets in developing countries are the engine of growth because their existence allows firms to operate in an environment where wage and regulatory costs are lower. On the other hand informality means that the amount of social protection offered to workers is lower. In this paper we extend the wage-posting framework of Burdett and Mortensen (1998) to allow for two sectors of employment. Firms are heterogeneous and decide endogenously in which sector to locate. Workers engage in both off the job and on the job search and decide which offers to accept. Direct transitions across sectors are permitted, which matches the evidence in the data about job mobility. Our empirical analysis uses Brazilian labour force surveys. We use the model to discuss the relative merits of alternative policies towards informality. In particular, we evaluate the impact of a tighter regulatory framework on employment in the formal and the informal sector on the distribution of wages.

]]>
http://www.ifs.org.uk/publications/6635 Mon, 11 Mar 2013 00:00:00 +0000
<![CDATA[Career progression, economic downturns and skills]]> This paper analyses the career progression of skilled and unskilled workers, with a focus on how careers are affected by economic downturns and whether formal skills, acquired early on, can shield workers from the effect of recessions. Using detailed administrative data for Germany for numerous birth cohorts across different regions, we follow workers from labour market entry onwards and estimate a dynamic life-cycle model of vocational training choice, labour supply and wage progression. Most particularly, our model allows for labour market frictions that vary by skill group and over the business cycle. We find that sources of wage growth differ: learning-by-doing is an important component for unskilled workers early on in their careers, while job mobility is important for workers who acquire skills in an apprenticeship scheme before labour market entry. Likewise, economic downturns affect skill groups through very different channels: unskilled workers lose out from a decline in productivity and human capital, whereas skilled individuals suffer mainly from lack of mobility.

]]>
http://www.ifs.org.uk/publications/6633 Mon, 11 Mar 2013 00:00:00 +0000
<![CDATA[Assortative matching and search with labor supply and home production]]> We extend the search-matching model of the marriage market of Shimer and Smith (200) to allow for labor supply and home production. We characterise the steady-state equilibrium when exogenous divorce is the only source of risk. We study nonparametric identification using cross-section data on wages and hours worked, and we develop a nonparametric estimator. The estimated matching probabilities that can be derived from the steady-state flow conditions are strongly increasing in male and female wages. We estimate the expected share of marriage surplus appropriated by each spouse as a function of wages. The model allows to infer the specialisation of female spouses in home production from observations on wages and hours worked.

]]>
http://www.ifs.org.uk/publications/6634 Mon, 11 Mar 2013 00:00:00 +0000
<![CDATA[Interdependent durations in joint retirement]]> In this paper we specify and use a new duration model to study joint retirement in married couples using the Health and Retirement Study. Whereas conventionally used models cannot account for joint retirement, our model admits joint retirement with positive probability, allows for simultaneity and nests the traditional proportional hazards model. In contrast to other statistical models for simultaneous durations, it is based on Nash bargaining and it is therefore interpretable in terms of economic behaviour. We provide a discussion of relevant identifying variation and estimate our model using indirect inference. The main empirical finding is that the simultaneity seems economically important. In our preferred specification the indirect utility associated with being retired increases by approximately 10% if one's spouse is already retired. By comparison, a defined benefit pension plan increases indirect utility by 20-30%. The estimated model also predicts that the indirect effect of a change in husbands' pension plan on wives' retirement dates is about 10% of the direct effect on the husbands.

]]>
http://www.ifs.org.uk/publications/6632 Mon, 11 Mar 2013 00:00:00 +0000
<![CDATA[Incentives, shocks or signals: labour supply effects of increasing the female state pension age in the UK]]> In 1995, the UK government legislated to increase the earliest age at which women could claim a state pension from 60 to 65 between April 2010 and March 2020. This paper uses data from the first two years of this change coming into effect to estimate the impact of increasing the state pension age from 60 to 61 on the employment of women and their partners using a difference-in-differences methodology. Our methodology controls in a flexible way for underlying differences between cohorts born at different times. We find that women's employment rates at age 60 increased by 7.3 percentage points when the state pension age was increased to 61 and their probability of unemployment increased by 1.3 percentage points. The employment rates of the male partners also increased by 4.2 percentage points. The magnitude of these effects, and the results from subgroup analysis, suggest they are more likely explained by the increase in the state pension age being a shock or through it having a signalling effect rather than them being due to either credit constraints or the effect of individuals responding to changes in their financial incentives to work. Taken together, our results suggest that the fiscal strengthening arising from a one-year increase in the female state pension age is 10% higher than a costing based on no behavioural change, due to additional direct and indirect tax revenues arising from increased earnings.

The current version of this working paper was published in January 2014 and replaces an earlier version originally published in March 2013.

]]>
http://www.ifs.org.uk/publications/6622 Fri, 08 Mar 2013 00:00:00 +0000
<![CDATA[Identification of discrete choice models for bundles and binary games]]> http://www.ifs.org.uk/publications/6618 Wed, 27 Feb 2013 00:00:00 +0000 <![CDATA[Estimating demand for differentiated products with error in market shares]]> http://www.ifs.org.uk/publications/6613 Wed, 20 Feb 2013 00:00:00 +0000 <![CDATA[Discount Rate Heterogeneity Among Older Households: A Puzzle?]]> We put forward a method for estimating individual discount rates using field data on wealth and income. We build consumption from these data using the budget constraint. At each date, consumption transitions yield discount rates by groups of households. We apply this technique to a representative sample of older English households. We show, unsurprisingly, that there is substantial heterogeneity in discounting in that population. We fi nd a distribution of discount rates similar in shape to those previously estimated using field data, though with much lower mean values than the ones found using experimental data. But surprisingly we find that, among this older population, patience is positively correlated with education and numerical ability. This is puzzling as it goes against the negative correlation found in experiments and some fi eld investigations of time preference for younger populations. We discuss potential explanations for this result.

]]>
http://www.ifs.org.uk/publications/6558 Fri, 25 Jan 2013 00:00:00 +0000
<![CDATA[How taxes and welfare distort work incentives: static lifecycle and dynamic perspectives]]> http://www.ifs.org.uk/publications/6552 Wed, 23 Jan 2013 00:00:00 +0000 <![CDATA[A semiparametric model for heterogeneous panel data with fixed effects]]> This paper develops methodology for semiparametric panel data models in a setting where both the time series and the cross section are large. Such settings are common in finance and other areas of economics. Our model allows for heterogeneous nonparametric covariate effects as well as unobserved time and individual specific effects that may depend on the covariates in an arbitrary way. To model the covariate effects parsimoniously, we impose a dimensionality reducing common component structure on them. In the theoretical part of the paper, we derive the asymptotic theory of the proposed procedure. In particular, we provide the convergence rates and the asymptotic distribution of our estimators. In the empirical part, we apply our methodology to a specific application that has been the subject of recent policy interest, that is, the effect of trading venue fragmentation on market quality. We use a unique dataset that reports the location and volume of trading on the FTSE350 companies from 2008 to 2011 at the weekly frequency. We find that the effect of fragmentation on market quality is nonlinear and non-monotonic. The implied quality of the market under perfect competition is superior to that under monopoly provision, but the transition between the two is complicated.

]]>
http://www.ifs.org.uk/publications/6549 Mon, 21 Jan 2013 00:00:00 +0000
<![CDATA[Specification tests for partially identified models defined by moment inequalities]]> This paper studies the problem of specification testing in partially identified models defined by a finite number of moment equalities and inequalities (i.e., (in)equalities). Under the null hypothesis, there is at least one parameter value that simultaneously satisfies all of the moment (in)equalities whereas under the alternative hypothesis there is no such parameter value. While this problem has not been directly addressed in the literature (except in particular cases), several papers have suggested implementing this inferential problem by checking whether confidence intervals for the parameters of interest are empty or not.

We propose two hypothesis tests that use the infimum of the sample criterion function over the parameter space as the test statistic together with two different critical values. We obtain two main results. First, we show that the two tests we propose are asymptotically size correct in a uniform sense. Second we show our tests are more powerful than the test that checks whether the confidence set for the parameters of interest is empty or not.

]]>
http://www.ifs.org.uk/publications/6548 Fri, 18 Jan 2013 00:00:00 +0000
<![CDATA[Inference on sets in finance]]> In this paper we consider the problem of inference on a class of sets describing a collection of admissible models as solutions to a single smooth inequality. Classical and recent examples include, among others, the Hansen-Jagannathan (HJ) variances for asset portfolio returns, and the set of structural elasticities in Chetty's (2012) analysis of demand with optimisation frictions. We show that the econometric structure of the problem allows us to construct convenient and powerful confidence regions based upon the weighted likelihood ration and weighted Wald (directed weighted Hausdorff) statistics. The statistics we formulate differ (in part) from existing statistics in that they enforce either exact or first order equivariance to transformations of parameters, making them especially appealing in the target applications. Moreover, the resulting inference procedures are also more powerful than the structured projection methods, which rely upon building confidence sets for the frontier-determining sufficient parameters (e.g. frontier-spanning portfolios), and then projecting them to obtain confidence sets for HJ sets or MF sets. Lastly, the framework we put forward is also useful for analysing intersection bounds, namely sets defined as solutions to multiple smooth inequalities, since multiple inequalities can be conservatively approximated by a single smooth inequality. We present two empirical examples that show how the new econometric methods are able to generate sharp economic conclusions.

]]>
http://www.ifs.org.uk/publications/6528 Thu, 27 Dec 2012 00:00:00 +0000
<![CDATA[Central limit theorems and multiplier bootstrap when <i>p</i> is much larger than <i>n</i>]]> We derive a central limit theorem for the maximum of a sum of high dimensional random vectors. More precisely, we establish conditions under which the distribution of the maximum is approximated by the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. The key innovation of our result is that it applies even if the dimension of random vectors (p) is much larger than the sample size (n). In fact, the growth of p could be exponential in some fractional power of n. We also show that the distribution of the maximum of a sum of the Gaussian random vectors with unknown covariance matrices can be estimated by the distribution of the maximum of the (conditional) Gaussian process obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. We call this procedure the “multiplier bootstrap”. Here too, the growth of p could be exponential in some fractional power of n. We prove that our distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation for the distribution of the original maximum, often with at most a polynomial approximation error. These results are of interest in numerous econometric and statistical applications. In particular, we demonstrate how our central limit theorem and the multiplier bootstrap can be used for high dimensional estimation, multiple hypothesis testing, and adaptive specification testing. All of our results contain non-asymptotic bounds on approximation errors.

]]>
http://www.ifs.org.uk/publications/6526 Tue, 18 Dec 2012 00:00:00 +0000
<![CDATA[Gaussian approximation of suprema of empirical processes]]> We develop a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating empirical processes themselves in the sup-norm. We prove an abstract approximation theorem that is applicable to a wide variety of problems, primarily in statistics. Especially, the bound in the main approximation theorem is non-asymptotic and the theorem does not require uniform boundedness of the class of functions. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein's method for normal approximation, and some new empirical processes techniques. We study applications of this approximation theorem to local empirical processes and series estimation in nonparametric regression where the classes of functions change with the sample size and are not Donsker-type. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under considerably weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples.

]]>
http://www.ifs.org.uk/publications/6525 Mon, 17 Dec 2012 00:00:00 +0000
<![CDATA[Inference for best linear approximations to set identified functions]]> http://www.ifs.org.uk/publications/6509 Mon, 17 Dec 2012 00:00:00 +0000 <![CDATA[Identification and estimation in a correlated random coefficients binary response model]]> We study identification and estimation in a binary response model with random coefficients B allowed to be correlated with regressors X. Our objective is to identify the mean of the distribution of B and estimate a trimmed mean of this distribution. Like Imbens and Newey (2009), we use instruments Z and a control vector V to make X independent of B given V. A consequent conditional median restriction identifies the mean of B given V. Averaging over V identifies the mean of B. This leads to an analogous localise-then-average approach to estimation. We estimate conditional means with localised smooth maximum score estimators and average to obtain a √n-consistent and asymptotically normal estimator of a trimmed mean of the distribution of B. The method can be adapted to models with nonrandom coefficients to produce √n-consistent and asymptotically normal estimators under the conditional median restrictions. We explore small sample performance through simulations, and present an application.

]]>
http://www.ifs.org.uk/publications/6508 Mon, 17 Dec 2012 00:00:00 +0000
<![CDATA[Measurement error in nonlinear models - a review]]> This overview of the recent econometrics literature on measurement error in nonlinear models centres on the question of the identification and estimation of general nonlinear models with measurement error. Simple approaches that rely on distributional knowledge regarding the measurement error (such as deconvolution or validation data techniques) are briefly presented. Then follows a description of methods that secure identification via more readily available auxiliary variables (such as repeated measurements, measurement systems with a 'factor model' structure, instrumental variables and panel data). Methods exploiting higher-order moments or bounding techniques to avoid the need for auxiliary information are presented next. Special attention is devoted to a recently introduced general method to handle a broad class of latent variable models, called Entropic Latent Variable Integration via Simulation (ELVIS). Finally, the complex but active topic of nonclassical measurement error is covered and applications of measurement error techniques to other fields are outlined.

]]>
http://www.ifs.org.uk/publications/6480 Mon, 03 Dec 2012 00:00:00 +0000
<![CDATA[Nonparametric identification and semiparametric estimation of classical measurement error models without side information]]> Virtually all methods aimed at correcting for covariate measurement error in regressions rely on some form of additional information (e.g. validation data, known error distributions, repeated measurements or instruments). In contrast, we establish that the fully nonparametric classical errors-in-variables mode is identifiable from data on the regressor and the dependent variable alone, unless the model takes a very specific parametric form. The parametric family includes (but is not limited to) the linear specification with normally distributed variables as a well-known special cast. This result relies on standard primitive regularity conditions taking the form of smoothness constraints and nonvanishing characteristic functions assumptions. Our approach can handle both monotone and nonmonotone specifications, provided the latter oscillate a finite number of times. Given that the very specific unidentified parametric functional form is arguably the exception rather than the rule, this identification result should have a wide applicability. It leads to a new perspective on handling measurement error in nonlinear and nonparametric models, opening the way to a novel and practical approach to correct for measurement error in data sets where it was previously considered impossible (due to the lack of additional information regarding the measurement error). We suggest an estimator based on non/semi-parametric maximum likelihood, derive its asymptotic properties and illustrate the effectiveness of the method with a simulation study and an application to the relationship between firm investment behaviour and market value, the latter being notoriously mismeasured.

]]>
http://www.ifs.org.uk/publications/6479 Mon, 03 Dec 2012 00:00:00 +0000
<![CDATA[A triangular treatment effect model with random coefficients in the selection equation]]> In this paper we study nonparametric estimation in a binary treatment model where the outcome equation is of unrestricted form, and the selection equation contains multiple unobservables that enter through a nonparametric random coefficients specification. This specification is flexible because it allows for complex unobserved heterogeneity of economic agents and non-monotone selection into treatment. We obtain conditions under which both the conditional distributions of Y0 and Y1, the outcome for the untreated, respectively treated, given first stage unobserved random coefficients, are identified. We can thus identify an average treatment effect, conditional on first stage unobservables called UCATE, which yields most treatment effects parameters that depend on averages, like ATE and TT. We provide sharp bounds on the variance, the joint distribution of (Y0 and Y1) and the distribution of treatment effects. In the particular case where the outcomes are continuously distributed, we provide novel and weak conditions that allow to point identify the join conditional distribution of Y0 and Y1, given the unobservables. This allows to derive every treatment effect parameter, e.g. the distribution of treatment effects and the proportion of individuals who benefit from treatment. We present estimators for the marginals, average and distribution of treatment effects, both conditional on unobservables and unconditional, as well as total population effects. The estimators use all data and discard tail values of the instruments when they are too unlikely. We provide their rates of convergence, and analyse their finite sample behaviour in a simulation study. Finally, we also discuss the situation where some of the instruments are discrete.

]]>
http://www.ifs.org.uk/publications/6476 Mon, 03 Dec 2012 00:00:00 +0000
<![CDATA[Identification in auctions with selective entry]]> This paper considers nonparametric identification of a two-stage entry and bidding model for auctions which we call the Affiliated-Signal (AS) model. This model assumes that potential bidders have private values, observe imperfect signals of their true values prior to entry, and choose whether to undertake a costly entry process. The AS model is a theoretically appealing candidate for the structural analysis of auctions with entry: it accommodates a wide range of entry processes, in particular nesting the Levin and Smith (1994) and Samuelson (1985) models as special cases. To date, however, the model's identification properties have not been well understood. We establish identifcation results for the general AS model, using variation in factors affecting entry behaviour (such as potential competition or entry costs) to construct identified bounds on model fundamentals. If available entry variation is continuous, the AS model may be point identified; otherwise, it will be partially identified. We derive constructive identification results in both cases, which can readily be refined to produce the sharp identified set. We also consider policy analysis in environments where only partial identifcation is possible, and derive identified bounds on expected seller revenue corresponding to a wide range of counterfactual policies while accounting for endogenous and arbitrarily selective entry. Finally we establish that our core results extend to environments with asymmetric bidders and nonseparable auction-level unobserved heterogeneity.

]]>
http://www.ifs.org.uk/publications/6473 Fri, 30 Nov 2012 00:00:00 +0000
<![CDATA[A winning formula? Elementary indices in the Retail Prices Index]]> can be applied at this level and moreover that it favours the use of the Jevons index.]]> http://www.ifs.org.uk/publications/6456 Tue, 27 Nov 2012 00:00:00 +0000 <![CDATA[Local identification of nonparametric and semiparametric models]]> In parametric models a sufficient condition for local identification is that the vector of moment is differentiable at the true parameter with full rank derivative matrix. This paper shows that additional conditions are often needed in nonlinear, nonparametric models to avoid nonlinearities overwhelming linear effects. It give restrictions on a neighbourhood of the true value that are sufficient for local identification. These results are applied to obtain new, primitive identification conditions in several important models, including nonseparable quantile instrumental variable (IV) models, single-index IV models, and semiparametric consumption-based asset pricing models.

]]>
http://www.ifs.org.uk/publications/6441 Wed, 14 Nov 2012 00:00:00 +0000
<![CDATA[Testing regression monotonicity in econometric models]]> http://www.ifs.org.uk/publications/6436 Tue, 13 Nov 2012 00:00:00 +0000 <![CDATA[Adaptive test of conditional moment inequalities]]> In this paper, the author constructs a new test of conditional moment inequalities based on studentised kernel estimates of moment functions. The test automatically adapts to the unknown smoothness of the moment functions, has uniformly correct asymptotic size, and is rate optimal against certain classes of alternatives. Some existing tests have nontrivial n-local alternatives of the certain type whereas my method only allows (n / log n) - local alternatives of this type. There exist, however, large classes of sequences of well-behaved alternatives against which the test developed in this paper is consistent and those tests are not.

]]>
http://www.ifs.org.uk/publications/6437 Tue, 13 Nov 2012 00:00:00 +0000
<![CDATA[Comparing household inflation experiences measured by the CPI and RPI ]]> http://www.ifs.org.uk/publications/6424 Fri, 02 Nov 2012 00:00:00 +0000 <![CDATA[Lifetime inequality and redistribution]]> http://www.ifs.org.uk/publications/6551 Wed, 31 Oct 2012 00:00:00 +0000 <![CDATA[Developing expenditure questions: Findings from R2 cognitive testing]]> http://www.ifs.org.uk/publications/6417 Tue, 30 Oct 2012 00:00:00 +0000 <![CDATA[Developing expenditure questions: Findings from focus groups]]> Currently there is no established way to measure expenditure in the context of a general purpose survey. Therefore NatCen's Questionnaire Development and Testing (QDT) Hub, working in collaboration with the Institute for Fiscal Studies and collaborators from Oxford and Cambridge Universities, are looking at how best to measure expenditure in a social survey context.

This report provides findings from a series of focus groups investigating how people think about household expenditure and what issues people may have in reporting household expenditure in a social survey context. The information collected in the focus groups will be used as a starting point for designing new questions on household spending for use in future social surveys. Subsequent stages of work will include cognitively testing any new questions produced and consulting a panel of experts over the proposed questions.

This project was funded by the Nuffield Foundation.

]]>
http://www.ifs.org.uk/publications/6415 Tue, 30 Oct 2012 00:00:00 +0000
<![CDATA[Developing expenditure questions: Findings from R1 cognitive testing]]> http://www.ifs.org.uk/publications/6416 Tue, 30 Oct 2012 00:00:00 +0000 <![CDATA[An instrumental variable random coefficients model for binary outcomes]]> In this paper we study a random coefficient model for a binary outcome. We allow for the possibility that some or even all of the regressors are arbitrarily correlated with the random coefficients, thus permitting endogeneity. We assume the existence of observed instrumental variables Z that are jointly independent with the random coefficients, although we place no structure on the joint determination of the endogenous variable X and instruments Z, as would be required for a control function approach. The model fits within the spectrum of generalised instrumental variable models studied in Chesher and Rosen (2012a), and we thus apply identification results from that and related studies to the present context, demonstrating their use. Specifically, we characterize the identified set for the distribution of random coefficients in the binary response model with endogeneity via a collection of conditional moment inequalities, and we investigate the structure of these sets by way of numerical illustration.

]]>
http://www.ifs.org.uk/publications/6409 Tue, 23 Oct 2012 00:00:00 +0000
<![CDATA[Intersection bounds: estimation and inference]]> We develop a practical and novel method for inference on intersection bounds, namely bounds defined by either the infimum or supremum of a parametric or nonparametric function, or equivalently, the value of a linear programming problem with a potentially infinite constraint set. We show that many bounds characterizations in econometrics, for instance bounds on parameters under conditional moment inequalities, can be formulated as intersection bounds. Our approach is especially convenient for models comprised of a continuum of inequalities that are separable in parameters, and also applies to models with inequalities that are non-separable in parameters. Since analog estimators for intersection bounds can be severely biased in finite samples, routinely underestimating the size of the identified set, we also offer a median-bias-corrected estimator of such bounds as a by-product of our inferential procedures. We develop theory for large sample inference based on the strong approximation of a sequence of series or kernel-based empirical processes by a sequence of "penultimate" Gaussian processes. These penultimate processes are generally not weakly convergent, and thus non-Donsker. Our theoretical results establish that we can nonetheless perform asymptotically valid inference based on these processes. Our construction also provides new adaptive inequality/moment selection methods. We provide conditions for the use of nonparametric kernel and series estimators, including a novel result that establishes strong approximation for any general series estimator admitting linearization, which may be of independent interest.

]]>
http://www.ifs.org.uk/publications/6382 Wed, 17 Oct 2012 00:00:00 +0000
<![CDATA[Asymptotic efficiency of semiparametric two-step GMM]]> In this note, we characterise the semiparametric efficiency bound for a class of semiparametric models in which the unknown nuisance functions are identified via nonparametric conditional moment restrictions with possibly non-nested or over-lapping conditioning sets, and the finite dimensional parameters are potentially over-identified via unconditional moment restrictions involving the nuisance functions. We discover a surprising result that semiparametric two-step optimally weighted GMM estimators achieve the efficiency bound, where the nuisance functions could be estimated via any consistent non-parametric procedures in the first step. Regardless of whether the efficiency bound has a closed form expression or not, we provide easy-to-compute sieve based optimal weight matrices that lead to asymptotically efficient two-step GMM estimators.

]]>
http://www.ifs.org.uk/publications/6372 Mon, 15 Oct 2012 00:00:00 +0000
<![CDATA[Exogeneity in semiparametric moment condition models]]> http://www.ifs.org.uk/publications/6371 Mon, 15 Oct 2012 00:00:00 +0000 <![CDATA[An estimation of economic models with recursive preferences]]> This paper presents estimates of key preference parameters of the Epstein and Zin (1989, 1991) and Weil (1989) (EZW) recursive utility model, evaluates the models ability to fit asset return data relative to other asset pricing models, and investigates the implications of such estimates for the unobservable aggregate wealth return. Our empirical results indicate that the estimated relative risk aversion parameter ranges from 17-60, with higher values for aggregate consumption than for stockholder consumption, while the estimated elasticity of intertemporal substitution is above one. In addition, the estimated model-implied aggregate wealth return is found to be weakly correlated with the CRSP value-weighted stock market return, suggesting that the return to human wealth is negatively correlated with the aggregate stock market return.

]]>
http://www.ifs.org.uk/publications/6373 Mon, 15 Oct 2012 00:00:00 +0000
<![CDATA[Financial implications of relationship breakdown: does marriage matter?]]> http://www.ifs.org.uk/publications/6363 Tue, 09 Oct 2012 00:00:00 +0000 <![CDATA[Econometric analysis of games with multiple equilibria]]> This article reviews the recent literature on the econometric analysis of games where multiple solutions are possible. Multiplicity does not necessarily preclude the estimation of a particular model (and in certain cases even improves its identification), but ignoring it can lead to misspecifications. The survey starts with a general characterisation of structural models that highlights how multiplicity affects the classical paradigm. Because the information structure is an important guide to identification and estimation strategies, I discuss games of complete and incomplete information separately. Whereas many of the techniques discussed in the article can be transported across different information environments, some of them are specific to particular models. I also survey models of social interactions in a different section. I close with a brief discussion of post-estimation issues and research prospects.

]]>
http://www.ifs.org.uk/publications/6344 Mon, 01 Oct 2012 00:00:00 +0000
<![CDATA[Efficient estimation of conditional risk measures in a semiparametric GARCH model]]> This paper proposes efficient estimators of risk measures in a semiparametric GARCH model defined through moment constraints. Moment constraints are often used to identify and estimate the mean and variance parameters and are however discarded when estimating error quantiles. In order to prevent this efficiency loss in quantile estimation we propose a quantile estimator based on inverting an empirical likelihood weighted distribution estimator. It is found that the new quantile estimator is uniformly more efficient than the simple empirical quantile and a quantile estimator based on normalized residuals. At the same time, the efficiency gain in error quantile estimation hinges on the efficiency of estimators of the variance parameters. We show that the same conclusion applies to the estimation of conditional Expected Shortfall. Our comparison also leads to interesting implications of residual bootstrap for dynamic models. We find that these proposed estimators for conditional Value-at-Risk and expected shortfall are asymptotically mixed normal. This asymptotic theory can be used to construct confidence bands for these estimators by taking account of parameter uncertainty. Simulation evidence as well as empirical results are provided.

]]>
http://www.ifs.org.uk/publications/6327 Fri, 21 Sep 2012 00:00:00 +0000
<![CDATA[Testing for the stochastic dominance efficiency of a given portfolio]]> We propose a new statistical test of the stochastic dominance efficiency of a given portfolio over a class of portfolios. We establish its null and alternative asymptotic properties, and define a method for consistently estimating critical values. We present some numerical evidence that our tests work well in moderate sized samples.

]]>
http://www.ifs.org.uk/publications/6329 Fri, 21 Sep 2012 00:00:00 +0000
<![CDATA[A flexible semiparametric model for time series]]> We consider approximating a multivariate regression function by an affine combination of one-dimensional conditional component regression functions. The weight parameters involved in the approximation are estimated by least squares on the first-stage nonparametric kernel estimates. We establish asymptotic normality for the estimated weights and the regression function in two cases: the number of the covariates is finite, and the number of the covariates is diverging. As the observations are assumed to be stationary and near epoch dependent, the approach in this paper is applicable to estimation and forecasting issues in time series analysis. Furthermore, the methods and results are augmented by a simulation study and illustrated by application in the analysis of the Australian annual mean temperature anomaly series. We also apply our methods to high frequency volatility forecasting, where we obtain superior results to parametric methods.

]]>
http://www.ifs.org.uk/publications/6330 Fri, 21 Sep 2012 00:00:00 +0000
<![CDATA[Averaging of moment condition estimators]]> We establish the consistency and asymptotic normality for a class of estimators that are linear combinations of a set of √n− consistent estimators whose cardinality increases with sample size. A special case of our framework corresponds to the conditional moment restriction and the implied estimator in that case is shown to achieve the semiparametric efficiency bound. The proofs do not rely on smoothness of underlying criterion functions.

]]>
http://www.ifs.org.uk/publications/6328 Fri, 21 Sep 2012 00:00:00 +0000
<![CDATA[Wages and informality in developing countries]]> http://www.ifs.org.uk/publications/6316 Thu, 20 Sep 2012 00:00:00 +0000 <![CDATA[Microfinance, Poverty and Education]]> http://www.ifs.org.uk/publications/6313 Wed, 19 Sep 2012 00:00:00 +0000 <![CDATA[Sharp for SARP: Nonparametric bounds on the behavioural and welfare effects of price changes]]> Sharp nonparametric bounds are derived for Hicksian compensating and equivalent variations. These 'i-bounds' generalize earlier results of Blundell, Browning and Crawford (2008). We show that their e-bounds are sharp under the Weak Axiom of Revealed Preference (WARP). They do not require transitivity. The new i-bounds are sharp under the Strong Axiom of Revealed Preference (SARP). By requiring transitivity they can be used to bound welfare measures. The new bounds on welfare measures are shown to be operationalized through algorithms that are easy to implement.

]]>
http://www.ifs.org.uk/publications/6312 Wed, 19 Sep 2012 00:00:00 +0000
<![CDATA[A nonparametric test of the leverage hypothesis]]> http://www.ifs.org.uk/publications/6308 Thu, 13 Sep 2012 00:00:00 +0000 <![CDATA[Nonparametric estimation of a periodic sequence in the presence of a smooth trend]]> In this paper, we study a nonparametric regression model including a periodic component, a smooth trend function, and a stochastic error term. We propose a procedure to estimate the unknown period and the function values of the periodic component as well as the nonparametric trend function. The theoretical part of the paper establishes the asymptotic properties of our estimators. In particular, we show that our estimator of the period is consistent. In addition, we derive the convergence rates as well as the limiting distributions of our estimators of the periodic component and the trend function. The asymptotic results are complemented with a simulation study that investigates the small sample behaviour of our procedure. Finally, we illustrate our method by applying it to a series of global temperature anomalies.

]]>
http://www.ifs.org.uk/publications/6307 Wed, 12 Sep 2012 00:00:00 +0000
<![CDATA[Nonparametric regression for locally stationary time series]]> In this paper, we study nonparametric models allowing for locally stationary regressors and a regression function that changes smoothly over time. These models are a natural extension of time series models with time-varying coefficients. We introduce a kernel-based method to estimate the time-varying regression function and provide asymptotic theory for our estimates. Moreover, we show that the main conditions of the theory are satisfied for a large class of nonlinear autoregressive processes with a time-varying regression function. Finally, we examine structured models where the regression function splits up into time-varying additive components. As will be seen, estimation in these models does not suffer from the curse of dimensionality. We complement the technical analysis of the paper by an application to financial data.

]]>
http://www.ifs.org.uk/publications/6302 Fri, 07 Sep 2012 00:00:00 +0000
<![CDATA[Durable purchases over the later life cycle]]> We investigate life cycle patterns of demand for services from household durables using UK panel data. We take careful account of prices, demographics, labour supply and health. Demand for consumer electronics rises with age, while the demand for household appliances is fl‡at. These …findings contrast with the well documented decline in nondurable consumption at older ages, and suggest that studies that estimate the overall discount rate from nondurable consumption may underestimate consumer patience and the savings required to fund retirement. We also …find important nonseparabilities between the demand for durables, labour supply and health status.

]]>
http://www.ifs.org.uk/publications/6300 Fri, 31 Aug 2012 00:00:00 +0000
<![CDATA[Simultaneous equations for discrete outcomes: coherence, completeness, and identification]]> This paper studies simultaneous equations models for two or more discrete outcomes. These models may be incoherent, delivering no values of the outcomes at certain values of the latent variables and covariates, and they may be incomplete, delivering more than one value of the outcomes at certain values of the covariates and latent variates. We revisit previous approaches to the problems of incompleteness and incoherence in such models, and we propose a new approach for dealing with these. For each approach, we use random set theory to characterize sharp identification regions for the marginal distribution of latent variables and the structural function relating outcomes to covariates, illustrating the relative identifying power and tradeoffs of the different approaches. We show that these identified sets are characterized by systems of conditional moment equalities and inequalities, and we provide a generically applicable algorithm for constructing these. We demonstrate these results for the simultaneous equations model for binary outcomes studied in for example Heckman (1978) and Tamer (2003) and the triangular model with a discrete endogenous variable studied in Chesher (2005) and Jun, Pinkse, and Xu (2011) as illustrative examples.

]]>
http://www.ifs.org.uk/publications/6297 Wed, 22 Aug 2012 00:00:00 +0000
<![CDATA[Nonparametric additive models]]> http://www.ifs.org.uk/publications/6293 Mon, 20 Aug 2012 00:00:00 +0000 <![CDATA[Asymptotic theory for differentiated products demand models with many markets]]> T increases, taking into account that the market shares are approximated by Monte Carlo integration. It is shown that the estimated parameters are √T consistent and asymptotically normal as long as the number of simulations R grows fast enough relative to T. Monte Carlo integration induces both additional variance as well additional bias terms in the asymptotic expansion of the estimator. If R does not increase as fast as T, the leading bias term dominates the leading variance term and the asymptotic distribution might not be centered at 0. This paper suggests methods to eliminate the leading bias term from the asymptotic expansion. Furthermore, an adjustment to the asymptotic variance is proposed that takes the leading variance term into account. Monte Carlo results show that these adjustments, which are easy to compute, should be used in applications to avoid severe undercoverage caused by the simulation error.]]> http://www.ifs.org.uk/publications/6290 Thu, 02 Aug 2012 00:00:00 +0000 <![CDATA[Measuring living standards with income and consumption: evidence from the UK]]> http://www.ifs.org.uk/publications/6276 Wed, 25 Jul 2012 00:00:00 +0000 <![CDATA[On the testability of identification in some nonparametric models with endogeneity]]> This paper examines three distinct hypothesis testing problems that arise in the context of identification of some nonparametric models with endogeneity. The first hypothesis testing problem we study concerns testing necessary conditions for identification in some nonparametric models with endogeneity involving mean independence restrictions. These conditions are typically referred to as completeness conditions. The second and third hypothesis testing problems we examine concern testing for identification directly in some nonparametric models with endogeneity involving quantile independence restrictions. For each of these hypothesis testing problems, we provide conditions under which any test will have power no greater than size against any alternative. In this sense, we conclude that no nontrivial tests for these hypothesis testing problems exist.

]]>
http://www.ifs.org.uk/publications/6275 Tue, 24 Jul 2012 00:00:00 +0000
<![CDATA[Penalized estimation of high-dimensional models under a generalized sparsity condition ]]> http://www.ifs.org.uk/publications/6274 Mon, 23 Jul 2012 00:00:00 +0000 <![CDATA[Testing multiple inequality hypotheses: a smoothed indicator approach]]> This paper proposes a class of origin-smooth approximators of indicators underlying the sum-of-negative-part statistic for testing multiple inequalities. The need for simulation or bootstrap to obtain test critical values is thereby obviated. A simple procedure is enabled using fixed critical values. The test is shown to have correct asymptotic size in the uniform sense that supremum finite-sample rejection probability over null-restricted data distributions tends asymptotically to nominal significance level. This applies under weak assumptions allowing for estimator covariance singularity. The test is unbiased for a wide class of local alternatives. A new theorem establishes directions in which the test is locally most powerful. The proposed procedure is compared with predominant existing tests in structure, theory and simulation.

This paper is a revised version of CWP13/09.

]]>
http://www.ifs.org.uk/publications/6243 Fri, 06 Jul 2012 00:00:00 +0000
<![CDATA[Identification and shape restrictions in nonparametric instrumental variables estimation]]> http://www.ifs.org.uk/publications/6212 Wed, 27 Jun 2012 00:00:00 +0000 <![CDATA[A simple bootstrap method for constructing nonparametric confidence bands for functions]]> http://www.ifs.org.uk/publications/6211 Mon, 25 Jun 2012 00:00:00 +0000 <![CDATA[Model comparisons in unstable environments]]> The goal of this paper is to develop formal tests to evaluate the relative in-sample performance of two competing, misspecified non-nested models in the presence of possible data instability. Compared to previous approaches to model selection, which are based on measures of global performance, we focus on the local relative performance of the models. We propose three tests that are based on different measures of local performance and that correspond to different null and alternative hypotheses. The empirical application provides insights into the time variation in the performance of a representative DSGE model of the European economy relative to that of VARs.

]]>
http://www.ifs.org.uk/publications/6201 Thu, 07 Jun 2012 00:00:00 +0000
<![CDATA[A warp-speed method for conducting Monte Carlo experiments involving bootstrap estimators]]> We analyze fast procedures for conducting Monte Carlo experiments involving bootstrap estimators, providing formal results establishing the properties of these methods under general conditions.

]]>
http://www.ifs.org.uk/publications/6202 Thu, 31 May 2012 00:00:00 +0000
<![CDATA[Do public health interventions crowd out private health investments? Malaria control policies in Eritrea]]> It is often argued that engaging in indoor residual spraying (IRS) in areas with high coverage of mosquito bed nets may discourage net ownership and use. This is just a case of a public program inducing perverse incentives. We analyze new data from a randomized control trial conducted in Eritrea which surprisingly shows the opposite: IRS encouraged net acquisition and use. Our evidence points to the role of imperfect information. The introduction of IRS may have made the problem of malaria more salient, leading to a change in beliefs about its importance and to an increase in private health investments.

]]>
http://www.ifs.org.uk/publications/6193 Fri, 18 May 2012 00:00:00 +0000
<![CDATA[Inference on treatment effects after selection amongst high-dimensional controls]]> We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances where the number of controls may be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the 'post-double-selection' method. Our results apply to Lasso-type methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties.

The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We illustrate the use of the developed methods with numerical simulations and an application to the effect of abortion on crime rates.

This paper is a revision of CWP42/11.

]]>
http://www.ifs.org.uk/publications/6191 Mon, 07 May 2012 00:00:00 +0000
<![CDATA[Saving on a rainy day, borrowing for a rainy day]]> http://www.ifs.org.uk/publications/6171 Fri, 04 May 2012 00:00:00 +0000 <![CDATA[Late starters or excluded generations? A cohort analysis of catch up in home ownership in England]]> http://www.ifs.org.uk/publications/6170 Thu, 03 May 2012 00:00:00 +0000 <![CDATA[The effect of the financial crisis on older households in England]]> http://www.ifs.org.uk/publications/6159 Mon, 30 Apr 2012 00:00:00 +0000 <![CDATA[The returns to private education: evidence from Mexico]]> Despite the rapid expansion and increasing importance of private education in developing countries, very little is known about the impact of studying in private schools on educational attainment and wages. This paper contributes to fi…lling this gap by estimating the returns to private high schools in Mexico. We construct a unique dataset that combines labor market outcomes and historical school census data, and we exploit changes in the availability and size of public and private high schools across states and over time for identi…cation. We …nd substantial evidence of a positive effect of studying in a private high school on wages after college graduation, and we discuss alternative mechanisms that can explain this …finding.

]]>
http://www.ifs.org.uk/publications/6153 Thu, 19 Apr 2012 00:00:00 +0000
<![CDATA[Household responses to information on child nutrition: experimental evidence from Malawi]]> This working paper has been updated to a new version, W14/02, which can be downloaded here.

]]>
http://www.ifs.org.uk/publications/6140 Wed, 11 Apr 2012 00:00:00 +0000
<![CDATA[Semiparametric estimation of random coefficients in structural economic models]]> In structural economic models, individuals are usually characterized as solving a decision problem that is governed by a finite set of parameters. This paper discusses the nonparametric estimation of the probability density function of these parameters if they are allowed to vary continuously across the population. We establish that the problem of recovering the probability density function of random parameters falls into the class of non-linear inverse problem. This framework helps us to answer the question whether there exist densities that satisfy this relationship. It also allows us to characterize the identified set of such densities. We obtain novel conditions for point identification, and establish that point identification is generically weak. Given this insight, we provide a consistent nonparametric estimator that accounts for this fact, and derive its asymptotic distribution. Our general framework allows us to deal with unobservable nuisance variables, e.g., measurement error, but also covers the case when there are no such nuisance variables. Finally, Monte Carlo experiments for several structural models are provided which illustrate the performance of our estimation procedure.

]]>
http://www.ifs.org.uk/publications/6138 Wed, 04 Apr 2012 00:00:00 +0000
<![CDATA[Estimation of random coefficients logit demand models with interactive fixed effects]]> We extend the Berry, Levinsohn and Pakes (BLP, 1995) random coefficients discrete-choice demand model, which underlies much recent empirical work in IO. We add interactive fixed effects in the form of a factor structure on the unobserved product characteristics. The interactive fixed effects can be arbitrarily correlated with the observed product characteristics (including price), which accommodates endogeneity and, at the same time, captures strong persistence in market shares across products and markets. We propose a two step least squares-minimum distance (LS-MD) procedure to calculate the estimator. Our estimator is easy to compute, and Monte Carlo simulations show that it performs well. We consider an empirical application to US automobile demand.

]]>
http://www.ifs.org.uk/publications/6137 Fri, 30 Mar 2012 00:00:00 +0000
<![CDATA[The distributional impact of public spending in the UK]]> http://www.ifs.org.uk/publications/6076 Wed, 21 Mar 2012 00:00:00 +0000 <![CDATA[Do up-front tax incentives affect private pension saving in the United Kingdom? ]]> The paper examines how individuals respond to complex decision-making environments – in particular, whether up-front financial incentives are an effective policy lever to change behaviour. The paper argues that incentives differ in their transparency and in their complexity; individuals are more likely to respond to incentives that are both transparent and imply a large pay-off in terms of net income.

The paper focuses on household ‘tax planning’ in the context of tax reliefs for retirement saving in the United Kingdom. It examines whether take-up of retirement saving instruments increases at the higher rate threshold for income tax, since tax relief is given at the marginal tax rate and should be more attractive to those just above this threshold than to those just below it. It then examines a more complex case where the tax system provides an incentive for pension saving to do be done by one member of a couple. Econometric results are obtained from the Family Resources Survey on these two tests of household responses to complex incentives.

]]>
http://www.ifs.org.uk/publications/6073 Tue, 20 Mar 2012 00:00:00 +0000
<![CDATA[Identification of income-leisure preferences and evaluation of income tax policy]]> The merits of alternative income tax policies depend on the population distribution of preferences for income and leisure. Standard theory, which supposes that persons want more income and more leisure, does not predict how they resolve the tension between these desires. Empirical studies of labor supply have imposed strong preference assumptions that lack foundation. This paper examines anew the problem of inference on income-leisure preferences and considers the implications for evaluation of tax policy. I first perform a basic revealed-preference analysis assuming only that persons prefer more income and leisure. This shows that observation of a person's time allocation under a status quo tax policy may bound his allocation under a proposed policy or may have no implications, depending on the tax schedules and the person's status quo time allocation. I next explore the identifying power of two classes of assumptions that restrict the distribution of income-leisure preferences. One assumes that groups of persons who face different choice sets have the same preference distribution. The second restricts the shape of this distribution. The generic finding is partial identification of preferences. This implies partial prediction of tax revenue under proposed policies and partial knowledge of the welfare function for utilitarian policy evaluation.

]]>
http://www.ifs.org.uk/publications/6069 Thu, 15 Mar 2012 00:00:00 +0000
<![CDATA[How children's schooling and work are affected when their father leaves permanently: evidence from Colombia]]> This paper investigates how the permanent departure of the father from the household affects children's school enrolment and work participation in rural Colombia. Our results show that departure of the father decreases children's school enrolment by around 4 percentage points, and increases child labour by 3 percentage points. After using household fixed effects to deal with time-invariant unobserved heterogeneity, and providing evidence suggesting strongly that estimates are not biased by time varying unobserved heterogeneity, we also exploit an interesting feature of our setting, a conditional cash transfer programme in place, and show that it counteracts the adverse effects. This, and other pieces of evidence we give, strongly suggests that the channel through which departure affects children is through reducing income. It also highlights the important safety net role played by such welfare programmes, in particular for very disadvantaged households, who are unlikely to find formal or informal ways of insuring themselves against such vagaries.

]]>
http://www.ifs.org.uk/publications/6044 Thu, 08 Mar 2012 00:00:00 +0000
<![CDATA[Revealed preference in a discrete consumption space]]> We show that an agent maximizing some utility function on a discrete (as opposed to continuous) consumption space will obey the generalized axiom of revealed preference (GARP) so long as the agent obeys cost efficiency. Cost efficiency will hold if there is some good, outside the set of goods being studied by the modeler, that can be consumed by the agent in continuous quantities. An application of Afriat's Theorem then guarantees that there is a strictly increasing utility function on the discrete consumption space that rationalizes price and demand observations in that space.

]]>
http://www.ifs.org.uk/publications/6043 Wed, 07 Mar 2012 00:00:00 +0000
<![CDATA[Goods versus characteristics: dimension reduction and revealed preference]]> This paper compares the goods and characteristics models of the consumer within a non-parametric revealed preference framework. Of primary interest is to make a comparison on the basis of predictive success that takes into account dimension reduction. This allows us to nonparametrically identify the model which best fits the data. We implement these procedures on household panel data from the UK milk market. The primary result is that the better fit of the characteristics model is entirely attributable to dimension reduction.

]]>
http://www.ifs.org.uk/publications/6038 Thu, 01 Mar 2012 00:00:00 +0000
<![CDATA[Measuring living standards with income and consumption: Evidence from the UK]]> http://www.ifs.org.uk/publications/6256 Thu, 01 Mar 2012 00:00:00 +0000 <![CDATA[How might in-home scanner technology be used in budget surveys?]]> This paper considers what role in-home barcode scanner data could play in collecting household expenditure information as part of national budget surveys. One role is as a source of validation. We make detailed micro-level comparisons of food and drink expenditures in two British datasets: the Living Costs and Food Survey (the main budget survey) and Kantar Worldpanel scanner data. We find that levels of spending are significantly lower in scanner data. A large part (but not all) of the gap is explained by weeks in which no spending at all is recorded in scanner data. Demographic differences between the surveys accentuate rather than close the gap. We also demonstrate that patterns of expenditure across the surveys are much more similar, as are Engel curves relating food commodity budget shares to total food expenditures. A key finding is that the period over which we observe households in the scanner data significantly alters the distribution, but not the average, of weekly food expenditures and budget shares, which has important implications for whether two-week spending diaries common to budget surveys are giving a truly accurate reflection of a household's typical spending patterns. A second, more involved use of scanner data would be to impute detailed commodity-level expenditure patterns given only information on total expenditures, as a way of reducing respondent burden in budget surveys. We find that observable demographics in the scanner data explain little of the variation in store-specific expenditure patterns, and so caution against relying too heavily on imputation.

]]>
http://www.ifs.org.uk/publications/6035 Mon, 27 Feb 2012 00:00:00 +0000
<![CDATA[Sieve inference on semi-nonparametric time series models]]> The method of sieves has been widely used in estimating semiparametric and nonparametric models. In this paper, we first provide a general theory on the asymptotic normality of plug-in sieve M estimators of possibly irregular functionals of semi/nonparametric time series models. Next, we establish a surprising result that the asymptotic variances of plug-in sieve M estimators of irregular (i.e., slower than root-T estimable) functionals do not depend on temporal dependence. Nevertheless, ignoring the temporal dependence in small samples may not lead to accurate inference. We then propose an easy-to-compute and more accurate inference procedure based on a "pre-asymptotic" sieve variance estimator that captures temporal dependence. We construct a "pre-asymptotic" Wald statistic using an orthonormal series long run variance (OS-LRV) estimator. For sieve M estimators of both regular (i.e., root-T estimable) and irregular functionals, a scaled "pre-asymptotic" Wald statistic is asymptotically F distributed when the series number of terms in the OS-LRV estimator is held fixed. Simulations indicate that our scaled "pre-asymptotic" Wald test with F critical values has more accurate size in finite samples than the usual Wald test with chi-square critical values.

]]>
http://www.ifs.org.uk/publications/6033 Fri, 24 Feb 2012 00:00:00 +0000
<![CDATA[Inference on counterfactual distributions]]> We develop inference procedures for policy analysis based on regression methods. We consider policy interventions that correspond to either changes in the distribution of covariates, or changes in the conditional distribution of the outcome given covariates, or both. Under either of these policy scenarios, we derive functional central limit theorems for regression-based estimators of the status quo and counterfactual marginal distributions. This result allows us to construct simultaneous confidence sets for function-valued policy effects, including the effects on the marginal distribution function, quantile function, and other related functionals. We use these confidence sets to test functional hypotheses such as no-effect, positive effect, or stochastic dominance. Our theory applies to general policy interventions and covers the main regression methods including classical, quantile, duration, and distribution regressions. We illustrate the results with an empirical application on wage decompositions using data for the United States. Of independent interest is the use of distribution regression as a tool for modeling the entire conditional distribution, encompassing duration/transformation regression, and representing an alternative to quantile regression.

This is a revision of CWP09/09.

]]>
http://www.ifs.org.uk/publications/6020 Mon, 13 Feb 2012 00:00:00 +0000
<![CDATA[Inference on sets in finance]]> In this paper we introduce various set inference problems as they appear in finance and propose practical and powerful inferential tools. Our tools will be applicable to any problem where the set of interest solves a system of smooth estimable inequalities, though we will particularly focus on the following two problems: the admissible mean-variance sets of stochastic discount factors and the admissible mean-variance sets of asset portfolios. We propose to make inference on such sets using weighted likelihood-ratio and Wald type statistics, building upon and substantially enriching the available methods for inference on sets.

]]>
http://www.ifs.org.uk/publications/6019 Sun, 05 Feb 2012 00:00:00 +0000
<![CDATA[The Making of Modern America: Migratory Flows in the Age of Mass Migration]]> We provide new estimates of migrant flows into and out of America during the Age of Mass Migration at the turn of the twentieth century. Our analysis is based on a novel data set of administrative records covering the universe of 24 million migrants who entered Ellis Island, New York between 1892 and 1924. We use these records to measure inflows into New York, and then scale-up these figures to estimate migrant inflows into America as a whole. Combining these flow estimates with census data on the stock of foreign-born in America in 1900, 1910 and 1920, we conduct a demographic accounting exercise to estimate out-migration rates in aggregate and for each nationality-age-gender cohort. The accounting exercise overturns common wisdom on two fronts. First, we estimate flows into the US to be 20% and 170% higher than stated in official statistics for the 1900-10 and 1910-20 decades, respectively. Second, we estimate the rate of out-migration from the US to be 76% during 1900-10 and close to 100% during 1910-20. These figures are between two and three times larger than official estimates for each decade. That migration was effectively a two-way flow between the US and the sending countries has major implications for understanding the potential selection of immigrants that chose to permanently reside in the US, their impact on Americans in labor markets, and institutional change in America and sending countries.

]]>
http://www.ifs.org.uk/publications/6124 Wed, 01 Feb 2012 00:00:00 +0000
<![CDATA[Blissful Ignorance? A Natural Experiment on The Effect of Feedback on Students' Performance]]> We present a theoretical framework and empirical strategy to measure whether and how providing university students with feedback on their own past exam performance affects their future exam performance. Our identification strategy exploits a natural experiment in a leading UK university where different departments have historically different rules on the provision of feedback to their students. Our theoretical framework makes precise that if feedback provides students with a signal of their marginal return to effort in generating test scores, then the effect of feedback depends on the balance of standard substitution and income effects, and on whether students over or under estimate the return to their effort. Empirically, we find the provision of feedback has a positive effect on student’s subsequent test scores: the mean impact corresponds to 13% of a standard deviation in test scores. The impact of feedback is stronger for more able students and for students who have less information to start with, while no students appear to be discouraged by feedback. Our findings add to a growing literature on feedback in organizations more generally, and specifically in this setting the results suggest that the provision of feedback might be a cost-effective means to increase students’ exam performance.

]]>
http://www.ifs.org.uk/publications/6123 Wed, 01 Feb 2012 00:00:00 +0000
<![CDATA[Aggregation without the aggravation? Nonparametric analysis of the representative consumer]]> In the tradition of Afriat (1967), Diewert (1973) and Varian (1982), we provide a revealed preference characterisation of the representative consumer. Our results are simple and complement those of Gorman (1953, 1961), Samuelson (1956) and others. They can also be applied to data very readily and without the need for auxiliary parametric or statistical assumptions. We investigate the application of our characterisation by means of a balanced microdata panel survey. Our findings provide robust evidence against the existence of a representative consumer for our data.

]]>
http://www.ifs.org.uk/publications/5997 Wed, 25 Jan 2012 00:00:00 +0000
<![CDATA[The age-period cohort problem: set identification and point identification]]> "Only entropy comes easily" - Anton Chekhov

Various methods have been used to overcome the point identification problem inherent in the linear age-period-cohort model. This paper presents a set-identification result for the model and then considers the use of the maximum-entropy principle as a vehicle for achieving point identification. We present two substantive applications (US female mortality data and UK female labor force participation) and compare the results from our approach to some of the solutions in the literature.

]]>
http://www.ifs.org.uk/publications/5996 Mon, 23 Jan 2012 00:00:00 +0000
<![CDATA[Long term impacts of compensatory preschool on health and behavior: evidence from Head Start]]> This paper provides new estimates of the medium and long-term impacts of Head Start on the health and behavioral problems of its participants. We identify these impacts using discontinuities in the probability of participation induced by program eligibility rules. Our strategy allows us to identify the effect of Head Start for the set of individuals in the neighborhoods of multiple discontinuities, which vary with family size, state and year (as opposed to a smaller set of individuals neighboring a single discontinuity). Participation in the program reduces the incidence of behavioral problems, serious health problems and obesity of male children at ages 12 and 13. It also lowers depression and obesity among adolescents, and reduces engagement in criminal activities for young adults.

]]>
http://www.ifs.org.uk/publications/5995 Sun, 22 Jan 2012 00:00:00 +0000
<![CDATA[Policing Cannabis and Drug Related Hospital Admissions: Evidence from Administrative Records]]> We evaluate the impact on hospital admissions related to illicit drug use, caused by a policing experiment that depenalized the possession of small quantities of cannabis in the London borough of Lambeth. We exploit administrative records on individual hospital ad missions and with ICD-10 diagnosis classications. We use these records to construct a panel data set by London borough and quarter from 1997 to 2009 to estimate the short and long run impacts of the depenalization policy unilaterally introduced into Lambeth between 2001 and 2002. We nd the depenalization of cannabis had signicant longer term impacts on hospital admissions related to the use of hard drugs. Among Lambeth residents, the impacts are concentrated among men, and are proportionately larger in younger cohorts, and among those with prior histories of hospitalization related to drug or alcohol use. The magnitudes of the impacts are large, corresponding to between 33% and 64% of baseline admission rates across age cohorts. The dynamic impacts across cohorts vary in prole with some cohorts experiencing hospitalization rates remaining above pre-intervention levels six years after the depenalization of cannabis was rst introduced. We nd evidence of positive spillover effects in hospitalization rates related to hard drugs among those resident in boroughs neighboring Lambeth, and these are concentrated among cohorts without prior histories of hospitalizations related to the use of illicit drugs or alcohol. Finally, the severity of hospital admissions, as measured by the length of hospital stays, signicantly increases for admissions related to the use of hard drugs and cannabis. Overall, our results suggest policing strategies related to the cannabis market have signicant, nuanced and lasting impacts on public health.

]]>
http://www.ifs.org.uk/publications/6121 Sun, 01 Jan 2012 00:00:00 +0000
<![CDATA[Estimation of treatment effects with high-dimensional controls]]> We propose methods for inference on the average effect of a treatment on a scalar outcome in the presence of very many controls. Our setting is a partially linear regression model containing the treatment/policy variable and a large number p of controls or series terms, with p that is possibly much larger than the sample size n, but where only s << n unknown controls or series terms are needed to approximate the regression function accurately. The latter sparsity condition makes it possible to estimate the entire regression function as well as the average treatment effect by selecting an approximately the right set of controls using Lasso and related methods. We develop estimation and inference methods for the average treatment effect in this setting, proposing a novel "post double selection" method that provides attractive inferential and estimation properties. In our analysis, in order to cover realistic applications, we expressly allow for imperfect selection of the controls and account for the impact of selection errors on estimation and inference. In order to cover typical applications in economics, we employ the selection methods designed to deal with non-Gaussian and heteroscedastic disturbances. We illustrate the use of new methods with numerical simulations and an application to the effect of abortion on crime rates.

]]>
http://www.ifs.org.uk/publications/5985 Sat, 31 Dec 2011 00:00:00 +0000
<![CDATA[Inference for high-dimensional sparse econometric models]]> This article is about estimation and inference methods for high dimensional sparse (HDS) regression models in econometrics. High dimensional sparse models arise in situations where many regressors (or series terms) are available and the regression function is well-approximated by a parsimonious, yet unknown set of regressors. The latter condition makes it possible to estimate the entire regression function effectively by searching for approximately the right set of regressors. We discuss methods for identifying this set of regressors and estimating their coefficients based on l1 -penalization and describe key theoretical results. In order to capture realistic practical situations, we expressly allow for imperfect selection of regressors and study the impact of this imperfect selection on estimation and inference results. We focus the main part of the article on the use of HDS models and methods in the instrumental variables model and the partially linear model. We present a set of novel inference results for these models and illustrate their use with applications to returns to schooling and growth regression.

]]>
http://www.ifs.org.uk/publications/5983 Fri, 30 Dec 2011 00:00:00 +0000
<![CDATA[Inference for extremal conditional quantile models, with an application to market and birthweight risks]]> Quantile regression is an increasingly important empirical tool in economics and other sciences for analyzing the impact of a set of regressors on the conditional distribution of an outcome. Extremal quantile regression, or quantile regression applied to the tails, is of interest in many economic and financial applications, such as conditional value-at-risk, production efficiency, and adjustment bands in (S,s) models. In this paper we provide feasible inference tools for extremal conditional quantile models that rely upon extreme value approximations to the distribution of self-normalized quantile regression statistics. The methods are simple to implement and can be of independent interest even in the non-regression case. We illustrate the results with two empirical examples analyzing extreme fluctuations of a stock return and extremely low percentiles of live infants' birthweights in the range between 250 and 1500 grams.

]]>
http://www.ifs.org.uk/publications/5982 Tue, 27 Dec 2011 00:00:00 +0000
<![CDATA[An instrumental variable model of multiple discrete choice]]> This paper studies identification of latent utility functions in multiple discrete choice models in which there may be endogenous explanatory variables, that is explanatory variables that are not restricted to be distributed independently of the unobserved determinants of latent utilities. The model does not employ large support, special regressor or control function restrictions, indeed it is silent about the process delivering values of endogenous explanatory variables and in this respect it is incomplete. Instead the model employs instrumental variable restrictions requiring the existence of instrumental variables which are excluded from latent utilities and distributed independently of the unobserved components of utilities.

We show that the model delivers set identification of the latent utility functions and we characterize sharp bounds on those functions. We develop easy-to-compute outer regions which in parametric models require little more calculation than what is involved in a conventional maximum likelihood analysis. The results are illustrated using a model which is essentially the parametric conditional logit model of McFadden (1974) but with potentially endogenous explanatory variables and instrumental variable restrictions. The method employed has wide applicability and for the first time brings instrumental variable methods to bear on structural models in which there are multiple unobservables in a structural equation.

]]>
http://www.ifs.org.uk/publications/5977 Thu, 22 Dec 2011 00:00:00 +0000
<![CDATA[Identification, data combination and the risk of disclosure]]> Businesses routinely rely on econometric models to analyze and predict consumer behavior. Estimation of such models may require combining a firm's internal data with external datasets to take into account sample selection, missing observations, omitted variables and errors in measurement within the existing data source. In this paper we point out that these data problems can be addressed when estimating econometric models from combined data using the data mining techniques under mild assumptions regarding the data distribution. However, data combination leads to serious threats to security of consumer data: we demonstrate that point identification of an econometric model from combined data is incompatible with restrictions on the risk of individual disclosure. Consequently, if a consumer model is point identified, the firm would (implicitly or explicitly) reveal the identity of at least some of consumers in its internal data. More importantly, we provide an argument that unless the firm places a restriction on the individual disclosure risk when combining data, even if the raw combined dataset is not shared with a third party, an adversary or a competitor can gather confidential information regarding some individuals from the estimated model.

]]>
http://www.ifs.org.uk/publications/5975 Tue, 20 Dec 2011 00:00:00 +0000
<![CDATA[Analysis of interactive fixed effects dynamic linear panel regression with measurement error]]> This paper studies a simple dynamic panel linear regression model with interactive fixed effects in which the variable of interest is measured with error. To estimate the dynamic coefficient, we consider the least-squares minimum distance (LS-MD) estimation method.

]]>
http://www.ifs.org.uk/publications/5974 Thu, 15 Dec 2011 00:00:00 +0000
<![CDATA[Livestock for the poor: under what conditions?]]> This study evaluates an intervention in the dairy subsector by an Indian livelihood promotion institution and conducts a detailed analysis of the main cost and benefit factors of the activity. Two rounds of data are available which allows for the comparison of impacts and costs and benets under different circumstances - a relatively good year as well as one officially declared as a drought period. Results suggest that the programme is benecial but impacts cannot be sustained under the macro shock. Looking at the main cost factors reveals that fodder availability was a major problem. The results help to suggest an improved programme design.

]]>
http://www.ifs.org.uk/publications/5953 Mon, 12 Dec 2011 00:00:00 +0000
<![CDATA[Group lending or individual lending? Evidence from a randomised field experiment in Mongolia]]> Although microfinance institutions across the world are moving from group lending towards individual lending, this strategic shift is not substantiated by sufficient empirical evidence on the impact of both types of lending on borrowers. We present such evidence from a randomised field experiment in rural Mongolia. We find a positive impact of access to group loans on food consumption and entrepreneurship. Among households that were offered group loans the likelihood of owning an enterprise increases by ten per cent more than in control villages. Enterprise profits increase over time as well, particularly for the less-educated. For individual lending on the other hand, we detect no significant increase in consumption or enterprise ownership. These results are in line with theories that stress the disciplining effect of group lending: joint liability may deter borrowers from using loans for non-investment purposes. Our results on informal transfers are consistent with this hypothesis. Borrowers in group-lending villages are less likely to make informal transfers to families and friends while borrowers in individual-lending villages are more likely to do so. We find no significant difference in repayment rates between the two lending programs, neither of which entailed weekly repayment meetings.

]]>
http://www.ifs.org.uk/publications/5952 Sun, 11 Dec 2011 00:00:00 +0000
<![CDATA[Average and marginal returns to upper secondary schooling in Indonesia]]> This paper estimates average and marginal returns to schooling in Indonesia using a non-parametric selection model. Identification of the model is given by exogenous geographic variation in access to upper secondary schools. We find that the return to upper secondary schooling varies widely across individuals: it can be as high as 50 percent per year of schooling for those very likely to enroll in upper secondary schooling, or as low as -10 percent for those very unlikely to do so. Average returns for the student at the margin are well below those for the average student attending upper secondary schooling.

]]>
http://www.ifs.org.uk/publications/5956 Sun, 20 Nov 2011 00:00:00 +0000
<![CDATA[Estimating production functions with robustness against errors in the proxy variables ]]> This paper proposes a new semi-nonparametric maximum likelihood estimation method for estimating production functions. The method extends the literature on structural estimation of production functions, started by the seminal work of Olley and Pakes (1996), by relaxing the scalar-unobservable assumption about the proxy variables. The key additional assumption needed in the identification argument is the existence of two conditionally independent proxy variables. The assumption seems reasonable in many important cases. The new method is straightforward to apply, and a consistent estimate of the asymptotic covariance matrix of the structural parameters can be easily computed.

]]>
http://www.ifs.org.uk/publications/5955 Tue, 15 Nov 2011 00:00:00 +0000
<![CDATA[Intersection bounds: estimation and inference]]> We develop a practical and novel method for inference on intersection bounds, namely bounds defined by either the infimum or supremum of a parametric or nonparametric function, or equivalently, the value of a linear programming problem with a potentially infinite constraint set. Our approach is especially convenient for models comprised of a continuum of inequalities that are separable in parameters, and also applies to models with inequalities that are non-separable in parameters. Since analog estimators for intersection bounds can be severely biased in finite samples, routinely underestimating the size of the identified set, we also offer a median-bias-corrected estimator of such bounds as a natural by-product of our inferential procedures. We develop theory for large sample inference based on the strong approximation of a sequence of series or kernel-based empirical processes by a sequence of "penultimate" Gaussian processes. These penultimate processes are generally not weakly convergent, and thus non-Donsker. Our theoretical results establish that we can nonetheless perform asymptotically valid inference based on these processes. Our construction also provides new adaptive inequality/moment selection methods. We provide conditions for the use of nonparametric kernel and series estimators, including a novel result that establishes strong approximation for any general series estimator admitting linearization, which may be of independent interest.

]]>
http://www.ifs.org.uk/publications/5743 Fri, 04 Nov 2011 00:00:00 +0000
<![CDATA[Global Bahadur representation for nonparametric censored regression quantiles and its applications]]> This paper is concerned with the nonparametric estimation of regression quantiles where the response variable is randomly censored. Using results on the strong uniform convergence of U-processes, we derive a global Bahadur representation for the weighted local polynomial estimators, which is sufficiently accurate for many further theoretical analyses including inference. We consider two applications in detail: estimation of the average derivative, and estimation of the component functions in additive quantile regression models.

]]>
http://www.ifs.org.uk/publications/5742 Thu, 03 Nov 2011 00:00:00 +0000
<![CDATA[Empirical analysis of countervailing power in business-to-business bargaining]]> This paper provides a comprehensive econometric framework for the empirical analysis of countervailing power. It encompasses the two main features of pricing schemes in business-to-business relationships: nonlinear price schedules and bargaining over rents. Disentangling them is critical to the empirical identification of countervailing power. Testable predictions from the theoretical analysis for a pragmatic reduced form empirical pricing model are delineated. This model is readily implementable on the basis of transaction data, routinely collected by antitrust authorities and illustrated using data from the UK brick industry. The paper emphasizes the importance of controlling for endogeneity of volumes and established supply chains and for heterogeneity across buyers and sellers due to intrinsically unobservable outside options.

]]>
http://www.ifs.org.uk/publications/5741 Tue, 01 Nov 2011 00:00:00 +0000
<![CDATA[Individual notions of distributive justice and relative economic status]]> Issues of inequality, distribution and redistribution are commanding progressively more attention in the minds of not only world leaders, politicians, and academics but also of ordinary people. So, what constitutes distributive justice in the minds of ordinary people? The philosophical literature offers several alternative principles of distributive justice. But which of these, if any, do ordinary people adopt as the principle against which to judge their own and other people's and entities' outcomes and actions?

This paper presents the findings from two experiments designed to test the hypothesis that individuals' notions of distributive justice are associated with their economic status relative to others within their own society. In the experiments, each participant played a specially designed distribution game. This game allowed us to establish whether and to what extent the participants perceived inequalities owing to differences in productivity rather than luck as just and, hence, not in need of redress. A type of participant that distinguished between inequalities owing to productivity and luck, redressing the latter and not or to a lesser extent the former, is said to be subject to an earned endowment effect. Drawing on previous work in both economics and psychology, we hypothesised that the richer members of any society would be more likely to be subject to an earned endowment effect, while the poorer members would be more inclined towards redistribution irrespective of whether the inequality was owing to productivity or luck.

We conducted our first experiment in the UK. We selected unemployed residents of one city to represent low economic status individuals and student and employed residents of the same city to represent relatively high economic status individuals. We found a statistically significant earned endowment effect among the students and employed and no effect among the unemployed. The difference between the unemployed and the others was also statistically significant.

Our second experiment was designed to test the generalizability of the findings from our first. It was conducted in Cape Town, South Africa. Exploiting the fact that Cape Town is home to one of the continent's best universities, we built a participant sample that was highly comparable to the UK sample in many regards. However, the states of employment and unemployment are less distinct in South Africa as compared to the UK and a number of interventions are in place to ensure that the student body of the University of Cape Town includes young people from not only rich and middle income but also poorer households. So, in South Africa we chose to rely on responses to a survey question to distinguish between high and low economic status individuals. The findings from this second experiment also supported the hypothesis; among individuals who classified their households as rich or high or middle income there was a statistically significant earned endowment effect, among individuals who classified their households as poor or low income there was not and the different between the two participant types was significant.

We conclude that individuals' notions of distributive justice are associated with their relative economic status within society and that this is a generalizable result.

]]>
http://www.ifs.org.uk/publications/5735 Mon, 31 Oct 2011 00:00:00 +0000
<![CDATA[Household consumption through recent recessions]]> This paper examines trends in household consumption and saving behaviour in each of the last three recessions in the UK. We identify several dimensions along which the most recent recession (the so-called 'Great Recession') has been different from those that occurred in the 1980s and 1990s. These include its depth and length as well as the composition of the cutbacks in expenditure - with a greater reliance on cuts to nondurable expenditure than was seen in previous recessions. We show that, both inside and outside recessions, the extent to which the growth in durable purchases is more volatile than growth in nondurable purchases has declined over the past 15 years. Finally, we present evidence that suggests that two aspects of fiscal policy in the UK in 2008 and 2009 - the temporary reduction in the rate of VAT and a car scrappage scheme - had some success in encouraging households to bring forward some durable purchases.

]]>
http://www.ifs.org.uk/publications/5715 Wed, 19 Oct 2011 00:00:00 +0000
<![CDATA[Semiparametric structural models of binary response: shape restrictions and partial identification]]>

The paper studies the partial identifying power of structural single equation threshold crossing models for binary responses when explanatory variables may be endogenous. The paper derives the sharp identified set of threshold functions for the case in which explanatory variables are discrete and provides a constructive proof of sharpness. There is special attention to a widely employed semiparametric shape restriction which requires the threshold crossing function to be a monotone function of a linear index involving the observable explanatory variables. It is shown that the restriction brings great computational benefits, allowing direct calculation of the identified set of index coefficients without calculating the nonparametrically specified threshold function. With the restriction in place the methods of the paper can be applied to produce identified sets in a class of binary response models with mis-measured explanatory variables.

This is a further revised version (Oct 7th 2011) of CWP23/09 "Single equation endogenous binary response models"

]]>
http://www.ifs.org.uk/publications/5706 Sat, 01 Oct 2011 00:00:00 +0000
<![CDATA[Semiparametric selection models with binary outcomes]]> This paper addresses the estimation of a semiparametric sample selection index model where both the selection rule and the outcome variable are binary. Since the marginal effects are often of primary interest and are difficult to recover in a semiparametric setting, we develop estimators for both the marginal effects and the underlying model parameters. The marginal effect estimator only uses observations which are members of a high probability set in which the selection problem is not present. A key innovation is that this high probability set is data dependent. The model parameter estimator is a quasi-likelihood estimator based on regular kernels with bias corrections. We establish their large sample properties and provide simulation evidence confirming that these estimators perform well in finite samples.

]]>
http://www.ifs.org.uk/publications/5705 Fri, 23 Sep 2011 00:00:00 +0000
<![CDATA[Enforcement of labor regulation and informality]]> Enforcement of labor regulations in the formal sector may drive workers to informality because they increase the costs of formal labor. But better compliance with mandated benefits makes it attractive to be a formal employee. We show that, in locations with frequent inspections workers pay for mandated benefits by receiving lower wages. Wage rigidity prevents downward adjustment at the bottom of the wage distribution. As a result, lower paid formal sector jobs become attractive to some informal workers, inducing them to want to move to the formal sector.

]]>
http://www.ifs.org.uk/publications/5704 Tue, 20 Sep 2011 00:00:00 +0000
<![CDATA[The impact of tuition fees and support on university participation in the UK]]> Understanding how policy can affect university participation is important for understanding how governments can promote human capital accumulation. In this paper, we estimate the separate impacts of tuition fees and maintenance grants on the decision to enter university in the UK. We use Labour Force Survey data covering 1992-2007, a period of important variation in higher education finance, which saw the introduction of up-front tuition fees and the abolition of maintenance grants in 1998, followed some eight years later by a shift to higher deferred fees and the reinstatement of maintenance grants. We create a pseudo-panel of university participation of cohorts defined by sex, region of residence and family background, and estimate a number of different specifications on these aggregated data. Our findings show that tuition fees have had a significant negative effect on participation, with a £1,000 increase in fees resulting in a decrease in participation of 3.9 percentage points, which equates to an elasticity of -0.14. Non-repayable support in the form of maintenance grants has had a positive effect on participation, with a £1,000 increase in grants resulting in a 2.6 percentage point increase in participation, which equates to an elasticity of 0.18. These findings are comparable to, but of a slightly lower magnitude than, those in the related US literature.

]]>
http://www.ifs.org.uk/publications/5648 Mon, 05 Sep 2011 00:00:00 +0000
<![CDATA[On-the-Job Search and Precautionary Savings: Theory and Empirics of Earnings and Wealth Inequality]]> I develop and estimate a model of the labor market in which precautionary savings interacts with labour market frictions to produce substantial inequality in wealth among ex ante identical workers. I show that a model of on-the-job search,in which workers are risk averse and markets are incomplete, provides a direct and intuitive link between the empirical earnings and wealth distributions. The

mechanism that generates the high degree of wealth inequality in the model is the dynamic of the "wage ladder" resulting from the search process. There is an important asymmetry between the incremental wage increases generated by on-thejob search (climbing the ladder) and the drop in income associated with job loss (falling off the ladder). The behavior of workers in low paying jobs is primarily governed by the expectation of wage growth, while the behavior of workers near the top of the distribution is driven by the possibility of job loss.

]]>
http://www.ifs.org.uk/publications/5647 Sun, 04 Sep 2011 00:00:00 +0000
<![CDATA[Innovation in China: the rise of Chinese inventors in the production of knowledge]]> In 2010 China was the world's fourth largest filer of patent applications. This followed a decade of unprecedented increases in investment in skills and Research and Development. If current trends continue China could rank first in the very near future. We provide evidence that the growth in Chinese patenting activity has been accompanied by a growth in Chinese inventors creating technologies that are near to the science base.

Part of the success of China has been to attract the investment of foreign multinationals. This is also true for a number of other Emerging Economies. Europe's largest multinational firms increasingly file patent applications that are based on inventor activities located in emerging economies, often working alongside inventors from the firm's home country.

]]>
http://www.ifs.org.uk/publications/5646 Sun, 04 Sep 2011 00:00:00 +0000
<![CDATA[Insurance and Investment Within Family Networks]]> We study how family networks affect informal insurance and investment in poor villages. We use panel data from the randomized evaluation of PROGRESA in rural Mexico and exploit the information on surnames to identify extended families. Using exogenous income variations, we show that members of an extended family (connected) share risk with each other but not with households without relatives in the village (isolated). In addition, connected households invest more in their children’s human capital when hit by a positive income shock, the PROGRESA transfer, and disinvest less when hit by a negative health shock. Such a higher level of investment is long-lasting, and increases long-term consumption. At the same time connected households achieve almost perfect insurance against idiosyncratic risk. These findings suggest that anti-poverty policies should take into account the familial structure of village economies.

]]>
http://www.ifs.org.uk/publications/6122 Thu, 01 Sep 2011 00:00:00 +0000
<![CDATA[Estimating structural mean models with multiple instrumental variables using the generalised method of moments]]> Instrumental variables analysis using genetic markers as instruments is now a widely used technique in epidemiology and biostatistics. As single markers tend to explain only a small proportion of phenotypical variation, there is increasing interest in using multiple genetic markers to obtain more precise estimates of causal parameters. Structural mean models (SMMs) are semi-parametric models that use instrumental variables to identify causal parameters, but there has been little work on using these models with multiple instruments, particularly for multiplicative and logistic SMMs. In this paper, we show how additive, multiplicative and logistic SMMs with multiple discrete instrumental variables can be estimated efficiently using the generalised method of moments (GMM) estimator, how the Hansen J-test can be used to test for model mis-specification, and how standard GMM software routines can be used to fit SMMs. We further show that multiplicative SMMs, like the additive SMM, identify a weighted average of local causal effects if selection is monotonic. We use these methods to reanalyse a study of the relationship between adiposity and hypertension using SMMs with two genetic markers as instruments for adiposity. We find strong effects of adiposity on hypertension, but no evidence of unobserved confounding.

]]>
http://www.ifs.org.uk/publications/5663 Tue, 30 Aug 2011 00:00:00 +0000
<![CDATA[The impact of a time-limited, targeted in-work benefit in the medium-term: an evaluation of In Work Credit]]> Conventional in-work benefits or tax credits are now well established as a policy instrument for increasing labour supply and tackling poverty. A different sort of in-work credit is one where the payments are time-limited, conditional on previous receipt of welfare, and, perhaps, not means-tested. Such a design is cheaper, and perhaps better targeted, but potentially less effective. Using administrative data, this paper evaluates one such policy for lone parents in the UK which was piloted in around one third of the country. It finds that the policy did increase flows off welfare and into work, and that these positive effects did not diminish after recipients reached the 12 month time-limit for receiving the supplement. Most of the impact arose by speeding up welfare off-flows: the job retention of programme recipients was good, but this cannot be attributed to the programme itself.

]]>
http://www.ifs.org.uk/publications/5644 Mon, 01 Aug 2011 00:00:00 +0000
<![CDATA[Disability, health and retirement in the United Kingdom]]> This paper examines changes in health and disability related transfers in the UK over the last thirty years, and describes how they are related to changes in labour force participation. The objective is to present a comprehensive description of the reforms to the institutional setting, along with available time series coming from administrative data on benefit receipt, cross-section or panel data on self-reported health and their interactions with labour force status. By providing systematic evidence on institutions and data, we hope to help future research providing a fuller picture of the trends over this period. We also present evidence on the impact of two large reforms to disability benefits in the UK.

]]>
http://www.ifs.org.uk/publications/5643 Wed, 20 Jul 2011 00:00:00 +0000
<![CDATA[Child mental health and educational attainment: multiple observers and the measurement error problem]]> We examine the effect of survey measurement error on the empirical relationship between child mental health and personal and family characteristics, and between child mental health and educational progress. Our contribution is to use unique UK survey data that contains (potentially biased) assessments of each child's mental state from three observers (parent, teacher and child), together with expert (quasi-) diagnoses, using an assumption of optimal diagnostic behaviour to adjust for reporting bias. We use three alternative restrictions to identify the effect of mental disorders on educational progress. Maternal education and mental health, family income, and major adverse life events, are all significant in explaining child mental health, and child mental health is found to have a large influence on educational progress. Our preferred estimate is that a 1-standard deviation reduction in 'true' latent child mental health leads to a 2-5 months loss in educational progress. We also find a strong tendency for observers to understate the problems of older children and adolescents compared to expert diagnosis.

]]>
http://www.ifs.org.uk/publications/5634 Wed, 20 Jul 2011 00:00:00 +0000
<![CDATA[Tests for neglected heterogeneity in moment condition models]]> The central concern of the paper is with the formulation of tests of neglected parameter heterogeneity appropriate for model environments specified by a number of unconditional or conditional moment conditions. We initially consider the unconditional moment restrictions framework. Optimal m-tests against moment condition parameter heterogeneity are derived with the relevant Jacobian matrix obtained as the second order derivative of the moment indicator in a leading case. GMM and GEL tests of specification based on generalized information matrix equalities appropriate for moment-based models are described and their relation to the optimal m-tests against moment condition parameter heterogeneity examined. A fundamental and important difference is noted between GMM and GEL constructions. The paper is concluded by a generalization of these tests to the conditional moment context.

]]>
http://www.ifs.org.uk/publications/5629 Wed, 13 Jul 2011 00:00:00 +0000
<![CDATA[The effect of education policy on crime: an intergenerational perspective]]> The Swedish comprehensive school reform implied an extension of the number of years of compulsory school from 7 or 8 to 9 for the entire nation and was implemented as a social experiment by municipality between 1949 and 1962. A previous study (Meghir and Palme, 2005) has shown that this reform significantly increased the number of years of schooling as well as labor earnings of the children who went through the post reform

school system, in particular for individuals originating from homes with low educated fathers. This study estimates the impact of the reform on criminal behavior: both within the generation directly affected by the reform as well as their children. We use census data on all born in Sweden between 1945 and 1955 and all their children merged with individual register data on all convictions between 1981 and 2008. We find a significant inverse effect of the reform on criminal behavior of men and on sons to fathers who went through the new school system.

]]>
http://www.ifs.org.uk/publications/5642 Fri, 01 Jul 2011 00:00:00 +0000
<![CDATA[Nonparametric identification using instrumental variables: sufficient conditions for completeness]]> This paper provides sufficient conditions for the nonparametric identification of the regression function m(.) in a regression model with an endogenous regressor x and an instrumental variable z. It has been shown that the identification of the regression function from the conditional expectation of the dependent variable on the instrument relies on the completeness of the distribution of the endogenous regressor conditional on the instrument, i.e., f(x/z). We provide sufficient conditions for the completeness of f(x/z) without imposing a specific functional form, such as the exponential family. We show that if the conditional density f(x/z) coincides with an existing complete density at a limit point in the support of z, then f(x/z) itself is complete, and therefore, the regression function m(.) is nonparametrically identified. We use this general result provide specific sufficient conditions for completeness in three different specifications of the relationship between the endogenous regressor x and the instrumental variable z.

]]>
http://www.ifs.org.uk/publications/5628 Sat, 25 Jun 2011 00:00:00 +0000
<![CDATA[Measuring the price responsiveness of gasoline demand: economic shape restrictions and nonparametric demand estimation]]> This paper develops a new method for estimating a demand function and the welfare consequences of price changes. The method is applied to gasoline demand in the U.S. and is applicable to other goods. The method uses shape restrictions derived from economic theory to improve the precision of a nonparametric estimate of the demand function. Using data from the U.S. National Household Travel Survey, we show that the restrictions are consistent with the data on gasoline demand and remove the anomalous behavior of a standard nonparametric estimator. Our approach provides new insights about the price responsiveness of gasoline demand and the way responses vary across the income distribution. We find that price responses vary nonmonotonically with income. In particular, we find that low- and high-income consumers are less responsive to changes in gasoline prices than are middle-income consumers. We find similar results using comparable data from Canada.

]]>
http://www.ifs.org.uk/publications/5627 Thu, 16 Jun 2011 00:00:00 +0000
<![CDATA[Penalized sieve estimation and inference of semi-nonparametric dynamic models: a selective review]]> In this selective review, we first provide some empirical examples that motivate the usefulness of semi-nonparametric techniques in modelling economic and financial time series. We describe popular classes of semi-nonparametric dynamic models and some temporal dependence properties. We then present penalized sieve extremum (PSE)estimation as a general method for semi-nonparametric models with cross-sectional, panel, time series, or spatial data. The method is especially powerful in estimating difficult ill-posed inverse problems such as semi-nonparametric mixtures or conditional moment restrictions. We review recent advances on inference and large sample properties of the PSE estimators, which include (1) consistency and convergence rates of the PSE estimator of the nonparametric part; (2) limiting distributions of plug-in PSE estimators of functionals that are either smooth (i.e., root-n estimable) or non-smooth (i.e., slower than root-n estimable); (3) simple criterion-based inference for plug-in PSE estimation of smooth or non-smooth functionals; and (4) root-n asymptotic normality of semiparametric two-step estimators and their consistent variance estimators. Examples from dynamic asset pricing, nonlinear spatial VAR, semiparametric GARCH, and copula-based multivariate financial models are used to illustrate the general results.

]]>
http://www.ifs.org.uk/publications/5626 Fri, 10 Jun 2011 00:00:00 +0000
<![CDATA[Cash by any other name? Evidence on labelling from the UK Winter Fuel Payment]]> Standard economic theory implies that the labelling of cash transfers or cash-equivalents (e.g. child benefits, food stamps) should have no effect on spending patterns. The empirical literature to date does not contradict this proposition. We study the UK Winter Fuel Payment (WFP), a cash transfer to older households. Exploiting sharp eligibility criteria in a regression discontinuity design, we find robust evidence of a behavioural effect of the labelling. On average households spend 41% of the WFP on fuel. If the payment was treated as cash, we would expect households to spend approximately 3% of the payment on fuel.

]]>
http://www.ifs.org.uk/publications/5603 Wed, 08 Jun 2011 00:00:00 +0000
<![CDATA[Is there a "heat or eat" trade-off in the UK?]]> In this research, funded by the Nuffield Foundation, we merge detailed household level expenditure data from older households with historical local weather information. We then test for a heat or eat trade off: do households cut back on food spending to finance the additional cost of keeping warm during cold shocks? We find evidence that the poorest of older households are unable to smooth spending over the worst temperature shocks. Statistically significant reductions in food spending are observed in response to temperatures two or more standard deviations colder than expected (which occur about one winter month in forty) and reductions in food expenditure are considerably larger in poorer households.

]]>
http://www.ifs.org.uk/publications/5602 Tue, 07 Jun 2011 00:00:00 +0000
<![CDATA[A practical asymptotic variance estimator for two-step semiparametric estimators]]> The goal of this paper is to develop techniques to simplify semiparametric inference. We do this by deriving a number of numerical equivalence results. These illustrate that in many cases, one can obtain estimates of semiparametric variances using standard formulas derived in the already-well-known parametric literature. This means that for computational purposes, an empirical researcher can ignore the semiparametric nature of the problem and do all calculations "as if"it were a parametric situation. We hope that this simplicity will promote the use of semiparametric procedures.

]]>
http://www.ifs.org.uk/publications/5625 Sun, 05 Jun 2011 00:00:00 +0000
<![CDATA[Bounding quantile demand functions using revealed preference inequalities]]> This paper develops a new technique for the estimation of consumer demand models with unobserved heterogeneity subject to revealed preference inequality restrictions. Particular attention is given to nonseparable heterogeneity. The inequality restrictions are used to identify bounds on quantile demand functions. A nonparametric estimator for these bounds is developed and asymptotic properties are derived. An empirical application using data from the U.K. Family Expenditure Survey illustrates the usefulness of the methods by deriving bounds and confidence sets for estimated quantile demand functions.

]]>
http://www.ifs.org.uk/publications/5599 Wed, 01 Jun 2011 00:00:00 +0000
<![CDATA[Reserve price effects in auctions: estimates from multiple RD designs]]> We present evidence from 260,000 online auctions of second-hand cars to identify the impact of public reserve prices on auction outcomes. To establish causality, we exploit multiple discontinuities in the relationship between reserve prices and vehicle characteristics to present RD estimates of reserve price impacts. Guided by auction theory, in our first set of results we find that an increase in reserve price decreases the number of bidders, increases the likelihood the object remains unsold, and increases expected revenue conditional on sale. We then combine these estimates to calibrate the reserve price effect on the auctioneer’s ex ante expected revenue. This reveals the auctioneer’s reserve price policy to be locally optimal. Our final set of results provide novel evidence on reserve price effects to shed light on the auction environment. We find that an increase in reserve price: (i) decreases the number of potential bidders as identified through individual web browsing histories; (ii) leads to only more experienced and historically successful bidders still entering the auction; (iii) the characteristics of actual winners are less sensitive to the reserve price than those of the average bidder, suggesting auction winners are not the marginal entrant. These novel margins suggest these auctions are characterized by endogenous bidder entry and bidder asymmetry

]]>
http://www.ifs.org.uk/publications/6125 Wed, 01 Jun 2011 00:00:00 +0000
<![CDATA[Quantile regression with censoring and endogeneity]]> In this paper, we develop a new censored quantile instrumental variable (CQIV)estimator and describe its properties and computation. The CQIV estimator combines Powell(1986) censored quantile regression (CQR) to deal semiparametrically with censoring, with a control variable approach to incorporate endogenous regressors. The CQIV estimator is obtained in two stages that are nonadditive in the unobservables. The first stage estimates a nonadditive model with infinite dimensional parameters for the control variable, such as a quantile or distribution regression model. The second stage estimates a nonadditive censored quantile regression model for the response variable of interest, including the estimated control variable to deal with endogeneity. For computation, we extend the algorithm for CQR developed by Chernozhukov and Hong (2002) to incorporate the estimation of the control variable. We give generic regularity conditions for asymptotic normality of the CQIV estimator and for the validity of resampling methods to approximate its asymptotic distribution. We verify these conditions for quantile and distribution regression estimation of the control variable. We illustrate the computation and applicability of the CQIV estimator with numerical examples and an empirical application on estimation of Engel curves for alcohol.

]]>
http://www.ifs.org.uk/publications/5598 Tue, 31 May 2011 00:00:00 +0000
<![CDATA[Conditional quantile processes based on series or many regressors]]>

Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantile-specific coefficients and estimate the function-valued coefficients from the data. We develop large sample theory for the empirical QR coefficient process, namely we obtain uniform strong approximations to the empirical QR coefficient process by conditionally pivotal and Gaussian processes, as well as by gradient and weighted bootstrap processes.

We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Specifically, we obtain uniform rates of convergence, large sample distributions, and inference methods based on strong pivotal and Gaussian approximations and on gradient and weighted bootstraps. All of the above results are for function-valued parameters, holding uniformly in both the quantile index and in the covariate value, and covering the pointwise case as a by-product. If the function of interest is monotone, we show how to use monotonization procedures to improve estimation and inference. We demonstrate the practical utility of these results with an empirical example, where we estimate the price elasticity function of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption.

]]>
http://www.ifs.org.uk/publications/5597 Fri, 27 May 2011 00:00:00 +0000
<![CDATA[Is distance dying at last? Falling home bias in fixed effects models of patent citations]]>

We examine the "home bias" of knowledge spillovers (the idea that knowledge spreads more slowly over international boundaries than within them) as measured by the speed of patent citations. We present econometric evidence that the geographical localization of knowledge spillovers has fallen over time, as we would expect from the dramatic fall in communication and travel costs. Our proposed estimator controls for correlated fixed effects and censoring in duration models and we apply it to data on over two million patent citations between 1975 and 1999. Home bias is exaggerated in models that do not control for fixed effects. The fall in home bias over time is weaker for the pharmaceuticals and information/communication technology sectors where agglomeration externalities may remain strong.

]]>
http://www.ifs.org.uk/publications/5593 Tue, 24 May 2011 00:00:00 +0000
<![CDATA[Local identification of nonparametric and semiparametric models]]> In parametric models a sufficient condition for local identification is that the vector of moment conditions is differentiable at the true parameter with full rank derivative matrix. We show that there are corresponding sufficient conditions for nonparametric models. A nonparametric rank condition and differentiability of the moment conditions with respect to a certain norm imply local identification. It turns out these conditions are slightly stronger than needed and are hard to check, so we provide weaker and more primitive conditions. We extend the results to semiparametric models. We illustrate the sufficient conditions with endogenous quantile and single index examples. We also consider a semiparametric habit-based, consumption capital asset pricing model. There we find the rank condition is implied by an integral equation of the second kind having a one-dimensional null space.

]]>
http://www.ifs.org.uk/publications/5592 Mon, 23 May 2011 00:00:00 +0000
<![CDATA[FORTAX: UK tax and benefit system documentation]]> This document describes the UK tax and benefit system between April 1990 and April 2010, as implemented in FORTAX, a microsimulation library written in Fortran. It begins with an overview of FORTAX and the information it calculates. Subsequent sections describe the taxes and benefits implemented in FORTAX, noting where simplifications have been made. An appendix lists values of the major tax and benefit parameters over time.

]]>
http://www.ifs.org.uk/publications/5588 Mon, 16 May 2011 00:00:00 +0000
<![CDATA[Inference and decision for set identified parameters using posterior lower and upper probabilities]]> This paper develops inference and statistical decision for set-identified parameters from the robust Bayes perspective. When a model is set-identified, prior knowledge for model parameters is decomposed into two parts: the one that can be updated by data (revisable prior knowledge) and the one that never be updated (unrevisable prior knowledge.) We introduce a class of prior distributions that shares a single prior distribution for the revisable, but allows for arbitrary prior distributions for the unrevisable. A posterior inference procedure proposed in this paper operates on the resulting class of posteriors by focusing on the posterior lower and upper probabilities. We analyze point estimation of the set-identified parameters with applying the gamma-minimax criterion. We propose a robustified posterior credible region for the set-identified parameters by focusing on a contour set of the posterior lower probability. Our framework offers a procedure to eliminate set-identified nuisance parameters, and yields inference for the marginalized identified set. For an interval identified parameter case, we establish asymptotic equivalence of the lower probability inference to frequentist inference for the identified set.

]]>
http://www.ifs.org.uk/publications/5582 Tue, 10 May 2011 00:00:00 +0000
<![CDATA[Do consumers gamble to convexify?]]>

When consumption goods are indivisible, individuals have to hold enough resources to cross a purchasing threshold. If individuals are liquidity constrained, they are unable to borrow to cross that threshold. Instead, we show that such individuals, even if risk averse, may choose to play gamble through playing lotteries to have a chance of crossing the threshold. One implication of this model is that income effects for individuals who choose to play lotteries are likely to be larger than for the general population. This in turn implies that estimating income effects through the random allocation of lottery winnings is likely to be a biased estimate of income effects of the broader population who chose not to gamble. Using UK data on lottery wins, other windfalls and durable good purchases, we show that lottery players display higher income effects than non-players but only amongst those likely to be credit constrained. This is consistent with credit constrained, risk-averse agents gambling in order to cross a purchase threshold and to convexify their budget set.

]]>
http://www.ifs.org.uk/publications/5579 Fri, 06 May 2011 00:00:00 +0000
<![CDATA[On the role of time in nonseparable panel data models]]> This paper contributes to the understanding of the source of identification in panel data models. Recent research has established that few time periods suffice to identify interesting structural effects in nonseparable panel data models even in the presence of complex correlated unobservables, provided these unobservables are time invariant. A communality of all of these approaches is that they point identify effects only for subpopulations. In this paper we focus on average partial derivatives and continuous explanatory variables. We elaborate on the parallel between time in panels and instrumental variables in cross sections and establish that point identification is generically only possible in specific subpopulations, for finite T . Moreover, for general subpopulations, we provide sharp bounds. Finally, we show that these bounds converge to point identification as T tends to infinity only. We systematize this behavior by comparing it to increasing the number of support points of an instrument. Finally, we apply all of these concepts to the semiparametric panel binary choice model and establish that these issues determine the rates of convergence of estimators for the slope coefficient.

]]>
http://www.ifs.org.uk/publications/5581 Thu, 05 May 2011 00:00:00 +0000
<![CDATA[Testing multivariate economic restrictions using quantiles: the example of Slutsky negative semidefiniteness]]> This paper is concerned with testing rationality restrictions using quantile regression methods. Specifically, we consider negative semidefiniteness of the Slutsky matrix, arguably the core restriction implied by utility maximization. We consider a heterogeneous population characterized by a system of nonseparable structural equations with infinite dimensional unobservable. To analyze the economic restriction, we employ quantile regression methods because they allow us to utilize the entire distribution of the data. Difficulties arise because the restriction involves several equations, while the quantile is a univariate concept. We establish that we may test the economic restriction by considering quantiles of linear combinations of the dependent variable. For this hypothesis we develop a new empirical process based test that applies kernel quantile estimators, and derive its large sample behavior. We investigate the performance of the test in a simulation study. Finally, we apply all concepts to Canadian individual data, and show that rationality is an acceptable description of actual individual behavior.

]]>
http://www.ifs.org.uk/publications/5580 Sun, 01 May 2011 00:00:00 +0000
<![CDATA[The impact of minimum wages on quit, layoff and hiring rates]]> We investigate differences in quit, layoff and hiring rates in high versus low minimum wage regimes using Canadian data spanning 1979 to 2008. The data include consistent questions on job tenure and reason for job separation for the whole period. Over the same time frame, there were over 140 minimum wage changes in Canada. We find that higher minimum wages are associated with lower hiring rates but also with lower job separation rates. Importantly, the reduced separation rates are due mainly to reductions in layoffs, occur in the first 6 months of a job, and are present for unskilled workers of all ages. Our estimates imply that a 10% increase in the minimum wage generates a 3.9% reduction in the layoff rate. We present a search and matching model that fits with these patterns and test its implications. Overall, our results imply that jobs in higher minimum wage regimes are more stable but harder to get.

]]>
http://www.ifs.org.uk/publications/5551 Thu, 14 Apr 2011 00:00:00 +0000
<![CDATA[Set identified linear models]]> We analyze the identification and estimation of parameters β satisfying the incomplete linear moment restrictions E(z T(x β−y)) = E(zTu(z)) where z is a set of instruments and u(z) an unknown bounded scalar function. We first provide empirically relevant examples of such a set-up. Second, we show that these conditions set identify β where the identified set B is bounded and convex. We provide a sharp characterization of the identified set not only when the number of moment conditions is equal to the number of parameters of interest but also in the case in which the number of conditions is strictly larger than the number of parameters. We derive a necessary and sufficient condition of the validity of supernumerary restrictions which generalizes the familiar Sargan condition. Third, we provide new results on the asymptotics of analog estimates constructed from the identification results. When B is a strictly convex set, we also construct a test of the null hypothesis, β0 ε B, whose size is asymptotically correct and which relies on the minimization of the support function of the set B − { β 0}. Results of some Monte Carlo experiments are presented.

]]>
http://www.ifs.org.uk/publications/5550 Sun, 10 Apr 2011 00:00:00 +0000
<![CDATA[The effect of abolishing university tuition costs: evidence from Ireland ]]> University tuition fees for undergraduates were abolished in Ireland in 1996. This paper examines the effect of this reform on the socio-economic gradient to determine whether the reform was successful in achieving its objective of promoting educational equality that is improving the chances of low socio-economic status (SES) students progressing to university. It finds that the reform clearly did not have that effect. It is also shown that the university/SES gradient can be explained by differential performance at second level. Students from white collar backgrounds do significantly better in their final second level exams than the children of blue-collar workers. The results are very similar to recent findings for the UK. The results show that the effect of SES on school performance is generally stronger for those at the lower end of the conditional distribution of academic attainment.

]]>
http://www.ifs.org.uk/publications/5530 Wed, 30 Mar 2011 00:00:00 +0000
<![CDATA[The socio-economic gradient in early child outcomes: evidence from the Millennium Cohort Study]]> This paper shows that there are large differences in cognitive and socio-emotional development between children from rich and poor backgrounds at the age of 3, and that this gap widens by the age of 5. Children from poor backgrounds also face much less advantageous "early childhood caring environments" than children from better off families. For example we identify differences in poor children's and their mothers' health and well-being (e.g. birth-weight, breast-feeding, and maternal depression); family interactions (e.g. mother child closeness); the home learning environment (e.g. reading regularly to the child); parenting styles and rules (e.g. regular bed-times and meal-times), and experiences of childcare by ages 3 and 5. Differences in the home learning environment, particularly at the age of 3 have an important role to play in explaining why children from poorer backgrounds experience lower levels of cognitive development than children from better off families. However, a much larger proportion of the gap remains unexplained, or appears directly related to other aspects of family background (such as mothers' age, and family size) that are not mediated through the early childhood caring environment. When it comes to socio-emotional development, a greater proportion of the socio-economic gap does appear to be related to differences in the early childhood caring environment.

]]>
http://www.ifs.org.uk/publications/5519 Tue, 22 Mar 2011 00:00:00 +0000
<![CDATA[Testing functional inequalities]]> This paper develops tests for inequality constraints of nonparametric regression functions. The test statistics involve a one-sided version of Lp-type functionals of kernel estimators. Drawing on the approach of Poissonization, this paper establishes that the tests are asymptotically distribution free, admitting asymptotic normal approximation. Furthermore, the tests have nontrivial local power against a certain class of local alternatives converging to the null at the rate of n-1/2. Some results from Monte Carlo simulations are presented.

]]>
http://www.ifs.org.uk/publications/5502 Tue, 22 Feb 2011 00:00:00 +0000
<![CDATA[Asymptotic theory for nonparametric regression with spatial data]]> Nonparametric regression with spatial, or spatio-temporal, data is considered. The conditional mean of a dependent variable, given explanatory ones, is a nonparametric function, while the conditional covariance reflects spatial correlation. Conditional heteroscedasticity is also allowed, as well as non-identically distributed observations. Instead of mixing conditions, a (possibly non-stationary) linear process is assumed for disturbances, allowing for long range, as well as short-range, dependence, while decay in dependence in explanatory variables is described using a measure based on the departure of the joint density from the product of marginal densities. A basic triangular array setting is employed, with the aim of covering various patterns of spatial observation. Sufficient conditions are established for consistency and asymptotic normality of kernel regression estimates. When the cross-sectional dependence is sufficiently mild, the asymptotic variance in the central limit theorem is the same as when observations are independent; otherwise, the rate of convergence is slower. We discuss application of our conditions to spatial autoregressive models, and models defined on a regular lattice.

]]>
http://www.ifs.org.uk/publications/5501 Sun, 20 Feb 2011 00:00:00 +0000
<![CDATA[Nonparametric trending regression with cross-sectional dependence]]> Panel data, whose series length T is large but whose cross-section size N need not be, are assumed to have a common time trend. The time trend is of unknown form, the model includes additive, unknown, individual-specific components, and we allow for spatial or other cross-sectional dependence and/or heteroscedasticity. A simple smoothed nonparametric trend estimate is shown to be dominated by an estimate which exploits the availability of cross-sectional data. Asymptotically optimal choices of bandwidth are justified for both estimates. Feasible optimal bandwidths, and feasible optimal trend estimates, are asymptotically justified, the finite sample performance of the latter being examined in a Monte Carlo study. A number of potential extensions are discussed.

]]>
http://www.ifs.org.uk/publications/5500 Fri, 18 Feb 2011 00:00:00 +0000
<![CDATA[Inference on power law spatial trends]]> Power law or generalized polynomial regressions with unknown real-valued exponents and coefficients, and weakly dependent errors, are considered for observations over time, space or space-time. Consistency and asymptotic normality of nonlinear least squares estimates of the parameters are established. The joint limit distribution is singular, but can be used as a basis for inference on either exponents or coefficients. We discuss issues of implementation, efficiency, potential for improved estimation, and possibilities of extension to more general or alternative trending models, and to allow for irregularly-spaced data or heteroscedastic errors; though it focusses on a particular model to fix ideas, the paper can be viewed as offering machinery useful in developing inference for a variety of models in which power law trends are a component. Indeed, the paper also makes a contribution that is potentially relevant to many other statistical models: our problem is one of many in which consistency of a vector of parameter estimates (which converge at different rates) cannot be established by the usual techniques for coping with implicitly-defined extremum estimates, but requires a more delicate treatment; we present a generic consistency result.

]]>
http://www.ifs.org.uk/publications/5495 Wed, 16 Feb 2011 00:00:00 +0000
<![CDATA[Statistical inference on regression with spatial dependence]]> Central limit theorems are developed for instrumental variables estimates of linear and semiparametric partly linear regression models for spatial data. General forms of spatial dependence and heterogeneity in explanatory variables and unobservable disturbances are permitted. We discuss estimation of the variance matrix, including estimates that are robust to disturbance heteroscedasticity and/or dependence. A Monte Carlo study of finite-sample performance is included. In an empirical example, the estimates and robust and non-robust standard errors are computed from Indian regional data, following tests for spatial correlation in disturbances, and nonparametric regression fitting. Some final comments discuss modifications and extensions.

]]>
http://www.ifs.org.uk/publications/5494 Mon, 14 Feb 2011 00:00:00 +0000
<![CDATA[The long-term effects of in-work benefits in a life-cycle model for policy evaluation]]> This paper presents a life-cycle model of woman's labour supply, human capital formation and savings for the evaluation of welfare-to-work and tax policies. Women's decisions are formalised in a dynamic and uncertain environment. The model includes a detailed characterisation of the tax system and of the dynamics of family formation while explicitly considering the determinants of employment and education decisions: (i ) contemporaneous incentives to work, (ii ) future consequences for employment through human capital accumulation and (iii) anticipatory effects on the value of employment and education. The choice of parameters follows a careful calibration procedure, based of a large sample of data moments from the British population during the nineties using BHPS data. Many important features established in the empirical literature are reproduced in the simulation exercises, including the employment effects of the WFTC reform in the UK. The model is used to gain further insight into the responses to two recent policy changes, the October 1999 WFTC and the April 2003 WTC/CTC reforms. We find small but non-negligible anticipation effects on employment and education.

]]>
http://www.ifs.org.uk/publications/5493 Sat, 12 Feb 2011 00:00:00 +0000
<![CDATA[An instrumental variable model of multiple discrete choice]]> This paper studies identification of latent utility functions in multiple discrete choice models in which there may be endogenous explanatory variables, that is explanatory variables that are not restricted to be distributed independently of the unobserved determinants of latent utilities. The model does not employ large support, special regressor or control function restrictions, indeed it is silent about the process delivering values of endogenous explanatory variables and in this respect it is incomplete. Instead the model employs instrumental variable restrictions requiring the existence of instrumental variables which are excluded from latent utilities and distributed independently of the unobserved components of utilities.

We show that the model delivers set, not point, identification of the latent utility functions and we characterize sharp bounds on those functions. We develop easy-to-compute outer regions which in parametric models require little more calculation than what is involved in a conventional maximum likelihood analysis. The results are illustrated using a model which is essentially the parametric conditional logit model of McFadden (1974) but with potentially endogenous explanatory variables and instrumental variable restrictions.

The method employed has wide applicability and for the first time brings instrumental variable methods to bear on structural models in which there are multiple unobservables in a structural equation.

This paper has now been revised and the new version is available as CWP39/11.

]]>
http://www.ifs.org.uk/publications/5479 Fri, 11 Feb 2011 00:00:00 +0000
<![CDATA[Does it matter who responded to the survey? Trends in the U.S. gender earnings gap revisited]]> Blau and Kahn (JOLE, 1997; ILRR, 2006) decomposed trends in the U.S. gender earnings gap into observable and unobservable components using the PSID. They found that the unobservable part contributed significantly not only to the rapidly shrinking earnings gap in the 1980s, but also to the slowing-down of the convergence in the 1990s. In this paper, we extend their framework to consider measurement error due to the use of proxy/representative respondents. First, we document a strong trend of changing gender composition of household-representative respondents toward more females. Second, we estimate the impact of the changing gender composition on Blau and Kahn's decomposition. We find that a non-ignorable portion of changes in the gender gap could be attributed to changes in the self/proxy respondent composition. Specifically, the actual reduction in the gender gap can be smaller than what the estimates without taking into account the measurement error might suggest. We conclude that a careful validation study would be necessary to ascertain the magnitude of the spurious measurement error effects.

]]>
http://www.ifs.org.uk/publications/5478 Fri, 11 Feb 2011 00:00:00 +0000
<![CDATA[Policy analysis with incredible certitude]]> Analyses of public policy regularly express certitude about the consequences of alternative policy choices. Yet policy predictions often are fragile, with conclusions resting on critical unsupported assumptions or leaps of logic. Then the certitude of policy analysis is not credible. I develop a typology of incredible analytical practices and gives illustrative cases. I call these practices conventional certitude, dueling certitudes, conflating science and advocacy, wishful extrapolation, illogical certitude, and media overreach.

]]>
http://www.ifs.org.uk/publications/5477 Thu, 10 Feb 2011 00:00:00 +0000
<![CDATA[High performance quadrature rules: how numerical integration affects a popular model of product differentiation]]> Efficient, accurate, multi-dimensional, numerical integration has become an important tool for approximating the integrals which arise in modern economic models built on unobserved heterogeneity, incomplete information, and uncertainty. This paper demonstrates that polynomialbased rules out-perform number-theoretic quadrature (Monte Carlo) rules both in terms of efficiency and accuracy. To show the impact a quadrature method can have on results, we examine the performance of these rules in the context of Berry, Levinsohn, and Pakes (1995)'s model of product differentiation, where Monte Carlo methods introduce considerable numerical error and instability into the computations. These problems include inaccurate point estimates, excessively tight standard errors, instability of the inner loop 'contraction' mapping for inverting market shares, and poor convergence of several state of the art solvers when computing point estimates. Both monomial rules and sparse grid methods lack these problems and provide a more accurate, cheaper method for quadrature. Finally, we demonstrate how researchers can easily utilize high quality, high dimensional quadrature rules in their own work.

]]>
http://www.ifs.org.uk/publications/5476 Tue, 08 Feb 2011 00:00:00 +0000
<![CDATA[How much do lifetime earnings explain retirement resources?]]> We use a unique dataset, containing individual survey data from the English Longitudinal Study of Ageing linked to administrative data on earnings histories from administrative records, to construct measures of lifetime earnings and examine how these relate to financial resources in retirement. Retirement income and wealth at retirement is, as expected, positively correlated with lifetime earnings but there is also substantial dispersion in retirement income and retirement wealth among people with similar lifetime earnings. For example, we find that those with greater numerical ability and higher education tend to have greater retirement resources even after controlling for differences in lifetime earnings. The retirement resources of single women are far less well explained by their own lifetime earnings than those of couples or single men. We hypothesise that, as the vast majority of single women in the age group considered had previously been married and are now widowed or divorced, this reflects the fact that we do not observe the lifetime earnings of their former spouses.

]]>
http://www.ifs.org.uk/publications/5470 Mon, 07 Feb 2011 00:00:00 +0000
<![CDATA[Extensive and intensive margins of labour supply: working hours in the US, UK and France]]> This paper documents the key stylised facts underlying the evolution of labour supply at the extensive and intensive margins in the last forty years in three countries: United-States, United-Kingdom and France. We develop a statistical decomposition that provides bounds on changes at the extensive and intensive margins. This decomposition is also shown to be coherent with the analysis of labour supply elasticities at these margins. We use detailed representative micro-datasets to examine the relative importance of the extensive and intensive margins in explaining the overall changes in total hours worked. We also present some initial estimates of the broad distribution of implied elasticities and their implication for the overall aggregate hours elasticity.

]]>
http://www.ifs.org.uk/publications/5531 Tue, 01 Feb 2011 00:00:00 +0000
<![CDATA[Factor rotation with non-negativity constraints]]> Factor rotation is widely used to interpret the estimated factor loadings from latent variable models. Rotation methods embody a priori concepts of 'complexity' of factor structures, which they seek to minimise. Surprisingly, it is rare for researchers to exploit one of the most common and powerful sources of a priori information: non-negativity of factor loadings. This paper develops a method of incorporating sign restrictions in factor rotation, exploiting a recently-developed test for multiple inequality constraints. An application to the measurement of disability demonstrates the feasibility of the method and the power of non-negativity restrictions.

]]>
http://www.ifs.org.uk/publications/5443 Wed, 26 Jan 2011 00:00:00 +0000
<![CDATA[Welfare analysis using nonseparable models]]> This paper proposes a framework to model empirically welfare effects that are associated with a price change in a population of heterogeneous consumers. Individual demands are characterized by a nonseparable model which is nonparametric in the regressors, as well as monotonic in unobserved heterogeneity. In this setup, we first provide and discuss conditions under which the heterogeneous welfare effects are identified, and establish constructive identification. We then propose a sample counterpart estimator, and analyze its large sample properties. For both identification and estimation, we distinguish between the cases when regressors are exogenous and when they are endogenous. Finally, we apply all concepts to measuring the heterogeneous effect of a chance of gasoline price using US consumer data and find very substantial differences in individual effects.

]]>
http://www.ifs.org.uk/publications/5442 Tue, 25 Jan 2011 00:00:00 +0000
<![CDATA[Partial identification using random set theory]]> This paper illustrates how the use of random set theory can benefit partial identification analysis. We revisit the origins of Manski's work in partial identification (e.g., Manski (1989, 1990)), focusing our discussion on identification of probability distributions and conditional expectations in the presence of selectively observed data, statistical independence and mean independence assumptions, and shape restrictions. We show that the use of the Choquet capacity functional and of the Aumann expectation of a properly defined random set can simplify and extend previous results in the literature. We pay special attention to explaining how the relevant random set needs to be constructed, depending on the econometric framework at hand. We also discuss limitations in the applicability of specific tools of random set theory to partial identification analysis.

]]>
http://www.ifs.org.uk/publications/5379 Wed, 22 Dec 2010 00:00:00 +0000
<![CDATA[Maternal education, home environments and the development of children and adolescents]]> We study the intergenerational effects of maternal education on children's cognitive achievement, behavioral problems, grade repetition and obesity. We address the endogeneity of maternal schooling by instrumenting it with variation in schooling costs during the mother's adolescence. Using matched data from the female participants of the National Longitudinal Survey of Youth 1979 (NLSY79) and their children, we can control for mother's ability and family background factors. Our results show substantial intergenerational returns to education. For children aged 7-8, for example, our IV results indicate that an additional year of mother's schooling increases the child's performance on a standardized math test by almost 0.1 of a standard deviation, and reduces the incidence of behavioral problems. Our data set allows us to study a large array of channels which may transmit the effect of maternal education to the child, including family environment and parental investments at different ages of the child. We find that income effects, delayed childbearing, and assortative mating are likely to be important, and we show that maternal education leads to substantial differences in maternal labor supply. We investigate heterogeneity in returns, and we present results focusing both on very early stages in the child's life as well as adolescent outcomes. We discuss potential problems of weak instruments, and our results are found to be robust to changes in our specification. We discuss policy implications and relate our findings to the literature on intergenerational mobility.

]]>
http://www.ifs.org.uk/publications/5378 Fri, 17 Dec 2010 00:00:00 +0000
<![CDATA[A flying start? Long term consequences of maternal time investments in children during their first year of life]]> We study the impact of increasing the time that the mother spends with her child in the first year of her life. In particular, we examine a reform that increased paid and unpaid maternity leave entitlements in Norway. In response to this reform, maternal leave increased on average by 4 months and family income was unaffected. We find that this increase in maternal time with the child led to a 2.7 percentage points decline in high school dropout rates, going up to 5.2 percentage points for those whose mothers have less than 10 years of education. This effect is especially large for children of mothers who, in the absence of the reform, would take very low levels of unpaid leave. Finally, there is a weak impact on college attendance. The results also suggest that much of the impact of early time with the child is at low levels of maternal education.

]]>
http://www.ifs.org.uk/publications/5377 Wed, 15 Dec 2010 00:00:00 +0000
<![CDATA[Sin taxes in differentiated product oligopoly: an application to the butter and margarine market]]> There is policy interest in using tax to change food purchasing behaviour. The literature has not accounted for the oligopolistic structure of the industry. In oligopoly the impact of taxes depend on preferences, and how firms pass tax onto prices. We consider a tax on saturated fat. Using transaction level data we find that the form of tax and firms' strategic behaviour are important determinants of the impact. Our results suggest that an excise tax is more efficient than an ad valorem tax at reducing saturated fat purchases and an ad valorem tax is more efficient at raising revenue.

]]>
http://www.ifs.org.uk/publications/5376 Mon, 13 Dec 2010 00:00:00 +0000
<![CDATA[Redistribution, work incentives and thirty years of UK tax and benefit reform]]> Governments wishing to reduce inequality by redistributing money from the rich to the poor face the dilemma that in doing so (by increasing tax rates and means-tested benefits, for example) they reduce the incentive for individuals to increase their incomes. Policy-makers have tried to balance these objectives in different ways and, partly as a result of this, the tax and benefit system today is very different from the one that existed thirty years ago. In this paper we look at how the tax and benefit system redistributed income and affected incentives to work in 2009-10, and at the effect of tax and benefit reforms between 1978-79 and 2009-10 on the level of inequality and work incentives.

]]>
http://www.ifs.org.uk/publications/5367 Thu, 09 Dec 2010 00:00:00 +0000
<![CDATA[Testing for threshold effects in regression models]]> In this article, we develop a general method for testing threshold effects in regression models, using sup-likelihood-ratio (LR)-type statistics. Although the sup-LR-type test statistic has been considered in the literature, our method for establishing the asymptotic null distribution is new and nonstandard. The standard approach in the literature for obtaining the asymptotic null distribution requires that there exist a certain quadratic approximation to the objective function. The article provides an alternative, novel method that can be used to establish the asymptotic null distribution, even when the usual quadratic approximation is intractable. We illustrate the usefulness of our approach in the examples of the maximum score estimation, maximum likelihood estimation, quantile regression, and maximum rank correlation estimation. We establish consistency and local power properties of the test. We provide some simulation results and also an empirical application to tipping in racial segregation. This article has supplementary materials online.

]]>
http://www.ifs.org.uk/publications/5364 Fri, 03 Dec 2010 00:00:00 +0000
<![CDATA[The matching method for treatment evaluation with selective participation and ineligibles]]> The matching method for treatment evaluation does not balance selective unobserved differences between treated and non-treated. We derive a simple correction term if there is an instrument that shifts the treatment probability to zero in specific cases. Within the same framework we also suggest a new test of the conditional independence assumption justifying matching. Policies with eligibility restrictions, where treatment is impossible if some variable exceeds a certain value, provide a natural application. In an empirical analysis, we exploit the age eligibility restriction in the Swedish Youth Practice subsidized work program for young unemployed, where compliance is imperfect among the young. Adjusting the matching estimator for selectivity changes the results towards making of subsidized work detrimental in moving individuals into employment.

This paper is a revised version of cemmap working paper CWP33/07.

]]>
http://www.ifs.org.uk/publications/5363 Mon, 29 Nov 2010 00:00:00 +0000
<![CDATA[A comparison of alternative approaches to sup-norm goodness of fit tests with estimated parameters]]> Goodness of fit tests based on sup-norm statistics of empirical processes have nonstandard limiting distributions when the null hypothesis is composite-that is, when parameters of the null model are estimated. Several solutions to this problem have been suggested, including the calculation of adjusted critical values for these nonstandard distributions and the transformation of the empirical process such that statistics based on the transformed process are asymptotically distribution-free. The approximation methods proposed by Durbin (1985) can be applied to compute appropriate critical values for tests based on sup-norm statistics. The resulting tests have quite accurate size, a fact which has gone unrecognized in the econometrics literature. Some justification for this accuracy lies in the similar features that Durbin's approximation methods share with the theory of extrema for Gaussian random fields and for Gauss-Markov processes. These adjustment techniques are also related to the transformation methodology proposed by Khmaladze (1981) through the score function of the parametric model. Monte Carlo experiments suggest that these two testing strategies are roughly comparable to one another and more powerful than a simple bootstrap procedure.

]]>
http://www.ifs.org.uk/publications/5345 Sun, 07 Nov 2010 00:00:00 +0000
<![CDATA[Additive models for quantile regression: model selection and confidence bandaids]]> Additive models for conditional quantile functions provide an attractive framework for nonparametric regression applications focused on features of the response beyond its central tendency. Total variation roughness penalities can be used to control the smoothness of the additive components much as squared Sobelev penalties are used for classical L2 smoothing splines. We describe a general approach to estimation and inference for additive models of this type. We focus attention primarily on selection of smoothing parameters and on the construction of confidence bands for the nonparametric components. Both pointwise and uniform confidence bands are introduced; the uniform bands are based on the Hotelling (1939) tube approach. Some simulation evidence is presented to evaluate finite sample performance and the methods are also illustrated with an application to modeling childhood malnutrition in India.

]]>
http://www.ifs.org.uk/publications/5344 Sat, 06 Nov 2010 00:00:00 +0000
<![CDATA[A structural model of segregation in social networks]]> In this paper, I develop and estimate a dynamic model of strategic network formation with heterogeneous agents. While existing models have multiple equilibria, I prove the existence of a unique stationary equilibrium, which characterizes the likelihood of observing a specific network in the data. As a consequence, the structural parameters can be estimated using only one observation of the network at a single point in time. The estimation is challenging because the exact evaluation of the likelihood is computationally infeasible. To circumvent this problem, I propose a Bayesian Markov Chain Monte Carlo algorithm that avoids direct evaluation of the likelihood. This method drastically reduces the computational burden of estimating the posterior distribution and allows inference in high dimensional models.

I present an application to the study of segregation in school friendship networks, using data from Add Health containing the actual social networks of students in a representative sample of US schools. My results suggest that for white students, the value of a same-race friend decreases with the fraction of whites in the school. The opposite is true for African American students.

The model is used to study how different desegregation policies may affect the structure of the network in equilibrium. I find an inverted u-shaped relationship between the fraction of students belonging to a racial group and the expected equilibrium segregation levels. These results suggest that desegregation programs may decrease the degree of interracial interaction within schools.

]]>
http://www.ifs.org.uk/publications/5343 Fri, 05 Nov 2010 00:00:00 +0000
<![CDATA[Sparse models and methods for optimal instruments with an application to eminent domain]]> We develop results for the use of LASSO and Post-LASSO methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p, that apply even when p is much larger than the sample size, n. We rigorously develop asymptotic distribution and inference theory for the resulting IV estimators and provide conditions under which these estimators are asymptotically oracle-efficient. In simulation experiments, the LASSO-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the LASSO-based IV estimator substantially reduces estimated standard errors allowing one to draw much more precise conclusions about the economic effects of these decisions.

Optimal instruments are conditional expectations; and in developing the IV results, we also establish a series of new results for LASSO and Post-LASSO estimators of non-parametric conditional expectation functions which are of independent theoretical and practical interest. Specifically, we develop the asymptotic theory for these estimators that allows for non-Gaussian, heteroscedastic disturbances, which is important for econometric applications. By innovatively using moderate deviation theory for self-normalized sums, we provide convergence rates for these estimators that are as sharp as in the homoscedastic Gaussian case under the weak condition that log p = o(n 1/3). Moreover, as a practical innovation, we provide a fully data-driven method for choosing the user-specified penalty that must be provided in obtaining LASSO and Post-LASSO estimates and establish its asymptotic validity under non-Gaussian, heteroscedastic disturbances.

]]>
http://www.ifs.org.uk/publications/5316 Fri, 22 Oct 2010 00:00:00 +0000
<![CDATA[Reserve price effects in auctions: estimates from multiple RD designs]]> We present evidence from 260,000 online auctions of second-hand cars to identify the impact of public reserve prices on auction outcomes. To establish causality, we exploit multiple discontinuities in the relationship between reserve prices and vehicle characteristics to present RD estimates of reserve price effects on auction outcomes. Our first set of results show that, in line with the robust predictions of auction theory, an increase in reserve price decreases the number of bidders, increases the likelihood the object remains unsold, and increases expected revenue conditional on sale. Reserve price effects are found to be larger when there are more entrants, and when the reserve price is lower to begin with. Our second set of results then combine these estimates to calibrate the reserve price effect on the auctioneer's expected revenue. This reveals the auctioneer's reserve price policy to be locally optimal. Our final set of results provide novel evidence on reserve price effects on the composition of bidders. We find that an increase in reserve price: (i) decreases the number of potential bidders as identified through individual web browsing histories; (ii) leads to only more experienced and historically successful bidders still entering the auction; (iii) the characteristics of actual winners are less sensitive to the reserve price than those of the average bidder, suggesting auction winners are not the marginal entrant.

]]>
http://www.ifs.org.uk/publications/5306 Fri, 15 Oct 2010 00:00:00 +0000
<![CDATA[Child poverty in the UK since 1998-99: lessons from the past decade]]> As a result of the Child Poverty Act (2010), current and future governments are committed to reducing the rate of relative income child poverty in the UK to 10% by 2020-21. This paper looks in detail at the progress made towards this goal under the previous Labour administrations. Direct tax and benefit reforms are very important in explaining at least three things: the large overall reduction in child poverty since 1998-99; the striking slowdown in progress towards the child poverty targets between 2004-05 and 2007-08; and some of the variation in child poverty trends between different groups of children. However, some of the child poverty-reducing impact of those reforms acted simply to stop child poverty rising as real earnings grew over the period, which increases median income and thus the relative poverty line. The performance of parents in the labour market is important too: between regions, parental employment and child poverty trends are closely related; the overall reduction in child poverty since 1998-99 has been helped by higher lone parent employment rates; and the overall rise in child poverty since 2004-05 has been most concentrated on children of one-earner couples, whose real earnings have fallen.

]]>
http://www.ifs.org.uk/publications/5303 Thu, 14 Oct 2010 00:00:00 +0000
<![CDATA[Estimating marginal returns to education]]> This paper estimates the marginal returns to college for individuals induced to enroll in college by different marginal policy changes. The recent instrumental variables literature seeks to estimate this parameter, but in general it does so only under strong assumptions that are tested and found wanting. We show how to utilize economic theory and local instrumental variables estimators to estimate the effect of marginal policy changes. Our empirical analysis shows that returns are higher for individuals more likely to attend college. We contrast the returns to well-defined marginal policy changes with IV estimates of the return to schooling. Some marginal policy changes inducing students into college produce very low returns.

]]>
http://www.ifs.org.uk/publications/5300 Mon, 11 Oct 2010 00:00:00 +0000
<![CDATA[What determines private school choice? a comparison between the UK and Australia]]> This paper compares patterns of private school attendance in the UK and Australia. About 6.5% of school children in the UK attend a private school, while 33% do so in Australia. We use comparable household panel data from the two countries to model attendance at a private school at age 15 or 16 as a function of household income and other child and parental characteristics. As one might expect, we observe a strong effect of household income on private school attendance. The addition of other household characteristics reduces this income elasticity, and reveals a strong degree of intergenerational transmission in both countries, with children being 8 percentage points more likely to attend a private school if one of their parents attended one in the UK, and anywhere up to 20 percentage points more likely in Australia. The analysis also reveals significant effects of parental education level, political preferences, religious background and the number of siblings on private school attendance.

]]>
http://www.ifs.org.uk/publications/5295 Thu, 30 Sep 2010 00:00:00 +0000
<![CDATA[The demand for private schooling in England: the impact of price and quality]]>

In this paper we use English school level data from 1993 to 2008 aggregated up to small neighbourhood areas to look at the determinants of the demand for private education in England from the ages of 7 until 15 (the last year of compulsory schooling). We focus on the relative importance of price and quality of schooling. However, there are likely to be unobservable factors that are correlated with private school prices and/or the quality of state schools that also impact on the demand for private schooling which could bias our estimates. Our long regional and local authority panel data allows us to employ a number of strategies to deal with this potential endogeneity. Because of the likely presence of incidental trends in our unobservables, we employ a double difference system GMM approach to remove both fixed effects and incidental trends. We find that the demand for private schooling is inversely related to private school fees as well as the quality of state schooling in the local area at the time families were making key schooling choice decisions at the ages of 7, 11 and 13. We estimate that a one standard deviation increase in the private school day fee when parents/students are making these key decisions reduces the proportion attending private schools by around 0.33 percentage points which equates to an elasticity of around -0.26. This estimate is only significant for choices at age 7 (but the point estimates are very similar at the ages of 11 and 13). At age 11 and age 13, an increase in the quality of local state secondary reduces the probability of attending private schools. At age 11, a one standard deviation increase in state school quality reduces participation in private schools by 0.31 percentage points which equates to an elasticity of -0.21. The effect at age 13 is slightly smaller, but still significant. Demand for private schooling at the ages of 8, 9, 10 and 12, 14 and 15 are almost entirely determined by private school demand in the previous year for the same cohort, and price and quality do not impact significantly on this decision other than through their initial influence on the key participation decisions at the ages of 7, 11 and 13.

]]>
http://www.ifs.org.uk/publications/5294 Wed, 29 Sep 2010 00:00:00 +0000
<![CDATA[Conditions for the existence of control functions in nonseparable simultaneous equations models]]> The control function approach (Heckman and Robb (1985)) in a system of linear simultaneous equations provides a convenient procedure to estimate one of the functions in the system using reduced form residuals from the other functions as additional regressors. The conditions on the structural system under which this procedure can be used in nonlinear and nonparametric simultaneous equations has thus far been unknown. In this note, we define a new property of functions called control function separability and show it provides a complete characterization of the structural systems of simultaneous equations in which the control function procedure is valid.

]]>
http://www.ifs.org.uk/publications/5290 Tue, 28 Sep 2010 00:00:00 +0000
<![CDATA[Empirically probing the quantity-quality model]]> This paper tests whether family size has a causal effect on girls' education in Mexico. It exploits son preference as the main source of random variation in the propensity to have more children, and estimates causal effects using instrumental variables. Overall, it finds no evidence of family size having an adverse effect on education, once the endogeneity of family size is accounted for. Results are robust to another commonly used instrument in this literature, the occurrence of twin births. A divisive concern throughout this literature is that the instruments are invalid, so that inferences including policy recommendations may be misleading. An important contribution of this paper is to allow for the possibility that the instruments are invalid and to provide an answer to the question of just how much the assumption of instrument exogeneity drives findings. It concludes that the assumption of exogeneity does not affect the results that much, and the effects of family size on girls' schooling remain extremely modest at most.

]]>
http://www.ifs.org.uk/publications/5289 Tue, 28 Sep 2010 00:00:00 +0000
<![CDATA[Could education promote the Israeli-Palestinian peace process?]]> The goal of this paper is to measure Palestinians' attitudes towards a peace process and their determinants. One novelty is to define these attitudes as multidimensional and to measure them carefully using a flexible item response model. Results show that education, on which previous evidence appears contradictory, has a positive effect on attitudes towards concessions but a negative effect on attitudes towards reconciliation. This could occur if more educated people, who currently have very low returns to education, have more to gain from peace but are less willing to reconcile because of resentment acquired due to their experience.

]]>
http://www.ifs.org.uk/publications/5276 Tue, 21 Sep 2010 00:00:00 +0000
<![CDATA[Starting school and leaving welfare: the impact of public education on lone parents' welfare receipt]]> Please note: This paper was updated on 26 October 2010.

Childcare costs are often viewed as one of the biggest barriers to work, particularly among lone parents on low incomes. Children in England are eligible to attend free part-time nursery classes (equivalent to pre-kindergarten) from the academic term after they turn 3, and are typically eligible to start free fulltime public education on 1 September after they turn four. These rules mean that children born one day apart may start nursery classes up to four months apart, and may start school up to one year apart. We exploit these discontinuities to investigate the impact of a youngest child being eligible for part-time nursery education and full-time primary education on welfare receipt and employment patterns amongst lone parents receiving welfare. In contrast to previous studies, we are able to estimate the precise timing (relative to the date on which part-time or full-time education begins) of any impact on labour supply, by using rich administrative data. Amongst those receiving welfare when their youngest child is aged approximately three and a half, we find a small but significant effect of free full-time public education on both employment and welfare receipt (of around 2 percentage points, or 10-15 per cent), which peaks eight to nine months after the child becomes eligible (aged approximately 4 years and 9 months). We find weaker evidence of an even smaller effect of eligibility for part-time nursery education. This suggests that the expansion of public education programmes to younger disadvantaged children may only encourage a small number of low income lone parents to return to work (although, of course, this is not the primary aim of such programmes).

]]>
http://www.ifs.org.uk/publications/5275 Mon, 20 Sep 2010 00:00:00 +0000
<![CDATA[Non cooperative household demand]]> We study non cooperative household models with two agents and several voluntarily contributed public goods, deriving the counterpart to the Slutsky matrix and demonstrating the nature of the deviation of its properties from those of a true Slutsky matrix in the unitary model. We provide results characterising both cases in which there are and are not jointly contributed public goods. Demand properties are contrasted with those for collective models and conclusions drawn regarding the possibility of empirically testing the collective model against non cooperative alternatives and the non cooperative model against a general alternative.

]]>
http://www.ifs.org.uk/publications/5274 Sun, 19 Sep 2010 00:00:00 +0000
<![CDATA[Conditional cash transfers, women and the demand for food]]>

We examine the effect of large cash transfers on the consumption of food by poor households in rural Mexico. The transfers represent 20% of household income on average, and yet, the budget share of food is unchanged following receipt of this money. This is an important puzzle to solve, particularly so in the context of a social welfare programme designed in part to improve nutrition of individuals in the poorest households. We estimate an Engel curve for food. We rule out price increases, changes in the quality of food consumed and homotheticity of preferences as explanations for this puzzle. We also show that food is a necessity, with a strong negative effect of income on the food budget share. The decrease in food budget share caused by the large increase in income is cancelled by some other relevant aspect of the programme so that the net effect is nil. We argue that the program has not changed preferences and that there is no labelling of money. We propose that the key to the puzzle resides in the fact that the transfer is put in the hands of women and that the change in control over household resources is what leads to the observed changes in behaviour.

]]>
http://www.ifs.org.uk/publications/5273 Sat, 18 Sep 2010 00:00:00 +0000
<![CDATA[Quantile uncorrelation and instrumental regressions]]> We introduce a notion of median uncorrelation that is a natural extension of mean (linear) uncorrelation. A scalar random variable Y is median uncorrelated with a k-dimensional random vector X if and only if the slope from an LAD regression of Y on X is zero. Using this simple definition, we characterize properties of median uncorrelated random variables, and introduce a notion of multivariate median uncorrelation. We provide measures of median uncorrelation that are similar to the linear correlation coefficient and the coefficient of determination. We also extend this median uncorrelation to other loss functions. As two stage least squares exploits mean uncorrelation between an instrument vector and the error to derive consistent estimators for parameters in linear regressions with endogenous regressors, the main result of this paper shows how a median uncorrelation assumption between an instrument vector and the error can similarly be used to derive consistent estimators in these linear models with endogenous regressors. We also show how median uncorrelation can be used in linear panel models with quantile restrictions and in linear models with measurement errors.

]]>
http://www.ifs.org.uk/publications/5272 Fri, 17 Sep 2010 00:00:00 +0000
<![CDATA[Explaining the socio-economic gradient in child outcomes: the intergenerational transmission of cognitive skills]]> Papers in this volume and elsewhere consistently find a strong relationship between children's cognitive abilities and their parents' socio-economic position (SEP). Most studies seeking to explain the paths through which SEP affects cognitive skills suffer from a potentially serious omitted variables problem, as they are unable to account for an important determinant of children's cognitive abilities, namely parental cognitive ability. A range of econometric strategies have been employed to overcome this issue, but in this paper, we adopt the very simple (but rarely available) route of using data that includes a range of typically unobserved characteristics, such as parental cognitive ability and social skills. In line with previous work on the intergenerational transmission of cognitive skills, we find that parental cognitive ability is a significant predictor of children's cognitive ability; moreover, it explains one sixth of the socio-economic gap in those skills, even after controlling for a rich set of demographic, attitudinal and behavioural factors. Despite the importance of parental cognitive ability in explaining children's cognitive ability, however, the addition of such typically unobserved characteristics does not alter our impression of the relative importance of other factors in explaining the socio-economic gap in cognitive skills. This is reassuring for studies that are unable to control for parental cognitive ability.

]]>
http://www.ifs.org.uk/publications/5270 Fri, 17 Sep 2010 00:00:00 +0000
<![CDATA[The role of attitudes and behaviours in explaining socio-economic differences in attainment at age 16]]> It is well known that children growing up in poor families leave school with considerably lower qualifications than children from better off backgrounds. Using a simple decomposition analysis, we show that around two thirds of the socio-economic gap in attainment at age 16 can be accounted for by long-run family background characteristics and prior ability, suggesting that circumstances and investments made considerably earlier in the child's life explain the majority of the gap in test scores between young people from rich and poor families. However, we also find that differences in the attitudes and behaviours of young people and their parents during the teenage years play a key role in explaining the rich-poor gap in GCSE attainment: together, they explain a further quarter of the gap at age 16, and the majority of the small increase in this gap between ages 11 and 16. On this basis, our results suggest that while the most effective policies in terms of raising the attainment of young people from poor families are likely to be those enacted before children reach secondary school, policies that aim to reduce differences in attitudes and behaviours between the poorest children and those from better-off backgrounds during the teenage years may also make a significant contribution towards lowering the gap in achievement between young people from the richest and poorest families at age 16.

]]>
http://www.ifs.org.uk/publications/5268 Thu, 16 Sep 2010 00:00:00 +0000
<![CDATA[Sharp identification regions in models with convex moment predictions]]>

We provide a tractable characterization of the sharp identification region of the parameters θ in a broad class of incomplete econometric models. Models in this class have set valued predictions that yield a convex set of conditional or unconditional moments for the observable model variables. In short, we call these models with convex moment predictions. Examples include static, simultaneous move finite games of complete and incomplete information in the presence of multiple equilibria; best linear predictors with interval outcome and covariate data; and random utility models of multinomial choice in the presence of interval regressors data. Given a candidate value for θ, we establish that the convex set of moments yielded by the model predictions can be represented as the Aumann expectation of a properly defined random set. The sharp identification region of θ, denoted Θ1, can then be obtained as the set of minimizers of the distance from a properly specified vector of moments of random variables to this Aumann expectation. Algorithms in convex programming can be exploited to efficiently verify whether a candidate θ is in Θ1. We use examples analyzed in the literature to illustrate the gains in identification and computational tractability afforded by our method.

This paper is a revised version of CWP27/09.

]]>
http://www.ifs.org.uk/publications/5271 Wed, 15 Sep 2010 00:00:00 +0000
<![CDATA[Education choices in Mexico: using a structural model and a randomized experiment to evaluate Progresa]]> In this paper we use an economic model to analyse data from a major social experiment, namely PROGRESA in Mexico, and to evaluate its impact on school participation. In the process we also show the usefulness of using experimental data to estimate a structural economic model. The evaluation sample includes data from villages where the program was implemented and where it was not. The allocation was randomised for evaluation purposes. We estimate a structural model of education choices and argue that without such a framework it is impossible to evaluate the effect of the program and, especially, possible changes to its structure. We also argue that the randomized component of the data allows us to identify a more flexible model that is better suited to evaluate the program. We find that the program has a positive effect on the enrollment of children, especially after primary school; this result is well replicated by the parsimonious structural model. We also find that a revenue neutral change in the program that would increase the grant for secondary school children while eliminating for the primary school children would have a substantially larger effect on enrollment of the latter, while having minor effects on the former.

]]>
http://www.ifs.org.uk/publications/5266 Tue, 14 Sep 2010 00:00:00 +0000
<![CDATA[Career progression and formal versus on-the-job training]]>

We evaluate the German apprenticeship system, which combines on-the-job training with classroom teaching, by modelling individual careers from the choice to join such a scheme and followed by their employment, job to job transitions and wages over the lifecycle. Our data is drawn from administrative records that report accurately job transitions and pay. We find that apprenticeships increase wages, and change wage pro files with more growth upfront, while wages in the non-apprenticeship sector grow at a lower rate but for longer. Non-apprentices face a much higher variance to the shocks of their match speci fic eff ects and a substantially larger variance in initial level of the off ered wages. We fi nd no evidence that quali fied apprentices are harder to reallocate following job loss. The average life-cycle return to an apprenticeship career is about 14% and the return is mainly driven by the di fferences in the wage pro file.

]]>
http://www.ifs.org.uk/publications/5265 Mon, 13 Sep 2010 00:00:00 +0000
<![CDATA[Estimating households' willingness to pay]]> The recent literature has brought together the characteristics model of utility and classic revealed preference arguments to learn about consumers' willingness to pay. We incorporate market pricing equilibrium conditions into this setting. This allows us to use observed purchase prices and quantities on a large basket of products to learn about individual household's willingness to pay for characteristics, while maintaining a high degree of flexibility and also avoiding the biases that arise from inappropriate aggregation.

We illustrate the approach using scanner data on food purchases to estimate bounds on willingness to pay for the organic characteristic. We combine these estimates with information on households' stated preferences and beliefs to show that on average quality is the most important factor affecting bounds on household willingness to pay for organic, with health concerns coming second, and environmental concerns lagging far behind.

]]>
http://www.ifs.org.uk/publications/5236 Wed, 18 Aug 2010 00:00:00 +0000
<![CDATA[The asymptotic variance of semi-parametric estimators with generated regressors]]> We study the asymptotic distribution of three-step estimators of a finite dimensional parameter vector where the second step consists of one or more nonparametric regressions on a regressor that is estimated in the first step. The first step estimator is either parametric or non-parametric. Using Newey's (1994) path-derivative method we derive the contribution of the first step estimator to the influence function. In this derivation it is important to account for the dual role that the first step estimator plays in the second step non-parametric regression, i.e., that of conditioning variable and that of argument. We consider three examples in more detail: the partial linear regression model estimator with a generated regressor, the Heckman, Ichimura and Todd (1998) estimator of the Average Treatment Effect and a semi-parametric control variable estimator.

]]>
http://www.ifs.org.uk/publications/5232 Fri, 06 Aug 2010 00:00:00 +0000
<![CDATA[Analyzing social experiments as implemented: evidence from the HighScope Perry Preschool Program]]>

Social experiments are powerful sources of information about the effectiveness of interventions. In practice, initial randomization plans are almost always compromised. Multiple hypotheses are frequently tested. "Significant" effects are often reported with p-values that do not account for preliminary screening from a large candidate pool of possible effects. This paper develops tools for analyzing data from experiments as they are actually implemented.

We apply these tools to analyze the influential HighScope Perry Preschool Program. The Perry program was a social experiment that provided preschool education and home visits to disadvantaged children during their preschool years. It was evaluated by the method of random assignment. Both treatments and controls have been followed from age 3 through age 40.

Previous analyses of the Perry data assume that the planned randomization protocol was implemented. In fact, as in many social experiments, the intended randomization protocol was compromised. Accounting for compromised randomization, multiple-hypothesis testing, and small sample sizes, we find statistically significant and economically important program effects for both males and females. We also examine the representativeness of the Perry study.

Download appendix

]]>
http://www.ifs.org.uk/publications/5231 Mon, 02 Aug 2010 00:00:00 +0000
<![CDATA[Alternative models for moment inequalities]]> Behavioral choice models generate inequalities which, when combined with additional assumptions, can be used as a basis for estimation. This paper considers two sets of such assumptions and uses them in two empirical examples. The second example examines the structure of payments resulting from the upstream interactions in a vertical market. We then mimic the empirical setting for this example in a numerical analysis which computes actual equilibria, examines how their characteristics vary with the market setting, and compares them to the empirical results. The final section uses the numerical results in a Monte Carlo analysis of the robustness of the two approaches to estimation to their underlying assumptions.

]]>
http://www.ifs.org.uk/publications/5198 Tue, 13 Jul 2010 00:00:00 +0000
<![CDATA[Is it different for zeros? Discriminating between models for non-negative data with many zeros]]> In many economic applications, the variate of interest is non-negative and its distribution is characterized by a mass-point at zero and a long right-tail. Many regression strategies have been proposed to deal with data of this type. Although there has been a long debate in the literature on the appropriateness of different models, formal statistical tests to choose between the competing specifications, or to assess the validity of the preferred model, are not often used in practice. In this paper we propose a novel and simple regression-based specification test that can be used to test these models against each other.

]]>
http://www.ifs.org.uk/publications/5197 Mon, 12 Jul 2010 00:00:00 +0000
<![CDATA[Uniform confidence bands for functions estimated nonparametrically with instrumental variables]]> This paper is concerned with developing uniform confidence bands for functions estimated nonparametrically with instrumental variables. We show that a sieve nonparametric instrumental variables estimator is pointwise asymptotically normally distributed. The asymptotic normality result holds in both mildly and severely ill-posed cases. We present an interpolation method to obtain a uniform confidence band and show that the bootstrap can be used to obtain the required critical values. Monte Carlo experiments illustrate the finite-sample performance of the uniform confidence band.

This paper is a revised version of CWP18/09.

]]>
http://www.ifs.org.uk/publications/5196 Sun, 11 Jul 2010 00:00:00 +0000
<![CDATA[Optimal significance tests in simultaneous equation models]]> Consider testing the null hypothesis that a single structural equation has specified coefficients. The alternative hypothesis is that the relevant part of the reduced form matrix has proper rank, that is, that the equation is identified. The usual linear model with normal disturbances is invariant with respect to linear transformations of the endogenous and of the exogenous variables. When the disturbance covariance matrix is known, it can be set to the identity, and the invariance of the endogenous variables is with respect to orthogonal transformations. The likelihood ratio test is invariant with respect to these transformations and is the best invariant test. Furthermore it is admissible in the class of all tests. Any other test has lower power and/or higher significance level.

]]>
http://www.ifs.org.uk/publications/5182 Mon, 05 Jul 2010 00:00:00 +0000
<![CDATA[How demanding is the revealed preference approach to demand]]> A well known problem with revealed preference methods is that when data are found to satisfy their restrictions it is hard to know whether this should be viewed as a triumph for economic theory, or a warning that these conditions are so undemanding that almost anything goes. This paper allows researchers to make this distinction. Our approach builds on theoretical support in the form of an axiomatic cardinal characterisation of a measure of predictive success due to Selten(1991). We illustrate the idea using a large, nationally representative panel survey of Spanish consumers with broad commodity coverage. The results show that this approach to revealed preference methods can lead us radically to reassess our view of the empirical performance of economic theory.

]]>
http://www.ifs.org.uk/publications/5181 Sun, 27 Jun 2010 00:00:00 +0000
<![CDATA[An empirical model for strategic network formation]]> We develop and analyze a tractable empirical model for strategic network formation that can be estimated with data from a single network at a single point in time. We model the network formation as a sequential process where in each period a single randomly selected pair of agents has the opportunity to form a link. Conditional on such an opportunity, a link will be formed if both agents view the link as beneficial to them. They base their decision on their own characateristics, the characteristics of the potential partner, and on features of the current state of the network, such as whether the the two potential partners already have friends in common. A key assumption is that agents do not take into account possible future changes to the network. This assumption avoids complications with the presence of multiple equilibria, and also greatly simplifies the computational burden of anlyzing these models. We use Bayesian markov-chain-monte-carlo methods to obtain draws from the posterior distribution of interest. We apply our methods to a social network of 669 high school students, with, on average, 4.6 friends. We then use the model to evaluate the effect of an alternative assignment to classes on the topology of the network.

]]>
http://www.ifs.org.uk/publications/5180 Fri, 18 Jun 2010 00:00:00 +0000
<![CDATA[Nonparametric learning rules from bandit experiments: the eyes have it!]]> We estimate nonparametric learning rules using data from dynamic two-armed bandit (probabilistic reversal learning) experiments, supplemented with auxiliary eye-movement measures of subjects' beliefs. We apply recent econometric developments in the estimation of dynamic models. The direct estimation of learning rules differs from the usual modus operandi of the experimental literature. The estimated choice probabilities and learning rules from our nonparametric models have some distinctive features; notably that subjects tend to update in a non-smooth manner following positive 'exploitative' choices (those made in accordance with current beliefs). Simulation results show how the estimated nonparametric learning rules fit aspects of subjects' observed choice sequences better than alternative parameterized learning rules from Bayesian and reinforcement learning models.

]]>
http://www.ifs.org.uk/publications/5179 Thu, 10 Jun 2010 00:00:00 +0000
<![CDATA[Nonparametric identification of accelerated failure time competing risks models]]> We provide new conditions for identification of accelerated failure time competing risks models. These include Roy models and some auction models. In our set up, unknown regression functions and the joint survivor function of latent disturbance terms are all nonparametric. We show that this model is identified given covariates that are independent of latent errors, provided that a certain rank condition is satisfied. We present a simple example in which our rank condition for identification is verified. Our identification strategy does not depend on identification at infinity or near zero, and it does not require exclusion assumptions. Given our identification, we show estimation can be accomplished using sieves.

]]>
http://www.ifs.org.uk/publications/5178 Sat, 05 Jun 2010 00:00:00 +0000
<![CDATA[Post-l1-penalized estimators in high-dimensional linear regression models]]>

In this paper we study post-penalized estimators which apply ordinary, unpenalized linear regression to the model selected by first-step penalized estimators, typically LASSO. It is well known that LASSO can estimate the regression function at nearly the oracle rate, and is thus hard to improve upon. We show that post-LASSO performs at least as well as LASSO in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the LASSO-based model selection 'fails' in the sense of missing some components of the 'true' regression model. By the 'true' model we mean here the best s-dimensional approximation to the regression function chosen by the oracle. Furthermore, post-LASSO can perform strictly better than LASSO, in the sense of a strictly faster rate of convergence, if the LASSO-based model selection correctly includes all components of the 'true' model as a subset and also achieves a sufficient sparsity. In the extreme case, when LASSO perfectly selects the 'true' model, the post-LASSO estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by LASSO which guarantees that this dimension is at most of the same order as the dimension of the 'true' model. Our rate results are non-asymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the LASSO estimator in the first step, but also applies to other estimators, for example, the trimmed LASSO, Dantzig selector, or any other estimator with good rates and good sparsity. Our analysis covers both traditional trimming and a new practical, completely data-driven trimming scheme that induces maximal sparsity subject to maintaining a certain goodness-of-fit. The latter scheme has theoretical guarantees similar to those of LASSO or post-LASSO, but it dominates these procedures as well as traditional trimming in a wide variety of experiments.

]]>
http://www.ifs.org.uk/publications/5176 Thu, 03 Jun 2010 00:00:00 +0000
<![CDATA[Perception and retrospection: the dynamic consistency of responses to survey questions on wellbeing]]> Implementation of broad approaches to welfare analysis usually entails the use of 'subjective' welfare indicators. We analyse BHPS data on financial wellbeing to determine whether reported current and retrospective perceptions are consistent with each other and with the existence of a common underlying wellbeing concept. We allow for adjustment of perceptions in a vector ARMA model for panel data, with dependent variables observed ordinally and find that current perceptions exhibit slow adjustment to changing circumstances and retrospective assessments of past wellbeing are heavily contaminated by current circumstances, causing significant bias in measures of the level and change in welfare.

]]>
http://www.ifs.org.uk/publications/5175 Wed, 02 Jun 2010 00:00:00 +0000
<![CDATA[Sharp identified sets for discrete variable IV models]]> Instrumental variable models for discrete outcomes are set, not point, identifying. The paper characterises identified sets of structural functions when endogenous variables are discrete. Identified sets are unions of large numbers of convex sets and may not be convex nor even connected. Each of the component sets is a projection of a convex set that resides in a much higher dimensional space onto the space in which a structural function resides. The paper develops a symbolic expression for this projection and gives a constructive demonstration that it is indeed the identified set. We provide a MathematicaTM notebook which computes the set symbolically. We derive properties of the set, suggest how the set can be used in practical econometric analysis when outcomes and endogenous variables are discrete and propose a method for estimating identified sets under parametric or shape restrictions. We develop an expression for a set of structural functions for the case in which endogenous variables are continuous or mixed discrete-continuous and show that this set contains all structural functions in the identified set in the non-discrete case.

]]>
http://www.ifs.org.uk/publications/4874 Tue, 18 May 2010 00:00:00 +0000
<![CDATA[Money, mentoring and making friends: the impact of a multidimensional access program on student performance]]> There is a well established socioeconomic gradient in educational attainment in all countries: young people from a low socioeconomic status (SES) will, on average, receive less education and do less well at school. While this is true virtually everywhere, this SES gradient is noticeably higher in Ireland compared to other OECD countries despite much effort in recent decades to address this inequality. This study evaluates a university access program in Ireland that provides financial, academic and social support to low SES students both prior to and after entry to university. It uses a natural experiment involving the gradual roll-out of the program to identify the effect of the program. The program has parallels with US Affirmative Action programs, although preferential treatment in this case is based on SES rather than ethnicity. Evaluating the effectiveness of programs targeting disadvantaged students in Ireland is particularly salient given the high rate of return to education and the lack of intergenerational mobility in educational attainment. Overall, we find positive treatment effects on first year exam performance, progression to second year and final year graduation rates, with the impact often stronger for higher ability students. We find similar patterns of results for students that entered through the regular system and the 'affirmative action' group i.e. the students that entering with lower high school grades. The program affects both male and female students, albeit in different ways. The study is unable to identify which specific component of the treatment is responsible for the effects but we find no evidence that changes in the financial support have an effect on student outcomes. This study suggests that access programs can be an effective means of improving academic outcomes for socio-economically disadvantaged students.

]]>
http://www.ifs.org.uk/publications/4873 Mon, 17 May 2010 00:00:00 +0000
<![CDATA[Disability risk, disability insurance and life cycle behavior]]> The Disability Insurance (DI) program in the US is a large social insurance program that offers income replacement benefits to people with work limiting disabilities. The proportion of DI claimants in the US is now almost 5% of the working-age population and the cost is three times that of unemployment insurance. The key questions in thinking about the size and growth of the DI program are whether program claimants are genuinely unable to work, and how valuable is the insurance provided.

This paper has three aims:

  1. We provide a framework for weighing up the insurance value of disability benefi…ts against the incentive cost of inducing healthy individuals to stop work at different points of their life-cycle.
  2. We estimate the risks to health that may lead to work-limiting disabilities and the risk to wages that may lead to individuals choosing not to work. We also estimate the extent of false awards made through the DI program alongside the proportion of awards to those in genuine need.
  3. We use our model and estimates to characterize the economic effects of the disability insurance and to consider how policy reforms would affect behaviour and standard measures of household welfare.

We differentiate disability status by its severity, and show that a severe disability shock leads to a decline in wages of 40%, as well as a substantial rise in the fixed cost of going to work. In terms of the effectiveness of the DI program, we estimate high levels of rejections of genuine applicants. In our counterfactual simulations, this means that household welfare increases as the program becomes less strict, despite the worsening incentives for false applications that this implies. On the other hand, incentives for false applications are reduced by reducing generosity and increasing reassessments, and these policies increase household welfare, despite the worse insurance implied.

]]>
http://www.ifs.org.uk/publications/4872 Sat, 15 May 2010 00:00:00 +0000
<![CDATA[Econometric methods for research in education]]> This paper reviews some of the econometric methods that have been used in the economics of education. The focus is on understanding how the assumptions made to justify and implement such methods relate to the underlying economic model and the interpretation of the results. We start by considering the estimation of the returns to education both within the context of a dynamic discrete choice model inspired byWillis and Rosen (1979) and in the context of the Mincer model. We discuss the relationship between the econometric assumptions and economic behaviour. We then discuss methods that have been used in the context of assessing the impact of education quality, the teacher contribution to pupils' achievement and the effect of school quality on housing prices. In the process we also provide a summary of some of the main results in this literature.

]]>
http://www.ifs.org.uk/publications/4871 Fri, 14 May 2010 00:00:00 +0000
<![CDATA[Minimum wage setting and standards of fairness]]> We examine the setting of minimum wages, arguing that they can best be understood as a reflection of voters' notions of fairness. We arrive at this conclusion through an empirical investigation of the implications of three models, considered in the context of policy setting by sub-units in a federation: a competing interests group model; a constrained altruism model; and a fairness based model. In the latter model, voters are interested in banning what they view to be unfair transactions, with the notion of fairness based on comparisons to the "going" unskilled wage. We use data on minimum wages set in the ten Canadian provinces from 1969 to 2005 to carry out the investigation. A key implication of the models that is borne out in the data is that minimum wages should be set as a positive function of the location of the unskilled wage distribution. Together, the results indicate that minimum wages are set according to a "fairness" standard and that this may exacerbate movements in inequality.

]]>
http://www.ifs.org.uk/publications/4869 Thu, 13 May 2010 00:00:00 +0000
<![CDATA[Did the extension of the franchise increase the Liberal vote in Victorian Britain? Evidence from the Second Reform Act]]> We use evidence from the Second Reform Act, introduced in the United Kingdom in 1867, to analyze the impact on electoral outcomes of extending the vote to the unskilled urban population. By exploiting the sharp change in the electorate caused by franchise extension, we separate the effect of reform from that of underlying constituency level traits correlated with the voting population. Although we find that the franchise affected electoral competition and candidate selection, there is no evidence that relates Liberal electoral support to changes in the franchise rules. Our results are robust to various sources of endogeneity.

]]>
http://www.ifs.org.uk/publications/4868 Wed, 12 May 2010 00:00:00 +0000
<![CDATA[The price elasticity of charitable giving: does the form of tax relief matter?]]> This paper uses a survey-based approach to test alternative methods of channeling tax relief to donors - as a tax rebate for the donor or as a matched payment to the receiving charity. On accounting grounds these two are equivalent but, in line with earlier experimental studies, we find that gross donations are significantly more responsive to a match change than to a rebate change. We show that the difference can largely be explained by the fact that a majority of donors do not adjust their nominal donations in response to a change in subsidy. This evidence adds to the growing empirical literature suggesting that consumers may not react to tax changes. In the case of tax subsidies for donations, this has implications for policy design - we show for the UK that a match-based system is likely to be more effective at increasing money going to charities.

]]>
http://www.ifs.org.uk/publications/4867 Tue, 11 May 2010 00:00:00 +0000
<![CDATA[When you are born matters: the impact of date of birth on educational outcomes in England]]> This paper examines the impact of month of birth on national achievement test scores in England whilst children are in school, and on subsequent further and higher education participation. Using geographical variation in school admissions policies, we are able to split this difference into an age of starting school or length of schooling effect, and an age of sitting the test effect. We find that the month in which you are born matters for test scores at ages 7, 11, 14 and 16, with younger children performing significantly worse, on average, than their older peers. Furthermore, almost all of this difference is due to the fact that younger children sit exams up to one year earlier than older cohort members. The difference in test scores at age 16 potentially affects the number of pupils who stay on beyond compulsory schooling, with predictable labour market consequences. Indeed, we find that the impact of month of birth persists into higher education (college) decisions, with age 19/20 participation declining monotonically with month of birth. The fact that being young in your school year affects outcomes after the completion of compulsory schooling points to the need for urgent policy reform, to ensure that future cohorts of children are not adversely affected by the month of birth lottery inherent in the English education system.

]]>
http://www.ifs.org.uk/publications/4866 Mon, 10 May 2010 00:00:00 +0000
<![CDATA[Widening participation in higher education: analysis using linked administrative data]]> This paper makes use of newly linked administrative data to better understand the determinants of higher education participation amongst individuals from socio-economically disadvantaged backgrounds. It is unique in being able to follow two cohorts of students in England - those who took GCSEs in 2001-02 and 2002-03 - from age 11 to age 20. The findings suggest that while there remain large raw gaps in HE participation (and participation at high-status universities) by socio-economic status, these differences are substantially reduced once controls for prior attainment are included. Moreover, these findings hold for both state and private school students. This suggests that poor attainment in secondary schools is more important in explaining lower HE participation rates amongst students from disadvantaged backgrounds than barriers arising at the point of entry into HE. These findings highlight the need for earlier policy intervention to raise HE participation rates amongst disadvantaged youth.

]]>
http://www.ifs.org.uk/publications/4951 Sat, 01 May 2010 00:00:00 +0000
<![CDATA[Testing the correlated random coefficient model]]> The recent literature on instrumental variables (IV) features models in which agents sort into treatment status on the basis of gains from treatment as well as on baseline-pretreatment levels. Components of the gains known to the agents and acted on by them may not be known by the observing economist. Such models are called correlated random coefficient models. Sorting on unobserved components of gains complicates the interpretation of what IV estimates. This paper examines testable implications of the hypothesis that agents do not sort into treatment based on gains. In it, we develop new tests to gauge the empirical relevance of the correlated random coefficient model to examine whether the additional complications associated with it are required. We examine the power of the proposed tests. We derive a new representation of the variance of the instrumental variable estimator for the correlated random coefficient model. We apply the methods in this paper to the prototypical empirical problem of estimating the return to schooling and find evidence of sorting into schooling based on unobserved components of gains.

]]>
http://www.ifs.org.uk/publications/4846 Mon, 26 Apr 2010 00:00:00 +0000
<![CDATA[Estimating the technology of cognitive and noncognitive skill formation]]> This paper formulates and estimates multistage production functions for children's cognitive and noncognitive skills. Skills are determined by parental environments and investments at different stages of childhood. We estimate the elasticity of substitution between investments in one period and stocks of skills in that period to assess the benefits of early investment in children compared to later remediation. We establish nonparametric identification of a general class of production technologies based on nonlinear factor models with endogenous inputs. A by-product of our approach is a framework for evaluating childhood and schooling interventions that does not rely on arbitrarily scaled test scores as outputs and recognizes the differential effects of the same bundle of skills in different tasks. Using the estimated technology, we determine optimal targeting of interventions to children with different parental and personal birth endowments. Substitutability decreases in later stages of the life cycle in the production of cognitive skills. It is roughly constant across stages of the life cycle in the production of noncognitive skills. This finding has important implications for the design of policies that target the disadvantaged. For most configurations of disadvantage, it is optimal to invest relatively more in the early stages of childhood than in later stages.

]]>
http://www.ifs.org.uk/publications/4845 Sun, 25 Apr 2010 00:00:00 +0000
<![CDATA[Comparing IV with structural models: what simple IV can and cannot identify]]> This paper compares the economic questions addressed by instrumental variables estimators with those addressed by structural approaches. We discuss Marschak's Maxim: estimators should be selected on the basis of their ability to answer well-posed economic problems with minimal assumptions. A key identifying assumption that allows structural methods to be more informative than IV can be tested with data and does not have to be imposed.

]]>
http://www.ifs.org.uk/publications/4842 Sat, 24 Apr 2010 00:00:00 +0000
<![CDATA[A comparison of bias approximations for the 2SLS estimator]]> We consider the bias of the 2SLS estimator in the linear instrumental variables regression with one endogenous regressor only. By using asymptotic expansion techniques we approximate 2SLS coefficient estimation bias under various scenarios regarding the number and strength of instruments. The resulting approximation encompasses existing bias approximations, which are valid in particular cases only. Simulations show that the developed approximation gives an accurate description of the 2SLS bias in case of either weak or many instruments or both.

]]>
http://www.ifs.org.uk/publications/4835 Wed, 21 Apr 2010 00:00:00 +0000
<![CDATA[Earnings, consumption and lifecycle choices]]> We discuss recent developments in the literature that studies how the dynamics of earnings and wages affect consumption choices over the life cycle. We start by analyzing the theoretical impact of income changes on consumption - highlighting the role of persistence, information, size and insurability of changes in economic resources. We next examine the empirical contributions, distinguishing between papers that use only income data and those that use both income and consumption data. The latter do this for two purposes. First, one can make explicit assumptions about the structure of credit and insurance markets and identify the income process or the information set of the individuals. Second, one can assume that the income process or the amount of information that consumers have are known and tests the implications of the theory. In general there is an identification issue that is only recently being addressed, with better data or better "experiments". We conclude with a discussion of the literature that endogenize people's earnings and therefore change the nature of risk faced by households.

]]>
http://www.ifs.org.uk/publications/4821 Fri, 16 Apr 2010 00:00:00 +0000
<![CDATA[Occupational pension value in the public and private sectors]]> It is well known that in the UK defined benefit pensions are more prevalent in the public sector than in the private sector. Furthermore, we find that the average value of accrual to members of both defined benefit pensions and defined contribution pensions is lower in the private sector than in the public sector. As a result of both these factors, we find that the average value of pension accrual is much higher in the public sector than in the private sector. Due to the long-running shift away from defined benefit pensions to less generous workplace defined contribution pensions in the private sector continuing between 2001 and 2005 the difference in average pension accrual between the sectors increased over this period. While on average over this period earnings in the public sector grew 3.5% faster than in the private sector, including pension accrual increases this difference by one-third to 4.7%. We simulate a plausible reform to the public sector defined benefit pensions - an increase in the normal pension age from 60 to 65 for future pension accrual of all current members. We find that, had this reform been implemented between 2001 and 2005, average growth in total remuneration over this period in the public sector would actually have been almost the same as that in the private sector.

]]>
http://www.ifs.org.uk/publications/4804 Thu, 01 Apr 2010 00:00:00 +0000
<![CDATA[Spatial circular matrices, with applications]]> The cumulants of the quadratic forms associated to the so-called spatial design matrices are often needed for inference in the context of isotropic processes on uniform grids. Unfortunately, because the eigenvalues of the matrices involved are generally unknown, the computation of the cumulants may be very demanding if the grids are large. This paper constructs circular counterparts, with known eigenvalues, to the spatial design matrices. It then studies some of their properties, and analyzes their performance in a number of applications.

]]>
http://www.ifs.org.uk/publications/4786 Mon, 08 Mar 2010 00:00:00 +0000
<![CDATA[Releasing jobs for the young? Early retirement and youth unemployment in the United Kingdom]]> This paper tries to assess whether or not we have any empirical evidence of links between early retirement and youth unemployment. Most economists would today dismiss the idea immediately as another version of the naïve 'lump-of-labor fallacy'. In its most basic form, this proposition holds that there is a fixed supply of jobs and that any reduction in labor supply will reduce unemployment by offering jobs to those who are looking for ones. Taken to the extreme, this view would support that the idea that a high level of employment of one group of individuals can only be at the expense of another group: if for instance were the population of a country to increase, younger individuals would be unemployed as older individuals would not 'release' enough jobs for the new entrants. The absurdity of this view in the long term is simply seen by considering the fact that the size of a country does not bear any relation to the share of population unemployed.

]]>
http://www.ifs.org.uk/publications/4785 Fri, 05 Mar 2010 00:00:00 +0000
<![CDATA[Optimal bandwidth choice for the regression discontinuity estimator]]> We investigate the problem of optimal choice of the smoothing parameter (bandwidth) for the regression discontinuity estimator. We focus on estimation by local linear regression, which was shown to be rate optimal (Porter, 2003). We derive the optimal bandwidth. This optimal bandwidth depends on unknown functionals of the distribution of the data and we propose specific, consistent, estimators for these functionals to obtain a fully data-driven bandwidth choice that has the "asymptotic no-regret" property. We illustrate our proposed bandwidth, and the sensitivity to the choices made in this bandwidth proposal, using a data set previously analyzed by Lee (2008), as well as a small simulation study based on the Lee data set. The simulations suggest that the proposed rule performs well.

]]>
http://www.ifs.org.uk/publications/4784 Thu, 04 Mar 2010 00:00:00 +0000
<![CDATA[On the dynamics of unemployment and wage distributions]]> Postel-Vinay and Robin's (2002) sequential auction model is extended to allow for aggregate productivity shocks. Workers exhibit permanent differences in ability while firms are identical. Negative aggregate productivity shocks induce job destruction by driving the surplus of matches with low ability workers to negative values. Endogenous job destruction coupled with worker heterogeneity thus provides a mechanism for amplifying productivity shocks that offers an original solution to the unemployment volatility puzzle (Shimer, 2005). Moreover, positive or negative shocks may lead employers and employees to renegotiate low wages up and high wages down when agents' individual surpluses become negative. The model delivers rich business cycle dynamics of wage distributions and explains why both low wages and high wages are more procyclical than wages in the middle of the distribution and why wage inequality may be countercyclical, as the data seem to suggest is true.

]]>
http://www.ifs.org.uk/publications/4783 Wed, 03 Mar 2010 00:00:00 +0000
<![CDATA[Genetic markers as instrumental variables: an application to child fat mass and academic achievement]]> The use of genetic markers as instrumental variables (IV) is receiving increasing attention from economists. This paper examines the conditions that need to be met for genetic variants to be used as instruments. We combine the IV literature with that from genetic epidemiology, with an application to child adiposity (fat mass, determined by a dual-energy X-ray absorptiometry (DXA) scan) and academic performance. OLS results indicate that leaner children perform slightly better in school tests compared to their more adipose counterparts, but the IV findings show no evidence that fat mass affects academic outcomes.

]]>
http://www.ifs.org.uk/publications/4782 Tue, 02 Mar 2010 00:00:00 +0000
<![CDATA[Identification of causal effects on binary outcomes using structural mean models]]> Structural mean models (SMMs) were originally formulated to estimate causal effects among those selecting treatment in randomised controlled trials affected by non-ignorable non-compliance. It has already been established that SMM estimators identify these causal effects in randomised placebo-controlled trials where no-one assigned to the control group can receive the treatment. However, SMMs are starting to be used for randomised controlled trials without placebo-controls, and for instrumental variable analysis of observational studies; for example, Mendelian randomisation studies, and studies where physicians select patients' treatments. In such scenarios, identification depends on the assumption of no effect modification, namely, the causal effect is equal for the subgroups defined by the instrument. We consider the nature of this assumption by showing how it depends crucially on the underlying causal model generating the data, which in applications is almost always unknown. If its no effect modification assumption does not hold then an SMM estimator does not estimate its associated causal effect. However, if treatment selection is monotonic we highlight that additive and multiplicative SMMs do identify local (or complier) causal effects, but that the double-logistic SMM estimator does not without further assumptions. We clarify the proper interpretation of inferences from SMM estimators using a data example and simulation study.

]]>
http://www.ifs.org.uk/publications/4781 Mon, 01 Mar 2010 00:00:00 +0000
<![CDATA[Identification of treatment response with social interactions]]>

This paper develops a formal language for study of treatment response with social interactions, and uses it to obtain new findings on identification of potential outcome distributions. Defining a person's treatment response to be a function of the entire vector of treatments received by the population, I study identification when shape restrictions and distributional assumptions are placed on response functions. An early key result is that the traditional assumption of individualistic treatment response (ITR) is a polar case within the broad class of constant treatment response (CTR) assumptions, the other pole being unrestricted interactions. Important non-polar cases are interactions within reference groups and distributional interactions. I show that established findings on identification under assumption ITR extend to assumption CTR. These include identification with assumption CTR alone and when this shape restriction is strengthened to semi-monotone response. I next study distributional assumptions using instrumental variables. Findings obtained previously under assumption ITR extend when assumptions of statistical independence (SI) are posed in settings with social interactions. However, I find that random assignment of realized treatments generically has no identifying power when some persons are leaders who may affect outcomes throughout the population. Finally, I consider use of models of endogenous social interactions to derive restrictions on response functions. I emphasize that identification of potential outcome distributions differs from the longstanding econometric concern with identification of structural functions.

This paper is a revised version of CWP01/10

]]>
http://www.ifs.org.uk/publications/4747 Tue, 09 Feb 2010 00:00:00 +0000
<![CDATA[Employment protection legislation, multinational firms and innovation]]> The theoretical effects of labour regulations such as employment protection legislation (EPL) on innovation is ambiguous, and empirical evidence has thus far been inconclusive. EPL increases job security and the greater enforceability of job contracts may increase worker investment in innovative activity. On the other hand EPL increases adjustment costs faced by firms, and this may lead to under-investment in activities that are likely to require adjustment, including technologically advanced innovation. In this paper we find empirical evidence that both effects are at work - multinational enterprises locate more innovative activity in countries with high EPL, however they locate more technologically advanced innovation in countries with low EPL.

This research is forthcoming in the Review of Economics and Statistics.]]>
http://www.ifs.org.uk/publications/4715 Mon, 04 Jan 2010 00:00:00 +0000
<![CDATA[IV models of ordered choice]]> This paper studies single equation instrumental variable models of ordered choice in which explanatory variables may be endogenous. The models are weakly restrictive, leaving unspecified the mechanism that generates endogenous variables. These incomplete models are set, not point, identifying for parametrically (e.g. ordered probit) or nonparametrically specified structural functions. The paper gives results on the properties of the identified set for the case in which potentially endogenous explanatory variables are discrete. The results are used as the basis for calculations showing the rate of shrinkage of identified sets as the number of classes in which the outcome is categorised increases.

]]>
http://www.ifs.org.uk/publications/4685 Wed, 09 Dec 2009 00:00:00 +0000
<![CDATA[Nonparametric tests of conditional treatment effects]]>

We develop a general class of nonparametric tests for treatment effects conditional on covariates. We consider a wide spectrum of null and alternative hypotheses regarding conditional treatment effects, including (i) the null hypothesis of the conditional stochastic dominance between treatment and control groups; ii) the null hypothesis that the conditional average treatment effect is positive for each value of covariates; and (iii) the null hypothesis of no distributional (or average) treatment effect conditional on covariates against a one-sided (or two-sided) alternative hypothesis. The test statistics are based on L1-type functionals of uniformly consistent nonparametric kernel estimators of conditional expectations that characterize the null hypotheses. Using the Poissionization technique of Giné et al. (2003), we show that suitably studentized versions of our test statistics are asymptotically standard normal under the null hypotheses and also show that the proposed nonparametric tests are consistent against general fixed alternatives. Furthermore, it turns out that our tests have non-negligible powers against some local alternatives that are n−½ different from the null hypotheses, where n is the sample size. We provide a more powerful test for the case when the null hypothesis may be binding only on a strict subset of the support and also consider an extension to testing for quantile treatment effects. We illustrate the usefulness of our tests by applying them to data from a randomized, job training program (LaLonde, 1986) and by carrying out Monte Carlo experiments based on this dataset.

]]>
http://www.ifs.org.uk/publications/4684 Sun, 06 Dec 2009 00:00:00 +0000
<![CDATA[Well-posedness of measurement error models for self-reported data]]> It is widely admitted that the inverse problem of estimating the distribution of a latent variable X* from an observed sample of X, a contaminated measurement of X*, is ill-posed. This paper shows that measurement error models for self-reporting data are well-posed, assuming the probability of reporting truthfully is nonzero, which is an observed property in validation studies. This optimistic result suggests that one should not ignore the point mass at zero in the error distribution when modeling measurement errors in self-reported data. We also illustrate that the classical measurement error models may in fact be conditionally well-posed given prior information on the distribution of the latent variable X*. By both a Monte Carlo study and an empirical application, we show that failing to account for the property can lead to significant bias on estimation of distribution of X*.

]]>
http://www.ifs.org.uk/publications/4683 Thu, 03 Dec 2009 00:00:00 +0000
<![CDATA[Endogenous semiparametric binary choice models with heteroscedasticity]]>

In this paper we consider endogenous regressors in the binary choice model under a weak median exclusion restriction, but without further specification of the distribution of the unobserved random components. Our reduced form specification with heteroscedastic residuals covers various heterogeneous structural binary choice models. As a particularly relevant example of a structural model where no semiparametric estimator has of yet been analyzed, we consider the binary random utility model with endogenous regressors and heterogeneous parameters. We employ a control function IV assumption to establish identification of a slope parameter 'â' by the mean ratio of derivatives of two functions of the instruments. We propose an estimator based on direct sample counterparts, and discuss the large sample behavior of this estimator. In particular, we show '√'n consistency and derive the asymptotic distribution. In the same framework, we propose tests for heteroscedasticity, overidentification and endogeneity. We analyze the small sample performance through a simulation study. An application of the model to discrete choice demand data concludes this paper.

]]>
http://www.ifs.org.uk/publications/4682 Wed, 02 Dec 2009 00:00:00 +0000
<![CDATA[Nonparametric identification in nonseparable panel data models with generalized fixed effects]]> This paper is concerned with extending the familiar notion of fixed effects to nonlinear setups with infinite dimensional unobservables like preferences. The main result is that a generalized version of differencing identifies local average structural derivatives (LASDs) in very general nonseparable models, while allowing for arbitrary dependence between the persistent unobservables and the regressors of interest even if there are only two time periods. These quantities specialize to well known objects like the slope coefficient in the semiparametric panel data binary choice model with fixed effects. We extend the basic framework to include dynamics in the regressors and time trends, and show how distributional effects as well as average effects are identified. In addition, we show how to handle endogeneity in the transitory component. Finally, we adapt our results to the semiparametric binary choice model with correlated coefficients, and establish that average structural marginal probabilities are identified. We conclude this paper by applying the last result to a real world data example. Using the PSID, we analyze the way in which the lending restrictions for mortgages eased between 2000 and 2004.

]]>
http://www.ifs.org.uk/publications/4681 Tue, 01 Dec 2009 00:00:00 +0000
<![CDATA[How many consumers are rational?]]> Rationality places strong restrictions on individual consumer behavior. This paper is concerned with assessing the validity of the integrability constraints imposed by standard utility maximization, arising in classical consumer demand analysis. More specifically, we characterize the testable implications of negative semidefiniteness and symmetry of the Slutsky matrix across a heterogeneous population without assuming anything on the functional form of individual preferences. In the same spirit, homogeneity of degree zero is being considered. Our approach employs nonseparable models and is centered around a conditional independence assumption, which is sufficiently general to allow for endogenous regressors. It is the only substantial assumption a researcher has to specify in this model, and has to be evaluated with particular care. Finally, we apply all concepts to British household data: We show that rationality is an acceptable description for large parts of the population, regardless of whether we test on single or composite households.

]]>
http://www.ifs.org.uk/publications/4680 Mon, 30 Nov 2009 00:00:00 +0000
<![CDATA[Taxation of human capital and wage inequality: a cross-country analysis]]> Wage inequality has been significantly higher in the United States than in continental European countries (CEU) since the 1970s. Moreover, this inequality gap has further widened during this period as the US has experienced a large increase in wage inequality, whereas the CEU has seen only modest changes. This paper studies the role of labor income tax policies for understanding these facts. We begin by documenting two new empirical facts that link these inequality differences to tax policies. First, we show that countries with more progressive labor income tax schedules have significantly lower before-tax wage inequality at different points in time. Second, progressivity is also negatively correlated with the rise in wage inequality during this period. We then construct a life cycle model in which individuals decide each period whether to go to school, work, or be unemployed. Individuals can accumulate skills either in school or while working. Wage inequality arises from differences across individuals in their ability to learn new skills as well as from idiosyncratic shocks. Progressive taxation compresses the (after-tax) wage structure, thereby distorting the incentives to accumulate human capital, in turn reducing the cross-sectional dispersion of (before-tax) wages. We find that these policies can account for half of the difference between the US and the CEU in overall wage inequality and 76% of the difference in inequality at the upper end (log 90-50 differential). When this economy experiences skill-biased technological change, progressivity also dampens the rise in wage dispersion over time. The model explains 41% of the difference in the total rise in inequality and 58% of the difference at the upper end.

]]>
http://www.ifs.org.uk/publications/4664 Mon, 23 Nov 2009 00:00:00 +0000
<![CDATA[Understanding the wage patterns of Canadian less skilled workers: the role of implicit contracts]]> We examine the wage patterns of Canadian less skilled male workers over the last quarter century by organizing workers into job entry cohorts. We find entry wages for successive cohorts declined until 1997, and then began to recover. Wage profiles steepened for cohorts entering after 1997, but not for cohorts entering in the 1980s - a period when start wages were relatively high. We argue that these patterns are consistent with a model of implicit contracts with recontracting in which a worker's current wage is determined by the best labour market conditions experienced during the current job spell.

]]>
http://www.ifs.org.uk/publications/4663 Sun, 22 Nov 2009 00:00:00 +0000
<![CDATA[Ability, parental valuation of education and the high school dropout decision]]> We use a large, rich Canadian micro-level dataset to examine the channels through which family socio-economic status and unobservable characteristics affect children's decisions to drop out of high school. First, we document the strength of observable socio-economic factors: our data suggest that teenage boys with two parents who are themselves high school dropouts have a 16 per cent chance of dropping out, compared to a dropout rate of less than 1 per cent for boys whose parents both have a university degree. We examine the channels through which this socio-economic gradient arises using an extended version of the factor model set out in Carneiro, Hansen, and Heckman (2003). Specifically, we consider the impact of cognitive and non-cognitive ability and the value that parents place on education. Our results support three main conclusions. First, cognitive ability at age 15 has a substantial impact on dropping out. The highest ability individuals are predicted never to drop out regardless of parental education or parental valuation of education. In contrast, the lowest ability teenagers have a probability of dropping out of approximately .36 if their parents have a low valuation of education. Second, parental valuation of education has a substantial impact on medium and low ability teenagers. A low ability teenager has a probability of dropping out of approximately .03 if his parents place a high value on education but .36 if their educational valuation is low. These effects are estimated while conditioning on ability at age 15. Thus, under some assumptions, they reflect parental influences during the upper teenage years and are in addition to any impact they might have in the early childhood years leading up to age 15. Third, parental education has no direct effect on dropping out once we control for ability and parental valuation of education. Overall, our results point to the importance of whatever determines ability at age 15 (including, potentially, early childhood interventions) and of parental valuation of education during the teenage years. Our work also provides a small methodological contribution by extending the standard factor based estimator to allow a more non-linear relationship between the factors and a co-variate of interest. We show that allowing for non-linearities has a substantial impact on estimated effects.

]]>
http://www.ifs.org.uk/publications/4662 Sat, 21 Nov 2009 00:00:00 +0000
<![CDATA[Nonparametric identification in asymmetric second-price auctions: a new approach]]> This paper proposes an approach to proving nonparametric identification for distributions of bidders' values in asymmetric second-price auctions. I consider the case when bidders have independent private values and the only available data pertain to the winner's identity and the transaction price. My proof of identification is constructive and is based on establishing the existence and uniqueness of a solution to the system of non-linear differential equations that describes relationships between unknown distribution functions and observable functions. The proof is conducted in two logical steps. First, I prove the existence and uniqueness of a local solution. Then I describe a method that extends this local solution to the whole support. This paper delivers other interesting results. I show how this approach can be applied to obtain identification in more general auction settings, for instance, in auctions with stochastic number of bidders or weaker support conditions. Furthermore, I demonstrate that my results can be extended to generalized competing risks models. Moreover, contrary to results in classical competing risks (Roy model), I show that in this generalized class of models it is possible to obtain implications that can be used to check whether the risks in a model are dependent. Finally, I provide a sieve minimum distance estimator and show that it consistently estimates the underlying valuation distribution of interest.

]]>
http://www.ifs.org.uk/publications/4654 Thu, 05 Nov 2009 00:00:00 +0000
<![CDATA[Identification region of the potential outcome distributions under instrument independence]]> This paper examines identification power of the instrument exogeneity assumption in the treatment effect model. We derive the identification region: The set of potential outcome distributions that are compatible with data and the model restriction. The model restrictions whose identifying power is investigated are (i)instrument independence of each of the potential outcome (marginal independence), (ii) instrument joint independence of the potential outcomes and the selection heterogeneity, and (iii) instrument monotonicity in addition to (ii) (the LATE restriction of Imbens and Angrist (1994)), where these restrictions become stronger in the order of listing. By comparing the size of the identification region under each restriction, we show that the joint independence restriction can provide further identifying information for the potential outcome distributions than marginal independence, but the LATE restriction never does since it solely constrains the distribution of data. We also derive the tightest possible bounds for the average treatment effects under each restriction. Our analysis covers both the discrete and continuous outcome case, and extends the treatment effect bounds of Balke and Pearl(1997) that are available only for the binary outcome case to a wider range of settings including the continuous outcome case.

]]>
http://www.ifs.org.uk/publications/4639 Mon, 12 Oct 2009 00:00:00 +0000
<![CDATA[Quantile and average effects in nonseparable panel models]]>

This paper gives identification and estimation results for quantile and average effects in nonseparable panel models, when the distribution of period specific disturbances does not vary over time. Bounds are given for interesting effects with discrete regressors that are strictly exogenous or predetermined. We allow for location and scale time effects and show how monotonicity can be used to shrink the bounds. We derive rates at which the bounds tighten as the number T of time series observations grows and give an empirical illustration.

]]>
http://www.ifs.org.uk/publications/4640 Fri, 09 Oct 2009 00:00:00 +0000
<![CDATA[Semiparametric efficiency bound for models of sequential moment restrictions containing unknown functions]]> This paper computes the semiparametric efficiency bound for finite dimensional parameters identified by models of sequential moment restrictions containing unknown functions. Our results extend those of Chamberlain (1992b) and Ai and Chen (2003) for semiparametric conditional moment restriction models with identical information sets to the case of nested information sets, and those of Chamberlain (1992a) and Brown and Newey (1998) for models of sequential moment restrictions without unknown functions to cases with unknown functions of possibly endogenous variables. Our bound results are applicable to semiparametric panel data models and semiparametric two stage plug-in problems. As an example, we compute the efficiency bound for a weighted average derivative of a nonparametric instrumental variables (IV) regression, and find that the simple plug-in estimator is not efficient. Finally, we present an optimally weighted, orthogonalized, sieve minimum distance estimator that achieves the semiparametric efficiency bound.

]]>
http://www.ifs.org.uk/publications/4638 Mon, 05 Oct 2009 00:00:00 +0000
<![CDATA[Sharp identification regions in models with convex predictions: games, individual choice, and incomplete data]]> We provide a tractable characterization of the sharp identification region of the parameters θ in a broad class of incomplete econometric models. Models in this class have set-valued predictions that yield a convex set of conditional or unconditional moments for the model variables. In short, we call these models with convex predictions. Examples include static, simultaneous move finite games of complete information in the presence of multiple mixed strategy Nash equilibria; random utility models of multinomial choice in the presence of interval regressors data; and best linear predictors with interval outcome and covariate data. Given a candidate value for θ, we establish that the convex set of moments yielded by the model predictions can be represented as the Aumann expectation of a properly defined random set. The sharp identification region of θ, denoted ΘI, can then be obtained as the set of minimizers of the distance from a properly specified vector of moments of random variables to this Aumann expectation. We show that algorithms in convex programming can be exploited to efficiently verify whether a candidate θ is in ΘI. We use examples analyzed in the literature to illustrate the gains in identification and computational tractability afforded by our method.

This paper is a revised version of cemmap working paper CWP15/08

]]>
http://www.ifs.org.uk/publications/4636 Fri, 02 Oct 2009 00:00:00 +0000
<![CDATA[Peace and goodwill? Using an experimental game to analyse the Desarrollo y Paz initiative in Colombia]]> Several decades of conflict, rebellion and unrest severely weakened civil society in parts of Colombia. Desarollo y Paz is the umbrella term used to describe the set of locally-led initiatives that aim at addressing this problem through initiatives to promote sustainable economic development and community cohesion and action.

In this paper we analyse the findings from a series of 'public good' games that were conducted between November 2005 and February 2007 in 104 municipalities in rural and urban Colombia with mainly poor participants. The data covers municipalities both with ('treatment') and without ('control') a PRDP in place, and within the 'treatment' municipalities, both beneficiaries and non beneficiaries of the PRDP initiative. The data for 'control' municipalities was collected as part of the evaluation of Familias en Accion (FeA), Colombia's conditional cash transfer programme.

The game is structured as a typical free-rider problem with the act of contributing to the 'public good' (a collective money pot) being always dominated by non-contribution. We interpret contribution as an act consistent with a high degree of social capital.

Potentially endogenous selection into the programme makes identifying programme effects difficult but we find strong and suggestive evidence that exposure to PRDPs improve social capital and that this extends beyond direct beneficiaries of the programme. In particular, the duration of programme operation and the proportion of programme beneficiaries in a game session increase contribution to the public good, suggesting that in order to have a major impact the programme must be sufficiently 'intensive'.

]]>
http://www.ifs.org.uk/publications/4631 Thu, 01 Oct 2009 00:00:00 +0000
<![CDATA[Migration, violence and welfare programmes in rural Colombia]]> This paper studies migration decisions of very poor households in an environment with a high level of violence. By matching detailed retrospective data on violence levels in Colombian rural municipalities with a household survey collected for the evaluation of the "Familias en Acción" welfare programme, the empirical analysis takes into account possible selection problems of the sample and the key issue of endogeneity of violence. The main results show that high levels of violence encourage households to leave their municipality of residence but that welfare programmes may mitigate these flows, provided that the incidence of violence is not unduly high. This is consistent with the fact that the households under study are liquidity constrained: when violence is high, cash transfers may enable them to leave their municipality of residence, whereas, in more normal circumstances, receiving cash transfers increases the benefits to stay where they are registered. Further evidence using household shocks and wealth confirm that liquidity constraints play a large role in explaining such heterogeneous impacts of the programme along violence levels. Other important determinants of migration are the type of property rights and the health insurance rural households can benefit from.

]]>
http://www.ifs.org.uk/publications/4629 Tue, 29 Sep 2009 00:00:00 +0000
<![CDATA[Estimating the peace dividend: the impact of violence on house prices in Northern Ireland]]> This paper exploits data on the pattern of violence across regions and over time to estimate the impact of the peace process in Northern Ireland on house prices. We begin with a linear model that estimates the average treatment effect of a conflict-related killing on house prices .showing a negative correlation between house prices and killings. We then develop an approach based on an economic model where the parameters of the statistical process are estimated for a Markov switching model where conflict and peace are treated as a latent state. From this, we are able to construct a measure of the discounted number of killings which is updated in the light of actual killings. This model naturally suggests a heterogeneous effect of killings across space and time which we use to generate estimates of the peace dividend. The economic model suggests a somewhat different pattern of estimates to the linear model. We also show that there is some evidence of spillover effects of violence in adjacent regions.

]]>
http://www.ifs.org.uk/publications/4628 Fri, 25 Sep 2009 00:00:00 +0000
<![CDATA[The scourge of Asian Flu: in utero exposure to pandemic influenza and the development of a cohort of British children]]>

This paper examines the impact of in utero exposure to the Asian influenza pandemic of 1957 upon physical and cognitive development in childhood. Outcome data is provided by the National Child Development Study (NCDS), a panel study of a cohort of British children who were all potentially exposed in the womb. Epidemic effects are identified using geographic variation in a surrogate measure of the epidemic. Results indicate significant detrimental effects of the epidemic upon birth weight and height at 7 and 11, but only for the offspring of mother's with certain health characteristics. By contrast, the impact of the epidemic on childhood cognitive test scores is more general: test scores are reduced at the mean, and effects remain constant across maternal health and socioeconomic indicators. Taken together, our results point to multiple channels linking foetal health shocks to childhood outcomes.

]]>
http://www.ifs.org.uk/publications/4626 Wed, 23 Sep 2009 00:00:00 +0000
<![CDATA[Externality-correcting taxes and regulation]]> Much of the literature on externalities has considered taxes and direct regulation as alternative policy instruments. Both instruments may in practice be imperfect, reflecting informational deficiencies and other limitations. We analyse the use of taxes and regulation in combination, to control externalities arising from individual consumption behaviour. We consider cases where taxes are either imperfectly differentiated to reflect individual differences in externalities, or where some consumption escapes taxation. In both cases we characterise the optimal instrument mix, and show how changing the level of direct regulation alters the optimal externality tax.

]]>
http://www.ifs.org.uk/publications/4622 Tue, 22 Sep 2009 00:00:00 +0000
<![CDATA[Set identification via quantile restrictions in short panels]]> This paper studies the identifying power of conditional quantile restrictions in short panels with fixed effects. In contrast to classical fixed effects models with conditional mean restrictions, conditional quantile restrictions are not preserved by taking differences in the regression equation over time. This paper shows however that a conditional quantile restriction, in conjunction with a weak conditional independence restriction, provides bounds on quantiles of differences in time-varying unobservables across periods. These bounds carry observable implications for model parameters which generally result in set identification. The analysis of these bounds includes conditions for point identification of the parameter vector, as well as weaker conditions that result in identification of individual parameter components.

]]>
http://www.ifs.org.uk/publications/4615 Wed, 16 Sep 2009 00:00:00 +0000
<![CDATA[Treatment effect estimation with covariate measurement error]]> This paper investigates the effect that covariate measurement error has on a conventional treatment effect analysis built on an unconfoundedness restriction that embodies conditional independence restrictions in which there is conditioning on error free covariates. The approach uses small parameter asymptotic methods to obtain the approximate generic effects of measurement error. The approximations can be estimated using data on observed outcomes, the treatment indicator and error contaminated covariates providing an indication of the nature and size of measurement error effects. The approximations can be used in a sensitivity analysis to probe the potential effects of measurement error on the evaluation of treatment effects.

]]>
http://www.ifs.org.uk/publications/4611 Tue, 08 Sep 2009 00:00:00 +0000
<![CDATA[Underidentification?]]> We develop methods for testing the hypothesis that an econometric model is underidentified and inferring the nature of the failed identification. By adopting a generalized-method-of moments perspective, we feature directly the structural relations and we allow for nonlinearity in the econometric specification. We establish the link between a test for overidentification and our proposed test for underidentification. If, after attempting to replicate the structural relation, we find substantial evidence against the overidentifying restrictions of an augmented model, this is evidence against underidentification of the original model.

]]>
http://www.ifs.org.uk/publications/4610 Mon, 07 Sep 2009 00:00:00 +0000
<![CDATA[Single equation endogenous binary reponse models]]> This paper studies single equation models for binary outcomes incorporating instrumental variable restrictions. The models are incomplete in the sense that they place no restriction on the way in which values of endogenous variables are generated. The models are set, not point, identifying. The paper explores the nature of set identification in single equation IV models in which the binary outcome is determined by a threshold crossing condition. There is special attention to models which require the threshold crossing function to be a monotone function of a linear index involving observable endogenous and exogenous explanatory variables. Identified sets can be large unless instrumental variables have substantial predictive power. A generic feature of the identified sets is that they are not connected when instruments are weak. The results suggest that the strong point identifying power of triangular "control function" models - restricted versions of the IV models considered here - is fragile, the wide expanses of the IV model's identified set awaiting in the event of failure of the triangular model's restrictions.

Updated version available CWP31/11

]]>
http://www.ifs.org.uk/publications/4594 Tue, 18 Aug 2009 00:00:00 +0000
<![CDATA[Identifying distributional characteristics in random coefficients panel data models]]>

We study the identification of panel models with linear individual-specific coefficients, when T is fixed. We show identification of the variance of the effects under conditional uncorrelatedness. Identification requires restricted dependence of errors, reflecting a trade-off between heterogeneity and error dynamics. We show identification of the density of individual effects when errors follow an ARMA process under conditional independence. We discuss GMM estimation of moments of effects and errors, and introduce a simple density estimator of a slope effect in a special case. As an application we estimate the effect that a mother smokes during pregnancy on child's birth weight.

]]>
http://www.ifs.org.uk/publications/4584 Mon, 03 Aug 2009 00:00:00 +0000
<![CDATA[Evaluating marginal policy changes and the average effect of treatment for individuals at the margin]]> This paper develops methods for evaluating marginal policy changes. We characterize how the effects of marginal policy changes depend on the direction of the policy change, and show that marginal policy effects are fundamentally easier to identify and to estimate than conventional treatment parameters. We develop the connection between marginal policy effects and the average effect of treatment for persons on the margin of indifference between participation in treatment and nonparticipation, and use this connection to analyze both parameters. We apply our analysis to estimate the effect of marginal changes in tuition on the return to going to college.

]]>
http://www.ifs.org.uk/publications/4580 Thu, 30 Jul 2009 00:00:00 +0000
<![CDATA[Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals]]>

This paper considers semiparametric efficient estimation of conditional moment models with possibly nonsmooth residuals in unknown parametric components (Θ) and unknown functions (h)of endogenous variables. We show that: (1) the penalized sieve minimum distance(PSMD) estimator (ˆΘ, ˆh) can simultaneously achieve root-n asymptotic normality of ˆΘ and nonparametric optimal convergence rate of ˆh, allowing for noncompact function parameter spaces; (2) a simple weighted bootstrap procedure consistently estimates the limiting distribution of the PSMD ˆΘ; (3) the semiparametric efficiency bound formula of Ai and Chen (2003) remains valid for conditional models with nonsmooth residuals, and the optimally weighted PSMD estimator achieves the bound; (4) the centered, profiled optimally weighted PSMD criterion is asymptotically chi-square distributed. We illustrate our theories using a partially linear quantile instrumental variables (IV) regression, a Monte Carlo study, and an empirical estimation of the shape-invariant quantile IV Engel curves.

This is an updated version of CWP09/08.

]]>
http://www.ifs.org.uk/publications/4579 Wed, 29 Jul 2009 00:00:00 +0000
<![CDATA[Intersection Bounds: estimation and inference]]> We develop a practical and novel method for inference on intersection bounds, namely bounds defined by either the infimum or supremum of a parametric or nonparametric function, or equivalently, the value of a linear programming problem with a potentially infinite constraint set. Our approach is especially convenient in models comprised of a continuum of inequalities that are separable in parameters, and also applies to models with inequalities that are non-separable in parameters. Since analog estimators for intersection bounds can be severely biased in finite samples, routinely underestimating the length of the identified set, we also offer a (downward/upward) median unbiased estimator of these (upper/lower) bounds as a natural by-product of our inferential procedure. Furthermore, our method appears to be the first and currently only method for inference in nonparametric models with a continuum of inequalities. We develop asymptotic theory for our method based on the strong approximation of a sequence of studentized empirical processes by a sequence of Gaussian or other pivotal processes. We provide conditions for the use of nonparametric kernel and series estimators, including a novel result that establishes strong approximation for general series estimators, which may be of independent interest. We illustrate the usefulness of our method with Monte Carlo experiments and an empirical example.

]]>
http://www.ifs.org.uk/publications/4576 Tue, 28 Jul 2009 00:00:00 +0000
<![CDATA[Uniform confidence bands for functions estimated nonparametrically with instrumental variables]]> This paper is concerned with developing uniform confidence bands for functions estimated nonparametrically with instrumental variables. We show that a sieve nonparametric instrumental variables estimator is pointwise asymptotically normally distributed. The asymptotic normality result holds in both mildly and severely ill-posed cases. We present an interpolation method to obtain a uniform confidence band and show that the bootstrap can be used to obtain the required critical values. Monte Carlo experiments illustrate the finite-sample performance of the uniform confidence band.

]]>
http://www.ifs.org.uk/publications/4575 Mon, 27 Jul 2009 00:00:00 +0000
<![CDATA[Food and cash transfers: evidence from Colombia]]> We study food Engel curves among the poor population targeted by a conditional cash transfer programme in Colombia. After controlling for the endogeneity of total expenditure and for the (unobserved) variability of prices across villages, the best fit is provided by a log-linear specification. Our estimates imply that an increase in total expenditure by 10% would lead to a decrease of 1% in the share of food. However, quasi-experimental estimates of the impact of the programme on total and food consumption show that the share of food increases, suggesting that the programme has more complex impacts than increasing household income. In particular, our results are not inconsistent with the hypothesis that the programme, targeted to women, could increase their bargaining power and induce a more than proportional increase in food consumption.

]]>
http://www.ifs.org.uk/publications/4572 Wed, 22 Jul 2009 00:00:00 +0000
<![CDATA[How does entry regulation influence entry into self-employment and occupational mobility?]]> We analyze how an entry regulation that imposes a mandatory educational standard affects entry into self-employment and occupational mobility. We exploit the German reunification as a natural experiment and identify regulatory effects by comparing differences between regulated occupations and unregulated occupations in East Germany with the corresponding differences in West Germany after reunification. Consistent with our expectations, we find that entry regulation reduces entry into self-employment and occupational mobility after reunification more in regulated occupations in East Germany than in West Germany. Our findings are relevant for transition or emerging economies as well as for mature market economies requiring large structural changes after unforeseen economic shocks.

]]>
http://www.ifs.org.uk/publications/4566 Wed, 15 Jul 2009 00:00:00 +0000
<![CDATA[Market regulation and firm performance: the case of smoking bans in the UK]]>

This paper analyzes the effects of a ban on smoking in public places upon firms and consumers. It presents a theoretical model and tests its predictions using unique data from before and after the introduction of smoking bans in the UK. Cigarette smoke is a public bad, and smokers and non-smokers differ in their valuation of smoke-free amenities. Consumer heterogeneity implies that the market equilibrium may result in too much uniformity, whereas social optimality requires a mix of smoking and non-smoking pubs (which can be operationalized via licensing). If the market equilibrium has almost all pubs permitting smoking (as is the case in the data) then a blanket ban reduces pub sales, profits, and consumer welfare. We collect survey data from public houses and find that the Scottish smoking ban (introduced in March 2006) reduced pub sales and harmed medium run profitability. An event study analysis of the stock market performance of pub-holding companies corroborates the negative effects of the smoking ban on firm performance.

]]>
http://www.ifs.org.uk/publications/4565 Tue, 14 Jul 2009 00:00:00 +0000
<![CDATA[Empirical analysis of buyer power]]> This paper provides a comprehensive econometric framework for the empirical analysis of buyer power. It encompasses the two main features of pricing schemes in business-to-business relationships: nonlinear price schedules and bargaining over rents. Disentangling them is critical to the empirical identification of buyer power. Testable predictions from the theoretical analysis are delineated, and a pragmatic empirical methodology is presented. It is readily implementable on the basis of transaction data, routinely collected by antitrust authorities. The empirical framework is illustrated using data from the UK brick industry. The paper emphasizes the importance of controlling for endogeneity of volumes and for heterogeneity across buyers and sellers.

]]>
http://www.ifs.org.uk/publications/4564 Sat, 11 Jul 2009 00:00:00 +0000
<![CDATA[Nonparametric identification of a binary random factor in cross section data]]> Suppose V and U are two independent mean zero random variables, where V has an asymmetric distribution with two mass points and U has a symmetric distribution. We show that the distributions of V and U are nonparametrically identified just from observing the sum V +U, and provide a rate root n estimator. We apply these results to the world income distribution to measure the extent of convergence over time, where the values V can take on correspond to country types, i.e., wealthy versus poor countries. We also extend our results to include covariates X, showing that we can nonparametrically identify and estimate cross section regression models of the form Y = g(X;D*)+U, where D* is an unobserved binary regressor.

]]>
http://www.ifs.org.uk/publications/4563 Fri, 10 Jul 2009 00:00:00 +0000
<![CDATA[Nonparametric identification of auction models with non-separable unobserved heterogeneity]]> We propose a novel methodology for nonparametric identification of first-price auction models with independent private values, which accommodates auction-specific unobserved heterogeneity and bidder asymmetries, based on recent results from the econometric literature on nonclassical measurement error in Hu and Schennach (2008). Unlike Krasnokutskaya (2009), we do not require that equilibrium bids scale with the unobserved heterogeneity. Our approach accommodates a wide variety of applications, including settings in which there is an unobserved reserve price, an unobserved cost of bidding, or an unobserved number of bidders, as well as those in which the econometrician fails to observe some factor with a non-multiplicative effect on bidder values.

]]>
http://www.ifs.org.uk/publications/4562 Thu, 09 Jul 2009 00:00:00 +0000
<![CDATA[Hypothesis testing of multiple inequalities: the method of constraint chaining]]> Econometric inequality hypotheses arise in diverse ways. Examples include concavity restrictions on technological and behavioural functions, monotonicity and dominance relations, one-sided constraints on conditional moments in GMM estimation, bounds on parameters which are only partially identified, and orderings of predictive performance measures for competing models. In this paper we set forth four key properties which tests of multiple inequality constraints should ideally satisfy. These are (1) (asymptotic) exactness, (2) (asymptotic)similarity on the boundary, (3) absence of nuisance parameters from the asymptotic null distribution of the test statistic, (4) low computational complexity and boostrapping cost. We observe that the predominant tests currently used in econometrics do not appear to enjoy all these properties simultaneously. We therefore ask the question : Does there exist any nontrivial test which, as a mathematical fact, satisfies the first three properties and, by any reasonable measure, satisfies the fourth ? Remarkably the answer is affirmative. The paper demonstrates this constructively. We introduce a method of test construction called chaining which begins by writing multiple inequalities as a single equality using zero-one indicator functions. We then smooth the indicator functions. The approximate equality thus obtained is the basis of a well-behaved test. This test may be considered as the baseline of a wider class of tests. A full asymptotic theory is provided for the baseline. Simulation results show that the finite-sample performance of the test matches the theory quite well.

]]>
http://www.ifs.org.uk/publications/4537 Mon, 15 Jun 2009 00:00:00 +0000
<![CDATA[Nonparametric estimation of a polarization measure]]>

This paper develops methodology for nonparametric estimation of a polarization measure due to Anderson (2004) and Anderson, Ge, and Leo (2006) based on kernel estimation techniques. We give the asymptotic distribution theory of our estimator, which in some cases is nonstandard due to a boundary value problem. We also propose a method for conducting inference based on estimation of unknown quantities in the limiting distribution and show that our method yields consistent inference in all cases we consider. We investigate the finite sample properties of our methods by simulation methods. We give an application to the study of polarization within China in recent years.

]]>
http://www.ifs.org.uk/publications/4538 Mon, 15 Jun 2009 00:00:00 +0000
<![CDATA[Set identification with Tobin regressors]]> We give semiparametric identification and estimation results for econometric models with a regressor that is endogenous, bound censored and selected,called a Tobin regressor. First, we show that true parameter value is set identified and characterize the identification sets. Second, we propose novel estimation and inference methods for this true value. These estimation and inference methods are of independent interest and apply to any problem where the true parameter value is point identified conditional on some nuisance parameter values that are set-identified. By fixing the nuisance parameter value in some suitable region, we can proceed with regular point and interval estimation. Then, we take the union over nuisance parameter values of the point and interval estimates to form the final set estimates and confidence set estimates. The initial point or interval estimates can be frequentist or Bayesian. The final set estimates are set-consistent for the true parameter value, and confidence set estimates have frequentist validity in the sense of covering this value with at least a prespecified probability in large samples. We apply our identification, estimation, and inference procedures to study the effects of changes in housing wealth on household consumption. Our set estimates fall in plausible ranges, significantly above low OLS estimates and below high IV estimates that do not account for the Tobin regressor structure.

]]>
http://www.ifs.org.uk/publications/4530 Mon, 18 May 2009 00:00:00 +0000
<![CDATA[Inference on counterfactual distributions]]> In this paper we develop procedures for performing inference in regression models about how potential policy interventions affect the entire marginal distribution of an outcome of interest. These policy interventions consist of either changes in the distribution of covariates related to the outcome holding the conditional distribution of the outcome given covariates fixed, or changes in the conditional distribution of the outcome given covariates holding the marginal distribution of the covariates fixed. Under either of these assumptions, we obtain uniformly consistent estimates and functional central limit theorems for the counterfactual and status quo marginal distributions of the outcome as well as other function-valued effects of the policy, including, for example, the effects of the policy on the marginal distribution function, quantile function, and other related functionals. We construct simultaneous confidence sets for these functions; these sets take into account the sampling variation in the estimation of the relationship between the outcome and covariates. Our procedures rely on, and our theory covers, all main regression approaches for modeling and estimating conditional distributions, focusing especially on classical, quantile, duration, and distribution regressions. Our procedures are general and accommodate both simple unitary changes in the values of a given covariate as well as changes in the distribution of the covariates or the conditional distribution of the outcome given covariates of general form. We apply the procedures to examine the effects of labor market institutions on the U.S. wage distribution.

]]>
http://www.ifs.org.uk/publications/4519 Thu, 07 May 2009 00:00:00 +0000
<![CDATA[Principal components and the long run]]>

We investigate a method for extracting nonlinear principal components. These principal components maximize variation subject to smoothness and orthogonality constraints; but we allow for a general class of constraints and densities, including densities without compact support and even densities with algebraic tails. We provide primitive sufficient conditions for the existence of these principal components. We also characterize the limiting behavior of the associated eigenvalues, the objects used to quantify the incremental importance of the principal components. By exploiting the theory of continuous-time, reversible Markov processes, we give a different interpretation of the principal components and the smoothness constraints. When the diffusion matrix is used to enforce smoothness, the principal components maximize long-run variation relative to the overall variation subject to orthogonality constraints. Moreover, the principal components behave as scalar autoregressions with heteroskedastic innovations. Finally, we explore implications for a more general class of stationary, multivariate diffusion processes.

]]>
http://www.ifs.org.uk/publications/4517 Thu, 07 May 2009 00:00:00 +0000
<![CDATA[Identification of structural dynamic discrete choice models]]> This paper presents new identification results for the class of structural dynamic discrete choice models that are built upon the framework of the structural discrete Markov decision processes proposed by Rust (1994). We demonstrate how to semiparametrically identify the deep structural parameters of interest in the case where utility function of one choice in the model is parametric but the distribution of unobserved heterogeneities is nonparametric. The proposed identification method does not rely on the availability of terminal period data and hence can be applied to infinite horizon structural dynamic models. For identification we assume availability of a continuous observed state variable that satisfies certain exclusion restrictions. If such excluded variable is accessible, we show that the structural dynamic discrete choice model is semiparametrically identified using the control function approach.

This is a substantial revision of "Semiparametric identification of structural dynamic optimal stopping time models", CWP06/07.

]]>
http://www.ifs.org.uk/publications/4518 Thu, 07 May 2009 00:00:00 +0000
<![CDATA[L1-Penalized quantile regression in high-dimensional sparse models]]> We consider median regression and, more generally, quantile regression in high-dimensional sparse models. In these models the overall number of regressors p is very large, possibly larger than the sample size n, but only s of these regressors have non-zero impact on the conditional quantile of the response variable, where s grows slower than n. Since in this case the ordinary quantile regression is not consistent, we consider quantile regression penalized by the L1-norm of coefficients (L1-QR). First, we show that L1-QR is consistent at the rate of the square root of (s/n) log p, which is close to the oracle rate of the square root of (s/n), achievable when the minimal true model is known. The overall number of regressors p affects the rate only through the log p factor, thus allowing nearly exponential growth in the number of zero-impact regressors. The rate result holds under relatively weak conditions, requiring that s/n converges to zero at a super-logarithmic speed and that regularization parameter satisfies certain theoretical constraints. Second, we propose a pivotal, data-driven choice of the regularization parameter and show that it satisfies these theoretical constraints. Third, we show that L1-QR correctly selects the true minimal model as a valid submodel, when the non-zero coefficients of the true model are well separated from zero. We also show that the number of non-zero coefficients in L1-QR is of same stochastic order as s, the number of non-zero coefficients in the minimal true model. Fourth, we analyze the rate of convergence of a two-step estimator that applies ordinary quantile regression to the selected model. Fifth, we evaluate the performance of L1-QR in a Monte-Carlo experiment, and provide an application to the analysis of the international economic growth.

]]>
http://www.ifs.org.uk/publications/4520 Thu, 07 May 2009 00:00:00 +0000
<![CDATA[Negative marginal tax rates and heterogeneity]]> Heterogeneity is likely to be an important determinant of the shape of optimal tax schemes. This article addresses the issue in a model à la Mirrlees with a continuum of agents. The agents differ in their productivities and opportunity costs of work, but their labor supplies depend only on a unidimensional combination of their two characteristics. Conditions are given under which the standard result that marginal tax rates are everywhere non-negative holds. This is in particular the case when work opportunity costs are distributed independently of productivities. But one can also get negative marginal tax rates: economies where negative tax rates are optimal at the bottom of the income distribution are studied, and a numerical illustration is given, based on UK data.

]]>
http://www.ifs.org.uk/publications/4516 Tue, 05 May 2009 00:00:00 +0000
<![CDATA[New evidence on taxes and portfolio choice]]> Identifying the effect of differential taxation on portfolio allocation requires exogenous variation in marginal tax rates. Marginal tax rates vary with income, but income surely affects portfolio choice directly. In systems of individual taxation - like Canada's - couples with the same household income can face different effective tax rates on capital income when labor income is distributed differently within households. Using this source of variation we find statistically significant but economically modest responses to taxation. In a 'placebo' test, using data from the U.S. (which has joint taxation), we find no effect of the intra-household distribution of labor income on portfolios.

]]>
http://www.ifs.org.uk/publications/4487 Tue, 21 Apr 2009 00:00:00 +0000
<![CDATA[ICT, corporate restructuring and productivity]]> Stronger productivity growth in the US than the EU over the late 1990s is widely attributed to faster, more widespread adoption of information and communication technology (ICT). The literature has emphasised complementarities in production between ICT and internal restructuring as an important mechanism. We investigate the idea that increased use of ICT has facilitated outsourcing of business services, and that these are complementary activities in production because they allow firms to focus on their core competencies. This is consistent with evidence from the business literature and aggregate trends, and we show evidence from microdata that is consistent with this idea.

]]>
http://www.ifs.org.uk/publications/4484 Fri, 17 Apr 2009 00:00:00 +0000
<![CDATA[Measuring the price responsiveness of gasoline demand]]> This paper develops a new method for estimating the demand function for gasoline and the deadweight loss due to an increase in the gasoline tax. The method is also applicable to other goods. The method uses shape restrictions derived from economic theory to improve the precision of a nonparametric estimate of the demand function. Using data from the U.S. National Household Travel Survey, we show that the restrictions are consistent with the data on gasoline demand and remove the anomalous behavior of a standard nonparametric estimator. Our approach provides new insights about the price responsiveness of gasoline demand and the way responses vary across the income distribution. We reject constant elasticity models and find that price responses vary non-monotonically with income. In particular, we find that low- and high-income consumers are less responsive to changes in gasoline prices than are middle-income consumers.

]]>
http://www.ifs.org.uk/publications/4521 Tue, 07 Apr 2009 00:00:00 +0000
<![CDATA[An analysis of consumer panel data]]> In terms of collecting comprehensive panel expenditure data, there are trade-offs to be made in terms of the demands imposed on respondents and the level of detail and spending coverage collected. Existing comprehensive spending data tends to be cross-sectional whilst panel studies include only limited expenditure questions that record spending only as broad aggregates. More recently, economists have begun to use spending information collected by market research companies that records very detailed spending down to the barcode level from a panel of households, usually recorded by in-home barcode scanners, which may provide considerable advantages over existing data more commonly used in social sciences. However, there has not been a comprehensive assessment of the strengths and weaknesses of this kind of data collection method and the potential implications of survey mode on the recorded data.

This paper seeks to address this, by an in-depth examination of scanner data from one company, Taylor Nelson Sofres (TNS), on grocery purchases over a five-year period. We assess how far the ongoing demands of participation inherent in this kind of survey lead to 'fatigue' in respondents' recording of their spending and compare the demographic representativeness of the data to the well-established Expenditure and Food Survey (EFS), constructing weights for the TNS that account for observed demographic differences. We also look at demographic transitions, comparing the panel aspect of the TNS to the British Household Panel Study (BHPS). We examine in detail the expenditure data in the TNS and EFS surveys and discuss the implications of this method of data collection for survey attrition. Broadly, we suggest that problems of fatigue and attrition may not be so severe as may be expected, though there are some differences in expenditure levels (and to some extent patterns of spending) that cannot be attributed to demographic or time differences in the two surveys alone and may be suggestive of survey mode effects. Demographic transitions appear to occur less frequently than we might expect which may limit the usefulness of the panel aspect of the data for some applications.

]]>
http://www.ifs.org.uk/publications/4468 Thu, 02 Apr 2009 00:00:00 +0000
<![CDATA[Why has home ownership fallen among the young?]]> We document that home ownership of households with 'heads' aged 25-44 years fell substantially between 1980 and 2000 and recovered only partially during the 2001-2005 housing boom. The 1980-2000 decline in young home ownership occurred as improvements in mortgage opportunities made it easier to purchase a home. This paper uses an equilibrium life-cycle model calibrated to micro and macro evidence to understand why young home ownership fell over a period when it became easier to own a home. Our findings indicate that a trend toward marrying later and the increase in household earnings risk that occurred after 1980 account for 3/5 to 4/5 of the decline in young home ownership.

]]>
http://www.ifs.org.uk/publications/4466 Tue, 24 Mar 2009 00:00:00 +0000
<![CDATA[The value of teachers' pensions]]>

As private sector employers have moved away from providing final salary defined benefit (DB) pensions to their employees, attention has increasingly focused on the public sector's continued provision of such pensions and the value of these pension promises to public sector employees. The estimated underlying liabilities of such plans have increased sharply in recent years, at least in part due to unanticipated increases in longevity. This has led to reforms of all the major public sector pension schemes, the net result of which has been to reduce the level of benefits offered by the schemes (predominantly to new, rather than existing members).

This paper examines, in the context of the Teachers' Pension Scheme (TPS), how much the pension promises are worth and what effect the change in scheme rules has had on them. This paper also addresses a number of other issues that are important when valuing DB pension rights and their relation to overall remuneration. First, how increases in current pay feed through into pension values. Second, how the age profile of earnings affects the profile of pension accrual. Finally, how the value of pension rights in DB schemes compares to that in a stylised defined contribution (DC) scheme.

The figures presented in this paper relate specifically to the composition of members and the specific scheme rules of the TPS. However, the issues raised apply equally to other DB schemes, both public and private sector.

]]>
http://www.ifs.org.uk/publications/4452 Wed, 04 Mar 2009 00:00:00 +0000
<![CDATA[Efficient estimation of copula-based semiparametric Markov models]]> This paper considers efficient estimation of copula-based semiparametric strictly stationary Markov models. These models are characterized by nonparametric invariant distributions and parametric copula functions; where the copulas capture all scale-free temporal dependence and tail dependence of the processes. The Markov models generated via tail dependent copulas may look highly persistent and are useful for financial and economic applications. We first show that Markov processes generated via Clayton, Gumbel and Student's t copulas (with tail dependence) are all geometric ergodic. We then propose a sieve maximum likelihood estimation (MLE) for the copula parameter, the invariant distribution and the conditional quantiles. We show that the sieve MLEs of any smooth functionals are root-n consistent, asymptotically normal and efficient; and that the sieve likelihood ratio statistics is chi-square distributed. We present Monte Carlo studies to compare the finite sample performance of the sieve MLE, the two-step estimator of Chen and Fan (2006), the correctly specified parametric MLE and the incorrectly specified parametric MLE. The simulation results indicate that our sieve MLEs perform very well; having much smaller biases and smaller variances than the two-step estimator for Markov models generated by Clayton, Gumbel and other copulas having strong tail dependence.

]]>
http://www.ifs.org.uk/publications/4450 Tue, 03 Mar 2009 00:00:00 +0000
<![CDATA[Identification and estimation of marginal effects in nonlinear panel models]]> This paper gives identification and estimation results for marginal effects in nonlinear panel models. We find that linear fixed effects estimators are not consistent, due in part to marginal effects not being identified. We derive bounds for marginal effects and show that they can tighten rapidly as the number of time series observations grows. We also show in numerical calculations that the bounds may be very tight for small numbers of observations, suggesting they may be useful in practice. We propose two novel inference methods for parameters defined as solutions to linear and nonlinear programs such as marginal effects in multinomial choice models. We show that these methods produce uniformly valid confidence regions in large samples. We give an empirical illustration.

]]>
http://www.ifs.org.uk/publications/4448 Mon, 02 Mar 2009 00:00:00 +0000
<![CDATA[Career progression and formal versus on-the-job training]]> We model the choice of individuals to follow or not apprenticeship training and their subsequent career. We use German administrative data, which records education, labour market transitions and wages to estimate a dynamic discrete choice

model of training choice, employment and wage growth. The model allows for returns to experience and tenure, match specific effects, job mobility and search frictions. We show how apprenticeship training affects labour market careers and we quantify its benefits, relative to the overall costs. We then use our model to show how two welfare reforms change life-cycle decisions and human capital accumulation: One is the introduction of an Earned Income Tax Credit in Germany, and the other is a reform to Unemployment Insurance. In both reforms we find very significant impacts of the policy on training choices and on the value of realized matches, demonstrating the importance of considering such longer term implications.

]]>
http://www.ifs.org.uk/publications/4446 Tue, 27 Jan 2009 00:00:00 +0000
<![CDATA[Dynamic housing expenditures and household welfare]]> In this paper we develop a measure of current "expenditures" on housing services for owner-occupiers. Having such a measure is important for measuring the relative welfare of households, especially when comparing renters and owners and for measuring inflation. From a theoretical perspective expenditures equal the "shadow price" of housing services (the marginal rate of substitution between housing services and non-durable consumption) multiplied by the quantity of housing services consumed. In an idealised world, two simple measures of the shadow price are available; the user cost of housing capital and the rental price of an equivalent rental house. However, imperfect capital markets, risk aversion, the tax system, moving costs and systematic differences between houses available in the rental and owner occupied sectors drive a wedge between the shadow price of housing and these other two measures. This paper contributes to previous research by calibrating a lifecycle model of housing investment and consumption to data from the UK Family Expenditure Survey and by developing measures of the shadow price of housing that take into account uncertainty in house prices, interest rates and incomes, dynamic life cycle choices, and liquidity constraints that depend on both income and house value.

]]>
http://www.ifs.org.uk/publications/4412 Sun, 25 Jan 2009 00:00:00 +0000
<![CDATA[A retail price index including the shadow price of owner occupied housing]]> How do house price changes affect the cost of living? The retail price index in the UK does not directly incorporate house price changes. Instead it uses mortgage interest to capture the cost of owning a home. This is a useful method from many perspectives. However, from a consumer welfare perspective, while mortgage interest does capture the cost of a particular service, it does not capture the cost of housing services. The shadow price of housing captures the welfare cost to a household of changes in housing prices. In this paper we create a new shadow price index using RPI data and the shadow price of housing and investigate how replacing the mortgage interest with the shadow price of housing affects measures of the cost of living.

]]>
http://www.ifs.org.uk/publications/4411 Sat, 24 Jan 2009 00:00:00 +0000
<![CDATA[Trends in quality-adjusted skill premia in the United States, 1960-2000]]> This paper presents new evidence that increases in college enrollment lead to a decline in the average quality of college graduates between 1960 and 2000, resulting in a decrease of 8 percentage points in the college premium. The standard demand and supply framework (Katz and Murphy, 1992, Card and Lemieux, 2001) can qualitatively account for the trend in the college and age premia over this period, but the quantitative adjustments that need to be made to account for changes in quality are substantial. Furthermore, the standard interpretation of the supply effect can be misleading if the quality of college workers is not controlled for. To illustrate the importance of these adjustments, we reanalyze the problem studied in Card and Lemieux (2001), who observe that the rise in the college premium in the 1980s occurred mainly for young workers, and attribute this to the differential behavior of the supply of skill between the young and the old. Our results show that changes in quality are as important as changes in prices to explain the phenomenon they document.

]]>
http://www.ifs.org.uk/publications/4410 Fri, 23 Jan 2009 00:00:00 +0000
<![CDATA[Preschool and maternal labour market outcomes: evidence from a regression discontinuity design]]> Expanding preschool education has the dual goals of improving child outcomes and work incentives for mothers. This paper provides evidence on the second, identifying the impact of preschool attendance on maternal labor market outcomes in Argentina. A major challenge in identifying the causal effect of preschool attendance on parental outcomes is non-random selection into early education. We address this by relying on plausibly exogenous variation in preschool attendance that is induced when children are born on either side of Argentina's enrollment cutoff date of July 1. Because of enrollment cutoff dates, 4 year-olds born just before July 1 are 0.3 more likely to attend preschool. Our regression-discontinuity estimates compare maternal employment outcomes of 4 year-old children on either side of this cutoff, identifying effects among the subset of complying households (who are perhaps more likely to face constraints on their level 2 preschool attendance).

Our findings suggest that, on average, 13 mothers start to work for every 100 youngest children in the household that start preschool (though, in our preferred specification, this estimate is not statistically significant at conventional levels). Furthermore, mothers are 19.1 percentage points more likely to work for more than 20 hours a week (i.e., more time than their children spend in school) and they work, on average, 7.8 more hours per week as consequence of their youngest offspring attending preschool. We find no effect on maternal labor outcomes when a child that is not the youngest in the household attends preschool. Finally, we find that at the point of transition from kindergarten to primary school some employment effects persist.

Our preferred estimates condition on mother's schooling and other exogenous covariates, given evidence that mothers' schooling is unbalanced in the vicinity of the July 1 cutoff in the sample of 4 year-olds. Using a large set of natality records, we found no evidence that this is due to precise birth date manipulation by parents. Other explanations, like sample selection, are also not fully consistent with the data, and we must remain agnostic on this point. Despite this shortcoming, the credibility of the estimates is partly enhanced by the consistency of point estimates with Argentine research using a different EPH sample and sources of variation in preschool attendance (Berlinski and Galiani 2007).

A growing body of research suggests that pre-primary school can improve educational outcomes for children in the short and long run (Blau and Currie 2006; Schady 2006). This paper provides further evidence that, ceteris paribus, an expansion in preschool education may enhance the employment prospects of mothers of children in preschool age.

]]>
http://www.ifs.org.uk/publications/4414 Fri, 23 Jan 2009 00:00:00 +0000
<![CDATA[Ethnic parity in labour market outcomes for benefit claimants]]> We use UK administrative data to estimate the differential in labour market outcomes between Ethnic Minority benefit claimants and otherwise identical Whites. In many cases, Minorities and Whites are simply too different for satisfactory estimates to be calculated and results are sensitive to the methodology used. This calls into question previous results based on simple regression techniques, which may hide the fact that observationally different ethnic groups are being compared by parametric extrapolation. For some groups, however, we could calculate satisfactory results. In these cases, large and significant raw penalties almost always disappear once we appropriately control for pre-inflow characteristics.

]]>
http://www.ifs.org.uk/publications/4413 Thu, 22 Jan 2009 00:00:00 +0000
<![CDATA[Estimating distributions of potential outcomes using local instrumental variables with an application to changes in college enrollment and wage inequality]]>

This paper extends the method of local instrumental variables developed by Heckman and Vytlacil (1999, 2001, 2005) to the estimation of not only means, but also distributions of potential outcomes. The newly developed method is illustrated by applying it to changes in college enrollment and wage inequality using data from the National Longitudinal Survey of Youth of 1979. Increases in college enrollment cause changes in the distribution of ability among college and high school graduates. This paper estimates a semiparametric selection model of schooling and wages to show that, for fixed skill prices, a 14% increase in college participation (analogous to the increase observed in the 1980s), reduces the college premium by 12% and increases the 90-10 percentile ratio among college graduates by 2%.

]]>
http://www.ifs.org.uk/publications/4409 Thu, 22 Jan 2009 00:00:00 +0000
<![CDATA[Geographic proximity and firm-university innovation linkages: evidence from Great Britain]]> We investigate evidence for spatially mediated knowledge transfer from university research. We examine whether firms locate their R&D labs near universities, and whether those that do are more likely to co-operate with, or source knowledge from universities. We find that pharmaceutical firms locate R&D near to frontier chemistry research departments, consistent with accessing localised knowledge spillovers, but also linked to the presence of science parks. In industries such as chemicals and vehicles there is less evidence of immediate co-location, but those innovative firms that do locate near to relevant research departments are more likely to engage with universities.

]]>
http://www.ifs.org.uk/publications/4408 Wed, 21 Jan 2009 00:00:00 +0000
<![CDATA[The economics of a temporary VAT cut]]> 1. The rate of VAT has been cut temporarily to 15%, with a return to 17.5% in place for the end of 2009. The government has predicted that this will increase consumer spending by about 0.5%. Much of the analysis of this tax cut has been critical of the policy and concluded that the government's estimates of the impact on spending are over-optimistic. The source of this criticism is a misunderstanding of the mechanism through which the tax cut will have an impact. In fact, we believe the government's estimates are overly-pessimistic.

2. There are two mechanisms through which the temporary VAT cut might affect spending:

first, it will increase spending power, making households feel as if they have more income. This mechanism is likely to be small partly because the tax cut increases income only for one year, and so the increase in total lifetime resources is very small, and partly because the lost revenue will have to be paid back.

3. However, the second (often ignored) mechanism is likely to be much more important. This second mechanism is the effect that the tax cut will have through changing the price of goods bought in 2009 compared to 2010: the cost of goods bought in 2009 has fallen compared to goods bought in 2010 and this change in prices gives an incentive to bring forward consumer spending to this year, rather than waiting until next.

4. Economic evidence on households' willingness to move spending from one year into an earlier (or later) year suggests that a 1% fall in the price today will translate into a 1% increase in spending. Since roughly only half of goods purchased are subject to VAT, the cut in the rate by 2.5% is like a cut in prices today by 1.25% and we would expect this to boost spending by about 1.25% over what it would otherwise be.

5. Of course, this issue of what the spending would otherwise be is crucial: we will not now know what spending in 2009 would have been without the cut in VAT and even with the VAT cut, spending is likely to decline. Our point is simply that economic analysis shows that the cut in VAT will make the situation significantly less bad than it might otherwise have been.

6. A natural comparison to the fiscal stimulus of a cut in VAT is a monetary stimulus through a cut in the interest rate: both make the price of spending today low compared to next year - an interest rate cut makes saving less attractive than current spending, as does the cut in VAT. The 1.25% fall in prices due to the cut in VAT reduces the price of spending today by more than a 1% point cut in the interest rate. It is surprising that some commentators have labeled the former as "small", while the latter would typically be considered a large cut.

7. There is however a difference between cutting interest rates and cutting VAT: a cut in interest rates penalises savers, whose spending power falls, and rewards borrowers. By contrast, the cut in VAT increases the spending power of savers (as well as borrowers) and this seems a fairer way to stimulate the economy.

]]>
http://www.ifs.org.uk/publications/4418 Tue, 20 Jan 2009 00:00:00 +0000
<![CDATA[Are two cheap, noisy measures better than one expensive, accurate one?]]>

1. Survey responses are always subject to measurement error. In general surveys (and especially longitudinal surveys), there are severe constraints on the time that can be spent eliciting a less noisy response for any target variable. In this paper we consider when it may be better to consider multiple noisy measures of the target measure rather than improving the reliability of a single measure.

2. The Kotlarski result states that if the measurement errors in two measures of the same

target variable are mutually independent and independent of the true value then we can recover the entire distribution of the quantity of interest, up to location.

3. We consider designing surveys to deliver measurement error with desirable properties. This shifts the emphasis from reliability (the signal to noise ratio for any given measure) to the joint properties of the multiple measures.

4. To illustrate our ideas, we consider a concrete example: the measurement of consumption inequality. A small simulation study suggests that the approach we propose has promise. The next step in this research agenda is experiments in survey data collection.

]]>
http://www.ifs.org.uk/publications/4451 Mon, 19 Jan 2009 00:00:00 +0000
<![CDATA[Non cooperative household demand]]>

We study noncooperative household models with two agents and several voluntarily contributed public goods, deriving the counterpart to the Slutsky matrix and demonstrating the nature of the deviation of its properties from those of a true Slutsky matrix in the unitary model. We provide results characterising both cases in which there are and are not jointly contributed public goods. Demand properties are contrasted with those for collective models and conclusions drawn regarding the possibility of empirically testing the collective model against noncooperative alternatives and the noncooperative model against a general alternative.

]]>
http://www.ifs.org.uk/publications/4400 Wed, 17 Dec 2008 00:00:00 +0000
<![CDATA[Instrumental variable models for discrete outcomes]]>

Single equation instrumental variable models for discrete outcomes are shown to be set not point identifying for the structural functions that deliver the values of the discrete outcome. Identified sets are derived for a general nonparametric model and sharp set identification is demonstrated. Point identification is typically not achieved by imposing parametric restrictions. The extent of an identified set varies with the strength and support of instruments and typically shrinks as the support of a discrete outcome grows. The paper extends the analysis of structural quantile functions with endogenous arguments to cases in which there are discrete outcomes.

This paper is a revised version of the original issued in December 2008.

]]>
http://www.ifs.org.uk/publications/4373 Thu, 27 Nov 2008 00:00:00 +0000
<![CDATA[Decomposing changes in income risk using consumption data]]> This paper concerns the decomposition of income risk into permanent and transitory components using repeated cross-section data on income and consumption. Our focus is on the detection of changes in the magnitudes of variances of permanent and transitory risks. A new approximation to the optimal consumption growth rule is developed. Evidence from a dynamic stochastic simulation is used to show that this approximation can provide a robust method for decomposing income risk in a nonstationary environment. We examine robustness to unobserved heterogeneity in consumption growth and to unobserved heterogeneity in income growth. We use this approach to investigate the growth in income inequality in the UK in the 1980s.

]]>
http://www.ifs.org.uk/publications/4357 Tue, 18 Nov 2008 00:00:00 +0000
<![CDATA['Klin'-ing up: effects of Polish tax reforms on those in and on those out]]> In 2007 and 2008 Polish governments introduced a series of reforms which led to a substantial reduction in the tax "wedge" (in Polish: "klin") on labour. The mean ATR on total labour cost was reduced from 41.6% to 34.0%. We show that when considered together the package of introduced reforms brought much greater reductions in the tax burden compared to a widely discussed 15% "flat tax". In the analysis we show the effects of the reforms both for the employed and for the non-employed populations. The latter analysis is done in such a way as to account for the entire (simulated) distribution of wages of the non-employed and shows interesting differences between the effects of reforms on employed and non-employed individuals. We argue that to fully appreciate the effect of reductions in labour taxation it is important to bear in mind that one of the reasons for introducing them is to make employment more likely for those who currently do not work. Given the extent of the reductions in the "klin" it is somewhat surprising that so far so little attention has been given to the recent Polish reforms.

]]>
http://www.ifs.org.uk/publications/4350 Tue, 11 Nov 2008 00:00:00 +0000
<![CDATA[The location of innovative activity in Europe]]> In this paper we use new data to describe how firms from 15 European countries organise their innovative activities. The data matches firm level accounting data with information on the patents that those firms and their subsidiaries have applied for at the European Patents Office. We describe the data in detail.

]]>
http://www.ifs.org.uk/publications/4346 Tue, 04 Nov 2008 00:00:00 +0000
<![CDATA[Does welfare reform affect fertility? Evidence from the UK]]> In 1999 the UK government made major reforms to the system of child-contingent benefits, including the introduction of Working Families' Tax Credit and an increase in means-tested Income Support for families with children. Between 1999-2003 government spending per-child on these benefits rose by 50 per cent in real terms, a change that was unprecedented over a thirty year period. This paper examines whether there was a response in childbearing. To identify the effect of the reforms, we exploit the fact that the spending increases were targeted at low-income households and we use the (exogenously determined) education of the woman and her partner to define treatment and control groups. We argue that the reforms are most likely to have a positive fertility effect for women in couples and show that this is the case. We find that there was an increase in births (by around 15 per cent) among the group affected by the reforms.

]]>
http://www.ifs.org.uk/publications/4345 Tue, 04 Nov 2008 00:00:00 +0000
<![CDATA[Optimal taxation in the extensive model]]> We study optimal taxation in the general extensive model: the only decision of the participants in the economy is to choose between working (full time) or staying inactive. People differ in their productivities and in other features which determine their work opportunity costs. The qualitative properties of optimal tax schemes are presented, with an emphasis on the role of heterogeneity in the equity-efficiency tradeoff. When the government has a redistributive stance, there are a number of cases where the low skilled workers face larger financial incentives to work than in the laissez-faire (negative average tax rates). In particular, this occurs whenever the social weights vary continuously with income and the social weight assigned to the less skilled workers is larger than average.

]]>
http://www.ifs.org.uk/publications/4344 Tue, 04 Nov 2008 00:00:00 +0000
<![CDATA[Separability and public finance]]> In a second best environment, the optimal policy choice sometimes follows the first best rules. This note lays down the information structure and separability assumptions under which this property holds in a variety of setups.

]]>
http://www.ifs.org.uk/publications/4343 Tue, 04 Nov 2008 00:00:00 +0000
<![CDATA[Are boys and girls affected differently when the household head leaves for good? Evidence from school and work choices in Colombia]]> This paper investigates how the permanent departure of the head from the household, mainly due to death or divorce, affects children's school enrolment and work participation in rural Colombia. In our empirical specification we use household-level fixed effects to deal with the fact that households that experience the departure of the head are likely to differ in unobserved ways from those that do not, and we also address the issue of non-random attrition from the panel. We find remarkably different effects for boys and girls. For boys, the adverse event reduces school participation and increases participation in paid work, whereas for girls we find evidence of the adverse event having a beneficial impact on schooling. To explain these differences, we provide evidence for boys consistent with the head's departure having an important effect through the income reduction associated with it, whereas for girls, changes in the household decision-maker appear to play an important role.

]]>
http://www.ifs.org.uk/publications/4347 Tue, 04 Nov 2008 00:00:00 +0000
<![CDATA[Large-sample inference on spatial dependence]]> We consider cross-sectional data that exhibit no spatial correlation, but are feared to be spatially dependent. We demonstrate that a spatial version of the stochastic volatility model of financial econometrics, entailing a form of spatial autoregression, can explain such behaviour. The parameters are estimated by pseudo Gaussian maximum likelihood based on log-transformed squares, and consistency and asymptotic normality are established. Asymptotically valid tests for spatial independence are developed.

]]>
http://www.ifs.org.uk/publications/4341 Fri, 24 Oct 2008 00:00:00 +0000
<![CDATA[Copula-based nonlinear quantile autoregression]]> Parametric copulas are shown to be attractive devices for specifying quantile autoregressive models for nonlinear time-series. Estimation of local, quantile-specific copula-based time series models offers some salient advantages over classical global parametric approaches. Consistency and asymptotic normality of the proposed quantile estimators are established under mild conditions, allowing for global misspecification of parametric copulas and marginals, and without assuming any mixing rate condition. These results lead to a general framework for inference and model specification testing of extreme conditional value-at-risk for financial time series data.

]]>
http://www.ifs.org.uk/publications/4339 Thu, 23 Oct 2008 00:00:00 +0000
<![CDATA[The median is the message: Wilson and Hilferty's reanalysis of C.S. Peirce's experiments on the law of errors]]> Data is reanalyzed from an important series of 19th century experiments conducted by C. S. Peirce and designed to study the plausibility of the Gaussian law of errors for astronomical observations. Contrary to the findings of Peirce, but in accordance with subsequent analysis by Frechet and Wilson and Hilferty, we find normality implausible and medians an attractive alternative to means for the analysis.

]]>
http://www.ifs.org.uk/publications/4340 Thu, 23 Oct 2008 00:00:00 +0000
<![CDATA[Alternative approaches to evaluation in empirical microeconomics]]> This paper reviews a range of the most popular policy evaluation methods in empirical microeconomics: social experiments, natural experiments, matching methods, instrumental variables, discontinuity design and control functions. It discusses the identification of both the traditionally used average parameters and more complex distributional parameters. In each case, the necessary assumptions and the data requirements are considered. The adequacy of each approach is discussed drawing on the empirical evidence from the education and labor market policy evaluation literature. We also develop an education evaluation model which we use to carry through the discussion of each alternative approach. A full set of STATA datasets are provided free online which contain Monte-Carlo replications of the various specifications of the education evaluation model. There are also a full set of STATA .do files for each of the estimation approaches described in the paper. The .do-files can be used together with the datasets to reproduce all the results in the paper.

]]>
http://www.ifs.org.uk/publications/4332 Tue, 14 Oct 2008 00:00:00 +0000
<![CDATA[Wage risk and employment risk over the life cycle]]> We specify a structural life-cycle model of consumption, labour supply and job mobility in an economy with search frictions that allows us to distinguish between different sources of risk and to estimate their effects. The sources of risk are shocks to productivity, job destruction, the process of job arrival when employed and unemployed and match level heterogeneity. Our model allows for four main social insurance programmes. In contrast to simpler models that attribute all income fluctuations to shocks, our framework allows us to disentangle the effects of the shocks from the responses to these shocks. Estimates of productivity risk, once we control for employment risk and for individual labour supply choices, are substantially lower than estimates that attribute all wage variation to productivity risk. Increases in productivity risk impose a considerable welfare loss on individuals and induce substantial precautionary saving. Increases in employment risk have large effects on output and, primarily through this channel, affect welfare. The welfare value of government programs such as food stamps which partially insure productivity risk is greater than the value of unemployment insurance which provides (partial) insurance against employment risk and no insurance against persistent shocks.

]]>
http://www.ifs.org.uk/publications/4317 Fri, 26 Sep 2008 00:00:00 +0000
<![CDATA[Identification and estimation of marginal effects in nonlinear panel models]]> This paper gives identification and estimation results for marginal effects in nonlinear panel models. We find that linear fixed effects estimators are not consistent, due in part to marginal effects not being identified. We derive bounds for marginal effects and show that they can tighten rapidly as the number of time series observations grows. We also show in numerical calculations that the bounds may be very tight for small numbers of observations, suggesting they may be useful in practice. We give an empirical illustration.

]]>
http://www.ifs.org.uk/publications/4315 Tue, 23 Sep 2008 00:00:00 +0000
<![CDATA[Peace and goodwill? Using an experimental game to analyse Paz y Desarrollo]]>

Several decades of conflict, rebellion and unrest severely weakened civil society in parts of Colombia. Paz y Desarrollo is the umbrella term used to describe the set of locally-led initiatives that aim at addressing this problem through initiatives to promote sustainable economic development and community cohesion and action.

This project analyses the findings from a series of "public goods" games that were conducted in the spring and winter of 2006 in 103 municipalities in rural and urban Colombia with predominantly poor participants. These municipalities included both those with and without Paz y Desarrollo in place, and within those municipalities where it was ("treatment" municipalities), both individuals who are participants in the programme and those who are not. The municipalities where PYD is not in place ("control" municipalities) were surveyed as part of the evaluation of another programme - Familias en Accion (FEA), and this project also analyses the impact of this programme on game-play. The game is structured as a typical free-rider problem with the act of contributing to the "public good" (a collective money pot) being always dominated by non-contribution. We interpret contribution as an act consistent with a high degree of social capital.

We find weak evidence that the programme acts at the group level: game sessions involving programme participants have higher levels of contribution than those not involving participants. In addition, there is some evidence that intensity of the programme matters: the more participants, the larger the impact. However, there is no evidence that the programme impacts at the individual level with participants no more likely to contribute than non-participants in treatment areas.

]]>
http://www.ifs.org.uk/publications/4310 Mon, 01 Sep 2008 00:00:00 +0000
<![CDATA[A Bayesian mixed logit-probit model for multinomial choice]]>

In this paper we introduce a new flexible mixed model for multinomial discrete choice where the key individual- and alternative-specific parameters of interest are allowed to follow an assumption-free nonparametric density specification while other alternative-specific coefficients are assumed to be drawn from a multivariate normal distribution which eliminates the independence of irrelevant alternatives assumption at the individual level. A hierarchical specification of our model allows us to break down a complex data structure into a set of submodels with the desired features that are naturally assembled in the original system. We estimate the model using a Bayesian Markov Chain Monte Carlo technique with a multivariate Dirichlet Process (DP) prior on the coefficients with nonparametrically estimated density. We employ a "latent class" sampling algorithm which is applicable to a general class of models including non-conjugate DP base priors. The model is applied to supermarket choices of a panel of Houston households whose shopping behavior was observed over a 24-month period in years 2004-2005. We estimate the nonparametric density of two key variables of interest: the price of a basket of goods based on scanner data, and driving distance to the supermarket based on their respective locations. Our semi-parametric approach allows us to identify a complex multi-modal preference distribution which distinguishes between inframarginal consumers and consumers who strongly value either lower prices or shopping convenience.

]]>
http://www.ifs.org.uk/publications/4305 Fri, 15 Aug 2008 00:00:00 +0000
<![CDATA[Recent developments in the econometrics of program evaluation]]> Many empirical questions in economics and other social sciences depend on causal effects of programs or policies. In the last two decades much research has been done on the econometric and statistical analysis of the effects of such programs or treatments. This recent theoretical literature has built on, and combined features of, earlier work in both the statistics and econometrics literatures. It has by now reached a level of maturity that makes it an important tool in many areas of empirical research in economics, including labor economics, public finance, development economics, industrial organization and other areas of empirical micro-economics. In this review we discuss some of the recent developments. We focus primarily on practical issues for empirical researchers, as well as provide a historical overview of the area and give references to more technical research.

]]>
http://www.ifs.org.uk/publications/4306 Fri, 15 Aug 2008 00:00:00 +0000
<![CDATA[Does a pint a day affect your child's pay? The effect of prenatal alcohol exposure on adult outcomes]]> This paper utilizes a Swedish alcohol policy experiment conducted in the late 1960s to identify the impact of prenatal alcohol exposure on educational attainments and labor market outcomes. The experiment started in November 1967 and was prematurely discontinued in July 1968 due to a sharp increase in alcohol consumption in the experimental regions, particularly among youths. Using a difference-in-difference-in-differences estimation strategy we find that around the age of 30 the cohort in utero during the experiment has substantially reduced educational attainments, lower earnings and higher welfare dependency rates compared to the surrounding cohorts. The results indicate that investments in early-life health have far-reaching effects on economic outcomes in later life.

]]>
http://www.ifs.org.uk/publications/4304 Tue, 12 Aug 2008 00:00:00 +0000
<![CDATA[Testing for stochastic monotonicity]]> We propose a test of the hypothesis of stochastic monotonicity. This hypothesis is of interest in many applications in economics. Our test is based on the supremum of a rescaled U-statistic. We show that its asymptotic distribution is Gumbel. The proof is diffcult because the approximating Gaussian stochastic process contains both a stationary and a nonstationary part and so we have to extend existing results that only apply to either one or the other case. We also propose a refinement to the asymptotic approximation that we show works much better infinite samples. We apply our test to the study of intergenerational income mobility.

]]>
http://www.ifs.org.uk/publications/4301 Thu, 31 Jul 2008 00:00:00 +0000
<![CDATA[The retirement consumption puzzle: evidence from a regression discontinuity approach]]> In this paper we investigate the size of the consumption drop at retirement in Italy. We use micro data on food and total non-durable household spending covering the period 1993-2004, and evaluate the change in consumption that accompanies retirement by exploiting the exogenous variability in pension eligibility to correct for the endogenous nature of the retirement decision. We take a regression discontinuity approach, and make the identifying assumption that consumption would be the same around the threshold for pension eligibility if individuals would not retire. We check in our data that a non-negligible fraction of individuals retire as soon as they become eligible, and estimate at 9:8% the part of the non-durable consumption drop that is associated with retirement induced by eligibility. We show that such fall is not driven by liquidity problems for the less well off in the population, and can be accounted for by drops in goods that are work-related expenses or leisure substitutes. However, we also show that retirement induces a significant drop in the number of grown children living with their parents, and this can account for most of the retirement consumption drop.

]]>
http://www.ifs.org.uk/publications/4298 Fri, 25 Jul 2008 00:00:00 +0000
<![CDATA[Estimating derivatives in nonseparable models with limited dependent variables]]> We present a simple way to estimate the effects of changes in a vector of observable variables X on a limited dependent variable Y when Y is a general nonseparable function of X and unobservables. We treat models in which Y is censored from above or below or potentially from both. The basic idea is to first estimate the derivative of the conditional mean of Y given X at x with respect to x on the uncensored sample without correcting for the effect of changes in x induced on the censored population. We then correct the derivative for the effects of the selection bias. We propose nonparametric and semiparametric estimators for the derivative. As extensions, we discuss the cases of discrete regressors, measurement error in dependent variables, and endogenous regressors in a cross section and panel data context.

]]>
http://www.ifs.org.uk/publications/4289 Wed, 09 Jul 2008 00:00:00 +0000
<![CDATA[GEL methods for non-smooth moment indicators]]> This paper considers the first order large sample properties of the GEL class of estimators for models specified by non-smooth indicators. The GEL class includes a number of estimators recently introduced as alternatives to the efficient GMM estimator which may suffer from substantial biases in finite samples. These include EL, ET and the CUE. This paper also establishes the validity of tests suggested in the smooth moment indicators case for over-dentifying restrictions and specification. In particular, a number of these tests avoid the necessity of providing an estimator for the Jacobian matrix which may be problematic for the sample sizes typically encountered in practice.

]]>
http://www.ifs.org.uk/publications/4288 Tue, 08 Jul 2008 00:00:00 +0000
<![CDATA[Household willingness to pay for organic products]]> We use hedonic prices and purchase quantities to consider what can be learned about household willingness to pay for baskets of organic products and how this varies across households. We use rich scanner data on food purchases by a large number of households to compute household specific lower and upper bounds on willingness to pay for various baskets of organic products. These bounds provide information about willingness to pay for organic without imposing restrictive assumptions on preferences. We show that the reasons households are willing to pay vary, with quality being the most important, health concerns coming second, and environmental concerns lagging far behind. We also show how these methods can be used for example by stores to provide robust upper bounds on the revenue implication of introducing a new line of organic products.

]]>
http://www.ifs.org.uk/publications/4287 Mon, 07 Jul 2008 00:00:00 +0000
<![CDATA[Improving point and interval estimates of monotone functions by rearrangement]]> Suppose that a target function is monotonic, namely weakly increasing, and an original estimate of this target function is available, which is not weakly increasing. Many common estimation methods used in statistics produce such estimates. We show that these estimates can always be improved with no harm by using rearrangement techniques: The rearrangement methods, univariate and multivariate, transform the original estimate to a monotonic estimate, and the resulting estimate is closer to the true curve in common metrics than the original estimate. The improvement property of the rearrangement also extends to the construction of confidence bands for monotone functions. Let l and u be the lower and upper endpoint functions of a simultaneous confidence interval [l,u] that covers the true function with probability (1-a), then the rearranged confidence interval, defined by the rearranged lower and upper end-point functions, is shorter in length in common norms than the original interval and covers the true function with probability greater or equal to (1-a). We illustrate the results with a computational example and an empirical example dealing with age-height growth charts.

Please note: This paper is a revised version of cemmap working Paper CWP09/07.

]]>
http://www.ifs.org.uk/publications/4286 Sun, 06 Jul 2008 00:00:00 +0000
<![CDATA[Identification with imperfect instruments]]> Dealing with endogenous regressors is a central challenge of applied research. The standard solution is to use instrumental variables that are assumed to be uncorrelated with unobservables. We instead assume (i) the correlation between the instrument and the error term has the same sign as the correlation between the endogenous regressor and the error term, and (ii) that the instrument is less correlated with the error term than is the endogenous regressor. Using these assumptions, we derive analytic bounds for the parameters. We demonstrate the method in two applications.

]]>
http://www.ifs.org.uk/publications/4277 Fri, 27 Jun 2008 00:00:00 +0000
<![CDATA[Sharp identification regions in games]]> We study identification in static, simultaneous move finite games of complete information, where the presence of multiple Nash equilibria may lead to partial identification of the model parameters. The identification regions for these parameters proposed in the related literature are known not to be sharp. Using the theory of random sets, we show that the sharp identification region can be obtained as the set of minimizers of the distance from the conditional distribution of game's outcomes given covariates, to the conditional Aumann expectation given covariates of a properly defined random set. This is the random set of probability distributions over action profiles given profit shifters implied by mixed strategy Nash equilibria. The sharp identification region can be approximated arbitrarily accurately through a finite number of moment inequalities based on the support function of the conditional Aumann expectation. When only pure strategy Nash equilibria are played, the sharp identification region is exactly determined by a finite number of moment inequalities. We discuss how our results can be extended to other solution concepts, such as for example correlated equilibrium or rationality and rationalizability. We show that calculating the sharp identification region using our characterization is computationally feasible. We also provide a simple algorithm which finds the set of inequalities that need to be checked in order to insure sharpness. We use examples analyzed in the literature to illustrate the gains in identification afforded by our method.

]]>
http://www.ifs.org.uk/publications/4264 Wed, 18 Jun 2008 00:00:00 +0000
<![CDATA[Generating functions and short recursions, with applications to the moments of quadratic forms in noncentral normal vectors]]> Using generating functions, the top-order zonal polynomials that occur in much distribution theory under normality can be recursively related to other symmetric functions (power-sum and elementary symmetric functions, Ruben, Hillier, Kan, and Wang). Typically, in a recursion of this type the k-th object of interest, dk say, is expressed in terms of all lower-order dj's. In Hillier, Kan, and Wang we pointed out that, in the case of top-order zonal polynomials (and generalizations of them), a shorter (i.e., fixed length) recursion can be deduced. The present paper shows that the argument in generalizes to a large class of objects/generating functions. The results thus obtained are then applied to various problems involving quadratic forms in noncentral normal vectors.

]]>
http://www.ifs.org.uk/publications/4263 Wed, 11 Jun 2008 00:00:00 +0000
<![CDATA[Nonparametric identification of dynamic models with unobserved state variables]]> We consider the identification of a Markov process {Wt, Xt*} for t=1,2,...,T when only {Wt} for t=1, 2,..,T is observed. In structural dynamic models, Wt denotes the sequence of choice variables and observed state variables of an optimizing agent, while Xt* denotes the sequence of serially correlated state variables. The Markov setting allows the distribution of the unobserved state variable Xt* to depend on Wt-1 and Xt-1*. We show that the joint distribution of (Wt, Xt*, Wt-1, Xt-1*) is identified from the observed distribution of (Wt+1, Wt, Wt-1, Wt-2, Wt-3) under reasonable assumptions. Identification of the joint distribution of (Wt, Xt*, Wt-1, Xt-1*) is a crucial input in methodologies for estimating dynamic models based on the "conditional-choice-probability (CCP)" approach pioneered by Hotz and Miller.

]]>
http://www.ifs.org.uk/publications/4228 Wed, 28 May 2008 00:00:00 +0000
<![CDATA[Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals]]> For semi/nonparametric conditional moment models containing unknown parametric components θ and unknown functions of endogenous variables (h), Newey and Powell (2003) and Ai and Chen (2003) propose sieve minimum distance (SMD) estimation of (θ, h) and derive the large sample properties. This paper greatly extends their results by establishing the followings: (1) The penalized SMD (PSMD) estimator can simultaneously achieve root-n asymptotic normality of the parametric components and nonparametric optimal convergence rate of the nonparametric components, allowing for models with possibly nonsmooth residuals and/or noncompact infinite dimensional parameter spaces. (2) A simple weighted bootstrap procedure can consistently estimate the limiting distribution of the PSMD estimator of the parametric components. (3) The semiparametric efficiency bound results of Ai and Chen (2003) remain valid for conditional models with nonsmooth residuals, and the optimally weighted PSMD estimator achieves the bounds. (4) The profiled optimally weighted PSMD criterion is asymptotically Chi-square distributed, which implies an alternative consistent estimation of confidence region of the efficient PSMD estimator of θ. All the theoretical results are stated in terms of any consistent nonparametric estimator of conditional mean functions. We illustrate our general theories using a partially linear quantile instrumental variables regression, a Monte Carlo study, and an empirical estimation of the shape-invariant quantile Engel curves with endogenous total expenditure.<

]]>
http://www.ifs.org.uk/publications/4159 Sun, 11 May 2008 00:00:00 +0000
<![CDATA[Estimation of nonparametric conditional moment models with possibly nonsmooth moments]]>

This paper studies nonparametric estimation of conditional moment models in which the residual functions could be nonsmooth with respect to the unknown functions of endogenous variables. It is a problem of nonparametric nonlinear instrumental variables (IV) estimation, and a difficult nonlinear ill-posed inverse problem with an unknown operator. We first propose a penalized sieve minimum distance (SMD) estimator of the unknown functions that are identified via the conditional moment models. We then establish its consistency and convergence rate (in strong metric), allowing for possibly non-compact function parameter spaces, possibly non-compact finite or infinite dimensional sieves with flexible lower semicompact or convex penalty, or finite dimensional linear sieves without penalty. Under relatively low-level sufficient conditions, and for both mildly and severely ill-posed problems, we show that the convergence rates for the nonlinear ill-posed inverse problems coincide with the known minimax optimal rates for the nonparametric mean IV regression. We illustrate the theory by two important applications: root-n asymptotic normality of the plug-in penalized SMD estimator of a weighted average derivative of a nonparametric nonlinear IV regression, and the convergence rate of a nonparametric additive quantile IV regression. We also present a simulation study and an empirical estimation of a system of nonparametric quantile IV Engel curves.

]]>
http://www.ifs.org.uk/publications/4201 Fri, 25 Apr 2008 00:00:00 +0000
<![CDATA[More on confidence intervals for partially identified parameters]]> This paper extends Imbens and Manski's (2004) analysis of confidence intervals for interval identified parameters. For their final result, Imbens and Manski implicitly assume superefficient estimation of a nuisance parameter. This appears to have gone unnoticed before, and it limits the result's applicability. I re-analyze the problem both with assumptions that merely weaken the superefficiency condition and with assumptions that remove it altogether. Imbens and Manski's confidence region is found to be valid under weaker assumptions than theirs, yet superefficiency is required. I also provide a different confidence interval that is valid under superefficiency but can be adapted to the general case, in which case it embeds a specification test for nonemptiness of the identified set. A methodological contribution is to notice that the difficulty of inference comes from a boundary problem regarding a nuisance parameter, clarifying the connection to other work on partial identification.

]]>
http://www.ifs.org.uk/publications/4200 Wed, 23 Apr 2008 00:00:00 +0000
<![CDATA[Adaptive partial policy innovation: coping with ambiguity through diversification]]>

This paper develops a broad theme about policy choice under ambiguity through study of a particular decision criterion. The broad theme is that, where feasible, choice between a status quo policy and an innovation is better framed as selection of a treatment allocation than as a binary decision. Study of the static minimax-regret criterion and its adaptive extension substantiate the theme. When the optimal policy is ambiguous, the static minimax-regret allocation always is fractional absent large fixed costs or deontological considerations. In dynamic choice problems, the adaptive minimax-regret criterion treats each cohort as well as possible, given the knowledge available at the time, and maximizes intertemporal learning about treatment response.

]]>
http://www.ifs.org.uk/publications/4199 Tue, 22 Apr 2008 00:00:00 +0000
<![CDATA[Building trust? Conditional cash transfers and social capital]]> In this paper we propose a measure of social capital based on the behaviour in a public good game. We play the public good game within 28 groups in two similar neighborhoods in Cartagena, Colombia, one of which had been targeted for over two years by a conditional cash transfer program that has an important social component. The level of cooperation we observe in the 'treatment' community is considerably higher than in the 'control' community. The two neighborhoods, however, although similar in many dimensions, turned out to be significantly different in other observable variables. The result we obtain in terms of cooperation, however, is robust to controls for these observable differences. In the last part of the paper we also compare our measure of social capital with other more traditional measures that have been used in the literature.

]]>
http://www.ifs.org.uk/publications/4175 Fri, 04 Apr 2008 00:00:00 +0000
<![CDATA[Labour supply and taxes]]> In this paper we provide an overview of the literature relating labour supply to taxes and welfare benefits with a focus on presenting the empirical consensus. We begin with a basic continuous hours model, where individuals have completely free choice over their hours of work. We then consider fixed costs of work, the complications introduced by the benefits system, dynamic aspects of labour supply and we place the analysis in the context of the family. The key conclusion of this work is that in order to estimate the impact of tax reform and be able to generalise results, a structural approach that takes account of many of these issues is desirable. We then discuss the 'new Tax Responsiveness' literature which uses the response of taxable income to the marginal tax rate as a summary statistic of the behavioural response to taxation. Underlying this approach is the unsatisfactory nature of using hours as a proxy for labour effort for those with high levels of autonomy on the job and who already work long hours, such as the self employed or senior executives. After discussing relevant theory we then provide a summary of empirical estimates and the methodology underlying the studies. Our conclusion is that hours of work are relatively inelastic for men, but are a little more responsive for married women and lone mothers. On the other hand, participation is quite sensitive to taxation and benefits for women. Within this paper we present new estimates form a discrete participation model for both married and single men based on the numerous reforms over the past two decades in the UK. We find that the participation of low education men is somewhat more responsive to incentives than previously thought. For men with high levels of education, participation is virtually unresponsive; here the literature on taxable income suggests that there may be significant welfare costs of taxation, although much of this seems to be a result of shifting income and consumption to non-taxable forms as opposed to actual reductions in work effort.

]]>
http://www.ifs.org.uk/publications/4166 Thu, 13 Mar 2008 00:00:00 +0000
<![CDATA[Bootstrap tests of stochastic dominance with asymptotic similarity on the boundary]]>

We propose a new method of testing stochastic dominance which improves on existing tests based on bootstrap or subsampling. Our test requires estimation of the contact sets between the marginal distributions. Our tests have asymptotic sizes that are exactly equal to the nominal level uniformly over the boundary points of the null hypothesis and are therefore valid over the whole null hypothesis. We also allow the prospects to be indexed by infinite as well as finite dimensional unknown parameters, so that the variables may be residuals from nonparametric and semiparametric models. Our simulation results show that our tests are indeed more powerful than the existing subsampling and recentered bootstrap.

]]>
http://www.ifs.org.uk/publications/4158 Mon, 10 Mar 2008 00:00:00 +0000
<![CDATA[Training disadvantaged youth in Latin America: evidence from a randomized trial]]> Youth unemployment in Latin America is exceptionally high, as much as 50% among the poor. Vocational training may be the best chance to help unemployed young people at the bottom of the income distribution. This paper evaluates the impact of a randomized training program for disadvantaged youth introduced in Colombia in 2005 on the employment and earnings of trainees.

]]>
http://www.ifs.org.uk/publications/4174 Sat, 01 Mar 2008 00:00:00 +0000
<![CDATA[Skill-based technology adoption: firm-level evidence from Brazil and India]]>

This paper provides the first firm-level econometric evidence on the skill-bias of ICT in developing countries using a unique new dataset of manufacturing firms in Brazil and India. I use detailed information on firms' adoption of ICT and the educational composition of their workforce to estimate skill-share equations in levels and long differences. The results are strongly suggestive of skill-biased ICT adoption, with ICT able to explain up to a third of the average increase in the share of skilled workers in Brazil and up to one half in India. I then use variation in the relative supply of skilled workers across states within each country to identify the skill-bias of ICT. The results are again consistent with skill-bias in both countries, and are mainly robust to various methods of controlling for unobserved heterogeneity across states. The magnitudes of the estimated effects from both approaches are surprisingly similar for the two countries. Overall, the results suggest that new developments in ICT are diffusing rapidly through the manufacturing sectors of both Brazil and India, with similar implications for the demand for skills in two very different and geographically distant countries. This evidence is consistent with ongoing pervasive skill-biased technological change associated with ICT throughout much of the developed and developing world. The implications for future developments in inequality both within and between countries are potentially far-reaching.

]]>
http://www.ifs.org.uk/publications/4148 Mon, 25 Feb 2008 00:00:00 +0000
<![CDATA[Computationally efficient recursions for top-order invariant polynomials with applications]]> The top-order zonal polynomials Ck(A),and top-order invariant polynomials Ck1,...,kr(A1,...,Ar)in which each of the partitions of ki,i = 1,..., r,has only one part, occur frequently in multivariate distribution theory, and econometrics - see, for example Phillips (1980, 1984, 1985, 1986), Hillier (1985, 2001), Hillier and Satchell (1986), and Smith (1989, 1993). However, even with the recursive algorithms of Ruben (1962) and Chikuse (1987), numerical evaluation of these invariant polynomials is extremely time consuming. As a result, the value of invariant polynomials has been largely confined to analytic work on distribution theory. In this paper we present new, very much more efficient, algorithms for computing both the top-order zonal and invariant polynomials. These results should make the theoretical results involving these functions much more valuable for direct practical study. We demonstrate the value of our results by providing fast and accurate algorithms for computing the moments of a ratio of quadratic forms in normal random variables.

]]>
http://www.ifs.org.uk/publications/4146 Wed, 20 Feb 2008 00:00:00 +0000
<![CDATA[Dynamic policy analysis]]> This chapter studies the microeconometric treatment-effect and structural approaches to dynamic policy evaluation. First, we discuss a reduced-form approach based on a sequential randomization or dynamic matching assumption that is popular in biostatistics. We then discuss two complementary approaches for treatments that are single stopping times and that allow for non- trivial dynamic selection on unobservables. The first builds on continuous-time duration and event-history models. The second extends the discrete-time dynamic discrete-choice literature.

]]>
http://www.ifs.org.uk/publications/4133 Mon, 11 Feb 2008 00:00:00 +0000
<![CDATA[Identifying the returns to lying when the truth is unobserved]]>

Consider an observed binary regressor D and an unobserved binary variable D*, both of which affect some other variable Y . This paper considers nonparametric identification and estimation of the effect of D on Y , conditioning on D* = 0. For example, suppose Y is a person's wage, the unobserved D* indicates if the person has been to college, and the observed D indicates whether the individual claims to have been to college. This paper then identifies and estimates the difference in average wages between those who falsely claim college experience versus those who tell the truth about not having college.We estimate this average returns to lying to be about 7% to 20%. Nonparametric identification without observing D* is obtained either by observing a variable V that is roughly analogous to an instrument for ordinary measurement error, or by imposing restrictions on model error moments.

]]>
http://www.ifs.org.uk/publications/4134 Mon, 11 Feb 2008 00:00:00 +0000
<![CDATA[Assessing the equalizing force of mobility using short panels: France 1990-2000]]>

In this paper, we document whether and how much the equalizing force of earnings mobility has changed in France in the 1990s. For this purpose, we use a representative three-year panel,the French Labour Force Survey. We develop a model of earnings dynamics that combines a flexible specification of marginal earnings distributions (to fit the large cross-sectional dimension of the data) with a tight parametric representation of the dynamics (adapted to the short timeseries dimension). Log earnings are modelled as the sum of a deterministic component, an individual fixed effect, and a transitory component which is assumed first-order Markov. The transition probability of the transitory component is modelled as a one-parameter Plackett copula. We estimate this model using a sequential EM algorithm.

We exploit the estimated model to study employment/earnings inequality in France over the 1990-2002 period. We show that, in phase with business cycle fluctuations (a recession in 1993 and two peaks in 1990 and 2000), earnings mobility decreases when cross-section inequality and unemployment risk increase. We simulate individual earnings trajectories and compute present values of lifetime earnings over various horizons. Inequality presents a hump-shaped evolution over the period, with a 9% increase between 1990 and 1995 and a decrease afterwards.Accounting for unemployment yields an increase of 11%. Moreover, this increase is persistent, as it translates into a 12% increase in the variance of log present values. The ratio of inequality in present values to inequality in one-year earnings, a natural measure of immobility or of the persistence of inequality, remains remarkably constant over the business cycle.

]]>
http://www.ifs.org.uk/publications/4130 Fri, 08 Feb 2008 00:00:00 +0000
<![CDATA[Consistent noisy independent component analysis]]>

We study linear factor models under the assumptions that factors are mutually independent and independent of errors, and errors can be correlated to some extent. Under factor non-Gaussianity, second to fourth-order moments are shown to yield full identification of the matrix of factor loadings. We develop a simple algorithm to estimate the matrix of factor loadings from these moments. We run Monte Carlo simulations and apply our methodology to British data on cognitive test scores.

]]>
http://www.ifs.org.uk/publications/4132 Fri, 08 Feb 2008 00:00:00 +0000
<![CDATA[Generalized nonparametric deconvolution with an application to earnings dynamics]]>

In this paper,we construct a nonparametric estimator of the distributions of latent factors in linear independent multi-factor models under the assumption that factor loadings are known. Our approach allows to estimate the distributions of up to L(L+1)/2 factors given L measurements. The estimator works through empirical characteristic functions. We show that it is consistent, and derive asymptotic convergence rates. Monte-Carlo simulations show good finite-sample performance, less so if distributions are highly skewed or leptokurtic. We finally apply the generalized deconvolution procedure to decompose individual log earnings from the PSID into permanent and transitory components.

]]>
http://www.ifs.org.uk/publications/4131 Fri, 08 Feb 2008 00:00:00 +0000
<![CDATA[Econometric causality]]> This paper presents the econometric approach to causal modeling. It is motivated by policy problems. New causal parameters are defined and identified to address specific policy problems. Economists embrace a scientific approach to causality and model the preferences and choices of agents to infer subjective (agent) evaluations as well as objective outcomes. Anticipated and realized subjective and objective outcomes are distinguished. Models for simultaneous causality are developed. The paper contrasts the Neyman-Rubin model of causality with the econometric approach.

]]>
http://www.ifs.org.uk/publications/4126 Thu, 07 Feb 2008 00:00:00 +0000
<![CDATA[Changing public sector wage differentials in the UK]]> The paper estimates public sector wage differentials and their changes over time for men and women in the United Kingdom using panel data from the New Earnings Survey/Annual Survey of Hours and Earnings for the period 1975 to 2006. It presents estimates that are robust to unobserved workforce characteristics and that also show the impact of policy changes and cyclical factors, by allowing the average measured public sector 'premium' or 'penalty' to be time-varying. The methodology also allows us to examine the extent to which discrepancies in public and private sector pay induce changing relative qualities of the sectoral workforces.

Results are given for men and women comparing mean wages in the public and private sectors as a whole. There is, on average, a very small positive premium over the whole period for public sector women and a very small penalty for men; however the variability of the differential is much more striking than the average difference.

The method can also be applied to sub-groups in the labour market, and we illustrate the case of female public sector nurses and midwives, where the comparison group are private sector workers who have ever been, or will be, public sector nurses or midwives. Measured variations in this nurses' differential reflects the various changes in pay structure and government pay policies over the period; it is striking however that in the last decade, the 'raw' differential accruing to public sector nurses and midwives has declined almost continuously, whereas the composition and quality-adjusted differential shows no overall trend.

]]>
http://www.ifs.org.uk/publications/4117 Wed, 06 Feb 2008 00:00:00 +0000
<![CDATA[Employment, hours of work and the optimal taxation of low income families]]> This paper examines the tax schedule for low income families with children. We take an optimal tax approach based on a structural labour supply model which incorporates unobserved heterogeneity, fixed costs of work, childcare costs and the detailed non-convexities of the tax and transfer system. The motivation is the British earned income tax credit reform (WFTC) and its interaction with the tax and transfer system for lone parents. Our analysis also examines the case for the use of hours-contingent payments. The results point to a tax schedule which depends on the age of children, with tax credits only optimal for low earners with school age children. The results also suggest a welfare improving role for hours-contingent payments although this is mitigated when hours cannot be monitored or recorded accurately by the tax authorities.

]]>
http://www.ifs.org.uk/publications/4111 Mon, 28 Jan 2008 00:00:00 +0000
<![CDATA[The matching method for treatment evaluation with selective participation and ineligibles]]> The matching method for treatment evaluation does not balance selective unobserved differences between treated and non-treated. We derive a simple correction term if there is an instrument that shifts the treatment probability to zero in specific cases. Policies with eligibility restrictions,where treatment is impossible if some variable exceeds a certain value, provide a natural application. In an empirical analysis, we first examine the performance of matching versus regression-discontinuity estimation in the sharp age-discontinuity design of the NDYP job search assistance program for young unemployed in the UK. Next, we exploit the age eligibility restriction in the Swedish Youth Practice subsidized work program for young unemployed, where compliance is imperfect among the young. Adjusting the matching estimator for selectivity changes the results towards ineffectiveness of subsidized work in moving individuals into employment.

]]>
http://www.ifs.org.uk/publications/4106 Sun, 30 Dec 2007 00:00:00 +0000
<![CDATA[Integrating Income Tax and National Insurance: an interim report]]> Income Tax and National Insurance are now sufficiently similar that merging them appears to be a plausible option, yet still sufficiently different that integration raises significant difficulties. This paper surveys the potential benefits of integration - increased transparency and reduced administrative and compliance costs - and the potential obstacles, assessing the extent to which each of the differences between Income Tax and NICs - in particular the contributory principle, the levying of an employer charge and the differences in tax base - constitute serious barriers to integration. The paper concludes that few of the difficulties look individually prohibitive, but that trying too hard to avoid significant reform of the current policy framework could produce a merged tax so complicated as to nullify much or all of the benefits of integration.

]]>
http://www.ifs.org.uk/publications/4101 Fri, 21 Dec 2007 00:00:00 +0000
<![CDATA[Unconditional quantile treatment effects under endogeneity]]> This paper develops IV estimators for unconditional quantile treatment effects (QTE) when the treatment selection is endogenous. In contrast to conditional QTE, i.e. the effects conditional on a large number of covariates X, the unconditional QTE summarize the effects of a treatment for the entire population. They are usually of most interest in policy evaluations because the results can easily be conveyed and summarized. Last but not least, unconditional QTE can be estimated at pn rate without any parametric assumption, which is obviously impossible for conditional QTE (unless all X are discrete). In this paper we extend the Identification of unconditional QTE to endogenous treatments. Identification is based on a monotonicity assumption in the treatment choice equation and is achieved without any functional form restriction. Several types of estimators are proposed: regression, propensity score and weighting estimators. Root n consistency, asymptotic normality and attainment of the semiparametric efficiency bound are shown for our weighting estimator, which is extremely simple to implement. We also show that including covariates in the estimation is not only necessary for consistency when the instrumental variable is itself confounded but also for efficiency when the instrument is valid unconditionally. Monte Carlo simulations and two empirical applications illustrate the use of the proposed estimators.

]]>
http://www.ifs.org.uk/publications/4104 Thu, 20 Dec 2007 00:00:00 +0000
<![CDATA[Estimating average marginal effects in nonseparable structural systems]]>

We provide nonparametric estimators of derivative ratio-based average marginal effects of an endogenous cause, X, on a response of interest, Y , for a system of recursive structural equations. The system need not exhibit linearity, separability, or monotonicity. Our estimators are local indirect least squares estimators analogous to those of Heckman and Vytlacil (1999, 2001) who treat a latent index model involving a binary X. We treat the traditional case of an observed exogenous instrument (OXI)and the case where one observes error-laden proxies for an unobserved exogenous instrument (PXI). For PXI, we develop and apply new results for estimating densities and expectations conditional on mismeasured variables. For both OXI and PXI, we use infnite order flat-top kernels to obtain uniformly convergent and asymptotically normal nonparametric estimators of instrument-conditioned effects, as well as root-n consistent and asymptotically normal estimators of average effects.

]]>
http://www.ifs.org.uk/publications/4103 Mon, 03 Dec 2007 00:00:00 +0000
<![CDATA[A nonparametric analysis of habits models]]>

This paper presents a nonparametric analysis of the canonical habits model. The approach is based on the combinatorial/revealed preference framework of Samuelson (1948), Houthakker (1950), Afriat (1967) and Varian (1982) and the extenstion and application of these ideas to intertemporal models in Browning (1989). It provides a simple finitely computable test of the model which does not require a parameterisation of the underlying (hypothesised) preferences.It also yields set identification of important features of the canonical habits model including the consumer's rate of time preference and the welfare effects of habit-formation. The ideas presented are illustrated using Spanish panel data.

]]>
http://www.ifs.org.uk/publications/4102 Sat, 01 Dec 2007 00:00:00 +0000
<![CDATA[Tax reform and retirement saving incentives: evidence from the introduction of stakeholder pensions in the UK]]> Faced with ageing populations, OECD governments are seeking policies to increase individual retirement saving. In April 2001, the UK government introduced Stakeholder Pensions - a low cost retirement saving vehicle. The reform also changed the structure of tax-relieved contribution ceilings, increasing their generosity for lower earning individuals. We examine the impact of these changes on private pension coverage and on contributions to personal pension accounts using individual level micro data.

]]>
http://www.ifs.org.uk/publications/4093 Tue, 27 Nov 2007 00:00:00 +0000
<![CDATA[Optimal investment policy with fixed adjustment costs and complete irreversibility]]> We develop and solve analytically an investment model with fixed adjust-ment costs and complete irreversibility that reproduces observed investment dynamics at the micro-level. We impose a minimal set of restrictions on technology and uncertainty. Most of the results duplicate or generalize earlier findings that have been established either by simulations or under contrefactual assumptions.

]]>
http://www.ifs.org.uk/publications/4092 Sat, 24 Nov 2007 00:00:00 +0000
<![CDATA[Maximal uniform convergence rates in parametric estimation problems]]>

This paper considers parametric estimation problems with independent, identically,non-regularly distributed data. It focuses on rate-effciency, in the sense of maximal possible convergence rates of stochastically bounded estimators, as an optimality criterion,largely unexplored in parametric estimation.Under mild conditions, the Hellinger metric,defined on the space of parametric probability measures, is shown to be an essentially universally applicable tool to determine maximal possible convergence rates. These rates are shown to be attainable in general classes of parametric estimation problems.

]]>
http://www.ifs.org.uk/publications/4091 Fri, 23 Nov 2007 00:00:00 +0000
<![CDATA[Regression discontinuity design with covariates]]>

In this paper, the regression discontinuity design (RDD) is generalized to account for differences in observed covariates X in a fully nonparametric way. It is shown that the treatment effect can be estimated at the rate for one-dimensional nonparametric regression irrespective of the dimension of X. It thus extends the analysis of Hahn, Todd and van der Klaauw (2001) and Porter (2003), who examined identification and estimation without covariates, requiring assumptions that may often be too strong in applications. In many applications, individuals to the left and right of the threshold differ in observed characteristics. Houses may be Cconstructed in different ways across school attendance district boundaries. Firms may differ around a threshold that implies certain legal changes, etc. Accounting for these differences in covariates is important to reduce bias. In addition, accounting for covariates may also reduces variance. Finally, estimation of quantile treatment effects (QTE) is also considered.

]]>
http://www.ifs.org.uk/publications/4090 Thu, 22 Nov 2007 00:00:00 +0000
<![CDATA[Semiparametric methods for the measurement of latent attitudes and the estimation of their behavioural consequences]]>

We model attitudes as latent variables that induce stochastic dominance relations in (item) responses. Observable characteristics that affect attitudes can be incorporated into the analysis to improve the measurement of the attitudes; the measurements are posterior distributions that condition on the responses and characteristics of each respondent. Methods to use these measurements to characterize the relation between attitudes and behaviour are developed and implemented.

]]>
http://www.ifs.org.uk/publications/4078 Thu, 22 Nov 2007 00:00:00 +0000
<![CDATA[Higher education funding reforms in England: the distributional effects and the shifting balance of costs]]>

This paper undertakes a quantitative analysis of substantial reforms to the system of higher education (HE) finance in England, first announced in 2004 and revised in 2007. The reforms introduced deferred fees for HE, payable by graduates through the tax system via income-contingent repayments on loans subsidised by the government. The paper uses lifetime earnings simulated by the authors to consider the likely distributional consequences of the reforms for graduates. It also considers the costs of the reforms for taxpayers, and how the reforms are likely to shift the balance of funding for HE between the public and private sectors.

]]>
http://www.ifs.org.uk/publications/4077 Fri, 26 Oct 2007 00:00:00 +0000
<![CDATA[Who wears the trousers? A semiparametric analysis of decision power in couples]]>

Decision processes among couples depend on the balance of power between the partners, determining the welfare of household members as well as household outcomes. However, little is known about the determinants of power. The collective model of household behavior gives an operational definition of decision power. We argue that important aspects of this concept of power are measurable through self-assessments of partners' say. Using such a measure, we model balance of power as an outcome of the interplay between both partners' demographic,socioeconomic, and health characteristics. Advancing flexible, yet parsimonious empirical models is crucial for the analysis, as both absolute status as well as relative position in the couple might potentially affect the balance of power, and gender-asymmetries may be important. Appropriately, we advance semiparametric double index models that feature one separate index for each spouse, which interact nonparametrically in the determination of power.Based on data from the Mexican Health and Aging Study (MHAS), we find education and employment status to be associated with more individual decision power,especially for women. Moreover, health and income have independent effects on the distribution of power. We also show that contextual factors are important determinants of decision power, with women in urban couples featuring more decision power than their rural counterparts.

]]>
http://www.ifs.org.uk/publications/4057 Mon, 08 Oct 2007 00:00:00 +0000
<![CDATA[What is a public sector pension worth?]]> We measure accruals in defined benefit (DB) pension plans for public and private sector workers in Britain, using typical differences in scheme rules and sector-specific lifetime age-earnings profiles by sex and educational group. We show not just that coverage by DB pension plans is greater in the public sector, but that median pension accruals as a % of salary are almost 5% higher among DB-covered public sector workers than covered private sector workers. This is largely driven by earlier normal pension (retirement) ages. For workers of different ages in the two sectors, marginal accruals also vary as a result of differences in earnings profiles across the sectors. The differences in earnings profiles across sectors should induce caution in using calculated coefficients on wages from cross sections of data in order to estimate sectoral wage effects.

]]>
http://www.ifs.org.uk/publications/4051 Tue, 02 Oct 2007 00:00:00 +0000
<![CDATA[Hedonic price equilibria, stable matching, and optimal transport: equivalence, topology, and uniqueness]]> Hedonic pricing with quasilinear preferences is shown to be equivalent to stable matching with transferable utilities and a participation constraint, and to an optimal transportation (Monge-Kantorovich) linear programming problem. Optimal assignments in the latter correspond to stable matchings, and to hedonic equilibria. These assignments are shown to exist in great generality; their marginal indirect payoffs with respect to agent type are shown to be unique whenever direct payoffs vary smoothly with type. Under a generalized Spence-Mirrlees condition the assignments are shown to be unique and to be pure, meaning the matching is one-to-one outside a negligible set. For smooth problems set on compact, connected type spaces such as the circle, there is a topological obstruction to purity, but we give a weaker condition still guaranteeing uniqueness of the stable match. An appendix resolves an old problem (# 111) of Birkhoff in probability and statistics [5], by giving a necessary and sufficient condition on the support of a joint probability to guarantee extremality among all joint measures with the same marginals.

]]>
http://www.ifs.org.uk/publications/4048 Fri, 28 Sep 2007 00:00:00 +0000
<![CDATA[Heterogeneity in consumer demands and the income effect: evidence from panel data]]> All micro studies of demand are based on using time series cross sectional data. Because in such data each household is only observed once, it is only under strong identifying restrictions that one can interpret the coefficients on consumer behavior. For example, if tastes are correlated with income, the usual estimates of income elasticities from cross sectional data are biased. In contrast, panel data allows identification of the coefficients on consumer behavior in the presence of unobservable correlated heterogeneity. In this paper we make use of a unique Spanish panel data set on household expenditures to test whether unobservable heterogeneity in household demands (taste) is correlated with total expenditures (income). We find that tastes are indeed correlated with income for half of the goods considered.

]]>
http://www.ifs.org.uk/publications/4043 Mon, 24 Sep 2007 00:00:00 +0000
<![CDATA[Integrability of demand accounting for unobservable heterogeneity: a test on panel data]]> In recent years it has become apparent that we must take unobservable heterogeneity into account when conducting empirical consumer demand analysis. This paper is concerned with integrability (that is, whether demand is consistent with utility maximization) of the conditional mean demand (that is, the estimated demand) when allowing for unobservable heterogeneity.

Integrability is important because it is necessary in order for the demand system estimates to be used for welfare analysis. Conditions for conditional mean demand to be integrable in the presence of unobservable heterogeneity are developed in the literature. There is, however, little empirical evidence suggesting whether these conditions for integrability are likely to be met in the data or not. In this paper we exploit the fact that the integrability conditions have testable implications for panel data and use a unique long panel data set to test them. Because of the sizeable longitudinal length of the panel, we are able to identify a very flexible specification of unobservable heterogeneity: We model individual demands as an Almost Ideal Demand system and allow for unobservable heterogeneity by allowing all intercept and slope parameters of the demand system to be individual-specific. We test the conditions for integrability of the conditional mean demand of this demand system. We do not reject them. This means that the conditional mean demand generated by a population of consumers with different preferences described by different Almost Ideal Demand systems is consistent with utility maximization. Given that integrability is not rejected, we conclude by an comparing the estimated demand system elasticties and welfare effects from a model with no heterogeneity (which is the model that would usually be estimated from cross sectional data) to those obtained from our heterogeneous model.

We find that the homogeneous model severely overestimates income elasticities for luxury goods and that the welfare effects from the heterogeneous model exhibit a large amount of heterogeneity, but deviate with only a few percentage points from the homogeneous model at the mean.

]]>
http://www.ifs.org.uk/publications/4040 Fri, 21 Sep 2007 00:00:00 +0000
<![CDATA[Maternal education, home environments and the development of children and adolescents]]> There is a striking increase in inequality in children's home environments over the last 50 years (McLanahan, 2004). These are measured as differences in age of mothers of young children (below 5), maternal employment, single motherhood, divorce during the first 10 years of marriage, father's involvement, and family income, for mothers with different levels of education. This trend is cause for great concern because the home environment is probably the best candidate for explaining inequality in child development.

 

Proposals to address this problem often rely on changes to the welfare system. However, given that home environments are rooted in the experiences of each family, they are probably difficult to change if we rely only the welfare system, while more direct interventions require invading family autonomy and privacy and are notoriously difficult to enforce. Therefore, one possible alternative is to target future parents in their youth, by affecting their education, before they start forming a family. In this paper we assess the potential for such a policy, by estimating the impact of maternal education on home environments and on child outomes.

 

We provided a unified analysis of different aspects of child development, including cognitive, noncognitive, and health outcomes, across ages. We also estimate the impact of maternal education not only on parental characteristics like employment, income, marital status, spouse's education, age at first birth, but also on several aspects of parenting practices. Our paper provides a detailed analysis of the possible mechanisms mediating the relationship between parental education and child outcomes. Finally, we compare the relative roles of maternal education and ability, and we show how the role of maternal education varies with the gender and race of the child, and with the cognitive ability of the mother.

 

We show that maternal education has positive impacts both on cognitive skills and behavioral problems of children, but the latter are more sustained than the former. This is perhaps because behavior is more malleable than cognition. Especially among whites, there is considerable heterogeneity in these impacts, which are larger for girls, and for mothers with higher cognition.

 

More educated mothers are more likely to work and work for longer hours, especially among blacks. This is true independently of the child being in its infancy, childhood, and adolescence. Nevertheless, there is no evidence that more educated mothers do less breastfeeding, spend much less time reading to their children, or even taking them on outings. This is important because some studies suggest that maternal employment may be detrimental for child outcomes if it leads to reduced (quality) time with children.

 

Due to the nature of the data, this paper focuses on the effect of maternal, but not paternal, schooling. Due to assortative mating, part of the effects we find may be driven by the father's schooling through a mating effect. However, unless the effect of partner's schooling is incredibly large, assortative mating cannot fully explain our main results, as suggested in some of the literature.

]]>
http://www.ifs.org.uk/publications/4041 Wed, 19 Sep 2007 00:00:00 +0000
<![CDATA[A reduced bias GMM-like estimator with reduced estimator dispersion]]> 2SLS is by far the most-used estimator for the simultaneous equation problem. However, it is now well-recognized that 2SLS can exhibit substantial finite sample (second-order) bias when the model is over-identified and the first stage partial R2 is low. The initial recommendation to solve this problem was to do LIML, e.g.Bekker(1994) or Staiger and Stock (1997).

 

However, Hahn, Hausman, and Kuersteiner (HHK 2004) demonstrated that the "problem" of LIML led to undesirable estimates in this situation. Morimune (1983) analyzed both the bias in 2SLS and the lack of moments in LIML. While it was long known that LIML did not have finite sample moments, it was less known that this lack of moments led to the undesirable property of considerable dispersion in the estimates, e.g. the inter-quartile range was much larger than 2SLS. HHK developed a jackknife 2SLS (J2SLS) estimator that attenuated the 2SLS bias problem and had good dispersion properties. They found in their empirical results that the J2SLS estimator or the Fuller estimator, which modifies LIML to have moments, did well on both the bias and dispersion criteria. Since the Fuller estimator had smaller second order MSE, HHK recommended using the Fuller estimator. However, Bekker and van der Ploeg (2005) and Hausman, Newey and Woutersen (HNW 2005) recognized that both Fuller and LIML are inconsistent with heteroscedasticity as the number of instruments becomes large in the Bekker (1994)sequence. Since econometricians recognize that heteroscedasticity is often present, this finding presents a problem.Hausman, Newey,Woutersen, Chao and Swanson (HNWCS 2007) solve this problem by proposing jackknife LIML (HLIML) and jackknife Fuller (HFull)estimators that are consistent in the presence of heteroscedasticity. HLIML does not have moments so HNWCS (2007)recommend using HFull, which does have moments. However, a problem remains. If serial correlation or clustering exists, neither HLIML nor HFull is consistent.

 

The continuous updating estimator, CUE, which is the GMM-like generalization of LIML, introduced by Hansen, Heaton, and Yaron (1996) would solve this problem. The CUE estimator also allows treatment of non-linear specifications which the above estimators need not allow for and also allows for general non- spherical disturbances. However, CUE suffers from the moment problem and exhibits wide dispersion. GMM does not suffer from the no moments problem, but like 2SLS, GMM has finite sample bias that grows with the number of moments.

 

In this paper we modify CUE to solve the no moments/large dispersion problem. We consider the dual formulation of CUE and we modify the CUE first order conditions by adding a term of order 1/T. To first order the variance of the estimator is the same as GMM or CUE, so no large sample efficiency is lost. The resulting estimator has moments up to the degree of overidentification and demonstrates considerably reduced bias relative to GMM and reduced dispersion relative to CUE. Thus, we expect the new estimator will be useful for empirical research. We next consider a similar approach but use a class of functions which permits us to specify an estimator with all integral moments existing. Lastly, we demonstrate how this approach can be extended to the entire family of Maximum Empirical Likelihood (MEL) Estimators, so these estimators will have integral moments of all orders.

]]>
http://www.ifs.org.uk/publications/4055 Fri, 14 Sep 2007 00:00:00 +0000
<![CDATA[On rate optimality for ill-posed inverse problems in econometrics]]> In this paper, we clarify the relations between the existing sets of regularity conditions for convergence rates of nonparametric indirect regression (NPIR) and nonparametric instrumental variables (NPIV) regression models. We establish minimax risk lower bounds in mean integrated squared error loss for the NPIR and the NPIV models under two basic regularity conditions that allow for both mildly ill-posed and severely ill-posed cases.We show that both a simple projection estimator for the NPIR model, and a sieve minimum distance estimator for the NPIV model,can achieve the minimax risk lower bounds, and are rate-optimal uniformly over a large class of structure functions, allowing for mildly ill-posed and severely ill-posed cases.

]]>
http://www.ifs.org.uk/publications/4045 Mon, 10 Sep 2007 00:00:00 +0000
<![CDATA[Instrumental variable estimation with heteroskedasticity and many instruments]]>

It is common practice in econometrics to correct for heteroskedasticity.This paper corrects instrumental variables estimators with many instruments for heteroskedasticity.We give heteroskedasticity robust versions of the limited information maximum likelihood (LIML) and Fuller (1977, FULL) estimators; as well as heteroskedasticity consistent standard errors thereof. The estimators are based on removing the own observation terms in the numerator of the LIML variance ratio. We derive asymptotic properties of the estimators under many and many weak instruments setups. Based on a series of Monte Carlo experiments, we find that the estimators perform as well as LIML or FULL under homoskedasticity, and have much lower bias and dispersion under heteroskedasticity, in nearly all cases considered.

]]>
http://www.ifs.org.uk/publications/4047 Sat, 01 Sep 2007 00:00:00 +0000
<![CDATA[An empirical investigation of labor income processes]]> In this paper we reassess the evidence on labor income risk. There are two leading views on the nature of the income process in the current literature. The first view, which we call the "Restricted Income Profiles" RIP process, holds that individuals are subject to large and very persistent shocks, while facing similar life-cycle income profiles. The alternative view, which we call the "Heterogeneous Income Profiles" HIP process, holds that individuals are subject to income shocks with modest persistence, while facing individual-specific income profiles. We first show that ignoring profile heterogeneity, when in fact it is present, introduces an upward bias into the estimates of persistence. Second, we estimate a parsimonious parameterization of the HIP process that is suitable for calibrating economic models. The estimated persistence is about 0.8 in the HIP process compared to about 0.99 in the RIP process. Moreover, the heterogeneity in income profiles is estimated to be substantial, explaining between 56 to 75 percent of income inequality at age 55. We also find that profile heterogeneity is substantially larger among higher educated individuals. Third, we discuss the source of identification - in other words, the aspects of labor income data that allow one to distinguish between the HIP and RIP processes. Finally, we show that the main evidence against profile heterogeneity in the existing literature - that the autocorrelations of income changes are small and negative - is also replicated by the HIP process, suggesting that this evidence may have been misinterpreted.

]]>
http://www.ifs.org.uk/publications/4023 Mon, 27 Aug 2007 00:00:00 +0000
<![CDATA[Instrumental variables estimation with flexible distribution]]> Instrumental variables are often associated with low estimator precision. This paper explores efficiency gains which might be achievable using moment conditions which are nonlinear in the disturbances and are based on flexible parametric families for error distributions. We show that these estimators can achieve the semiparametric efficiency bound when the true error distribution is a member of the parametric family. Monte Carlo simulations demonstrate low efficiency loss in the case of normal error distributions and potentially significant efficiency improvements in the case of thick-tailed and/or skewed error distributions.

]]>
http://www.ifs.org.uk/publications/4046 Mon, 20 Aug 2007 00:00:00 +0000
<![CDATA[Rearranging Edgeworth-Cornish-Fisher expansions]]> This paper applies a regularization procedure called increasing rearrangement to monotonize Edgeworth and Cornish-Fisher expansions and any other related approximations of distribution and quantile functions of sample statistics. Besides satisfying the logical monotonicity, required of distribution and quantile functions, the procedure often delivers strikingly better approximations to the distribution and quantile functions of the sample mean than the original Edgeworth-Cornish-Fisher expansions.

]]>
http://www.ifs.org.uk/publications/4008 Tue, 14 Aug 2007 00:00:00 +0000
<![CDATA[Nonparametric identification and estimation of nonclassical errors-in-variables models without additional information ]]> This paper considers identification and estimation of a nonparametric regression model with an unobserved discrete covariate. The sample consists of a dependent variable and a set of covariates, one of which is discrete and arbitrarily correlates with the unobserved covariate. The observed discrete covariate has the same support as the unobserved covariate, and can be interpreted as a proxy or mismeasure of the unobserved one, but with a nonclassical measurement error that has an unknown distribution. We obtain nonparametric identification of the model given monotonicity of the regression function and a rank condition that is directly testable given the data. Our identification strategy does not require additional sample information, such as instrumental variables or a secondary sample. We then estimate the model via the method of sieve maximum likelihood, and provide root-n asymptotic normality and semiparametric efficiency of smooth functionals of interest. Two small simulations are presented to illustrate the identification and the estimation results.

]]>
http://www.ifs.org.uk/publications/4006 Mon, 13 Aug 2007 00:00:00 +0000
<![CDATA[Better prepared for retirement? Using panel data to improve wealth estimates of ELSA respondents]]> We compare the key assumptions underpinning estimates of the pension wealth of ELSA respondents to outcomes over the period from 2002-03 to 2004-05. We find that many of these assumptions have, on average, proved cautious or reasonable. Improving pension wealth calculations using this new evidence makes little difference to the distribution of pension wealth. Previous estimates of retirement resources also considered net financial, physical and housing wealth. Particularly cautious, ex-post, was the assumption that net housing wealth would remain constant in real terms. We find that average housing wealth has risen by almost 40% in nominal terms over just two years, which is in line with growth in the Nationwide House Price Index. This large increase in house prices boosts estimates of total wealth across the entire distribution of wealth. Previous research showed that once half of current net housing wealth was included as a retirement resource 12.6% of employees approaching retirement were estimated to have resources below the Pensions Commission's definition of adequacy. We show that taking into account the high growth in house prices between 2002-03 and 2004-05 reduces this to 10.9%, and that it would fall by a further 1.2 percentage points if house prices were to grow by 2.5% a year in real terms in the future.

]]>
http://www.ifs.org.uk/publications/4007 Mon, 13 Aug 2007 00:00:00 +0000
<![CDATA[Differences in the measurement and structure of wealth using alternative data sources: the case of the UK]]> In this paper, we identify methodological differences and similarities in the measurement of wealth using survey data constructed for different purposes in the United Kingdom and England. The focus of the paper is on two prominent surveys in the UK: the English Longitudinal Survey of Ageing (ELSA) and the British Household Panel Survey (BHPS). We find conceptual difference in the measurement of financial assets and debt. At the same time, striking similarities exist in the measurement of non-financial assets. For the most part, many differences arise in the tails of the distributions of wealth. Comparable definitions of overall wealth in the surveys lead us to find a 10% and 3% difference in mean and conditional median of total net worth, respectively. Reassuring is the fact that inequality results carried out with the two surveys support one another and quantile regression shows that the distribution of total net worth across demographic groups is similar in the two surveys.

]]>
http://www.ifs.org.uk/publications/4005 Tue, 07 Aug 2007 00:00:00 +0000
<![CDATA[What would you do? An investigation of stated-response data]]> When analysing choices or policy impacts, economists generally rely on what people actually do, rather than what they say they would do. The "stated response" approach is treated with scepticism due, for example, to concerns regarding the effect of strategic or social considerations on what people say, and a belief that people may not adequately consider such a hypothetical question. This paper evaluates an example of this approach; the direct questioning of parents as to whether they would withdraw their children from school if the Familias en Accion education subsidies were withdrawn. Our results suggest that these concerns are not entirely invalid but that the stated responses do provide important information and correlate in the expected manner with child and household characteristics. We conclude by emphasising the importance of good question design, which may allow researchers to use the "stated response" method as a complement to more typical quantitative methodologies.

]]>
http://www.ifs.org.uk/publications/4003 Wed, 01 Aug 2007 00:00:00 +0000
<![CDATA[Nonparametric identification of regression models containing a misclassified dichotomous regressor without instruments]]> This note considers nonparametric identification of a general nonlinear regression model with a dichotomous regressor subject to misclassification error. The available sample information consists of a dependent variable and a set of regressors, one of which is binary and error-ridden with misclassification error that has unknown distribution. Our identification strategy does not parameterize any regression or distribution functions, and does not require additional sample information such as instrumental variables, repeated measurements, or an auxiliary sample. Our main identifying assumption is that the regression model error has zero conditional third moment. The results include a closed-form solution for the unknown distributions and the regression function.

]]>
http://www.ifs.org.uk/publications/4002 Wed, 01 Aug 2007 00:00:00 +0000
<![CDATA[Moment inequalities and their application]]> This paper provides conditions under which the inequality constraints generated by either single agent optimizing behavior, or by the Nash equilibria of multiple agent problems, can be used as a basis for estimation and inference. We also add to the econometric literature on inference in models defined by inequality constraints by providing a new specification test and methods of inference for the boundaries of the model's identified set. Two applications illustrate how the use of inequality constraints can simplify the problem of obtaining estimators from complex behavioral models of substantial applied interest.

]]>
http://www.ifs.org.uk/publications/4001 Tue, 31 Jul 2007 00:00:00 +0000
<![CDATA[Mixed hitting-time models]]> We study a mixed hitting-time (MHT) model that specifies durations as the first time a Levy process - a continuous-time process with stationary and independent increments - crosses a heterogeneous threshold. Such models are of substantial interest because they can be reduced from optimal-stopping models with heterogeneous agents that do not naturally produce a mixed proportional hazards (MPH) structure. We show how strategies for analyzing the MPH model's identifiability can be adapted to prove identifiability of an MHT model with observed regressors and unobserved heterogeneity. We discuss inference from censored data and extensions to time-varying covariates and latent processes with more general time and dependency structures. We conclude by discussing the relative merits of the MHT and MPH models as complementary frameworks for econometric duration analysis.

]]>
http://www.ifs.org.uk/publications/4000 Thu, 26 Jul 2007 00:00:00 +0000
<![CDATA[Nonparametric identification of the classical errors-in-variables model without side information]]> This note establishes that the fully nonparametric classical errors-in-variables model is identifiable from data on the regressor and the dependent variable alone, unless the specification is a member of a very specific parametric family. This family includes the linear specification with normally distributed variables as a special case. This result relies on standard primitive regularity conditions taking the form of smoothness and monotonicity of the regression function and nonvanishing characteristic functions of the disturbances.

]]>
http://www.ifs.org.uk/publications/3999 Thu, 26 Jul 2007 00:00:00 +0000
<![CDATA[Why do home owners work longer hours?]]> This paper uses a structural model to address the question of why home-owners with large mortgage debt work longer hours than those without such debt. We consider whether this is due to lower net wealth or to capital market imperfections, including mortgage constraints that depend on current earnings and, therefore, labour supply choices. We show that the need to meet current mortgage commitments can generate the observed correlation, and this impact of current commitments arises from the institutional borrowing constraints. We also show that labour supply as a function of household debt is highly nonlinear: those with greater debt are more likely to face binding borrowing constraints and their labour supply is more variable.

]]>
http://www.ifs.org.uk/publications/3998 Wed, 25 Jul 2007 00:00:00 +0000
<![CDATA[Consumption inequality and intra-household allocations]]> The consumption literature uses adult equivalence scales to measure individual level inequality. This practice imposes the assumption that there is no within household inequality.

In this paper, we show that ignoring consumption inequality within households produces misleading estimates of inequality along two dimensions. First, the use of adult equivalence scales underestimates the level of cross sectional consumption inequality by 30 percent, as large differences in the earnings of husbands and wives translate into large differences in consumption allocations within households. Second, the rise in inequality since the 1970s is overstated by almost two-thirds: within household inequality declined over time as the share of income provided by wives increased. Our findings also indicate that increases in marital sorting on wages and hours worked can simultaneously explain virtually all of the decline in within household inequality and a substantial fraction of the rise in between household inequality for one and two adult households in the UK since the 1970s.

]]>
http://www.ifs.org.uk/publications/3997 Wed, 25 Jul 2007 00:00:00 +0000
<![CDATA[Why is consumption more log normal than income? Gibrat's law revisited]]> Significant departures from log normality are observed in income data, in violation of Gibrat's law. We identify a new empirical regularity, which is that the distribution of consumption expenditures across households is, within cohorts, closer to log normal than the distribution of income. We explain these empirical results by showing that the logic of Gibrat's law applies not to total income, but to permanent income and to maginal utility. These findings have important implications for welfare and inequality measurement, aggregation, and econometric model analysis.

]]>
http://www.ifs.org.uk/publications/3987 Tue, 03 Jul 2007 00:00:00 +0000
<![CDATA[Taxation of the Family]]> This paper has been written as a review of the current tax treatment of the family in the UK. It considers the income tax, capital gains tax, inheritance tax and stamp duty implications of different types of family unit. It seeks to show where inconsistencies, confusion and discrepancies lie and considers whether marriage or the entering into a civil partnership offers tax advantages or disadvantages to the persons involved. It considers the implications of European anti-discrimination laws and looks at the stated policies of the Government, the Conservative Party and the Liberal Democrats.

]]>
http://www.ifs.org.uk/publications/3984 Mon, 18 Jun 2007 00:00:00 +0000
<![CDATA[Efficiency bounds for estimating linear functionals of nonparametric regression models with endogenous regressors]]> The main objective of this paper is to derive the efficiency bounds for estimating certain linear functionals of an unknown structural function when the latter is not itself a conditional expectation.

]]>
http://www.ifs.org.uk/publications/3982 Wed, 30 May 2007 00:00:00 +0000
<![CDATA[On the computational complexity of MCMC-based estimators in large samples]]> In this paper we examine the implications of the statistical large sample theory for the computational complexity of Bayesian and quasi-Bayesian estimation carried out using Metropolis random walks. Our analysis is motivated by the Laplace-Bernstein-Von Mises central limit theorem, which states that in large samples the posterior or quasi-posterior approaches a normal density. Using this observation, we establish polynomial bounds on the computational complexity of general Metropolis random walks methods in large samples. Our analysis covers cases, where the underlying log-likelihood or extremum criterion function is possibly nonconcave, discontinuous, and of increasing dimension. However, the central limit theorem restricts the deviations from continuity and log-concavity of the log-likelihood or extremum criterion function in a very specific manner.

]]>
http://www.ifs.org.uk/publications/3980 Tue, 29 May 2007 00:00:00 +0000
<![CDATA[Welfare reform in the UK: 1997 - 2007]]>

This paper, written at the request of the Economic Council of Sweden, presents a tour of welfare reform in the UK since the last change of government, summarising the most important changes in active labour market policies, and in measures intended to strengthen financial incentives to work. It argues that developments in the UK's active labour market policies occurred in two broad phases: first, the Government sought to strengthen ALMPs for those individuals deemed to be unemployed, through the New Deal programme. Second, the Government has reformed benefits for individuals traditionally viewed as inactive and thus excused job search activity, such as lone parents, and the sick and disabled. Accompanying these have been changes to direct taxes, tax credits and welfare benefits aiming to strengthen financial work incentives. However, financial work incentives have been strengthened by less than might be expected given the early rhetoric: the expansion in family-based tax credits have weakened the financial work incentives of (potential) second earners in families with children, many more workers now face combined marginal tax and tax credit withdrawal rates in excess of 60% than a decade ago, and a desire to achieve broad reductions in relative child poverty has led the Government to increase substantially income available to non-working families with children. We also summarise evaluations of three important UK welfare-to-work reforms (WFTC, NDYP and Pathways to Work), but without comparing their efficacy.

]]>
http://www.ifs.org.uk/publications/4072 Thu, 17 May 2007 00:00:00 +0000
<![CDATA[Rarely pure and never simple: extracting the truth from self-reported data on substance use]]> We consider the misreporting of illicit drug use and juvenile smoking in self-report surveys and its consequences for statistical inference. Panel data containing repeated self-reports of 'lifetime' prevalence give unambiguous evidence of misreporting as 'recanting' of earlier reports of drug use. The identification of true initiation and reporting processes from such data is problematic in short panels, whilst more secure identification is possible in panels with at least five waves. Nevertheless, evidence from three UK datasets clearly indicates serious underreporting of cannabis, cocaine and tobacco use by young people, with consequent large biases in statistical modelling.

]]>
http://www.ifs.org.uk/publications/3976 Thu, 10 May 2007 00:00:00 +0000
<![CDATA[Quantile and probability curves without crossing]]>

The most common approach to estimating conditional quantile curves is to fit a curve, typically linear, pointwise for each quantile. Linear functional forms, coupled with pointwise fitting, are used for a number of reasons including parsimony of the resulting approximations and good computational properties. The resulting fits, however, may not respect a logical monotonicity requirement that the quantile curve be increasing as a function of probability. This paper studies the natural monotonization of these empirical curves induced by sampling from the estimated non-monotone model, and then taking the resulting conditional quantile curves that by construction are monotone in the probability.

]]>
http://www.ifs.org.uk/publications/3974 Mon, 30 Apr 2007 00:00:00 +0000
<![CDATA[Improving estimates of monotone functions by rearrangement]]> Suppose that a target function is monotonic, namely, weakly increasing, and an original estimate of the target function is available, which is not weakly increasing. Many common estimation methods used in statistics produce such estimates. We show that these estimates can always be improved with no harm using rearrangement techniques: The rearrangement methods, univariate and multivariate, transform the original estimate to a monotonic estimate, and the resulting estimate is closer to the true curve in common metrics than the original estimate. We illustrate the results with a computational example and an empirical example dealing with age-height growth charts.

]]>
http://www.ifs.org.uk/publications/3975 Mon, 30 Apr 2007 00:00:00 +0000
<![CDATA[The weak instrument problem of the system GMM estimator in dynamic panel data models]]>

The system GMM estimator for dynamic panel data models combines moment conditions for the model in first differences with moment conditions for the model in levels. It has been shown to improve on the GMM estimator in the first differenced model in terms of bias and root mean squared error. However, we show in this paper that in the covariance stationary panel data AR(1) model the expected values of the concentration parameters in the differenced and levels equations for the crosssection at time t are the same when the variances of the individual heterogeneity and idiosyncratic errors are the same. This indicates a weak instrument problem also for the equation in levels. We show that the 2SLS biases relative to that of the OLS biases are then similar for the equations in differences and levels, as are the size distortions of the Wald tests. These results are shown in a Monte Carlo study to extend to the panel data system GMM estimator.

]]>
http://www.ifs.org.uk/publications/3931 Tue, 27 Mar 2007 00:00:00 +0000
<![CDATA[Robust priors in nonlinear panel data models]]> Many approaches to estimation of panel models are based on an average or integrated likelihood that assigns weights to different values of the individual effects. Fixed effects, random effects, and Bayesian approaches all fall in this category. We provide a characterization of the class of weights (or priors) that produce estimators that are first-order unbiased. We show that such bias-reducing weights must depend on the data unless an orthogonal reparameterization or an essentially equivalent condition is available. Two intuitively appealing weighting schemes are discussed. We argue that asymptotically valid confidence intervals can be read from the posterior distribution of the common parameters when N and T grow at the same rate. Finally, we show that random effects estimators are not bias reducing in general and discuss important exceptions. Three examples and some Monte Carlo experiments illustrate the results.

]]>
http://www.ifs.org.uk/publications/3911 Wed, 21 Mar 2007 00:00:00 +0000
<![CDATA[Investment abroad and adjustment at home: evidence from UK multinational firms]]> I use within-firm, plant-level data combined with geographic information on firms' overseas operations to examine how investment in low-wage economies affects firms' home-country operations. To remain close to theory I focus on changes in firms' organisational and industrial structure driven by plant closures. As predicted by models of vertical multinationals I find that investment in relatively low-wage economies is associated with plant closures in relatively low-skill, labour-intensive industries in the UK. The findings are of interest in the context of the relaxation of barriers to inward investment in low-wage economies.

]]>
http://www.ifs.org.uk/publications/3900 Wed, 14 Mar 2007 00:00:00 +0000
<![CDATA[Semiparametric identification of structural dynamic optimal stopping time models]]> This paper presents new identification results for the class of structural dynamic optimal stopping time models that are built upon the framework of the structural discrete Markov decision processes proposed by Rust (1994). We demonstrate how to semiparametrically identify the deep structural parameters of interest in the case where the utility function of an absorbing choice in the model is parametric but the distribution of unobserved heterogeneity is nonparametric. Our identification strategy depends on availability of a continuous observed state variable that satisfies certain exclusion restrictions. If such excluded variable is accessible, we show that the dynamic optimal stopping model is semiparametrically identified using control function approaches.

]]>
http://www.ifs.org.uk/publications/3892 Thu, 08 Mar 2007 00:00:00 +0000
<![CDATA[Endogeneity and discrete outcomes]]> http://www.ifs.org.uk/publications/3888 Tue, 06 Mar 2007 00:00:00 +0000 <![CDATA[Bias corrections for two-step fixed effects panel data estimators]]> This paper introduces bias-corrected estimators for nonlinear panel data models with both time invariant and time varying heterogeneity. These include limited dependent variable models with both unobserved individual effects and endogenous explanatory variables, and sample selection models with unobserved individual effects.

]]>
http://www.ifs.org.uk/publications/3870 Tue, 27 Feb 2007 00:00:00 +0000
<![CDATA[Electoral bias and policy choice: theory and evidence]]> This paper develops an approach to studying how bias in favor of one party due to the pattern of electoral districting affects policy choice. We tie a commonly used measure of electoral bias to the theory of party competition and show how this affects party strategy in theory. The usefulness of the approach is illustrated using data on local government in England. The results suggest that reducing electoral bias leads parties to moderate their policies.

]]>
http://www.ifs.org.uk/publications/3847 Mon, 05 Feb 2007 00:00:00 +0000
<![CDATA[Identification and estimation of firms' marginal cost functions with incomplete knowledge of strategic behavior]]> In this paper I develop a new approach for identification and estimation of the parameters of an oligopoly model, without relying on a potentially unverifiable equilibrium assumption. Rather, I consider inference on model parameters when the researcher does not know precisely what decision rule firms use, but is willing to consider a set of possibilities. In contrast to traditional approaches in the literature, the proposed methodology allows firm behavior to vary flexibly across observations, in a manner consistent with many Nash Equilibria. I derive identification results for both homogeneous product and differentiated product markets. Due to the flexibility afforded to firm behavior, the arameters of firms' marginal cost functions may only be set identified rather than point identified. The restrictions of the model are, however, still informative. I find that the size of the identified set for marginal cost parameters depends on the elasticity of market demand, the set of decision rules considered, and the functional form assumptions imposed. I formulate how to compute consistent set estimates for marginal cost parameters and demonstrate the proposed methodology with price and quantity data on the Joint Executive Committee, a 19th century railway cartel. To perform statistical inference implement the methodology of Rosen (2005) to construct asymptotically valid confidence regions for the partially identified marginal cost parameters. The application illustrates how the precision of estimated marginal costs depends on the elasticity of market demand as well as the extent to which firm behavior is allowed to vary.

]]>
http://www.ifs.org.uk/publications/3843 Fri, 02 Feb 2007 00:00:00 +0000
<![CDATA[Testing a parametric quantile-regression model with an endogenous explanatory variable against a nonparametric alternative]]> This paper is concerned with inference about a function g that is identified by a conditional quantile restriction involving instrumental variables. The paper presents a test of the hypothesis that g belongs to a finite-dimensional parametric family against a nonparametric alternative. The test is not subject to the ill-posed inverse problem of nonparametric instrumental variables estimation. Under mild conditions, the test is consistent against any alternative model. In large samples, its power is arbitrarily close to 1 uniformly over a class of alternatives whose distance from the null hypothesis is O (n1/2), where n is the sample size. Monte Carlo simulations illustrate the finite-sample performance of the test.

]]>
http://www.ifs.org.uk/publications/3842 Thu, 01 Feb 2007 00:00:00 +0000
<![CDATA[The SES health gradient on both sides of the Atlantic]]> In this paper we investigate the size of health differences that exist among men in England and the United States and how those differences vary by Socio-Economic Status (SES) in both countries. Three SES measures are emphasized - education, household income, and household wealth - and the health outcomes investigated span multiple dimensions as well. International comparisons have played a central part of the recent debate involving the 'SES health gradient' with some authors citing cross-country differences in levels of income equality and mortality as among the most compelling evidence that unequal societies have negative impacts on individual health outcomes. In spite of the analytical advantages of making such international comparisons, until recently good micro data measuring both SES and health in comparable ways have not been available for both countries. Fortunately, that problem has been remedied with the fielding of two surveys - the Health and Retirement Survey (HRS) and the English Longitudinal Survey of Aging (ELSA). In order to facilitate the type of research represented in this paper, both the health and SES measures in ELSA and HRS were purposely constructed to be as directly comparable as possible.

Our analysis presents data on some of the most salient issues regarding the social health gradient in health and the manner in which this health gradient differs for men across the two countries in question. There are a several key findings. First, looking across a wide variety of diagnosed diseases, average health status among mature men is much worse in America compared to England, confirming non-gender specific findings we reported in earlier research. Second, there exists a steep negative health gradient for men in both countries where men at the bottom of the economic hierarchy are in much worse health than those at the top. This social health gradient exists whether education, income, or financial wealth is used as the marker of SES. While the negative social gradient in male health characterizes men in both countries, it appears to be steeper in the United States. These central conclusions are maintained even after controlling for a standard set of behavioral risk factors such as smoking, drinking, and obesity and are equally true using either biological measures of disease or individual self-reports.

In contrast to these disease based measures of health, the health of American men appears to be superior to the health of English men when self-reported subjective general health status is used as the measure of health status. This apparent contradiction does not result from differences in co-morbidity, emotional health, or ability to function all of which still point to mature American men being less healthy than their English counterparts. The contradiction most likely stems instead from different thresholds used by Americans and English when evaluation their health status on subjective scales. For the same 'objective' health status, Americans are much more likely to say that their health is good than are the English. Finally, we present preliminary data that indicates that feedbacks from new health events to household income are also one of the reasons that underlie the strength of the income gradient with health in England. Previous research has demonstrated its importance as one of the underlying causes in the United States and these results suggest that that conclusion should most likely be extended to England as well although further research is required on this topic.

]]>
http://www.ifs.org.uk/publications/3834 Mon, 22 Jan 2007 00:00:00 +0000
<![CDATA[The impact of income shocks on health: evidence from cohort data]]> We study the effect of permanent income innovations on health for a prime-aged population. Using information on more than half a million individuals sampled over a twenty-five year period in three different cross-sectional surveys we aggregate data by date-of-birth cohort to construct a 'synthetic cohort' dataset with details of income, expenditure, socio-demographic factors, health outcomes and selected risk factors. We then exploit structural and arguably exogenous changes in cohort incomes over the eighties and nineties to uncover causal effects of permanent income shocks on health. We find that such income innovations have little effects on health, but do affect health behaviour and mortality.

]]>
http://www.ifs.org.uk/publications/3835 Mon, 22 Jan 2007 00:00:00 +0000
<![CDATA[Distributional effects in household models: separate spheres and income pooling]]> We derive distributional effects for a non-cooperative alternative to the unitary model of household behaviour. We consider the Nash equilibria of a voluntary contributions to public goods game. Our main result is that, in general, the two partners either choose to contribute to di¤erent public goods or they contribute to at most one common good. The former case corresponds to the separate spheres case of Lundberg and Pollak (1993). The second outcome yields (local) income pooling. A household will be in different regimes depending on the distribution of income within the household. Any bargaining model with this non-cooperative case as a breakdown point will inherit the local income pooling. We conclude that targeting benefits such as child benefits to one household member may not always have an effect on outcomes.

]]>
http://www.ifs.org.uk/publications/3832 Thu, 18 Jan 2007 00:00:00 +0000
<![CDATA[University research and the location of business R&D]]> We investigate the relationship between the location of private sector R&D labs and university research departments in Great Britain. We combine establishment-level data on R&D activity with information on levels and changes in research quality from the Research Assessment Exercise. The strongest evidence for co-location is for pharmaceuticals R&D, which is disproportionately located near to relevant university research, particularly 5 or 5* rated chemistry departments. This relationship is stronger for foreign-owned labs, consistent with multinationals sourcing technology internationally. We also find some evidence for co-location with lower rated research departments in industries such as machinery and communications equipment.

]]>
http://www.ifs.org.uk/publications/3829 Thu, 04 Jan 2007 00:00:00 +0000
<![CDATA[Demand properties in household Nash equilibrium]]> Please Note: This paper was updated in July 2007

We study noncooperative household models with two agents and several voluntarily contributed public goods, deriving the counterpart to the Slutsky matrix and demonstrating the nature of the deviation of its properties from those of a true Slutsky matrix in the unitary model. We demonstrate the importance of distinguishing between cases in which there are and are not jointly contributed public goods and provide results characterising both cases. Demand properties are contrasted with those for collective models and conclusions drawn regarding the possibility of empirically testing the collective model against noncooperative alternatives.

]]>
http://www.ifs.org.uk/publications/3828 Wed, 03 Jan 2007 00:00:00 +0000
<![CDATA[Correlation testing in time series, spatial and cross-sectional data]]> We provide a general class of tests for correlation in time series, spatial, spatio-temporal and cross-sectional data. We motivate our focus by reviewing how computational and theoretical difficulties of point estimation mount as one moves from regularly-spaced time series data, through forms of irregular spacing, and to spatial data of various kinds. A broad class of computationally simple tests is justiied. These specialize Lagrange multiplier tests against parametric departures of various kinds. Their forms are illustrated in case of several models for describing correlation in various kinds of data. The initial focus assumes homoscedasticity, but we also robustify the tests to nonparametric heteroscedasticity.

]]>
http://www.ifs.org.uk/publications/3826 Mon, 01 Jan 2007 00:00:00 +0000
<![CDATA[On the conditional likelihood ratio test for several parameters in IV regression]]> For the problem of testing the hypothesis that all m coefficients of the RHS endogenous variables in an IV regression are zero, the likelihood ratio (LR) test can, if the reduced form covariance matrix is known, be rendered similar by a conditioning argument. To exploit this fact requires knowledge of the relevant conditional cdf of the LR statistic, but the statistic is a function of the smallest characteristic root of an (m + 1)−square matrix, and is therefore analytically difficult to deal with when m > 1. We show in this paper that an iterative conditioning argument used by Hillier (2006) and Andrews, Moreira, and Stock (2007) to evaluate the cdf in the case m = 1 can be generalized to the case of arbitrary m. This means that we can completely bypass the difficulty of dealing with the smallest characteristic root. Analytic results are obtained for the case m = 2, and a simple and efficient simulation approach to evaluating the cdf is suggested for larger values of m.

]]>
http://www.ifs.org.uk/publications/3824 Tue, 12 Dec 2006 00:00:00 +0000
<![CDATA[Confidence sets for partially identified parameters that satisfy a finite number of moment inequalities]]> This paper proposes a new way to construct confidence sets for a parameter of interest in models comprised of finitely many moment inequalities. Building on results from the literature on multivariate one-sided tests, I show how to test the hypothesis that any particular parameter value is logically consistent with the maintained moment inequalities. The associated test statistic has an asymptotic chi-bar-square distribution, and can be inverted to construct an asymptotic confidence set for the parameter of interest, even if that parameter is only partially identified. The confidence sets are easily computed, and Monte Carlo simu