# Vector quantile regression: an optimal transport approach

We propose a notion of conditional vector quantile function and a vector quantile regression. A *conditional vector quantile function* (CVQF) of a random vector Y">YY, taking values in Rd">RdRd given covariates Z=z">Z=zZ=z, taking values in Rk">RkRk, is a map u⟼QY|Z(u,z)">u⟼QY|Z(u,z)u⟼QY|Z(u,z), which is monotone, in the sense of being a gradient of a convex function, and such that given that vector U">UU follows a reference non-atomic distribution FU">FUFU, for instance uniform distribution on a unit cube in Rd">RdRd, the random vector QY|Z(U,z)">QY|Z(U,z)QY|Z(U,z) has the distribution of Y">YY conditional on Z=z">Z=zZ=z. Moreover, we have a strong representation, Y=QY|Z(U,Z)">Y=QY|Z(U,Z)Y=QY|Z(U,Z) almost surely, for some version of U">UU. The *vector quantile regression* (VQR) is a linear model for CVQF of Y">YY given Z">ZZ. Under correct specification, the notion produces strong representation, Y=β(U)⊤f(Z)">Y=β(U)⊤f(Z)Y=β(U)⊤f(Z), for f(Z)">f(Z)f(Z) denoting a known set of transformations of Z">ZZ, where u⟼β(u)⊤f(Z)">u⟼β(u)⊤f(Z)u⟼β(u)⊤f(Z) is a monotone map, the gradient of a convex function and the quantile regression coefficients u⟼β(u)">u⟼β(u)u⟼β(u) have the interpretations analogous to that of the standard scalar quantile regression. As f(Z)">f(Z)f(Z) becomes a richer class of transformations of Z">ZZ, the model becomes nonparametric, as in series modelling. A key property of VQR is the embedding of the classical Monge–Kantorovich’s optimal transportation problem at its core as a special case. In the classical case, where Y">YY is scalar, VQR reduces to a version of the classical QR, and CVQF reduces to the scalar conditional quantile function. An application to multiple Engel curve estimation is considered.