As is well known there are substantial difficulties in comparing income distribution
data across countries.'0
Countries differ in the coverage of the survey (national versus
subnational), in the welfare measure (income versus consumption), the measure of
income (gross versus net), and the unit of observation (individuals versus households).
We are only able to very imperfectly adjust for these differences.
We have restricted our
sample to only income distribution measures based on nationally representative surveys.
For all surveys we have information on whether the welfare measure is income or
consumption, and for the majority of these we also know whether the income measure is
gross or net of taxes and transfers.
While we do have information on whether the
recipient unit is the individual or the household, for most of our observations we do not
have information on whether the Lorenz curve refers to the fraction of individuals or the
fraction of households."
As a result, this last piece of information is of little help in
adjusting for methodological differences in measures of income distribution across
countries.
We therefore implement the following very crude adjustment for observable
differences in survey type.
We pool our sample of 418 observations separated by at
least five years, and regress both the Gini coefficient and the first quintile share on a
constant, a set of regional dummies, and dummy variables indicating whether the
welfare measure is gross income or whether it is consumption. We then subtract the
estimated mean difference between these two altematives and the omitted category to
arrive at a set of distribution measures that notionally correspond to the distribution of
income net of taxes and transfers.'2 The results of these adjustment regressions are
reported in Table 2.
3.2 Estimation
10 See Atkinson and Brandolini (1999) for a detailed discussion of these issues.
" This information is only available for the Chen-Ravallion dataset which exclusively refers to individuals
and for which the Lorenz curve is consistently constructed using the fraction of individuals on the horizontal
axis.
12 Our main results do not change substantially if we use three other possibilities: (1) ignoring differences in
survey type, (2) including dummy variables for survey type as strictly exogenous right-hand side variables in
our regressions, or (3) adding country fixed effects to the adjustment regression so that the mean
differences in survey type are estimated from the very limited within-country variation in survey type.
14
In order to examine how incomes of the poor vary with overall incomes, we
estimate variants of the following regression of the logarithm of per capita income of the
poor (yP) on the logarithm of average per capita income (y) and a set of additional control
variables (X):
(1) yPt = (XO + CC1 * Yct + a2'Xct + Pc + £ct
where c and t index countries and years, respectively, and 1.c + Ect is a composite error
term including unobserved country effects. We have already seen the pooled version of
Equation (1) with no control variables Xct in the top panel of Figure 1 above. Since
incomes of the poor are equal to the the first quintile share times average income
divided by 0.2, it is clear that Equation (1) is identical to a regression of the log of the first
quintile share on average income and a set of control variables:
(2) ln(Qt) = ao+ (ci -1).Yt+ c2'Xct+l4c+8et
Moreover, since empirically the log of the first quintile share is almost exactly a linear
function of the Gini coefficient, Equation (1) is almost equivalent to a regression of a
negative constant times the Gini coefficient on average income and a set of control
variables.13
We are interested in two key parameters from Equation (1). The first is a,, which
measures the elasticity of income of the poor with respect to mean income. A value of
ct1=1 indicates that growth in mean income is translated one-for-one into growth in
income of the poor. From Equation (2) this is equivalent to the observation that the
share of income accruing to the poorest quintile does not vary systematically with
average incomes (ai-1=0). Estimates of a1 greater or less than one indicate that growth
more than or less than proportionately benefits those in the poorest quintile. The second
parameter of interest is a2 which measures the impact of other determinants of income
13 In our sample of spaced observations, a regression of the log first quintile share on the Gini coefficient
delivers a slope of -23.3 with an R-squared of 0.80.
15
of the poor over and above their impact on mean income. Equivalently from Equation
(2), a2 measures the impact of these other variables on the share of income accruing to
the poorest quintile, holding constant average incomes.
Simple ordinary least squares (OLS) estimation of Equation (1) using pooled
country-year observations is likely to result in inconsistent parameter estimates for
several reasons.'4 Measurement error in average incomes or the other control
variables in Equation (1) will lead to biases that are difficult to sign except under very
restrictive assumptions.'5 Since we consider only a fairly parsimonious set of right-hand-
side variables in X, omitted determinants of the log quintile share that are correlated with
either X or average incomes can also bias our results. Finally, there may be reverse
causation from average incomes of the poor to average incomes, or equivalently from
the log quintile share to average incomes, as suggested by the large empirical literature
which has examined the effects of income distribution on subsequent growth. This
literature typically estimates growth regressions with a measure of initial income
inequality as an explanatory variable, such as:
(3) YC D PO+P'Y,- I ( I Q1Ctk ' + P2)- 13 ' 9 Z-V
lct o1WYc.tk~P~1I 0.2 )_
This literature has found mixed results using different sample and different econometric
techniques. On the one hand, Perotti (1996) and Barro (1999) find evidence of a
negative effect of income inequality on growth (i.e. 01>0). On the other hand, Forbes
(2000) and Li and Zou (1998) both find positive effects of income inequality on growth,
(i.e. 13i<0). Finally, Bannerjee and Duflo (1999) modestly, and perhaps most
appropriately, conclude that there is at best weak evidence of a U-shaped correlation
between income inequality and growth and that very little can be said about causation in
14 It should also be clear that OLS standard errors will be inconsistent given the cross-observation
correlations induced by the unobserved country-specific effect.
'5 While at first glance it may appear that measurement error in per capita income (which is also used to
construct our measure of incomes of the poor) will bias the coefficient on per capita income towards one in
Equation (1), this is not the case. From Equation (2) (which of course yields identical estimates of the
parameters of interest as does Equation (1)) it is clear that we only have a problem to the extent that
measurement error in the first quintile share is correlated with average incomes. Since our data on income
distribution and average income are drawn from different sources, there is no a priori reason to expect such
a correlation. When average income is taken from the same household survey, under plausible
assumptions even measurement error in both variables will not lead to inconsistent coefficient estimates
(Chen and Ravallion (1997)).
16
either direction. Whatever the true underlying relationship, it is clear that as long as D1I is
not equal to zero, OLS estimation of Equations (1) or (2) will yield inconsistent estimates
of the parameters of interest. For example, high realizations of A. which result in higher
incomes of the poor relative to mean income in Equation (1) will also raise (lower) mean
incomes in Equation (3), depending on whether P, is greater than (less than) zero. This
could induce an upwards (downwards) bias into estimates of the elasticity of incomes of
the poor with respect to mean incomes in Equation (1).