Policy Research Working Paper                      10748




   Multidimensional and Specific Inequalities
                                 James E. Foster
                                 Michael Lokshin




Europe and Central Asia Region
Office of the Chief Economist
April 2024
Policy Research Working Paper 10748


  Abstract
  Despite the multitude of measures of multidimensional                              down multidimensional inequality into terms reflecting the
  inequality, none is regularly used in policymaking. This                           average specific inequalities (within dimensions) and the
  paper proposes multidimensional inequality measures that                           joint distribution (across dimensions)—for any measure
  are easily implementable and transparent and overcome                              created using a standard unidimensional measure or the
  many deficiencies of existing measures. The measures follow                        Lorenz curve. The paper also provides an approach to cali-
  a traditional two-stage format, which aggregates dimensions                        brating the measure for use with data over time, replacing
  first and then applies a unidimensional measure like the                           the usual ad hoc normalization of variables with one that
  Gini coefficient to the distribution of aggregates. A novel                        accounts for a policymaker’s normative weights. The tech-
  characterization result identifies the precise form of aggre-                      nology is illustrated first using synthetic data to understand
  gation needed to obtain axiomatically sound measures. The                          how the measure varies as the components are changed and
  paper derives an additive decomposition formula—breaking                           then using data from Azerbaijan.




 This paper is a product of the Office of the Chief Economist, Europe and Central Asia Region. It is part of a larger effort
 by the World Bank to provide open access to its research and make a contribution to development policy discussions
 around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The
 authors may be contacted at mlokshin@worldbank.org and fosterje@email.gwu.edu.




          The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
          issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
          names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
          of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
          its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                        Produced by the Research Support Team
    Multidimensional and Specific Inequalities*

                                                James E. Foster
                Oliver T. Carr Jr Professor of International Affairs and Professor of Economics
                                      The George Washington University

                                               Michael Lokshin
            Lead Economist with the Office of the Chief Economist for Europe and Central Asia,
                                            The World Bank




JEL: D30, D63
Keywords: Multidimensional Inequality, Axioms, Measures, Lorenz Curves, Decompositions



*
  The authors have benefitted from conversations with Jean Dreze, Chico Ferreira, Stephen Jenkins, Ravi Kanbur, Suman
Seth, Tony Shorrocks, Stephen Smith, and the participants of seminars at the LSE and the LIS. We thank Carolina
Sanchez-Paramo for her support of this research project. Foster is grateful to the Bank and to the International
Inequalities Institute of the LSE for hosting his sabbatical leave. This research was partially supported by the World
Bank.
I. Introduction

While global income inequality has declined since the 1990s, income inequality levels within
countries have been rising for a wide range of developed and developing countries, capturing the
attention of social activists and policymakers alike. More than half of all countries and close to
90 percent of advanced economies have seen an increase in income inequality since 2000, with
the income Gini increasing by more than two points in some instances (IMF 2023). Yet
inequality is present not only in the space of incomes; it inhabits other key dimensions such as
health, education, and social services, whose dimension-specific inequalities may reinforce or
dampen the impact of income inequality. Empirical data on income inequality offers only a
partial view of what Amartya Sen has termed “economic inequality” and can limit the scope and
accuracy of a country’s policy responses (Sen 1997, 1999).

Dashboards and weighted averages of dimension-specific inequalities can help paint a broader
picture of economic inequality within a country, and its evolution through time. However, they
completely ignore the joint distribution of dimensional variables, which conveys important
information on how people in the country are experiencing inequality. For example, it could be
the case that most people with lower levels of one variable have lower levels of the others,
resulting in a rigid hierarchy of achievement vectors and, arguably, greater economic inequality.
Alternatively, people could exhibit mixed levels of achievements, dampening positive
association and its impact on inequality. Measurement tools sensitive to the joint distribution can
distinguish between these situations, and better gauge the extent of economic inequality. 1

Following the pioneering work of Kolm (1977), Atkinson and Bourguignon (1982), and
Maasoumi (1986), there have been significant advances in the range of tools available for
measuring inequality when there are multiple dimensions, including new classes of measures,
axioms for discerning among measures, and dominance methods for ensuring comparisons are
robust. 2 In their survey on multidimensional poverty and inequality, Aaberge and Brandolini
(2015, p. 201) note a strong demand for multidimensional analyses by policymakers and other
stakeholders, and indeed, empirical and policy applications of multidimensional poverty indices

1
  For Aaberge and Brandolini (2015, p. 146) this sensitivity is “the single feature that distinguishes multidimensional
from unidimensional analysis.” Note that it also places additional requirements on the data that can be used.
2
  See the recent surveys of Aalberg and Brandolini (2015), Andreoli and Zoli (2020), Glassman (2019), and Seth and
Santos (2018).


                                                                                                                     2
(MPIs) are numerous. 3 However, for multidimensional inequality measures, the policy impact
has been more muted.

Why is this? One possibility might be the complexity of existing measures. 4 To be effective in
policy analysis, a measure needs to be easily understood and communicated. But this desirable
characteristic is unlikely to be present unless it has been prioritized and intentionally built into
the measure along with the traditional axiomatic requirements. 5 In addition, the process of
bringing a particular multidimensional measure to data can be daunting, requiring many
consequential choices not addressed in most theoretical presentations (Alkire and Foster 2010).
On what basis should a given cardinalization of a variable be selected? How can the variables be
made comparable to one another? Where should the relative importance of variables be
reflected? The structure of a measure might facilitate or hinder the answers to such questions,
impacting its dependability and ease of use in policy analyses.

The aim of this paper is to identify axiomatically-sound multidimensional inequality measures
having attributes well-suited for policy. 6 Our focus is the two-stage approach of Maasoumi
(1986), which yields intuitive measures frequently used in empirical analyses. 7 In this approach,
a multidimensional inequality measure is constructed using two components: an aggregation
function converting each person’s dimensional achievements into an aggregate indicator; and an
inequality measure evaluating the resulting vector of indicators. His original presentation used
general means and generalized entropy measures; ours considers a general set of aggregation
functions based on Bosmans et al. (2015) and Lorenz-consistent inequality measures. 8

Two-stage measures inherit several of the standard axioms for multidimensional inequality
measures from the properties of their components. However, Dardanoni (1995) has shown that at

3
  Google Scholar and the site www.mppn.org list hundreds of empirical studies using MPIs along with many policy
applications. The relative infrequency for multidimensional inequality indices is noted in Hong (2009), Seth and
Santos (2018) and IPSS (2018).
4
  For example, Lugo (2007) notes how the parameters of a measure can obscure its meaning, hampering
applications.
5
  See Foster (2024) who discusses an approach to “intentional measurement”.
6
  As we shall see below, the required properties include anonymity, scale invariance and replication invariance, as
well as the two generalizations of the transfer principle to multidimensional measures.
7
  According to Bosmans, Decancq, and Ooghe (2015 p. 95) the two-stage approach “dominates the empirical
literature”. Recent applications include Nilson (2010), Justino (2012), Rhode and Guest (2013, 2018), Bartels and
Stockhausen (2016).
8
  Aggregation functions are assumed to satisfy continuity, concavity, and linear homogeneity; Lorenz-consistent
measures follow the Lorenz criterion when it applies.


                                                                                                                  3
least one property is not assured: Kolm’s (1977) weak majorization axiom, which requires the
level of multidimensional inequality not to rise when each dimensional distribution is
“smoothed” using the same bistochastic matrix. This is a fundamental axiom that generalizes the
Pigou-Dalton transfer principle to the multidimensional context. Consequently, additional
restrictions on the two components may be needed to ensure that the resulting measures are
axiomatically sound.

Our first results characterize the subset of measures satisfying the standard multidimensional
axioms, including weak uniform majorization. We show that, while any Lorenz-consistent
inequality measure can be used at the second stage, the only form of aggregation that can be used
in the first is linear. Given this specification, we show that the two-stage measures satisfy all the
basic axioms, including, perhaps surprisingly, the unfair rearrangement axiom, which ensures
that a multidimensional inequality measure is appropriately sensitive to positive association
among the variables. The Lorenz curve can also be applied in the second stage to obtain a
graphical depiction of multidimensional inequality and a dominance criterion that indicates when
all two-stage measures with the same aggregation would agree on a comparison.

The next series of results explores the link between multidimensional inequality and the
(dimension-)specific inequalities. Following Shorrocks (1978), we consider Lorenz consistent
measures satisfying a basic convexity assumption. 9 We show that multidimensional inequality
can be expressed as a weighted average of specific inequalities minus a non-negative term
reflecting the relevant aspects of the joint distribution across dimensions. In the case of the Gini
coefficient (or the Lorenz curve) this final term is particularly intuitive: it is the extent to which
multidimensional inequality would rise if achievements were completely aligned.

To implement the measure, we provide a calibration approach based on data in an initial period
and normative policy weights. An average Lorenz curve is constructed by weighting and
summing up the specific Lorenz curves for the country in the initial period. Then, working back
through the Lorenz formula, coefficients for the linear aggregation function are extracted to
reflect the normative weights. In essence, dimensions are rendered comparable using the
measuring rod of Lorenz (or Gini) inequality. 10 Once the multidimensional inequality measure


9
    The property of constant-sum convexity is satisfied by virtually all traditional measures.
10
     This is analogous to the role of deprivations in multidimensional poverty (Alkire et al 2015, p. 50).


                                                                                                             4
has been calibrated using the initial period’s data, it can then be used to gauge the country’s
multidimensional inequality through time and, with additional assumptions, through space.

We illustrate the methodology using simulated data to allow specific inequalities, distributional
means, and correlation levels to vary freely. Examples show the impact of each factor on
multidimensional inequality for measures based on the Lorenz curve, the Gini coefficient, and
other Lorenz-consistent measures. A second illustration using data from Azerbaijan examines the
evolution of multidimensional inequality during a period of rapid income growth.

Previous studies have considered linear aggregation as one among several options, but to our
knowledge this is the first paper that selects this structure based on the axiomatic properties of its
associated measures. Likewise, a number of authors have linked multidimensional inequality to
specific inequalities or to positive association, but none has the elegant simplicity of our
decomposition, which highlights the fundamental role of Shorrocks mobility in representing a
positive association in multidimensional inequality. In addition, the approach is unique in its use
of normative weights and Lorenz curves to calibrate the measure in a base year, and then to
judge subsequent changes accordingly. Finally, the paper is unusual in its focus on identifying
multidimensional inequality measures that are especially useful in conducting policy.

Section II introduces the definitions and notation used in the paper. Section III presents different
approaches to multidimensional inequality measurement and then establishes our main
characterization result, identifying the subset of measures from Maasoumi’s (1986) two-stage
approach that are axiomatically sound. Section IV provides expressions linking members of this
class of multidimensional inequality measures to specific inequalities and Shorrocks mobility.
Our calibration method is described in Section V, selecting initial parameters of the measure
based on normative weights and Lorenz curves. Section VI illustrates our methods, while the
final section concludes.

II. Notation, Axioms, and Other Fundamentals

The data for measuring inequality are given in an array ������������ with ������������ rows and ������������ columns. The ������������ th
row ������������������������ lists data for person ������������ = 1, … , ������������; the ������������th column ������������∙������������ lists data for dimension ������������ = 1, … , ������������ ;
and each entry ������������������������������������ ≥ 0 is person ������������ ’s achievement in dimension ������������. The population size ������������ can vary,
but ������������ is fixed in any given application; hence, the set ������������ of possible arrays is a subset of


                                                                                                                                   5
∪∞       ������������������������
 ������������=1 R + . Two alternatives for ������������ will be considered: ������������ = ������������1 containing arrays that are strictly

positive, and ������������ = ������������2 containing nonnegative arrays having at least one positive quantity in each
column. 11 A multidimensional inequality measure is a mapping ������������: ������������ → ������������ associating a real
number to each array, interpreted as its level of multidimensional inequality.

When developing a measurement tool like ������������, it is important to be clear about its intended
purpose, its desired characteristics, and the axioms it should satisfy (Foster 2024). The purpose
of the measure we seek is to monitor multidimensional inequality in a country over time. A
useful list of desired characteristics (or desiderata) can be found in Szekely (2006); of special
relevance to the present paper is one that calls for a measure to be understandable and easy to
describe. 12 The list of axioms to be satisfied by a multidimensional inequality measure will be
given below.

We make use of unidimensional inequality measures in producing and understanding
multidimensional measures. Let ������������ ⋮ denote the set of column vectors of arbitrary length
associated with ������������. A unidimensional inequality measure is a mapping ������������ : ������������ ⋮ → ������������ associating a
real number ������������ (������������) to each vector ������������ in ������������ ⋮ interpreted as its inequality level. The Lorenz curve
������������������������ : [0,1] → [0,1] associated with ������������ depicts its level of equality (and inequality), where ������������������������ (������������) is
the share of the income received by the lowest ������������ share of the population. Distributions ������������ , ������������ ′ ������������ ������������ ⋮
have the same level of inequality by the Lorenz criterion if ������������������������′ (������������) = ������������������������ (������������) for all ������������; distribution
������������′ has no less inequality than ������������ if ������������������������′ (������������) ≤ ������������������������ (������������) for all ������������, in which case we say that ������������ weakly
Lorenz dominates ������������′; and ������������′ has greater inequality than ������������ if ������������������������′ (������������) ≤ ������������������������ (������������) for all ������������, and ������������������������′ ≠
������������������������ , in which case we say that ������������ (strictly) Lorenz dominates ������������′. A measure ������������ is said to be Lorenz
consistent if it renders the same judgment as the Lorenz criterion when the criterion applies;
equivalently, ������������ satisfies the axioms of anonymity, scale invariance, replication invariance, and
the (Pigou-Dalton) transfer principle (Foster 1985). A measure ������������ is said to be constant-sum
convex if it is convex over slices of ������������ ⋮ having the same population size and the same total. 13


11
                        ������������=1 R ++ while ������������2 = ∪������������=1 (R + \0) , where R + \0 denotes the nonnegative orthant in ������������
   In symbols, ������������1 = ∪∞       ������������������������           ∞        ������������  ������������      ������������                                          ������������

excluding its origin.
12
   Other desiderata include: conforming to a common-sense notion of what is being measured; fitting the stated
purpose; being technically solid; being operationally viable; and being easily replicable. See also Alkire et al (2015)
and Foster (2024). Such “proto-axioms” are less precise than formal axioms but help ensure a measure is fit for
purpose.
13
   See Kolm (1976, p. 93). Like decomposability, this property constrains the cardinal values of a measure.


                                                                                                                                         6
Applying ������������ to column ������������∙������������ in ������������ yields the inequality level ������������ (������������∙������������ ) for dimension ������������ = 1, … , ������������, which
will be termed its specific inequality level; analogously ������������������������.������������ (or ������������������������ when ������������ is understood)
graphically depicts its specific inequality level.
                                                                                                     ������������
Let ������������ be an element of ������������ ⋮ . We denote the mean of ������������ by ������������ (������������) = ∑������������
                                                                                ������������=1 ������������ or, more succinctly, by
                                                                                          ������������



                                                   � of ������������ is the permutation of ������������ for which ������������
������������������������ if ������������ = ������������.������������ . The ordered vector ������������                                             �1 ≤ ������������       ������������� .
                                                                                                         �2 ≤ ⋯ ≤ ������������
                                                                                                                    �.������������ ;
Given array ������������ ������������ ������������, the completely aligned array ������������̿ of ������������ is the array whose columns are ������������̿.������������ = ������������
it takes the same dimensional achievements and reorders them so that person 1 has the lowest
entry for each dimension, person 2 has the next lowest, and so forth. An array is said to be
aligned if it has the same rows as its completely aligned version, but potentially in a different
order.

Some measurement approaches use a function to tally up the dimensional achievements of a
given person. Let ������������ ⋯ denote the set of row vectors of length ������������ associated with ������������. An
aggregation function is a mapping ℎ: ������������ ⋯ → ������������ associating a real number to each vector in ������������ ⋯ ;
applying ℎ to ������������������������ in ������������ yields person i’s aggregate level ℎ(������������������������ ). Given vectors ������������ , ������������ ′ ������������ ������������ ⋯ , we define
the vector dominance relations as follows: ������������ ≫ ������������ ′ iff ������������������������ > ������������������������′ for all ������������; ������������ ≥ ������������′ iff ������������������������ ≥ ������������������������′ for all
������������; and ������������ > ������������ ′ iff ������������ ≥ ������������ ′ and not ������������ ′ ≥ ������������. The rows of an aligned array can be ranked by vector
dominance; and in the completely aligned array the later rows vector dominate earlier rows, so
that ������������̿������������′ ≥ ������������̿������������ for ������������ ′ > ������������ .

We consider two categories of axioms for multidimensional inequality measures – invariance
axioms and dominance axioms. Invariance axioms specify the transformations of an array that
leave the measure unchanged. They include anonymity, scale invariance, and replication
invariance and are entirely analogous to the unidimensional versions. Dominance axioms specify
transformations that cause multidimensional inequality to move in a certain direction. They
include the two multidimensional generalizations of the Pigou-Dalton transfer principle
associated with Kolm (1977) and Atkinson and Bourguignon (1982), respectively.

Intuitively, Kolm’s (1977) weak uniform majorization axiom requires multidimensional
inequality not to increase when each dimension is “smoothed” in the same way. More precisely,
we say that ������������′ is obtained from ������������ by a uniform smoothing if ������������ ′ = Β������������ for some bistochastic




                                                                                                                                                7
matrix Β. 14 Under a uniform smoothing, person i’s vector in ������������ ′ is a weighted average of all
persons’ initial vectors in ������������, where the weights are the elements of the ith row of Β. This creates a
new array whose columns weakly Lorenz dominate the respective columns of the original array.
The first dominance axiom for ������������ is given by the following.

Weak Uniform Majorization Axiom. If ������������′ is obtained from ������������ by a uniform smoothing, then
������������(������������ ′ ) ≤ ������������(������������).

This axiom specifies that multidimensional inequality should not increase as a result of uniform
smoothing.

When might we expect multidimensional inequality to strictly fall as a result of a uniform
smoothing? When will it stay the same? We say that ������������′ is obtained from ������������ by a permutation if
������������ ′ = Π������������ for some permutation matrix Π. 15 Note that a given uniform smoothing could also be a
permutation if, say, Β were itself a permutation matrix or if the rows averaged by Β happened to
be identical. Since anonymous multidimensional inequality measures are unchanged by a
permutation (and anonymity is typically assumed), the strict version of the axiom accounts for
this possibility.

Uniform Majorization Axiom. If ������������′ is obtained from ������������ by a uniform smoothing, then ������������(������������ ′ ) ≤
������������(������������); if, in addition, ������������′ is not obtained from ������������ by a permutation, then ������������(������������ ′ ) < ������������(������������).

Many multidimensional measures have two stages: the first employs an aggregation function
ℎ(������������������������ ) = ������������������������ and a second applies some symmetric function to the aggregate distribution ������������ =
(������������1 , … , ������������������������ ). 16 It could be argued that when dealing with this form of measure, the strict form of
the uniform majorization axiom needs to be modified further. To be sure, if ������������ ′ is a permutation
of ������������, then ������������ ′ will be a permutation of ������������. However, the converse need not be true. And if it were
not, then the axiom would require ������������(������������ ′ ) < ������������(������������) (since ������������′ is not a permutation of ������������) at the same
time that anonymity would be requiring ������������(������������ ′ ) = ������������(������������) (since ������������ ′ is a permutation of ������������). The
following modification accounts for this issue.



14
   A bistochastic matrix is a square nonnegative matrix whose rows and columns sum to 1.
15
   A permutation matrix is a square matrix containing 0’s and 1’s whose rows and columns sum to 1.
16
   Examples of papers that aggregate rows first include Tsui (1995, 1999), Bourguignon (1999), Diez et al. (2007),
Decanq and Lugo (2012), and Seth (2013).


                                                                                                                     8
Limited Uniform Majorization Axiom. If ������������′ is obtained from ������������ by a uniform smoothing, then
������������(������������ ′ ) ≤ ������������(������������); if, in addition, ������������′ is not obtained from ������������ by a permutation, then ������������(������������ ′ ) < ������������(������������).

This revised axiom only requires strict inequality to hold when the aggregate vectors are not
permutations of one another. It should be noted that this axiom is tailor-made for two-stage
multidimensional inequality measures, and it is a joint restriction on ������������ and ℎ. 17

The second type of dominance axiom, based on Atkinson and Bourguignon (1982), takes into
account the association among variables. 18 We say that ������������ ′ is obtained from ������������ by an unfair
rearrangement if ������������ ′ = ������������̿ , where ������������̿ is the completely aligned version of ������������. An unfair
rearrangement reassigns achievement levels to people so that person 1 has the lowest level in
each dimension, person 2 has the next lowest levels, and so forth, thereby maximizing positive
association among variables. The following is a second multidimensional generalization of the
transfer axiom.

Weak Unfair Rearrangement Axiom. If ������������′ is obtained from ������������ by an unfair rearrangement, then
������������(������������ ′ ) ≥ ������������(������������).

According to this axiom, reallocating achievements so as to maximize positive association
should not decrease multidimensional inequality.

Note that as before, this transformation may yield an array that is a permutation of the original
array, as would happen if ������������ had the same rows as ������������̿ , but in a different order across people. The
strict version of this axiom accounts for this possibility.

Unfair Rearrangement Axiom. If ������������ ′ is obtained from ������������ by an unfair rearrangement, then
������������(������������ ′ ) ≥ ������������(������������); if, in addition, ������������ ′ is not obtained from ������������ by a permutation, then ������������(������������ ′ ) > ������������(������������).

This axiom goes beyond the weaker version by requiring multidimensional inequality to strictly
increase when the unfair rearrangement is not a permutation of the original array.

III. Multidimensional Inequality Measures




17
     Thus, its use is limited to the class of two-stage measures.
18
     Our presentation follows Dardanoni (1995). For other versions of the axiom see, for example, Tsui (1995).


                                                                                                                                  9
We now present several intuitive approaches to measuring multidimensional inequality and the
properties they satisfy. A dashboard ������������(������������) = ������������� (������������∙1 ), … , ������������ (������������∙������������ )� is a vector of specific
inequalities, which can be interpreted as a multidimensional inequality measure (or rather a
quasiordering on ������������) when used with vector dominance. 19 Multidimensional inequality is then
judged to be higher when one specific inequality level is higher, and the rest are no lower. So
long as ������������ is Lorenz-consistent, ������������ satisfies all but one of the general axioms required of a
multidimensional inequality measure. 20 The unfair rearrangement axiom fails, since ������������ ignores
information on positive association and considers ������������̿ and ������������ to be identical. Of course, the practical
utility of ������������ is also hampered by its inability to make comparisons when one specific inequality
rises and another falls. 21

A simple way of moving from a dashboard to a multidimensional inequality measure is to take a
weighted average of specific inequalities using positive weights ������������1 , … , ������������������������ that sum to 1,
resulting in an average specific inequality measure

         ������������(������������) = ������������1 ������������(������������∙1 ) + ⋯ + ������������������������ ������������ (������������∙������������ )                                          (1)

Gajdos and Weymark (2005, p. 489), for example, use the Gini coefficient and weights ������������������������ =
������������������������ / ∑������������ ������������������������ to obtain a measure of this form, which they contrast to Koshevoy and Mosler (1997)
who consider fixed weights. Whether weights depend on means or are fixed, ������������ satisfies the same
list of axioms as ������������ when ������������ is Lorenz-consistent. 22 Unlike a dashboard, an average specific
inequality measure can make comparisons between any two arrays, but it also ignores the
association between dimensions. In particular, it views ������������̿ and ������������ as identical and violates the
unfair rearrangement axiom.



19
   Dashboards can also be populated by distinct inequality measures or applied to unrelated sample populations. The
above case fits best in the present context. A quasiordering is a reflexive and transitive relation, that is not
necessarily complete (Sen 1997 Ch 3).
20
   In particular, dashboard inequality levels are unchanged by permutations, by a scalar multiple, by a population
replication, and by an unfair rearrangement; they do not rise, and can fall, as a result of a uniform smoothing.
Consequently, the quasiordering generated by ������������ satisfies the three invariance axioms, the uniform majorization
axiom, and the weak unfair rearrangement axiom.
21
   Even when ������������ can compare two arrays, the comparison might go against judgments that take into account
information on dimensional means (and dependence); the conclusions rendered by dashboards are partial and not
unambiguous.
22
   In particular, ������������ satisfies the three invariance axioms, the weak unfair rearrangement axiom, and the uniform
dominance axiom, where the latter property holds since a uniform smoothing leaves weights unchanged.


                                                                                                                          10
Maasoumi (1986) constructs a multidimensional inequality measure that reverses the order of
aggregation by first combining a person’s dimensional achievements into a single aggregate
indicator ������������������������ = ℎ(������������������������ ) and then applying a unidimensional inequality measure ������������ to the aggregate
distribution ������������ = (������������1 , … , ������������������������ )′ ������������ ������������ ⋮ . The resulting two-stage measure ������������(������������) = ������������ (������������) is intuitive in
structure, with components ������������ and ℎ that can be readily understood and applied. 23 However,
questions about the axiomatic suitability of the approach have been raised. Dardanoni (1995)
shows how a two-stage measure can violate the weak uniform majorization axiom, which leads
him to critique the axiom; Weymark (2006) reinterprets this finding as a critique of Maasoumi’s
two-stage approach.

Bosmans et al. (2015) provide a novel justification of two-stage measures in the context of
normative inequality measurement, which views multidimensional inequality as the welfare loss
from falling short of an optimal allocation. 24 They divide each normative multidimensional
inequality measure into two distinct terms: one that evaluates inefficiency and another that
evaluates inequity and show that the latter term is, in fact, a two-stage inequality measure. The
authors conclude: “If one would insist that inequality measures should be concerned with
inequity alone, and not with inefficiency, then we arrive at the striking conclusion that the
normative approach itself pushes two-stage measures to the forefront.” This is a remarkable
observation, which sheds light on the structure of normative multidimensional inequality
measures as well as the suitability of the two-stage class. Note, though, that it justifies the two-
stage measures not as independent multidimensional inequality measures but as useful “partial”
indices that focus on one aspect of multidimensional inequality. 25

The broader suitability of the two-stage measures depends on the axioms they satisfy, which in
turn depends on the range of components ℎ and ������������ being considered. For the first component, we
consider all aggregation functions ℎ: ������������ ⋯ → ������������ that are continuous, concave, linear homogenous,
and strictly increasing (as in Bosmans et al., 2015), and denote the resulting set by ℋ . For the
second, we consider all Lorenz-consistent unidimensional inequality measures ������������ : ������������ ⋮ → ������������, and


23
   While many, including Maasoumi (1986) and Bosemans et al (2015), interpret the aggregation function as utility,
here we are “making no use of information on individual relative valuations” of dimensional variables, and instead
are treating the function as “a subject for social decision” (Atkinson and Bourguignon 1982 p. 190).
24
   See also Kolm (1977), Tsui (1995), and Weymark (2006) for traditional derivations of a (relative) normative
multidimensional inequality measure from a welfare function.
25
   On partial indices, see Foster and Sen (1997, p. 168-9).


                                                                                                                                      11
denote the set by ℐ . The object of study is ℳ , the set of two-stage measures ������������: ������������ → ������������ with
components ℎ ������������ ℋ and ������������ ������������ ℐ. Which axioms are satisfied by these measures? Can we identify a
subclass of ℳ that is both intuitive and axiomatically sound?

The properties defining ℋ and ℐ ensure that every measure in ℳ satisfies the axioms of
anonymity, scale invariance, and replication invariance. 26 Our first result takes up the weak
uniform majorization axiom.

Theorem 1. Let ������������ be a two-stage measure with components ℎ ������������ ℋ and ������������ ������������ ℐ. If ������������ satisfies weak
uniform majorization, then there exists ������������ = (������������1 , … , ������������������������ ) ≫ 0 such that ℎ(������������) = ������������1 ������������1 + ⋯ + ������������������������ ������������������������
for all ������������ ������������ ������������ ⋯ .

Proof. See the Appendix.

The proof draws on Dardanoni (1995) and begins by showing that any convex combination of
allocations in ������������ ⋯ having the same value under ℎ also has the same value under ℎ. Applying this
to a certain set of allocations yields a simplex over which ℎ is linear, while the remaining
argument expands the characterization to all of ������������ ⋯ . Theorem 1 identifies the two-stage
measures that are consistent with weak uniform majorization; the remaining measures in ℳ
violate this basic axiom and hence are not axiomatically sound. 27

Let ℒ be the subclass of ℳ whose aggregation functions are linear with ������������ ≫ 0. The next result
describes the axioms satisfied by the measures in ℒ.

Theorem 2. Any two-stage measure ������������ ������������ ℒ satisfies the anonymity, scale invariance, replication
invariance, limited uniform majorization, and unfair rearrangement axioms.

Proof. See the Appendix.

The proof shows how each invariance property for ������������ follows immediately from the analogous
property for ������������ . For the limited uniform majorization axiom, when a bistochastic matrix is applied
to array ������������, the new aggregate vector can be found by applying the same bistochastic matrix to the



26
  See Theorem 2, below.
27
  This includes two-stage measures using other CES-type aggregation functions suggested by Maasoumi (1986) and
used in empirical applications.



                                                                                                                                 12
aggregate vector ������������ of ������������, due to the linearity of ℎ. Consequently, both the weak and strict parts of
the axiom follow directly from the transfer axiom and anonymity of ������������ . As for the unfair
rearrangement axiom, the linear structure of ℎ might lead one to think that the resulting measure
would not be sensitive to the joint distribution and hence that ������������(������������̿ ) = ������������(������������). Yet the proof
shows that ������������(������������̿ ) > ������������(������������) follows immediately from the Lorenz consistency of ������������ whenever ������������̿ is
not a permutation of ������������.

The intuition can be seen in an example with ������������ = ������������ = 2, where c = (2,3) are the coefficients in
ℎ. Suppose, initially, person 1 has ������������1 = (2,1) while person 2 has ������������2 = (1,2), so that the initial
                                          7
aggregate distribution is ������������ = � �. As a result of the unfair rearrangement, we obtain ������������̿1 = (1,1)
                                          8
                                              5
and ������������̿ 2 = (2,2) and hence ������������̿ = � �. In other words, the unfair rearrangement of ������������ translates into
                                              10
a regressive transfer from ������������, and hence, strictly more inequality according to the Lorenz
consistency of ������������ .

These results are summarized in the following corollary.

Corollary. A two-stage measure ������������ ������������ ℳ satisfies anonymity, scale invariance, population
replication, limited uniform majorization, and unfair rearrangement if and only if ������������ ������������ ℒ.

Each measure in ℒ is determined by a vector ������������ ≫ 0 of coefficients and a unidimensional measure
������������ . An approach to selecting ������������ is given in Section V below. The choice of ������������ can be guided by a
large literature on unidimensional inequality measures. The Lorenz curve, which plays a central
role in that literature, also applies directly to the present environment for fixed ������������ . First, it
provides a useful graphical depiction of the inequality in ������������ as given by the aggregate Lorenz
curve ������������������������ , or the Lorenz curve applied to the aggregate vector ������������ associated with ������������. Second, the
resulting weak and strict Lorenz criteria can be used to rank arrays. For example, ������������′ has strictly
more multidimensional inequality than ������������ whenever

         ������������������������′ (������������) ≤ ������������������������ (������������) for all ������������ ������������ [0,1], with strict inequality for some ������������.   (2)




                                                                                                                   13
Indeed, when (2) holds, it follows that every ������������ ������������ ℒ with the same ������������ would agree that ������������(������������ ′ ) >
������������(������������). 28

Given the purpose of the measure, and the desired characteristics initially posited for it, the
simple, linear form of the aggregation function can be viewed as an advantage. 29 It is clearly
“neutral” in the ALEP sense discussed in Kannai (1980), so that dimensional variables are
neither complements nor substitutes. At the same time, the choice of ������������ offers substantial scope
for incorporating normative and practical considerations, as we shall see below. Linear
aggregators often appear in empirical applications and theoretical discussions as part of a
parametric family.

Following Maasoumi (1986), ℎ typically has been interpreted as a utility function whose
functional form often follows traditional examples from familiar classes. The empirical or
normative bases for selecting from among the possibilities has been limited, and in any given
application, several different choices for ℎ are usually applied without identifying one, say, as a
headline indicator for policy analysis. In contrast, the present paper does not adopt a utility
interpretation; instead, analogous to Atkinson (1970) or Atkinson and Bourguignon (1982), it
simply views ℎ as a function employed in social evaluation. Its normative content will originate
in public policy discussions of the relative importance of specific inequalities rather than from
individual preference.

Finally, we should note that the results differ slightly depending on which domain is being
assumed. Recall that the domain ������������ can either be ������������1 containing positive arrays or ������������2 containing
nonnegative arrays that have at least one positive entry per column. Domain ������������1 allows a broader
range of components, including those not defined for vectors with zero entries (such as many
generalized entropy measures and weighted general means), but consequentially yields
multidimensional measures that can be used only with positive data. Domain ������������2 limits
consideration to a narrower range of components defined for zero values, but then yields
measures that apply more broadly.


28
   This follows directly from the Lorenz consistency of ������������ ������������ ℐ. Arguments entirely analogous to the proof of Theorem
2 show that for any given ������������ ≫ 0, the inequality quasiordering associated with the Lorenz criterion satisfies the
multidimensional axioms of anonymity, scale invariance, replication invariance, limited uniform majorization and
unfair rearrangement.
29
   See the discussion in Alkire and Foster (2011, p. 486).


                                                                                                                      14
IV. Multidimensional from Specific Inequalities

As noted above, a key desirable characteristic of an inequality measure is for it to be easily
understood and communicated to others. For multidimensional measures, this could be facilitated
by a clear link with specific inequality levels. In this section we show that such a link exists for
many two-stage measures in ℒ and for the aggregate Lorenz curve.

The key intuition is found in Shorrocks (1978), who considers the impact of the accounting
period on income inequality and measuring mobility as the extent to which inequality falls as the
accounting period is extended. Consideration is restricted to measures ������������ ������������ ℐ that are constant-sum
convex, which includes the Gini coefficient, the generalized entropy measures, and Atkinson’s
family, among others. 30 Where ������������.1 , … , ������������.������������ are (column) vectors listing the incomes of ������������ persons
over ������������ periods, Shorrocks compares the inequality in the total income ������������ (������������.1 + ⋯ + ������������.������������ ) to the
income-share weighted average of the per period income inequalities ������������1 ������������ (������������.1 ) + ⋯ + ������������������������ ������������ (������������.������������ ),
                       �������������������������.������������ �
where ������������������������ = ∑                             for ������������ = 1, … , ������������ . He notes that the former never exceeds the latter, while the
                      ������������ �������������������������.������������ �

two coincide when ������������.������������ are scalar multiples of each other. Mobility can be defined as

          ������������ = ������������1 ������������ (������������.1 ) + ⋯ + ������������������������ ������������ (������������.������������ ) − ������������ (������������.1 + ⋯ + ������������.������������ ) ≥ 0                   (3)

or the extent to which total income inequality falls below the average per period income
inequality due to the “smoothing” effect of aggregation across time periods.

The same logic can be applied in the multidimensional context where the smoothing now occurs
across dimensions. Pick any array ������������ ������������ ������������. Let ℒ ′ denote the set of all measures in ℒ with
inequality components that are constant sum convex and select a measure ������������ ������������ ℒ ′ with associated
components ������������ and ������������ . Substituting ������������������������ ������������∙������������ for ������������∙������������ in equation (3) converts the final term into
������������ (������������1 ������������.1 + ⋯ + ������������������������ ������������.������������ ) = ������������ (������������) = ������������(������������), or the multidimensional inequality in array ������������, while the
first terms become the average specific inequality ������������(������������) as defined in (1) using weights
                                     ������������������������ ������������������������
          ������������������������ = ������������                                for ������������ = 1, … , ������������                                               (4)
                         1 ������������1 +⋯+������������������������ ������������������������


We have the following result.


30
  His formal results apply to constant-sum strict convex measures; implications of the weaker convexity property
are discussed informally (Shorrocks 1978 p. 382).


                                                                                                                                        15
Theorem 3. Select any ������������ ������������ ℒ ′ and define ������������(������������) using (1) and (4). Then for any ������������ ������������ ������������ we have
������������(������������) = ������������(������������) − ������������(������������) ≥ 0, with ������������(������������) = 0 if the normalized vectors ������������∙������������ /������������������������ of ������������ are identical
for ������������ = 1, … , ������������ .

Proof. See the Appendix.

The theorem shows that the average specific inequality level ������������(������������) is generally larger, and
certainly no smaller, than the multidimensional inequality level ������������(������������), with their difference being
the Shorrocks mobility measure ������������(������������), assessed here across dimensions rather than through time.
In the special case where the column vectors of ������������ are multiples of one another (hence ordered in
the same way and with identical shapes), the aggregate vector ������������ will also be a multiple, so that
������������(������������) = ������������(������������) and hence ������������(������������) = 0.

Given any ������������ in ℒ′ the mobility term can be usefully broken down into two independent terms
that respectively reflect the association among dimensional distributions and their relative
shapes. The rearrangement term ������������ (������������) = ������������(������������̿ ) − ������������(������������) measures the pure effect of positive
association on multidimensional inequality as one moves from ������������ to the completely aligned
version ������������̿ . By Theorem 2 and the unfair rearrangement axiom we know that ������������(������������) ≥ 0. The
structural term ������������(������������) = ������������(������������̿ ) − ������������(������������̿ ) measures the mobility associated with the completely
aligned vector ������������̿ . It is always nonnegative by Theorem 3 given weights ������������1 , … , ������������������������ from (4). In
addition, the anonymity of ������������ ensures that ������������(������������) = ������������(������������̿ ). With these observations, the next result
follows immediately from Theorem 3.

Theorem 4. Select any ������������ ������������ ℒ′ and define ������������1 , … , ������������������������ using (3). Then for any ������������ ������������ ������������ we have

           ������������(������������) = ������������1 ������������ (������������∙1 ) + ⋯ + ������������������������ ������������ (������������∙������������ ) − ������������(������������) − ������������(������������)                    (5)

where R(������������), ������������(������������) ≥ 0

Equation (5) provides a general expression linking multidimensional to specific inequalities, with
the nonnegative terms ������������ (������������) and ������������(������������) accounting for the extent to which ������������(������������) falls below ������������(������������).
If ������������̿ is not a permutation of ������������, we know that ������������ (������������) > 0 by the unfair rearrangement axiom, in
which case the rearrangement term is impacting measured inequality; if the columns of ������������̿ are not
scalar multiples of each other, and inequality measure ������������ is sensitive to their different shapes, then




                                                                                                                                    16
������������(������������) > 0 by Theorem 3 and so the structural term is impacting measured inequality. 31 As an
array ������������ evolves over time, changes in multidimensional inequality can be viewed in terms of four
factors: (i) changes in the specific inequality levels, (ii) changes in the weights on the specific
inequalities through dimensional means, (iii) changes in positive association across dimensions,
and (iv) changes in the shapes of the dimensional distributions.

The Gini coefficient ������������ is the most common Lorenz-consistent measures, due in part to its
intuitive interpretations and its clear link to the Lorenz curve. When applied to ordered vectors,
the Gini becomes a linear function, which simplifies its associated two-stage multidimensional
measure and expression (5) above. Let ������������������������ ������������ ℒ ′ denote the two-stage measure based on the Gini
coefficient ������������ , and let ������������������������ (������������) its associated rearrangement term. We have the following
expression for ������������������������ (������������).

Theorem 5. Consider ������������������������ ������������ ℒ′ and define ������������1 , … , ������������������������ using (4). Then for any ������������ ������������ ������������, we have

           ������������������������ (������������) = ������������1 ������������ (������������∙1 ) + ⋯ + ������������������������ ������������ (������������∙������������ ) − ������������������������ (������������)                                       (6)

where ������������������������ (������������) ≥ 0.

Proof. See the Appendix.

The key step of the proof uses the linear structure of ������������ over ordered vectors to show that
������������������������ (������������) = ������������������������ (������������̿ ) − ������������������������ (������������̿ ) = 0, eliminating the structural term from (5). 32 Theorem 5 offers a
remarkably straightforward expression for multidimensional inequality when ������������ is used: ������������������������ (������������) is
the average specific Gini minus a term ������������������������ (������������) = ������������������������ (������������̿ ) − ������������������������ (������������) that is positive for any ������������ that is
not a permutation of ������������̿ , but falls to zero as a positive association in ������������ rises towards its maximum
level in ������������̿ .

Suppose that a linear ℎ ������������ ℋ with ������������ ≫ 0 has been selected in the first stage. Rather than choosing
a particular ������������ ������������ ℐ in the second stage, the Lorenz curve can be used to depict multidimensional
inequality and make comparisons. Given ������������ ������������ ������������ with aggregate vector ������������, the aggregate Lorenz
                                                                              1       ������������
curve ������������������������ for ������������ can be defined as ������������������������ (������������) = ������������(������������) ∫0 ������������������������ (������������)������������������������ for ������������ ������������ [0,1], where ������������(������������) is the mean


31
  Measures with the needed sensitivity include those that are constant-sum strictly convex.
32
  The proof also applies to any ������������ ������������ ℐ that is linear over ordered vectors, such as the generalized Gini measures
(Weymark 1981).


                                                                                                                                                      17
of ������������ and ������������������������ : [0,1] → ������������ is its quantile function. 33 For ������������ = 1, … , ������������ , let ������������������������ (������������) and ������������������������ (������������) respectively
denote the quantile function and Lorenz curve of the ������������th variable ������������∙������������ in ������������. Multidimensional
inequality in ������������ is evaluated using its aggregate Lorenz curve ������������������������ while specific inequalities are
evaluated using the specific Lorenz curves ������������������������ ; in each case greater inequality is indicated by a
lower Lorenz curve.

Multidimensional and specific inequalities are linked for this case as well. Let ������������������������ denote the
weighted average of the specific Lorenz curves for ������������ using the weights from (4) so that

           ������������������������ (������������) = ������������1 ������������1 (������������) + ⋯ + ������������������������ ������������������������ (������������) for ������������ ������������ [0,1].                                             (7)

It can be shown that ������������������������ (������������) is itself a Lorenz curve. Indeed, let ������������̿ be the completely aligned
version of ������������ and denote its aggregate vector by ������������̿, the associated quantile function by ������������������������̿ , and the
aggregate Lorenz curve by ������������������������̿ . Then

                                     1                     ������������
           ������������������������ (������������) = ∑                           ∫0 [������������1 ������������1 (������������) + ⋯ + ������������������������ ������������������������ (������������)]������������������������
                                ������������ ������������������������ ������������������������


                                1               ������������
                          = ������������ ∫0 ������������������������̿ (������������)������������������������ = ������������������������̿ (������������)                                   for ������������ ������������ [0,1]           (8)
                                   �
                                   ������������


so that the average Lorenz curve ������������������������ is identical to ������������������������̿ , the aggregate Lorenz curve of ������������̿ . This is
analogous to what was found above for ������������������������ and relies on the linearity of ������������ in the ordered incomes
of ������������ . Now define the rearrangement function ������������������������ : [0,1] → ������������ by ������������������������ (������������) = ������������������������ (������������) − ������������������������̿ (������������) for
������������ ������������ [0,1], and note that it graphically depicts the inequality-reducing impact of dampened
association in moving from ������������̿ to ������������. 34 We have the following result.

Theorem 6. Consider any linear ℎ ������������ ℋ with ������������ ≫ 0 and define ������������1 , … , ������������������������ using (3). Then for any
������������ ������������ ������������, we have

           ������������������������ (������������) = ������������1 ������������1 (������������) + ⋯ + ������������������������ ������������������������ (������������) + ������������������������ (������������)                          for ������������ ������������ [0,1]   (9)

where ������������������������ (������������) ≥ 0.



33
   A quantile function is a generalized inverse of the cdf of a distribution; it lists the income of each person against
the percentile of the person, ranging from lowest to highest.
34
   Integrating ������������������������ (������������) measures the average distance or the area between the two Lorenz curves, and hence is half of
the rearrangement term for the Gini coefficient.


                                                                                                                                                      18
Proof. See the Appendix.

A Lorenz curve can be used in place of a numerical inequality measure in the two-stage
approach, with ������������������������ representing multidimensional inequality in ������������. Expression (9) shows how
multidimensional inequality ������������������������ can be additively decomposed into a weighted average ������������������������ of
specific Lorenz curves plus a nonnegative rearrangement function ������������������������ reflecting the extent to
which ������������ dampens positive association in ������������̿ . The final term vanishes for all ������������ when ������������̿ is a
permutation of ������������, but otherwise is strictly positive for some ������������.

V. Calibration

The above results have established the properties of these multidimensional inequality measures
and explored their links with specific inequalities. We now turn to the task of implementing the
measures with data. As with other measures using the joint distribution, data need to be drawn
from a single source to construct the vector of achievements for each person. Dimensional
variables must be viewed as cardinal variables that are comparable across people so that
meaningful specific inequalities and Lorenz curves can be constructed. 35 In keeping with the
stated purpose of the measure, data should be available in a base period and over time. We
assume the existence of an initial array ������������ = ������������1 and several subsequent arrays ������������ 2 , … , ������������ ������������ for some
������������ ≥ 2.

Taking positive multiples of variables leaves specific inequalities unchanged but can directly
impact multidimensional inequality. In empirical applications, attention is often paid to rescaling
variables to make them “comparable” in some sense. For example, variables might be rescaled to
a common range such as [0,1]; or normalized by dividing by the mean or another indicator of
size. The resulting variables are then posited to be comparable even though no measuring rod
related to inequality has been invoked. In addition, the very rescaling process that asserts
comparability across dimensions can, if reapplied to each round of data, reduce comparability
across time.




35
  More precisely, dimensional variables must be ratio scales; note that this fixes a natural zero value for each
variable, which in the present context also has implications for comparability across variables. Related issues are
discussed in Alkire and Foster (2010).


                                                                                                                      19
Our approach leaves variables in their original forms and accounts for the relative importance of
variables, and their comparability, through the calibration of ������������ from the aggregation function.
The calibration assumes that policymakers can express the relative importance of specific
inequalities using positive normative weights ������������1 , … , ������������������������ summing to 1. The weights are then
applied to data in the base period to obtain a normative average Lorenz curve, namely

          ������������������������ (������������) = ������������1 ������������1 (������������) + ⋯ + ������������������������ ������������������������ (������������)                                                                            (10)

where ������������������������ for ������������ = 1, … , ������������ are the specific Lorenz curves from period 1. For example, if weights
are equal, ������������������������ will be the simple average of the specific Lorenz curves; while if a weight of 0.5 is
placed on income and the remaining 0.5 is equally split among the rest, the resulting ������������������������ will
more closely resemble the income Lorenz curve. The normative Lorenz curve provides a
snapshot of average specific inequality in the base year using weights that reflect policy
priorities.

Expression (9) linking multidimensional and specific inequalities contains a second average
Lorenz curve ������������������������ which depends on the choice of coefficients ������������1 , … , ������������������������ through its weights. The
calibration approach selects these coefficients to ensure that the latter curve is the same as the
former. Given the associated quantile functions ������������������������ (������������) and means ������������������������ for ������������ = 1, … , ������������ , we can
rewrite expression (10) as
                                              ������������ ������������1                                      ������������ ������������������������
                        ������������������������ (������������) = ∫0      ������������1
                                                           ������������1 (������������)������������������������ + ⋯ + ∫0           ������������������������
                                                                                                              ������������������������ (������������)������������������������                    (11)

                                                                                                                      ������������
which through (8) suggests the use of coefficients ������������������������ = ������������������������ for ������������ = 1, … , ������������ . Indeed, applying these
                                                                                                                             ������������

coefficients to equation (7) yields ������������������������ (������������) = ������������������������ (������������) for the base period. In this way, the baseline
for evaluating the evolution of inequality can reflect the policy priorities embodied in the
normative weights ������������1 , … , ������������������������ . An analogous relationship holds for the measure ������������������������ associated
                                                                                ������������
with the Gini coefficient. Substituting ������������������������ = ������������������������ into the weights ������������������������ = ∑������������1������������������������������������
                                                                                                      1
                                                                                                           used in ������������������������ (������������)
                                                                                       ������������                                             ������������ ������������ ������������

immediately yields ������������������������ (������������) = ������������1 ������������ (������������.1 ) + ⋯ + ������������������������ ������������ (������������.������������ ). So, for example, if the measure had two
dimensions (say health and income) with equal normative weights, each Gini point in health
would have the same value in ������������������������ (������������) as a Gini point in income. Through the calibration, which
accounts for both the weights and the base year means, the effective measuring rod becomes
units of specific inequalities.


                                                                                                                                                                20
Once the coefficient vector ������������ has been fixed, the measure ������������ ������������ ℒ′ (and ������������������������ ) can be applied to
distributions ������������ ������������ for ������������ = 1, … , ������������ for the purpose of evaluating multidimensional inequality
through time. The analysis it provides is consistent with the core axioms of multidimensional
inequality, while its intuitive aggregation structure helps interpret results. The link with specific
inequalities likewise helps in understanding the evolution of multidimensional inequality via
changes in specific inequalities, dimensional means, and joint distribution. Depending on the
unidimensional inequality measure used, further analysis is possible. For example, ������������������������ can help
evaluate the robustness of trends to the choice of unidimensional measure; while the use of a
decomposable ������������ allows multidimensional inequality to be decomposed into traditional within-
group and between-group terms.

VI. Illustrations.

We begin with a series of simulated results to illustrate the properties of the approach. The data
generation process for our simulation is based on multivariate log-normal distributions, a class of
distributions that approximates well many policy-relevant variables. We generate data for ������������ = 2
outcome variables over ������������ = 2 time periods by populating samples for each period with
������������ =100,000 random draws of outcome pairs from the bivariate log-normal distribution with
given means and covariance matrix. 36 In the resulting arrays ������������ ������������ for ������������ = 1, 2, the distribution of
the first outcome is more unequal than the distribution of the second outcome. Outcomes have
                                                       1
equal normative weights so that ������������������������ = 2������������ for ������������ = 1, 2, where ������������������������ is the mean of ������������.������������ = ������������.1
                                                                                                                 ������������ . The
                                                           ������������

resulting ������������ is applied to ������������ ������������ for ������������ = 1, 2 to obtain aggregate vector ������������ ������������ , while ������������̿ ������������ corresponds to the
completely aligned version ������������̿ ������������ of ������������ ������������ .

In Figure 1, the solid line depicts the “actual” aggregate Lorenz curve ������������������������������������ (������������), while the dashed
line depicts the completely aligned aggregate (or average) Lorenz curve ������������������������̿������������ (������������) for each time
period t = 1, 2. Figure 1 shows the simulated effect of the proportional increase (growth) in the
mean of the first outcome in period 2, namely, both ������������������������2 (������������) and ������������������������̿2 (������������) shift downwards,
indicating an increase in multidimensional (and average) inequality between the two periods.



36
  We use Stata command drawnorm to generate a bivariate normal distribution of two variables with the given
parameters. We then exponentiate these variables to obtain the bivariate log-normal distribution (Stata 2023).



                                                                                                                              21
This increase is driven in part by the higher effective weight on the more unequal outcome in
period 2 (via equation 7). This change also makes the array ������������ 2 more “aligned” compared to ������������1 as
reflected in a lower rearrangement term in period 2, which could be seen by comparing average
vertical differences between ������������������������2 (������������) and ������������������������̿2 (������������) (0.037) and ������������������������1 (������������) and ������������������������̿1 (������������) (0.045).

       Figure 1: Uncorrelated outcomes, increase in the mean of the first outcome in period 2.




Figure 2 presents the simulation scenario where the distribution of the second outcome becomes
more unequal in period 2. This change increases overall multidimensional inequality, which
again is manifested by the south-east shift of ������������������������2 (������������) (and ������������������������̿2 (������������)) compared to the curves in
period 1. The average vertical difference between the Lorenz curves associated with the actual
and the fully aligned array increases from 0.045 in period 1 to 0.051 in period 2.




                                                                                                                                  22
      Figure 2: Uncorrelated outcomes, higher inequality in the second outcome in period 2.




Figure 3 depicts the simulation where the two outcomes become more correlated in period 2. The
higher correlation of the outcomes clearly has no effect on two fully aligned curves as
������������������������̿1 (������������) = ������������������������̿2 (������������). But it increases the multidimensional inequality in period 2 by lowering the
mobility term and pushing ������������������������2 (������������) towards ������������������������̿2 (������������), so that the average vertical difference
between the actual and aligned curves decreases from 0.045 in period 1 to 0.027 in period 2.




                                                                                                                      23
               Figure 3: Outcomes are uncorrelated in period 1 and correlated in period 2.




We now illustrate equation (5), which breaks down multidimensional inequality into average
specific inequalities, and the rearrangement and structural terms, for various measures in ℒ ′ .
Subtracting the structural term ������������(������������) from the average specific inequality ������������(������������) yields ������������(������������̿ ), the
multidimensional inequality of the fully aligned array. Subtracting the rearrangement term ������������(������������)
from ������������(������������̿ ) yields multidimensional inequality ������������(������������). Table 1 lists these components for the
multidimensional measures generated by the Mean Log Deviation (MLD), the (first) Theil, and
the Gini inequality measures for three simulation scenarios. As noted above, ������������(������������) = 0 in the
case of Gini, but it is positive for the other measures. For example, when the mean of the first
outcome doubles, ������������������������������������������������ increases from 0.158 to 0.212 and ������������������������ℎ������������������������������������ from 0.170 to 0.235, with
much of the change being reflected in a rising ������������(������������) term and ������������(������������) and ������������(������������) falling slightly.
������������������������������������������������������������ likewise increases from 0.311 to 0.360 with ������������(������������) = ������������(������������̿ ) rising and ������������ (������������) falling by less
and ������������(������������) obviously remaining unchanged.




                                                                                                                                24
Table 1: Changes in multidimensional inequality and components for three scenarios.*
                                MLD                              Theil                             Gini
                      Period 1         Period 2         Period 1         Period 2       Period 1         Period 2
   Scenario 1              Mean income of the first outcome increases from 100 in Period 1 to 200 in Period 2
              �)
      ������������(������������         0.270            0.323           0.281            0.335           0.400            0.434
      ������������(������������ )       0.158            0.212           0.170            0.235           0.311            0.360
       ������������(������������)       0.302            0.353           0.302            0.354           0.400            0.434
       ������������(������������)       0.112            0.111           0.111            0.100           0.089            0.074
       ������������(������������ )      0.032            0.030           0.021            0.019           0.000            0.000
   Scenario 2      Inequality of the second outcome increases from Gini 0.3 in Period 1 to Gini 0.4 in Period 2
              �)
      ������������(������������         0.270            0.356           0.281            0.360           0.400            0.450
      ������������(������������ )       0.158            0.204           0.170            0.211           0.311            0.349
       ������������(������������)       0.302            0.365           0.302            0.366           0.400            0.450
       ������������(������������)       0.112            0.152           0.111            0.149           0.089            0.101
       ������������(������������ )      0.032            0.009           0.021            0.006           0.000            0.000
   Scenario 3                Correlation between outcomes increases from 0 in Period 1 to ≈0.3 in Period 2
              �)
      ������������(������������         0.270            0.270           0.281            0.282           0.400            0.400
      ������������(������������ )       0.158            0.200           0.170            0.211           0.311            0.347
       ������������(������������)       0.302            0.302           0.302            0.303           0.400            0.400
       ������������(������������)       0.112            0.070           0.111            0.071           0.089            0.053
       ������������(������������ )      0.032            0.032           0.021            0.021           0.000            0.000
*
  ) The simulated distributions have the following parameters: First outcome: mean 100, Gini 0.5; Second outcome:
mean 30, Gini 0.3; Correlation: period 1: 0; period 2: ≈ 0.3.

Table 2 presents results for the three inequality measures when the outcome variables in period 2
are based on a simple transformation of the period 1 data: namely, the second variable is
unchanged from period 1 while the first is obtained from its period 1 value by an additive or a
proportional increase. A uniform increment of 20 units lowers multidimensional inequality, as is
seen in the top panel of Table 2: ������������������������������������������������ declines from 0.160 to 0.130, ������������������������ℎ������������������������������������ declines from 0.173
to 0.144, and ������������������������������������������������������������ from 0.312 to 0.284. That increment also results in a drop in the
rearrangement component for all three indexes, and the structural component declines for MLD
and Theil, and remains at 0 for Gini. In contrast, a proportional increase in the first outcome by
20% increases multidimensional inequality as seen in the bottom panel of Table 2. ������������������������������������������������ grows
from 0.158 to 0.171, ������������������������ℎ������������������������������������ grows from 0.170 to 0.185, and ������������������������ grows from 0.311 to 0.323,
while both the rearrangement and structural mobility components change only slightly.




                                                                                                                        25
Table 2: Changes in multidimensional inequality from additive or proportional changes in the
first outcome.*
                                 MLD                              Theil                            Gini
                      Period 1          Period 2       Period 1          Period 2        Period 1         Period 2
   Scenario 1            The first outcome variable additively increases from 100 in Period 1 to 120 in Period 2
              �)
      ������������(������������         0.270            0.216           0.281             0.234          0.400             0.364
      ������������(������������ )       0.158            0.129           0.170             0.142          0.311             0.283
       ������������(������������)       0.302            0.222           0.302             0.242          0.400             0.364
       ������������(������������)       0.112            0.087           0.111             0.092          0.089             0.081
       ������������(������������ )      0.032            0.006           0.021             0.008          0.000             0.000
   Scenario 2      The first outcome variable proportionally increases from 100 in Period 1 to 120 in Period 2
              �)
      ������������(������������         0.270            0.284           0.281             0.296          0.400             0.409
      ������������(������������ )       0.158            0.171           0.170             0.185          0.311             0.323
       ������������(������������)       0.302            0.316           0.302             0.316          0.400             0.409
       ������������(������������)       0.112            0.113           0.111             0.111          0.089             0.086
       ������������(������������ )      0.032            0.032           0.021             0.020          0.000             0.000
*
  ) The simulated distributions have the following parameters: First outcome: mean 100, Gini 0.5; Second outcome:
mean 30, Gini 0.3; Correlation: period 1: 0; period 2: ≈ 0.3.

We now apply the methods to analyze changes in multidimensional inequality in Azerbaijan
from 2016 to 2018. We use data from the second (2016) and the fourth (2023) rounds of the Life
in Transition Survey (LITS). The LITS is a survey run by the European Bank of Reconstruction
and Development and the World Bank covering the so-called “transition countries” of Europe
and Central Asia and several comparator countries of Western Europe, the Middle East, and
North Africa (EBRD 2023). The survey included a nationally representative sample of around
1,000 households in Azerbaijan in the second and fourth rounds. Azerbaijan experienced rapid
growth in per capita GDP between 2016 and 2023, which makes the example of this country
helpful in illustrating the properties of the methods.

We construct the multidimensional inequality index ������������������������������������������������������������ for three dimensions captured by the
monthly per capita income (in 2017 PPP terms), years of education, and respondent’s health
assessment. The measure is calibrated for 2016 using normative weights of ½ for the income
dimension and ¼ for the education and health dimensions. Table 3 presents the specific and
multidimensional inequality levels for Azerbaijan in 2016 and 2023. The mean monthly per
capita income increased by almost 59 percent from about 852 PPP dollars in 2016 to 1350 PPP
dollars in 2023. The average years of education and health self-assessment remained relatively
stable. Income growth was accompanied by an increase in income inequality, from a Gini of
0.253 in 2016 to 0.339 in 2023. The inequality in years of education grew while the inequality in
health assessment slightly declined.


                                                                                                                 26
Table 3: Specific and multidimensional inequalities in Azerbaijan, 2016-2023.
                                                                 2016                                 2023

 Specific Inequalities
 Income
    Mean                                                        852.46                               1384.44
    Gini                                                         0.253                                0.339
 Education (years)
    Mean                                                        10.304                               11.091
    Gini                                                         0.094                                0.115
 Health
    Mean                                                         3.448                                3.511
    Gini                                                         0.166                                0.159

 Multidimensional Inequality
   ������������(������������ ) = ������������(������������̿ )                                    0.181                                0.260
   ������������(������������ )                                                   0.144                                0.230
   ������������(������������ ) = ������������ (������������ )                                    0.037                                0.029




Multidimensional inequality as measured by ������������������������������������������������������������ increased significantly between 2016 and
2023, from 0.144 to 0.230. This increase is due to (i) changes in the specific inequalities, (ii)
changes in the effective weights as dimensional means change, and (iii) changes in the
rearrangement term. Holding (ii) and (iii) fixed and altering specific inequalities from the 2016
to the 2023 levels raises the average specific inequality from 0.181 to 0.238. Incorporating the
2023 means alters the effective weights from 0.50, 0.25, and 0.25, respectively, to 0.61, 0.20, and
0.19, thus increasing the average further to 0.260. Finally, the rearrangement term fell slightly
from 0.037 to 0.029, reflecting greater alignment of dimensions and ensuring that the increase in
multidimensional inequality (namely, from 0.144 in 2016 to 0.230 in 2023) is more pronounced
than the increase in average specific inequality.

Figure 4 depicts the associated Lorenz curves for 2016 and 2023. The solid curves represent
������������������������������������ (������������), the Lorenz curve of the aggregate distribution ������������ ������������ from the actual data array. The dashed
curve is ������������������������̿������������ (������������) associated with the aligned aggregate distribution ������������̿ ������������ or, equivalently, the
weighted average of the specific Lorenz curves. The vertical difference between the solid and
dashed curve is the Lorenz rearrangement function for the year, while the average vertical
difference is clearly linked to the Gini rearrangement term. Each 2023 curve is below its
respective 2016 curve, indicating that multidimensional inequality is unambiguously higher in
2023 than in 2016. Replacing the Gini with any Lorenz-consistent inequality index ������������ would
preserve the conclusion that multidimensional inequality rose from 2016 to 2023.


                                                                                                                    27
                       Figure 4: Multidimensional inequality in Azerbaijan.




VII. Conclusions.

The goal of this paper was to identify a technology for measuring multidimensional inequality
that is axiomatically sound and can be easily understood by policymakers. The latter concern
suggests the intuitive two-stage approach of Maasoumi (1986), which aggregates achievements
for each person in the first stage and applies a traditional inequality measure to the distribution of
aggregates in the second. Our first series of results showed that the aggregation function must
take on a linear form if the measure is to satisfy the basic axioms for multidimensional inequality
measures. The next set of results showed how multidimensional inequality can be expressed as a
weighted average of specific inequalities minus a “mobility” term that measures the inequality-
reducing impact of dimensional mixing. The mobility term can be further divided into terms
reflecting the effect of rearranging achievements and combining distributions with different
shapes, with the latter term disappearing when the Gini coefficient or the Lorenz curve is used in
the second stage. We noted how the technology might be calibrated to incorporate the normative
priorities of policymakers over the specific inequalities. A series of examples with simulated data
showed how the specific inequalities, dimensional means, rearrangement term and structural



                                                                                                   28
term together shape multidimensional inequality, while an empirical example from Azerbaijan
illustrates how multidimensional inequality evolved during a period of rapid economic growth.

There are several concepts related to multidimensional inequality that the paper does not address.
We have not relied on a prior notion of welfare in our approach, and the functional form is not
reliant on traditional representations of welfare, utility functions, or even preferences. While
there is significant ongoing work in these areas, our paper is not intended to contribute to the
associated lines of research. The capability approach provides a natural guide for expanding
consideration to other evaluation spaces. As there are multiple capabilities, our multidimensional
approach to inequality might be seen as moving toward the same general goal. But our approach
does not account for individual variations in conversion factors between resources and outcomes,
nor the value of having many choices, both of which are central to the capability approach
(Foster and Sen 1997, Basu and Lopez-Calva, 2011). In addition, there is a burgeoning literature
on inequality with ordinal variables. 37 In contrast, the methods presented in this paper require
cardinal variables, while the link with specific inequalities also requires inequality values to have
cardinal significance. Consequently, they are not applicable to the many cases where cardinality
assumptions are unable to be maintained.




37
     See Allison and Foster (2004), for example.


                                                                                                    29
References

Aaberge, R. and Brandolini, A., (2015). “Multidimensional poverty and inequality.” In:
   Atkinson, A.B., Bourguignon, F. (Eds.), Handbook of Income Distribution, 2A, pp 141-216.

Alkire, S. and Foster, J., (2010). “Designing the inequality-adjusted human development index
   (IHDI).” Human Development Research Paper 2010/28, UNDP: New York.

Alkire, S. and Foster, J., (2011). “Counting and multidimensional poverty measurement.”
   Journal of Public Economics, 95 (7-8), pp. 476-487.

Alkire, S., Foster, J.E., Seth, S., Santos, M.E., Roche, J. and Ballon, P., (2015). Multidimensional
   Poverty Measurement and Analysis. Oxford University Press, Oxford.

Allison, R.A. and Foster, J.E., (2004). “Measuring health inequality using qualitative data.”
    Journal of Health Economics, 23, pp. 505–524.

Andreoli, F. and Zoli, C., (2020). From unidimensional to multidimensional inequality: A
   review. Metron, https://doi.org/10.1007/s40300-020-00168-4.

Atkinson, A. B., (1970). “On the measurement of inequality.” Journal of Economic Theory, 2,
   pp. 244–263.

Atkinson, A. B., (2003). “Multidimensional deprivation: Contrasting social welfare and counting
   approaches.” Journal of Economic Inequality, 1, pp. 51–65.

Atkinson, A.B. and Bourguignon, F., (1982). “The comparison of multi-dimensional distribution
   of economic status.” The Review of Economic Studies, 49, pp. 183–201.

Bartels, C. and Stockhausen, M., (2017). “Children's opportunities in Germany – An application
   using multidimensional measures.” German Economic Review, 18(3), pp. 327-376.

Basu, K. and Lopez-Calva, L.F., (2011). “Functionings and Capabilities.” Handbook of Social
   Choice and Welfare, 2, pp. 153-187.

Boland, P. and Proschan, F., (1988). “Multivariate arrangement increasing functions with
   application in probability and statistics.” Journal of Multivariate Analysis, 25, pp. 286–298.

Bosmans, K., Decancq, K. and Ooghe, E., (2015). “What do normative indices of
   multidimensional inequality really measure?” Journal of Public Economics, 130, pp. 94–104.

Bourguignon, F., (1999). “Comment on ‘Multidimensioned approaches to welfare analysis’.” In:
   Maasoumi, E., Silber, J. (Eds.), Handbook of Income Inequality Measurement. Kluwer
   Academic, London.




                                                                                                 30
Dardanoni, V., (1995). “On multidimensional inequality measurement.” In: Dagum, C., Lemmi,
   A. (Eds.), Income Distribution, Social Welfare, Inequality and Poverty. Research on
   Economic Inequality vol. 6. CT: JAI Press, Stamford.

Decancq, K., (2009). Essays on the Measurement of Multidimensional Inequality. Ph.D.
   Dissertation, Katholieke Universiteit, Lueven.

Decancq, K., (2011). “Measuring global well-being inequality: A dimension-by-dimension or
   multidimensional approach?” Reflets et perspectives de la vie économique, 4 (Tome L), pp.
   179-196.

Decancq, K., and Lugo, M. A., (2012). “Inequality of wellbeing: A multidimensional approach.”
   Economica, 79, pp. 721–746.

Diez, H., Lasso de la Vega M. C. and Urrutia, A. M., (2007). “Unit-Consistent aggregative
   multidimensional inequality measures: A characterization.” Working Papers #66, ECINEQ,
   Society for the Study of Economic Inequality.

Foster, J. E., (1985). “Inequality measurement.” In: Proceedings of Symposia in Applied
   Mathematics Vol. 33, pp. 31-68 (H. P. Young, ed.), American Mathematical Society.

Foster, J. E., (1994). “Normative measurement: Is theory relevant?” American Economic Review,
   84(2), pp. 365-370.

Foster, J. E., (2023). “Intentional measurement.” Mimeo.

Foster, J. E. and Sen, A. K., (1997). “On economic inequality. After a quarter century.” Annex to
   the enlarged edition of On Economic Inequality by Sen, A.K. Clarendon Press, Oxford.

Gajdos, T. and Weymark, J.A., (2005). “Multidimensional generalized Gini indices.” Economic
   Theory, 26, pp. 471–496.

Glassman, B., (2019). “Multidimensional inequality: Measurement and analysis using the
   American Community Survey.” SEHSD Working Paper Number 2019-17, Eastern Economic
   Association Annual Conference.

International Monetary Fund (2023). “Income inequality.” IMF accessed at
    https://www.imf.org/en/Topics/Inequality/introduction-to-inequality.

International Panel on Social Progress, (2018). “Rethinking society for the 21st century.”
    Cambridge University Press, Cambridge.

Justino, P. (2012). “Multidimensional welfare distributions: empirical application to household
    panel data from Vietnam.” Applied Economics, 44, pp. 3391–3405.




                                                                                                  31
Kannai, Y., (1980). “The ALEP definition of complementarity and least concave utility
   functions.” Journal of Economic Theory, 22, pp. 115–117.

Kolm, S-C., (1976). “Unequal Inequalities II.” Journal of Economic Theory, 13, pp. 82-111.

Kolm, S-C., (1977). “Multidimensional egalitarianisms.” Quarterly Journal of Economics, 91
   (1), pp. 1–13.

Koshevoy, G. A. and Mosler, K., (1997). “Multivariate Gini indices.” Journal of Multivariate
   Analysis, 60, pp. 252-276.

Lugo, M.A., (2007). “Comparing multidimensional indices of inequality: methods and
   application.” In: Bishop, J., Amiel, Y. (Eds.), Inequality and Poverty. Research on Economic
   Inequality, 14. Emerald Group Publishing Limited, pp. 213-236.

Maasoumi, E., (1986). “The measurement and decomposition of multi-dimensional inequality.”
  Econometrica, 54, pp. 991–998.

Maasoumi, E., (1999). “Multidimensioned approaches to welfare analysis.” In: Silber, J. (Ed.),
  Handbook of Income Inequality Measurement. Kluwer Academic, London.

Nilsson, T. (2010). “Health, wealth and wisdom: Exploring multidimensional inequality in a
    developing country.” Social Indicators Research, 95, pp. 299–323.

Rohde, N. and Guest, R. (2013). “Multidimensional racial inequality in the United States.”
   Social Indicators Research, 114, pp. 591–605.

Rohde, N. and Guest, R. (2018). “Multidimensional inequality across three developed countries.”
   Review of Income and Wealth, 64(3), pp. 576–591.

Sen, A.K. (1997). On Economic Inequality. Clarendon Press, Oxford.

Sen, A. (1999). Development as Freedom. Alfred Knopf, New York.

Seth, S., (2013). “A class of distribution and association sensitive multidimensional welfare
   indices.” The Journal of Economic Inequality, 11, pp. 133–162.

Seth, S. and Santos, M. E., (2018). “Multidimensional inequality and human development.”
   OPHI Working Paper No. 114, 18-11.

Slesnick, D. T., (1989). “Specific egalitarianism and total welfare inequality: A decompositional
    analysis.” The Review of Economics and Statistics, 71(1), pp. 116-127.

Shorrocks, A. F., (1978). “Income inequality and income mobility.” Journal of Economic
   Theory, 19, pp. 376-393.



                                                                                                 32
Stiglitz, J., (2023). “Inequality and democracy.” Project Syndicate, accessed at
    https://www.projectsyndicate.org/commentary/inequality-source-of-lost-confidence-in-
    liberal-democracy-by-joseph-e-stiglitz-2023-08

Stiglitz, J., Sen, A., and Fitoussi, J., (2009). Report by the Commission on the Measurement of
    Economic Performance and Social Progress. Commission on the Measurement of Economic
    Performance and Social Progress, Paris.

Szekely, M. (2006). Números Que Mueven al Mundo: La Medición de la Pobreza en México,
   Miguel Ángel Porrúa, Mexico City.

Tobin, J., (1970). “On limiting the domain of inequality.” Journal of Law and Economics, 13 (2),
   pp. 263- 277.

Tsui, K. Y., (1995). “Multidimensional generalizations of the relative and absolute inequality
   indices: the Atkinson–Kolm–Sen approach.” Journal of Economic Theory, 67(1), pp. 251–
   265.

Tsui, K. Y., (1999). “Multidimensional inequality and multidimensional generalized entropy
   measures: an axiomatic derivation.” Social Choice and Welfare, 16, pp. 145–157.

Weymark, J.A. (1981). “Generalized Gini Inequality Indices.” Mathematical Social Sciences, 1
  (4), pp. 409-430.

Weymark, J.A., (2006). “The normative approach to the measurement of multidimensional
  inequality.” In: Farina, F., Savaglio, E. (Eds.), Inequality and Economic Integration,
  Routledge, London.




                                                                                                 33
Appendix

Proof of Theorem 1. Let ������������ ������������ ℳ have components ℎ and ������������ . Suppose that ������������ satisfies weak
uniform majorization. We begin with the following lemma.

Lemma: Suppose that ℎ(������������1 ) = ⋯ = ℎ(������������������������ ) = ������������ > 0 for a given set ������������1 , … , ������������������������ ������������ ������������ ⋯ of ������������ ≥ 2
vectors. Then ℎ(������������1 ������������1 + ⋯ + ������������������������ ������������������������ ) = ������������ for any ������������1 , … , ������������������������ > 0 with ������������1 + ⋯ + ������������������������ = 1.

∎ Let ������������1 , … , ������������������������ ������������ ������������ ⋯ satisfy ℎ(������������1 ) = ⋯ = ℎ(������������������������ ) = ������������ > 0, and suppose that ������������1 , … , ������������������������ > 0 with
������������1 + ⋯ + ������������������������ = 1. By linear homogeneity of ℎ we can find ������������ ������������ ������������ ⋯ for which ℎ(������������) = ������������ is strictly
                                                        ������������1                   ������������
                                                          ⋮                ′
below ������������. Consider the ������������ × ������������ arrays ������������ = � � ������������ ������������ and ������������ = � ⋮ � ������������ ������������ and define the associated
                                                       ������������������������                 ������������
                            ������������                                                                    B 0
2������������ × ������������ array ������������ = � ′ � ������������ ������������. Now define a 2������������ × 2������������ bistochastic matrix ������������ = �        �, where Β is a
                           ������������                                                                     0 I
������������ × ������������ bistochastic matrix having (������������11 , … , ������������1������������ ) = (������������1 , … , ������������������������ ) as its first row, ������������ is a ������������ × ������������ identity
matrix, and 0 is a ������������ × ������������ matrix of zeros. Since ������������ satisfies the weak uniform majorization axiom,
it follows that ������������(������������) ≥ ������������(������������ ′ ) for ������������ ′ = ������������������������. Let ������������ be the vector defined by ������������������������ = ℎ(������������������������ ) for ������������ =
1, … ,2������������ and note that ������������������������ = ℎ(������������������������ ) = ������������ for ������������ = 1, … , ������������ and ������������������������ = ℎ(������������) = ������������ for ������������ = ������������ + 1, … ,2������������ .
                                                                                                ������������
By replication invariance of ������������ , it follows that ������������(������������) = ������������ (������������) = ������������ ������������� �. As for the vector ������������ ′

                                                               ������������������������
associated with ������������ ′ , we note that ������������ ′ = �                        �, so it is immediate that ������������������������′ = ������������ for ������������ = ������������ + 1, … ,2������������,
                                                                ������������ ′
while each entry ������������������������′ for ������������ = 1, … , ������������ is found by applying ℎ to a weighted average of the rows in ������������,
namely ������������������������′ = ℎ(������������������������1 ������������1 + ⋯ + ������������������������������������ ������������������������ ). By the concavity of ℎ we have ������������������������′ ≥ ������������������������1 ℎ(������������1 ) + ⋯ +
������������������������������������ ℎ(������������������������ ) = ������������ for all ������������ . Now consider the vector ������������ ′′ having the same last ������������ entries as ������������ ′ , but with
the first ������������ entries in ������������ ′ replaced by their mean ������������ = ������������
                                                               1
                                                                    ∑������������        ′
                                                                     ������������=1 ������������������������ . By the transfer axiom and
                                                                                 ������������
replication invariance for ������������ we have ������������ (������������ ′ ) ≥ ������������ (������������ ′′ ) = ������������ ������������� �. Summing up, we conclude that
      ������������           ������������
������������ ������������� � ≥ ������������ ������������� � for the Lorenz consistent measure ������������ . By construction, we know that the
                                                                                               ������������
inequalities ������������ ≥ ������������ > ������������ must hold; but ������������ > ������������ is surely not possible, since then ������������� � would strictly
                 ������������                    ������������           ������������
Lorenz dominate ������������� � leading to ������������ ������������� � < ������������ ������������� �. Consequently, ������������ = ������������, and since ������������ = ������������
                                                                                                            1
                                                                                                                 ∑������������        ′
                                                                                                                  ������������=1 ������������������������

with ������������������������′ ≥ ������������ for all ������������ , it follows that ������������������������′ = ������������ for all ������������ . In particular, setting ������������ = 1 yields
ℎ(������������1 ������������1 + ⋯ + ������������������������ ������������������������ ) = ℎ(������������11 ������������1 + ⋯ + ������������1������������ ������������������������ ) = ������������������������′ = ������������ , as desired. ∎


                                                                                                                                                     34
                                                                                                  ������������                ������������
Now consider the collection of vectors ������������1 , … , ������������������������ ������������ ������������ ⋯ defined by ������������������������ = ℎ�������������   �
                                                                                                         , where
                                                                                                                             ������������

������������1 , … , ������������������������ ������������ ������������ ⋯ are constructed from the usual basis vectors ������������1 , … , ������������������������ and the midpoint ������������ =
  1        1                                  1           1
������������� , … , ������������� as follows: ������������������������ = 2 ������������������������ + 2 ������������ for ������������ = 1, … , ������������. It can be shown that ������������1 , … , ������������������������ are linearly
                                                           ������������1
independent, and hence that the ������������ × ������������ matrix ������������ = � ⋮ � is invertible. Define ������������ = (������������1 , … , ������������������������ ) ������������ ������������ ������������
                                                           ������������������������
                   1
by ������������ = ������������ −1 � ⋮ � and note that ������������������������������������ = ℎ������������������������� � = 1 for all j. The vectors ������������1 , … , ������������������������ generate a simplex
                   1

            ������������ = {������������ ������������ ������������ ⋯ : ������������ = ������������1 ������������1 + ⋯ + ������������������������ ������������������������ for ������������1 , … , ������������������������ > 0 with ������������1 + ⋯ + ������������������������ = 1}

and a cone

            ������������ = {������������ ������������ ������������ ⋯ : ������������ = ������������1 ������������1 + ⋯ + ������������������������ ������������������������ for ������������1 , … , ������������������������ > 0}.

We will show that ℎ(������������) = ������������������������ for all ������������ ������������ ������������ . First, pick any ������������ ������������ ������������. By the definition of ������������ as a point
in ������������, it follows that ������������������������ = ������������1 ������������������������1 + ⋯ + ������������������������ ������������������������������������ = 1; applying the lemma ensures that ℎ(������������) = 1.
Consequently ℎ(������������) = ������������������������ for all ������������ ������������ ������������. Now pick any ������������ ������������ ������������ . By the definition of ������������ as a point in ������������ ,
it follows that ������������������������ = ������������1 ������������������������1 + ⋯ + ������������������������ ������������������������������������ = ������������1 + ⋯ + ������������������������ , so that ������������ ′ = ������������ /(������������1 + ⋯ + ������������������������ ) ������������ ������������.
Consequently, by the linear homogeneity of ℎ we have ℎ(������������) = (������������1 + ⋯ + ������������������������ )ℎ(������������ ′ ) =
(������������1 + ⋯ + ������������������������ ) = ������������������������, which shows that ℎ(������������) = ������������������������ for all ������������ ������������ ������������ .

We now show that ℎ(������������) = ������������������������ for any ������������ ������������ ������������ ⋯ that is not in ������������ . Pick any such ������������ and define ������������ ′ =
������������/ℎ(������������) so that by linear homogeneity ℎ(������������ ′ ) = 1. Define ������������������������ = (������������1 + ⋯ + ������������������������ )/������������ ������������ ������������ and note
that ℎ������������������������� � = ������������������������������������ = 1. Since ������������������������ is interior to ������������ , we can find a small enough weight 0 < ������������ < 1
so that ������������ ′′ = ������������������������ ′ + (1 − ������������)������������������������ ������������ ������������ . By the lemma, ℎ(������������ ′′ ) = 1 and hence ������������������������ ′′ = 1 since ������������′′������������ ������������ .
Clearly, ������������������������ ′′ = ������������������������������������ ′ + (1 − ������������)������������������������������������ , or equivalently 1 = ������������������������������������ ′ + (1 − ������������), which yields ������������ = ������������������������������������ ′
and so 1 = ������������������������ ′ = ������������������������/ℎ(������������). It follows, then, that ℎ(������������) = ������������������������ for ������������ ������������ ������������ ⋯ . Finally, since ℎ is
increasing, it we know that ������������ ≫ 0.

Proof of Theorem 2. Let ������������ ������������ ℒ be any two-stage measure with components ℎ ������������ ℋ and ������������ ������������ ℐ. To
show that ������������ satisfies anonymity, let ������������′ be obtained from ������������ by a permutation, so that ������������ ′ = Π������������ for
some permutation matrix Π. Clearly ������������ ′ = Π������������, and hence ������������ (������������ ′ ) = ������������ (������������) by anonymity of ������������ , which
yields ������������(������������ ′ ) = ������������(������������), as required. To show that ������������ satisfies scale invariance, let ������������′ be obtained


                                                                                                                                                            35
from ������������ by a scalar multiple, so that ������������������������′������������ = ������������������������������������������������ for some ������������ > 0, for all ������������ = 1, … , ������������ and ������������ =
1, … , ������������. Clearly ������������ ′ = ������������������������ by linear homogeneity of ℎ, and hence ������������ (������������′) = ������������ (������������) by the scale
invariance of ������������ , which yields ������������(������������ ′ ) = ������������(������������), as required. To show replication invariance, let ������������′
be obtained from ������������ by a replication. Clearly ������������′ is obtained from ������������ by a replication, and hence
������������ (������������′) = ������������ (������������) by the replication invariance of ������������ , which yields ������������(������������ ′ ) = ������������(������������), as required.

Now to show that that ������������ satisfies limited uniform majorization, let ������������ ′ be obtained from ������������ by a
uniform majorization, so that ������������ ′ = B������������ for some bistochastic matrix ������������. Clearly ������������������������′ =
ℎ(������������������������1 ������������1 + ⋯ + ������������������������������������ ������������������������ ) = ������������������������1 ℎ(������������1 ) + ⋯ + ������������������������������������ ℎ(������������������������ ) = ������������������������1 ������������1 + ⋯ + ������������������������������������ ������������������������ and so ������������ ′ = ������������������������. Hence
������������ (������������ ′ ) ≤ ������������ (������������) by the transfer axiom for ������������ , which yields ������������(������������ ′ ) ≤ ������������(������������), as required by the first
part of the axiom. Now suppose that ������������′ is obtained from ������������ by a uniform smoothing, and that in
addition, ������������′ is not obtained from ������������ by a permutation. By the above we know that ������������ ′ = ������������������������, and
given that ������������′ is not obtained from ������������ by a permutation, it follows from the transfer axiom that
������������ (������������′) < ������������ (������������), and hence ������������(������������′) < ������������(������������), as required by the second part of the axiom.

Finally, to show that that ������������ satisfies the unfair rearrangement axiom, let ������������ ′ be obtained from ������������
by an unfair rearrangement, so that ������������ ′ = ������������̿ . We first show that ������������(������������̿ ) ≥ ������������(������������), or equivalently
������������ (������������̿) ≥ ������������ (������������). Consider the ordered vector ������������̂ of s and similarly permute the rows of ������������ to obtain an
array ������������′′ whose aggregate vector is ������������̂. Note that ������������̿ is also an unfair rearrangement of ������������′′, which
                                                            �∙������������ of ������������̿ contains the same entries as the
implies for each ������������ = 1, … , ������������ that the column vector ������������
column vector ������������.′′                         �∙������������ is ordered from lowest to highest, it follows that
                   ������������ of ������������′′. And since ������������


             ∑������������    ������������������������� ≤ ∑������������
              ������������=1 ������������
                                               ′′
                                   ������������=1 ������������������������������������ for ������������ = 1, … , ������������, with equality holding for ������������ = ������������                                                 (A1)

Multiplying through by ������������������������ and summing across all ������������ yields

             ∑������������    ������������
                                       ������������������������� ≤ ∑������������
              ������������=1 ∑������������=1 ������������������������ ������������
                                                            ������������                 ′′
                                                    ������������=1 ∑������������=1 ������������������������ ������������������������������������ for ������������ = 1, … , ������������, with equality holding for ������������ = ������������.


Note that ������������̿������������ = ∑������������             ������������������������� while ������������̂������������ = ∑������������
                       ������������=1 ������������������������ ������������
                                                                                          ′′
                                                                     ������������=1 ������������������������ ������������������������������������ and hence


                 ∑������������                ������������
                  ������������=1 ������������̿������������ ≤ ∑������������=1 ������������̂������������   ������������ = 1, … , ������������, with equality holding for ������������ = ������������.                                                  (A2)

                          �∙������������ of ������������̿ is ordered, the aggregate vector ������������̿ is ordered. Consequently,
Since each column vector ������������
(A2) is equivalent to the statement that ������������ weakly Lorenz dominates ������������̿. It follows that ������������ (������������̿ ) ≥ ������������ (������������)
for any Lorenz consistent ������������ , and so ������������(������������̿ ) ≥ ������������(������������).


                                                                                                                                                                              36
Now suppose that, in addition, ������������̿ is not a permutation of ������������. We need to verify that ������������(������������̿ ) > ������������(������������).
To do this we will show that at least one of the inequalities in (A2) is strict, in which case ������������
would strictly Lorenz dominate ������������̿, implying ������������ (������������̿) > ������������ (������������) for Lorenz consistent ������������ and yielding the
desired conclusion. So suppose, by way of contradiction, that all the inequalities in (A2) hold
with equality. Since ������������̿ and ������������̂ are ordered vectors, this would imply that that ������������̿ = ������������̂. Inequality
                                                                        ′′                      ′′
(A1) applied to ������������ = 1 for ������������ = 1, … ������������ implies that ������������̿1 ≤ ������������1  . Since ������������̿1 = ������������1  , the vector dominance
                                      ′′
cannot be strict, and so ������������̿1 = ������������1  ; both ������������̿ and ������������ ′′ have the same first row. Now suppose that the
first ������������′ − 1 rows in ������������̿ are the same as the first ������������′ − 1 rows in ������������ ′′ , where ������������′ = 2, … , ������������. We want to
show that row ������������′ is the same for both. By (A1) applied to ������������′ for ������������ = 1, … , ������������ we know that ������������̿������������′ ≤
     ′′                                                                                                  ′′
������������������������′ , since the first ������������′ − 1 terms in each summation are identical. And since ������������̿������������′ = ������������������������′ , the vector
                                                     ′′
dominance cannot be strict, and so ������������̿������������′ = ������������������������′ ; both ������������̿ and ������������ ′′ have the same first ������������′ rows. This
leads to the conclusion that ������������̿ = ������������′′ must be a permutation of ������������, contrary to assumption. Thus,
there must be a strict inequality in (A2) for some ������������ = 1, … , ������������ − 1, from which it follows that
������������(������������̿ ) > ������������(������������).

Proof of Theorem 3. Let ������������ ������������ ℒ′ and pick any ������������ ������������ ������������. Let ������������ be the aggregate vector associated
                                                                                                                       ′
with ������������, and notice that ������������(������������) = ������������1 ������������1 + ⋯ + ������������������������ ������������������������ . Define ������������ ′ = ������������/������������ (������������) and ������������∙������������ = ������������∙������������ /������������������������ for ������������ =
1, … , ������������, which share the same population size and same total (or mean). By constant-sum
                                            ′ )                          ′
convexity, ������������ (������������ ′ ) ≤ ������������1 ������������ (������������∙1  + ⋯ + ������������������������ ������������ (������������∙������������ ), where ������������1 , … ������������������������ defined by ������������������������ = ������������������������ ������������������������ /������������(������������) are
nonnegative and sum to 1. By anonymity of ������������ we then have ������������ (������������) ≤ ������������1 ������������(������������∙1 ) + ⋯ + ������������������������ ������������ (������������∙������������ )
                                                           ′
and hence ������������(������������) = ������������(������������) − ������������(������������) ≥ 0. If ������������∙������������ are all identical, then so is their convex combination
������������′ and hence ������������(������������) = ������������(������������) − ������������(������������) = 0 in this case.

Proof of Theorem 5. Pick any ������������ ������������ ������������. It will be shown that ������������(������������) = ������������������������ (������������̿ ) − ������������������������ (������������̿ ) = 0 which
along with Theorem 4 will establish the result. Note that the Gini coefficient ������������ : ������������ ⋮ → ������������ can be
defined for vector ������������ ������������ ������������ ⋮ by ������������ (������������) =                                      1
                                                                                      ������������(������������)
                                                                                                   ∑������������             ������������� where ������������������������ = �2������������−������������−1
                                                                                                    ������������=1 ������������������������ ������������                          ������������2
                                                                                                                                                          �. Consequently,

            ������������������������ (������������̿ ) =    ������������1 ������������1
                                  ∑������������ ������������������������ ������������������������
                                                                   �.1 ) + ⋯ + ∑������������������������������������������������������������
                                                            ������������ (������������                                      �.������������ ) =
                                                                                                ������������ ������������ (������������               ������������1
                                                                                                                         ∑������������ ������������������������ ������������������������
                                                                                                                                                   ∑������������             �������������1 + ⋯ + ∑ ������������
                                                                                                                                                    ������������=1 ������������������������ ������������
                                                                                                                                                                                        ������������
                                                                                                                                                                                     ������������ ������������
                                                                                                                                                                                               ∑������������             ������������������������� .
                                                                                                                                                                                                ������������=1 ������������������������ ������������
                                                                                           ������������ ������������ ������������                                                                                 ������������ ������������ ������������



Let ������������̿ ������������ ������������ ⋮ be the aggregate vector associated with ������������̿ . Then ������������̿������������ = ������������1 ������������                    ������������������������� , and ������������(������������̿) =
                                                                                              �������������1 + ⋯ + ������������������������ ������������
∑������������ ������������������������ ������������������������ so that ������������������������ (������������̿ ) = ������������1
                                                        (������������
                                                         �)
                                                              ∑������������
                                                               ������������=1 ������������������������ ������������̿������������ . Since ������������̿ is the aggregate vector for ������������̿ , it follows that




                                                                                                                                                                                                                        37
������������������������ (������������̿ ) = ������������ (������������̿); and since the rows of ������������̿ are ordered by vector dominance, ������������̿ itself is an ordered
vector, and hence ������������ (������������̿) = ������������1
                                     (������������
                                      �)
                                           ∑������������
                                            ������������=1 ������������������������ ������������̿������������ . It follows that ������������������������ (������������̿ ) = ������������������������ (������������̿ ), as desired.


Proof of Theorem 6. Let ℎ ������������ ℋ be linear with ������������ ≫ 0, and let ������������������������ be the aggregate Lorenz curve
for ������������ ������������ ������������. Equation (8) showed that ������������������������ (������������) = ������������������������̿ (������������) and so ������������������������ (������������) + ������������������������ (������������) = ������������������������ (������������) for ������������ ������������ [0,1]
as required by (9). By expression (A1) in the proof of Theorem 2, we know that ������������������������ (������������) =
������������������������ (������������) − ������������������������̿ (������������) ≥ 0 for ������������ ������������ [0,1]. If in addition ������������̿ is a permutation of ������������, then ������������̿ is a permutation of
������������ and hence their quantile functions and Lorenz curves are identical, which implies ������������������������ (������������) = 0
for all ������������ ������������ [0,1]. If ������������̿ is a not permutation of ������������, then the proof of Theorem 2 shows that ������������ Lorenz
dominates ������������̿ and hence ������������������������ (������������) ≠ 0 for some ������������ ������������ [0,1].




                                                                                                                                                            38