Scholarly article on topic 'Pareto tails and lognormal body of US cities size distribution'

Pareto tails and lognormal body of US cities size distribution Academic research paper on "Electrical engineering, electronic engineering, information engineering"

Share paper
{Lognormal / "Lower tail" / "Upper tail" / "US cities"}

Abstract of research paper on Electrical engineering, electronic engineering, information engineering, author of scientific article — Jeff Luckstead, Stephen Devadoss

Abstract We consider a distribution, which consists of lower tail Pareto, lognormal body, and upper tail Pareto, to estimate the size distribution of all US cities. This distribution fits the data more accurately than a distribution that comprises of only lognormal and the upper tail Pareto.

Academic research paper on topic "Pareto tails and lognormal body of US cities size distribution"

Contents lists available at ScienceDirect

Physica A

journal homepage:

Pareto tails and lognormal body of US cities size distribution CrossMark

Jeff Lucksteada-*, Stephen Devadossb

a University of Arkansas, Agricultural Economics & Agribusiness, Agriculture Building, Fayetteville, AR 72701, United States b Texas Tech University, United States


• Our distribution consists oflower tail Pareto, lognormal body, and upper tail Pareto.

• We apply this distribution to all US cities.

• Our distribution fits the data more accurately than lognormal and upper tail Pareto.

article info abstract

Article history: We consider a distribution, which consists oflower tail Pareto, lognormal body, and upper

Received 20 April 2016 tail Pareto, to estimate the size distribution of all US cities. This distribution fits the data

Received in revised foi-m 17 July 2016 more accurately than a distribution that comprises of only lognormal and the upper tail

Available online 31 August 2016 Pareto

--© 2016 The Author(s). Published by Elsevier B.V. This is an open access article under the

^ CC BY-NC-ND license (

Lower tail Upper tail US cities

1. Introduction

City size distribution has been studied extensively for several decades [1]. Earlier studies focused on the distribution of cities in the upper tail and found this distribution to be Pareto [2]. Gabaix [3] developed an economic model that predicts Gibrat's law and power law behavior for upper tail cities. Along this line, Reed and Jorgensen [4] employed geometric Brownian motion to derive lower-tail reverse-Pareto and upper-tail Pareto distribution which he termed double Pareto lognormal. In contrast, Eeckhout [5] formulated an economic model and concluded lognormal distribution accurately depicts the US city size distribution.

Based on the above economic models, studies have attempted to combine the lognormal body and the upper-tail Pareto into a unified distribution to analyze the distribution of all cities [6]. Recent studies by Devadoss et al. [7,8] have shown strong evidence that the lower tail of the city size distribution also follows the power law behavior. Consequently, the purpose of this study is to propose a distribution, which is consistent with the economic theory of city sizes, to model lower and upper tails with Pareto and middle range with lognormal, and endogenously identify the transition points both from lower tail to lognormal and from lognormal to upper tail.1 We denote this distribution as Pareto-tails lognormal (PTLN). Thus, our distribution extends that of loannides and Skouras [6] by modeling the reverse Pareto for the lower tail and delineating

* Corresponding author.

E-mail address: (J. Luckstead). 1 Reed [9] models both the lower tail and upper tail with Pareto and the middle range with lognormal, but this model does not allow for parametric estimation of the switching points. Also see Gomez-Deniz and Calderin-Ojeda [10] for application of Pareto ArcTan Distribution.

0378-4371/© 2016 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( licenses/by-nc-nd/4.0/).

the switching point at the lower tail also. We econometrically estimate the size distribution of all US cities using these two distributions. Our results show that the two-Pareto tail-lognormal distribution performs better than the lognormal-upper tail Pareto distribution. The next section presents the two-Pareto tail-lognormal distribution and shows how it is related to the lognormal-upper tail Pareto distribution. We also present the log-likelihood functions for these two distributions. Section 3 discusses the data, estimation, and results. The final section concludes the paper.

2. Two-Pareto tail-lognormal distribution

In recent years, new size distributions in the literature have mushroomed. Economists have combined the lognormal and Pareto distributions, which are grounded in economic theory, to formulate composite lognormal-Pareto distributions. These distributions are also grounded in statistics and can be constructed from several different methods. Eliazar and Cohen [11], Eliazar and Cohen [ 12], Eliazar and Cohen [13] apply Lorenz asymptotic analyses to derive rank distributions to model income and wealth in human societies and other natural phenomena. Reed [14] and Reed andJorgensen [4] provide evidence of upper-tail Pareto and lower-tail inverse Pareto in city size data and use geometric Brownian motion to derive the double Pareto lognormal distribution. Eliazar and Cohen [15,16], and Eliazar and Cohen [17] use geometric Langevin dynamics to derive general methods for constructing composite distributions; one commonly employed distribution is the log-Gaussian for the body and power law in both tails. In a recent study, Cohen and Eliazar [18] employ an entropy based approach to establish power-law distributions.

Along this line of work, we formulate a distribution consisting of three components - lower tail Pareto, middle range lognormal, and upper tail Pareto - as specified by

Ic (m, a, T,, a,) b (m, a, a,, T,, tu, au) g' (x; a,, T,), xmin < x < T, b (^,a,a',T',vu,au) f (x; m, a), T, < x < ru (1)

a (m, a, Tu, au) b (m, a, a,, T,, Tu, au) gu (x; au, Tu), Tu < x < to.

Below, we define each of the components and the parameters. The lower tail Power law, with shape parameter a,, over the support [xmin, T,) is

g' (x; a,, T,) = xa-1, xmin < x < t,, a, > 0,

where t, delineates the transition point from the lower tail to lognormal. The lognormal body, with location m and scale a, over the support [t,, tu] is

, 1 ( (logx - m)2 \

f(x;M,a) = xa^exp\—2-—), T<x<Tu,

where Tu pinpoints the switching point from lognormal to the upper tail. The upper tail Power law, with shape parameter au, over the support (tu, to) is

gu (x; au, Tu) = x-(1+au), Tu < x, au > 0. The normalization constant c (m, a, t,, a,) = gi^'.^'T) maintains continuity at the transition point t,, and similarly, the normalization constant a (m, a, Tu, au) = gf((T".;M,°r)) preserves continuity at Tu. The term

( t, - (T,)1-<axmin tu

b (m, a, a,, t,, au, Tu) = f (t,; m,°)--h 0 (Tu; m, a) - 0 (t,; m, a) + f (Tu; m, a) —

is a normalization constant which ensures that the integration of the PDF h over its entire support yields one, where 0 is the CDF of the lognormal.

Calderin-Ojeda [19] builds on loannides and Skouras's four-parameter continuous spliced model by proposing a three-parameter composite Lognormal-Pareto distribution to estimate the distribution for all French communes.2 In contrast, our distribution explicitly models the lower tail.

By imposing the differentiability at the transition points, we derive the following two restrictions:

log t, - m

a, =--2--(2)

log Tu - m

au =-2--(3)

and reduce the number of parameters to be estimated from six (m, a, a,, t,, au, and Tu) to four (m, a, t,, and Tu).3

2 Puente-Ajovin and Ramos [20] study the French, German, Italian, and Spanish communes by estimating the seven-parameter threshold double Pareto Singh-Maddala distribution. This distribution also captures Pareto behavior in the lower and upper tails but has a Singh-Maddala body.

It is worth noting that Eliazar and Cohen [21] and Eliazar and Cohen [22] derive a similar Pareto tails-lognormal body distribution using geometric Langevin dynamics and apply to study socioeconophysical variables such as income and wealth. When we applied this distribution to our city-size data, the performance of this model is comparable to that of our distribution.

For n i.i.d. samples of x, the joint log-likelihood is

l (a, a, ti, Tu |X1, ..., Xn) =

nrl-1 nn-1 nn-1

log [c] log [b] +J2 log [gl (Xi)] , Xmin < X < Ti

J2 log [b] + J2 log [f (Xi)] , Ti < X < Tu

i=nTl i=nTi

£ log [a] + £ log [b] + £ log [gu (Xi)].

Tu < X < 00.

i=nTu + 1

i=nTu + 1

i=nTu + 1

Since an analytical solution does not exist for this log-likelihood function, we numerically estimate the parameters a, a, tl, and Tu.

The CDF of h is

H (x; a, a, Tl, Tu) =

cbX-^ = k 1,

1 b /logx - b /logti - a

k 1 + - erf ——---erf -—-

+ 2 V 21/2a / 2 V 21/2a

k---\x-au - t,7

-au _ T-au

Xmin < X < Tl Tl < X < Tu Tu < X < TO

where k 1 = bc f^, gl (x; t1) dX and k2 = k 1 + b /TT"f (x; a, a) dX. Using the estimated parameters and the quantile

function, we can predict city sizes as

H (x) a, a

_ _L x l

bc + min

exp 21/2a erf-1

Kh (x)+2 - k

H (x) e]0,K 1[ j- k ^ + H (x) e [k 1, k 2]

bf (Tu) (Tu)1+au

H ( x )

bf (Tu )Tu K 2

H (x) e]K2,1].

Our distribution is more comprehensive and can be simplified into the lognormal upper-tail Pareto of loannides and Skouras [6] by setting t, = xmin = 0 thereby eliminating the lower-tail Pareto and not imposing the differentiability condition at the upper-tail transition point.

3. Estimation and analysis

We collected data for all US cities for 2010 from the Census Bureau [23]. This data set includes all incorporated cities and census designated places (CDPs). CDPs are the statistical counter part to incorporated cities for all cities, towns, and villages that do not have municipal government, but otherwise qualify as incorporated places.4 This data set includes 29,241 places with the smallest city size xmin = 1 for several places and a maximum value of 8,175,133 for New York City.

We implement maximum likelihood estimation of Eq. (4) for two-Pareto tail-lognormal distribution and lognormal-upper tail distribution with and without the differentiability conditions. Table 1 presents the estimated parameters of the latter. Then, after discussing the results, we report the results of the PTLN distribution with the differentiability restrictions. We utilized the line-search algorithm in the Matlab software for the estimation. We computed the standard errors based on 500 bootstrapped samples. All parameter estimates are highly significant as indicated by the low standard errors. For the two-Pareto tail-lognormal distribution, the estimate of the lower tail switching point (t,) is 158 with Pareto slope parameter (a,) estimate of 1.27; the location (a) and scale (a) estimates for the lognormal body are 6.71 and 2.13, respectively; while the upper tail cutoff (tu) is 52,500 with a Pareto slope parameter (au) estimate of 1.34.

When Ti = 1, the two-Pareto tail-lognormal distribution turns into the lognormal-upper tail distribution, the lower tail becomes part of the lognormal, and t, and a, are not part of the distribution. The estimated location for the lognormal body is larger at 7.08 while the scale is smaller at 1.80. However, the upper tail transition point Tu is the same for both distribution while the Pareto slope parameter is lower at 1.17. lt is interesting that, even though the upper tail switching point is the same,

4 Following the 2000 Census, criteria was changed such that CDPs are areas that are ''locally recognized and identified by name'' [24]. CDPs are defined according to administrative definitions and do not follow economic criteria.

Table 1

Parameter estimates for the two distributions.

Parameters Est. value Std. error3 Model selection criteria

Two-Pareto tail-lognormal distribution

T 158.00 9.03 L (•!•) = —266,299.98

a 1.27 2.89 x 10-2 AIC = 532,611.96

ß 6.71 4.06 x 10-2 BIC = 532,661.66

a 2.13 3.05 x 10-2 BF < 0.0001

Tu 52,500.00 1.91 x 10-2 V = 9.70

au 1.34 3.86 x 10-2

Lognormal-upper tail Pareto distribution

ß 7.08 4.75 x 10-6 L (•!•) = —266,509.02

a 1.80 5.50 x 10-6 AIC = 533,026.04

Tu 52,500.00 14.78 BIC = 533,059.17

au 1.17 5.08 x 10-5

a Standard errors are calculated from 500 bootstrapped samples.

Fig. 1. Rank-size plot for descending city size.

the other common parameter estimates for the two distribution differ in value. This shows that, even though only 3582 cities with 311,769 people constitute the lower tail, explicitly modeling this tail allows more flexibility for the lognormal body for mid-range cities, which, in turn, significantly impacts the upper tail Pareto parameter in the two-Pareto tail-lognormal distribution.

For the 2000 Census, Ioannides and Skouras [6] report estimates of lognormal location of 7.26 and scale (a) of 1.73, upper tail cutoff of Tu = 60,290, and the Pareto slope parameter at au = 1.25 for the lognormal-upper tail distribution. Comparison of our upper tail cutoff point estimate for 2010 Census data to that of Ioannides and Skouras [6] for 2000 Census data reveals that a larger portion of cities and population was in the upper tail for the 2010 Census (723 places, 2.5% of all places, or 51% of population) relative to the 2000 Census (501 places, 2% of all cities, or 46% percent of population).

Next, we provide graphical and statistical evidence that the two-Pareto tail-lognormal distribution estimates the lower tail, body, and upper tail more accurately than the lognormal-upper tail distribution. Figs. 1 and 2 graph the rank-size plots for descending and ascending city size in log-log scale, respectively. The descending (ascending) rank-size plot accentuates the upper (lower) tail and helps to visualize the goodness of fit more clearly. In both figures, the solid black line represents the city size data, while the blue dashed line and green dot-dashed line depict the data generated - using Eq. (5) and the parameter estimates given in Table 1 - for the two-Pareto tail-lognormal and lognormal-upper tail. These figures show that the two-Pareto tail-lognormal distribution predicts the actual data for both the upper and lower tail more accurately than the lognormal-upper tail distribution. Fig. 3 plots the lognormal body by stripping out the lower tail for the city sizes below 158.00 and upper tail for the city sizes above 52,500.00. This plot shows that the two-Pareto tail-lognormal also fits the lognormal body better than the lognormal-upper tail distribution. This indicates that the smaller location and larger scale of the two-Pareto tail-lognormal more accurately represents the lognormal body for the city size data.

More formally, the last column in Table 1 provides model selection criteria (AIC and BIC) and statistical tests (Bayes factor and Vuong's test) to identify which model fits the data better. Using the log-likelihood L and number of estimated parameters k, we consider the Akaike Information Criterion (AIC = 2k — 2L) and Bayesian information criterion (BIC =

O) o 5

Two Tails-Lognormal Distribution Lognormal-Upper Tail Distribution Data

6 8 10 Log City Size

Fig. 2. Rank-size plot for ascending city size.

jc 9.5

' Two Tails-Lognormal Distribution ' Lognormal-Upper Tail Distribution - Data

4 5 6 7 8 9

Log City Size

Fig. 3. Rank-size plot for lognormal body for descending city size.

k log (n) - 2L) for model selection. Both the AIC and BIC trade off accuracy of the proposed distribution with the number of parameters in the model; by design, the lowest value indicates the favored distribution. We are therefore able to judge whether the additional parameters and added precision of the two-Pareto tail-lognormal distribution is preferred to the more simplified lognormal-upper tail distribution. As shown in the table both the AIC and BIC are smaller for the two-Pareto tail-lognormal distribution, indicating it is the preferred distribution.

Since the two distributions are not nested, we apply the Bayes factor (BF) with Jeffrey's scale and Vuong's closeness test to compare model performance. The Bayes factor - a Bayesian counterpart to the likelihood ratio test - can be approximated using the BIC as BF ^ exp (1 (BICu - BICr)) and interpreted using Jeffrey's scale5 [25]. Given the small value of the BF < 0.0001, there is strong evidence in favor of two-Pareto tail-lognormal distribution. The Vuong's test statistic V = 9.70 also indicates that our distribution fits the data better than the lognormal-upper tail distribution.

The upper tail Pareto estimates in both distributions are significantly different from one given that they are estimated with a high degree of precision (very small standard errors), and thus do not strictly adhere to Zipfs law.

We also estimated the PTLN model by imposing the differentiability condition as given in (2) and (3). The estimated parameter value for the lower tail switching point (t,) is 7.03; the location (a) and scale (a) estimates for the lognormal body are 7.08 and 1.80, respectively; while the upper tail cutoff (tu) is 52,500. Using the differentiability restrictions, we computed the reverse-Pareto slope parameter (a,) estimate of 1.59 and the upper-tail Pareto slope parameter (au) estimate

5 According to Jeffrey's scale, there is strong support for the two-Pareto tail-lognormal distribution if BF < jL, moderate evidence if J; < BF < 1, and

weak evidence if 3 < BF < 1.

of 1.17. These estimates are very similar to those of the lognormal-upper tail Pareto distribution, and the Vuong's closeness test statistic of 1.19 indicates the models are statistically equivalent. Furthermore, a comparison of Vuong's test statistic V = 9.66 for PTLN versus PTLN with differentiability restrictions and V = 9.70 for PTLN versus lognormal-upper tail distribution indicates that PTLN distribution is more flexible and not over parameterized.

4. Conclusion

We propose a distribution that models the lower tail Pareto, lognormal body, and upper tail Pareto of city size data. The proposed distribution nests the lognormal body and upper tail Pareto distribution proposed by Ioannides and Skouras [6]. Based on the graphical evidence of the rank-size plot and statistical support provided by AIC, BIC, likelihood-ratio test, the Bayes factor, and Vuong's test, our results provide strong evidence that the more flexible two-Pareto tail-lognormal distribution provides better fit of the US city size distribution and is the statistically preferred model. Because the two-Pareto tail-lognormal distribution explicitly models the truncation point for the lower tail Pareto, there is more flexibility in estimating the location and scale parameters for the lognormal body, which ultimately allow for more accurate estimation of the upper tail Pareto parameter.


[1] J. Luckstead, S. Devadoss, A comparison of city size distributions for China and India from 1950 to 2010, Econom. Lett. 124 (2) (2014) 290-295.

[2] P. Krugman, The Self-Organizing Economy, Blackwell Publishers Cambridge, Massachusetts, 1996.

[3] X. Gabaix, Zipfs law for cities: An explanation, Quart. J. Econ. 114(3) (1999) 739-767.

[4] W.J. Reed, M. Jorgensen, The double pareto-lognormal distribution-a new parametric model for size distributions, Comm. Statist. Theory Methods 33 (8) (2004) 1733-1753.

[5] J. Eeckhout, Gibrat's law for (All) cities, Amer. Econ. Rev. 94 (5) (2004) 1429-1451.

[6] Y. Ioannides, S. Skouras, US city size distribution: Robustly pareto, but only in the tail, J. Urban Econ. 73 (1) (2013) 18-29.

[7] S. Devadoss, J. Luckstead, D. Danforth, S. Akhundjanov, The power law distribution for lower tail cities in India, Physica A 442 (2016) 193-196.

[8] S. Devadoss, J. Luckstead, Size distribution of US lower-tail cities, Econom. Lett. 135 (1) (2015) 12-14.

[9] W.J. Reed, The Pareto, Zipf and other power laws, Econom. Lett. 74 (1) (2001) 15-19.

[10] E. Gomez-Deniz, E. Calderin-Ojeda, On the use of the pareto arctan distribution for describing city size in Australia and New Zealand, Physica A 436 (2015)821-832.

[11] I.I. Eliazar, M.H. Cohen, Power-law connections: From zipf to heaps and beyond, Ann. Physics 332 (2012) 56-74.

[12] I. Eliazar, M.H. Cohen, Hierarchical socioeconomic fractality: The rich, the poor, and the middle-class, Physica A 402 (2014) 30-40.

[13] I.I. Eliazar, M.H. Cohen, Rank distributions: A panoramic macroscopic outlook, Phys. Rev. E 89 (1) (2014) 012111.

[14] W.J. Reed, The pareto law of incomesaAfan explanation and an extension, Physica A 319 (2003) 469-486.

[15] I. Eliazar, M.H. Cohen, A Langevin approach to the Log-Gauss-Pareto composite statistical structure, Physica A 391 (22) (2012) 5598-5610.

[16] I.I. Eliazar, M.H. Cohen, Topography of chance, Phys. Rev. E 88 (5) (2013) 052104.

[17] M.H. Cohen, I.I. Eliazar, Econophysical visualization of adam smithaAZs invisible hand, Physica A 392 (4) (2013) 813-823.

[18] I. Eliazar, Power-law and exponential rank distributions: A panoramic Gibbsian perspective, Ann. Physics 355 (2015) 322-337.

[19] Enrique Calderin-Ojeda, The distribution of all French communes: A composite parametric approach, Physica A 450 (2016) 385-394.

[20] M. Puente-Ajovin, A. Ramos, On the parametric description of the French, German, Italian and Spanish city size distributions, Ann. Reg. Sci. 54 (2) (2015)489-509.

[21] I.I. Eliazar, M.H. Cohen, On the physical interpretation of statistical data from black-box systems, Physica A 392 (13) (2013) 2924-2939.

[22] I.I. Eliazar, M.H. Cohen, Econophysical anchoring of unimodal power-law distributions, J. Phys. A 46 (36) (2013) 365001.

[23] US Census Bureau. Population estimates: Historical data, 2014.

[24] US Census Bureau. Census designated place (cdp) program for the 2010 census-final criteria. Bureau of the Census, Commerce, Volume 73, Number 30, 2008.

[25] R.E. Kass, A.E. Raftery, Bayes factors, J. Amer. Statist. Assoc. 90 (430) (1995) 773-795.