Contents lists available at ScienceDirect

Physica A

journal homepage: www.elsevier.com/locate/physa

The power law distribution for lower tail cities in India

Stephen Devadossa, Jeff Luckstead b-*, Diana Danforthb, Sherzod Akhundjanovc

a University of Idaho, United States b University of Arkansas, United States c Washington State University, United States

highlights

• India is a predominantly rural country with numerous villages.

• Power-law behavior of small Indian cities.

• Lower-tail Indian cities follow the reverse-Pareto distribution.

article info abstract

The city size distribution for lower tail cities has received scant attention because a small portion of the population lives in rural villages, particularly in developed countries, and data are not readily available for small cities. However, in developing countries much of the population inhabits rural areas. The purpose of this study is to test whether power law holds for small cities in India by using the most recent and comprehensive Indian census data for the year 2011. Our results show that lower tail cities for India do exhibit a power law.

© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

CrossMark

Article history: Received 26 February 2015 Received in revised form 12 May 2015 Available online 16 September 2015

Keywords: India

Lower tail cities Size distribution Power law

1. Introduction

The city size distribution literature has extensively studied upper tail cities because much of the population lives in urban areas and data for large cities are readily available. Many studies have shown that Zipfs law for upper tail cities is a regularly observed phenomenon. For example, Krugman [ 1 ], Gabaix [2], and Ioannides and Overman [3] conclude that the distribution of large cities in the United States is stable and Zipfs law fits upper tail cities with statistical regularity.1

In contrast, the city size distribution for lower-tail cities has received scant attention because a small portion of the population lives in rural villages, particularly in developed countries, and data are not readily available for small cities.2 However, in developing countries much of the population resides in rural areas. For example, in India about 69 percent of the population lives in rural areas [8]. Recent studies by Gangopadhyay and Basu [9] and Luckstead and Devadoss [6] found power-law behavior of upper-tail Indian cities. However, because India is predominantly a rural country with numerous

* Corresponding author.

E-mail addresses: devadoss@uidaho.edu (S. Devadoss), jluckste@uark.edu (J. Luckstead), ddanfort@uark.edu (D. Danforth), sherzod.akhundjanov@email.wsu.edu (S. Akhundjanov).

1 However, several studies have shown Zipfs law fails to hold for upper tail cities particularly for several developing countries [4-6].

2 One exception is Reed [7] who studied the lower tail distribution for various states in the United States and Spain and showed that the degree of linearity in the lower tail is stronger than that of the upper tail.

http://dx.doi.org/10.1016Zj.physa.2015.09.016

0378-4371/© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).

small and medium villages, it is worth examining the size distribution of the lower tail cities for developing countries such as India. Thus, the purpose of this study is to investigate whether power law holds for small cities in India by using the most recent and comprehensive Indian census data for the year 2011.

2. Methodology

Consider m cities in the lower tail and rank them in ascending order such that x1 is the smallest city and xm is the largest city. Denote F (x) as the CDF of x, then m ^ F (x;).3 Based on Reed [11] (see the Appendix for more detail), the PDF and CDF of a reverse Pareto random variable x are

f (x) = P— (1)

F (x) = 0m )'■ (2)

where the location parameter is xm = max (x) such that xm > x; > 0 and the shape parameter is P > 0. For a reverse Pareto law of small cities, log (Pr (X < x)) is linearly related to log (x) with a positive slope for P [7]. Substitution of the CDF in (2)

into — ^ F (x;) results in — — ) .If P = 1, then — ^ —, and the rank of a city is proportional to its size.

m v 17 m \xm I r ' m xm> j r r

For an independently and identically distributed sample of m lower tail cities, the log likelihood function of the reverse Pareto distribution is

L (P, xm |*1.....xm-1) = (m - 1) log (P) - P(m - 1) log (xm) + (P - 1)J2 log (x;). (3)

Maximization of this function with respect to P yields the first-order condition

- (m - 1) log (xm) + ^ log (x;) = 0. (4)

d L (m - 1) m-1

dP P =1

The above equation can be solved to obtain

P =_(m - 1)_

P m-1 .

(m - 1) log (xm) - J2 log (x;)

In the next section, we estimate the parameter P for lower tail Indian cities.

3. Data and results

The data for Indian cities for the 2011 census were collected from the Census of India [8]. According to this census, cities are classified as villages based on the following criteria.4 All places with (1) no municipality, corporation, and cantonment board, (2) a population of less than 5000, (3) at least 25% of the male workers employed in agricultural occupations, and (4) a population density of less than 400 per square km are classified as villages.

To determine the cutoff of the lower tail cities, we generate the histogram of the log of city sizes of the full sample (Fig. 1). Based on the inflection point of the lower tail of this figure, we can ascertain that the log of xm, i.e., the largest city in the lower tail, is about 4.5, which translates into a population of about 90. For this truncation point, the sample size is 37,153 with a mean city size of44.99 and standard deviation of 26.33. In addition, to provide robust estimates, we consider different truncation points for log (xm) ranging from 4.09 to 4.79, which translate into a population of 60-120. The corresponding sample size for this range of log (xm) is 24,849-49,826.

Table 1 presents the estimated values of P and their standard errors for various sample sizes of lower tail cities. The estimated values of P generally show a slight positive trend as the sample size increases. For the truncation point xm = 110, the estimated values of P is exactly equal to 1. Similar observations can be also made for xm = 100. These results show Ps are very close to 1, indicating lower-tail power law behavior. Our results are robust in that the estimated values of P are close to 1 for various sample sizes corresponding to different truncation points of xm. Since the standard errors are very small, the beta estimates for various truncation points are highly significant.

3 Also see Stanley etal. [10].

4 While our paper considers cities based on the administrative definition, Rozenfeld et al. [12] find that cities based on a geographical notion can generate Zipfs law results for a larger sample with a cutoff city size of 12,000.

0.2 0.18 0.16 0.14 0.12 I 0.1 0.08 0.06 0.04 0.02 0

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13 13.5 14 14.5 15

Log of City Size Fig. 1. Histogram of log of city sizes.

Table 1

Lower tail Pareto estimation.

Truncation points (log (xm)) Sample size ß (std. error)

log (60) = 4.0943 24,849 0.99525 (0.00634)

log (70) = 4.2485 28,911 0.99437 (0.00589)

log (80) = 4.3820 32,985 0.99384 (0.00551)

log (90) = 4.4998 37,153 0.99607 (0.00520)

log (100) = 4.6052 41,369 0.99955 (0.00494)

log (110) 4.7005 45,545 1.00000 (0.00471)

log(120) = 4.7875 49,826 1.00385 (0.00452)

Recent studies have found strong power law behavior in upper-tail Indian cities. Gangopadhyay and Basu [9], using 2001 Indian Census data, demonstrated the upper-tail power law behavior with shape parameter estimates of 2.03 and 1.88 for the truncation points of 203,380 and 10,000, respectively. Also, Luckstead and Devadoss [6] showed upper-tail cities to exhibit Power law behavior for 2010 Indian data with a shape parameter estimate of 1.16 for the truncation point of 746,000. In contrast, our study predicts reverse rank-size property for the lower tail cities. Our findings corroborate the results of Reed [7] who observed the lower tail rank-size property for the cities in the states of West Virginia and California in the United States and Cantabria and Barcelona in Spain.

Fig. 2 plots the actual data and predicted city sizes5 versus rank of the cities in log-log scale for the truncation point of 120 with a sample size of 49,826. This figure illustrates that predicted city sizes replicate the actual data very closely, which clearly highlights the reverse rank-size property in the lower-tail Indian cities.

Since India is a predominantly rural country with a large portion of the population living in small villages, the economic activities in this part of the country are important to the national GDP. Moreover, the results of our paper can be used in conjunction with economic indicators such as government expenditure for rural development and employment to explain how economic factors impact population growth in rural areas. Thus, the information about the trend of the size distribution of rural cities will assist government officials to implement policies designed to benefit the rural population. Therefore, our study of the distribution of lower-tail cities is valuable.

5 We randomly sample with replacement from a grid ranging from the minimum to maximum city size using the parameterized PDF as weights to generate predicted values.

Actual Data Predicted Data

1.5 2 2.5 3 3.5 4 4.5 Log City Size

Fig. 2. Rank-size plot in log-log scale.

Appendix

The reverse Pareto PDF is based on the double Pareto distribution derived in Reed [11]. The PDF for a double Pareto random variable x is

where a and p are the Pareto exponents for the upper and low tail, respectively, and xm is the cutoff point. The truncated PDF for the lower tail of the double Pareto distributed system is obtained as

[1] P. Krugman, The Self-organizing Economy, Blackwell Publishers, Cambridge, Massachusetts, 1996.

[2] X. Gabaix, Zipfs law for cities: An explanation, Q..J. Econ. 114 (3) (1999) 739-767.

[3] Y.M. loannides, H.G. Overman, Zipfs law for cities: An empirical examination, Reg. Sci. Urban Econ. 33 (2) (2003) 127-137.

[4] G. Anderson, Y. Ge, The size distribution of Chinese cities, Reg. Sci. Urban Econ. 35 (6) (2005) 756-776.

[5] K.T. Soo, Zipfs law and urban growth in Malaysia, Urban Stud. 44 (1) (2007) 1-14.

[6] J. Luckstead, S. Devadoss, A comparison of city size distributions for China and India from 1950 to 2010, Econom. Lett. 124 (2) (2014) 290-295.

[7] W.J. Reed, On the rank-size distribution for human settlements, J. Reg. Sci. 42 (1) (2002) 1-17.

[8] Census of India (2014). Population enumeration data (final population). http://www.censusindia.gov.in/2011census/.

[9] K. Gangopadhyay, B. Basu, City size distributions for India and China, Physica A 388 (13) (2009) 2682-2688.

[10] M.H. Stanley, S.V. Buldyrev, S. Havlin, R.N. Mantegna, M.A. Salinger, H.E. Stanley, Zipf plots and the size distribution of firms, Econom. Lett. 49 (4) (1995)453-457.

[11] W.J. Reed, The Pareto, Zipf and other power laws, Econom. Lett. 74 (1) (2001) 15-19.

[12] H. Rozenfeld, D. Rybski, X. Gabaix, H. Makse, The area and population of cities: New insights from a different perspective on cities, Am. Econ. Rev. 101 (5)(2011)2205-2225.

References