Available online at www.sciencedirect.com

ScienceDirect

Transportation Research Procedía 18 (2016) 35-43

Procedía

www.elsevier.com/locate/procedia

XII Conference on Transport Engineering, CIT 2016, 7-9 June 2016, Valencia, Spain

Exploring the Factors that Impact on Transit Use through an Ordered Probit Model: the Case of Metro of Madrid

Laura Ebolia, Carmen Forcinitia*, Gabriella Mazzullaa, Francisco Calvob

aUniversity of Calabria, Department of Civil Engineering, Ponte Pietro Bucci cubo 46B, 87036 Rende (Cs), Italy bUniversity of Granada, E.T.S.I. Caminos Canales y Puertos, c/Severo Ochoa s/n, 18071 Granada, Spain

Abstract

The configuration of urban areas is the result of a cyclic relationship between land use and transportation system: the changes in transportation system arrangements influence the localisation of residence and economic activities, as well as the changes in land use affect transportation system characteristics. In this context, by operating on land use, travel demand can be shift from the individual transportation modes to transit systems. In the literature, many conceptual models were proposed to describe the complex relationship between land use and travel behaviour. In addition to spatial variation, the study of travel demand shows the categorical variation of variables.

This work aims to analyse the influence of the categorical variation of variables impacting on transit use. An ordered probit model is proposed for evaluating how transit use depends on variables related to socio-economic characteristics of population, territorial features, accessibility, and transportation system. The study case is the metro network of Madrid (Spain).

The results show a strong influence of characteristics of population and land use variables on daily trips made using metro system and highlighted the aspects that mainly impact on the choice of travelling by metro, providing useful suggestions for shifting people from individual transportation mode to transit systems.

© 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license

(http://creativecommons.Org/licenses/by-nc-nd/4.0/).

Peer-reviewunder responsibility oftheorganizingcommittee ofCIT 2016

Keywords: Transit use; metro; transportation system; land use; ordered probit model.

* Corresponding author. Tel.: +39 0984 494020; fax: +39 0984 496787. E-mail address: carmen.forciniti@unical.it

2352-1465 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license

(http://creativecommons.Org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of the organizing committee of CIT 2016

doi:10.1016/j.trpro.2016.12.005

1. Introduction

Urban areas configuration is the result of the interactions between land use and transportation system, which can be represented as a cyclic relationship. The changes that occur in the transportation system can influence users' behaviour and the future localisation of residence and economic activities. Similarly, the changes in land use can produce modification to transportation system arrangement and users' travel behaviour. In particular, when an area has an intervention that improves transport system, it also improves accessibility, thus the area becomes more attractive as a destination of the trips or as industrial or residential area in relation to the type of intervention performed (Eboli et al., 2012).

In the last few years, the decline in railway use has favoured the expansion of road mobility and its infrastructures (Nocera et al., 2012). Users' travel behaviour has become more complex and users prefer private car for moving. From the user's perspective, private car represents the most attractive mode of transportation because it is perceived as the most comfortable, flexible, and fast compared with the other modes. In addition, the urban space characteristics, as urban sprawl and low density areas, contribute to increase the attractiveness of car as the most convenient transportation mode over all the points of view.

The fast increase of the number of trips made by car causes the complication of the air pollution problem and the energy consumption. Simultaneously, it participates to arise of social problems as traffic congestion and decline in quality of life. Consequently, national and international policies regarding transportation and territorial planning aim to shift travel demand from private car to public transit systems. In order to attract more users, the transport companies can operate interventions aimed to increase the service quality and to reduce the travel time. On the other hand, the research in the field of land use and transport interaction can focus on the identification of possible interventions on land use which could influence positively users' travel behaviour. Several problems can be analysed by modelling the interaction between land use and mobility, as defining which planning aspects have to be considered in the realization of neighbourhoods which contribute to decrease the car use. In this context, by operating on land use, travel demand can be shift from the individual transport to transit systems.

In the literature, many researchers propose detailed analysis and conceptual models to explicate the complex relationship between land use and travel behaviour. Among the several land use characteristics, the structural characteristics of neighbourhoods play a relevant influence on transportation variables, as trip generation, trip length and modal split (Friedman et al., 1994; Handy et al., 2005; Kitamura et al., 1997). In particular, these studies show that residents of neighbourhoods with higher levels of urban density, land-use mix, transit accessibility, and pedestrian friendliness (among other characteristics) drive less than residents of neighbourhoods with lower levels of these characteristics. However, the authors affirmed that, in order to provide a reliable analysis of the phenomena, socioeconomic characteristics and travel attitudes of residents could be taken into account. These models have a strong spatial component because the data regarding land use, population, economic activities, and transportation system have spatial references (Mazzulla and Forciniti, 2012). In addition to spatial variation, the study of travel demand shows the categorical variation of variables. Some examples are the variables regarding the socio-economic characteristics of population, as occupational status or education level, but also variables related to transportation system, as available modal choices and amount of trips, and the accessibility levels. In order to take into account the categorical variation of variables, ordered regression models can be applied.

The main aim of this paper is to analyse the influence of the categorical variation of variables impacting on transit use. The study case is the metro network of Madrid (Spain), where, in the last few years, the realisation of new lines and stations has caused important variations in the localisation of population and economic activities, and in users' travel behaviour. An ordered probit (OP) model is proposed for evaluating how the use of metro systems depends on variables related to socio-economic characteristics of population, territorial features, accessibility, and transportation system.

The rest of the paper is organised as follows. In the next section, the characteristics of the study case are illustrated. Section 3 describes the methodological approach and provides some theoretical remarks about OP models; the same section contains a discussion about the proposed model and the choice of the variables, and it finally presents the main results. Finally, the paper ends with brief conclusions about the work.

2. Study Case

Madrid has experienced important modifications during the past 50 years, changing from one-nuclear city to a poly-nuclear metropolis (Monzón and de la Hoz, 2009). The city was invested by an intense developing process accompanied by the dispersion of urban settlements. The urban sprawl has led to the appearance of the Madrid metropolitan area, constituted by the capital city together with a set of smaller neighbouring towns. Usually Madrid metropolitan area is divided into four regions: CBD (Central Business District), Madrid City, Metropolitan Ring (including the bordering urban centres) and Regional (including the urban centres not directly bordering to Madrid). CBD and Madrid City compose the capital city.

Study case is represented only by the Madrid municipality, and the data had been attributed to the zones on the basis of the official zoning of the city of Madrid. The population recorded in the city of Madrid at 2014 amounted to 3,166,130 inhabitants. Starting from the elaboration of the official maps and documents related to land use, different land use classes were identified. More central zones are mainly areas addressed to residential buildings. This aspect confirms the original nuclear structure of the city. A ring of areas addressed to buildings for economic activities and to infrastructural equipment (as parks, streets, squares, pedestrian areas) surrounds the central zone, whereas rural areas are in the northern zone of the city, where a natural reserve is localised. Population density follows the same trend; therefore, central zones present values of density population higher than the external zones.

In the last 15-20 years, Madrid's population decreased whereas the number of inhabitants has been increasing in the surrounding towns. This could be due to many factors, and mainly because Madrid city centre is a consolidated historical area, surrounded by areas with a high potential growth and low land prices. In addition, the transit system expansion, as in the case of metro system, caused differential growth.

Madrid's metro system opened in 1919 with a total length of 3.48 kilometres and 8 stations. It has been in constant development ever since. The most recent extensions made the metro network 293 kilometres long by 2010 and subway became an interurban system that reached several towns in the metropolitan area. In addition, Madrid's metro system is interconnected with the light rail system, opened in 2007, and with the suburban railways servicing short distance travel to and across the city. In the meantime, demand increased by 58% from 1995 to 2010 reaching over 600 millions of passengers travelling in a year. This important growth in geographical coverage and demand shows the potential impact of the Madrid's metro on changes in land use and population settlement. In particular, the extensions of the subway lines beyond Madrid's municipal limits allowed the bordering towns to be directly and quickly connected to Madrid city centre.

Data collected by the Household Travel Survey of Madrid (EDM), conducted in 2004 by the transportation company of the Madrid Region (CRTM, 2006), provide a clear outline of the situation about the mobility in the metropolitan area. An amount of 14,511,397 trips was recorded, including trips made for work and study purposes (systematic mobility) and trips made for other purposes (unsystematic mobility). About 60.0% of trips are made for study or work purposes. An average number of 2.60 daily trips per inhabitant was recorded. The number of trips per inhabitant increased of 20.3% compared to the results with the same type of survey conducted in 1996.

In regard to the modal split, 54.7% of trips were made by transit system and 45.3% by private vehicles. The most chosen transit system is metro (37.3% of users). An increase of about 10% was recorded compared to the survey of 1996. This could be due to the expansion of the metro network. The greatest amount of trips regards the central zone of the city; however, the survey showed an increased number of trips made from the more suburban zones of the city compared to the survey dated 1996. The majority of trips made by transit systems are generated from central zones. The entire area showed a decrease of the number of trips made by private vehicles.

3. Ordered Probit Model

3.1. Theoretical framework

Models for evaluating travel demand, which reproduce the interaction between land use and transportation system, have a strong spatial component. In addition to spatial variation, the variables that characterise the interaction between land use and transportation system can be considered as categorical variables and measured with a scale consisting of a set of categories.

Ordered regression models are very convenient for treating the categorical variation of variables; for this reason, they have a good applicability for analysing land use-transport interaction and the accessibility to transit system. These models can be divided into two groups: ordered probit and ordered logit, according to the statistical errors distribution. Since many years ago, several researchers have indicated that the results from the ordered probit and ordered logit are similar. However, there is no consensus on which model is the best. The authors of this paper have already adopted in the past OP models for analysing other transportation issues, as service quality of airport transit services (Eboli and Mazzulla, 2009), and road safety (Cardamone et al., 2015, 2016; de Ona et al., 2014).

The OP model was originally developed by McKelvey and Zavoina (1975). In the OP model there is an observed ordinal variable Y, which is, in turn, a function of another variable Y* that is not measured (Borooah, 2001). Specifically, in the ordered model there is a continuous unmeasured latent variable Y*, whose values determine what the observed ordinal variable Y matches. The continuous latent variable Y* has various threshold points. The value Y; of the observed variable depends on whether or not it crossed a particular threshold, as shown by the following formulas from (1) to (4).

Y = 1 if Y* < K (1)

Y = 2 if k 2 < y; < K (2)

y = j if kj < y; < k}_1 (3)

Y = M if y; > kM 1 (4)

In the population, the continuous latent variable Y* is equal to (formula 5):

Y*=TtAxi+si = Z+si (5)

where there is a random disturbance term £; normally distributed. The error term reflects the fact that the variables may not be perfectly measured, and some relevant variables may be not introduced in the equation. By means of the OP we can estimate the expected average value of the Yj* (formula 6):

e(Y; )=zt (6)

Once we have estimated ft coefficients and the (M-1) k cutoff terms, we can estimate the probability that Y will have a particular value. The formulas are the following:

P(Y = 1) = 1 [1 + exp(Zi - k,)] (7)

P(Y = 2) = 1 [exp(Zf. - k2)] -1 [exp(Zf. - k2)] (8)

P(Y = M) = 1 -1 [1 + exp(Zi - Km_1)] (9)

Finally, the OP model can be used to estimate the probability that the unobserved variable Y* falls within the various threshold limits.

3.2. Proposed model

We propose an OP model for estimating how the use of metro system depends on categorical variables related to socio-demographic characteristics of population, territorial features, accessibility, and transportation system.

The dependent variable Y relates to the use of metro system, according to a two-point numerical scale. The dependent variable is equal to 0 if the number of generated daily trips made by metro system is smaller than 50% of the total generated trips, and equal to 1 if the trips made by metro are more than 50% of the total trips. The dependent variable is defined in the following way:

Y = 0 if y < k (io)

Y = 1 if Y > K (ii)

where i indicates the unit of analysis (zone of the city of Madrid) and kt is the threshold parameter estimated by model together with the p parameters. In order to explicate the dependent variable, a set of independent variables was defined on the basis of a theoretical reasoning linking some factors to the use of metro system. Each factor was treated making a priori hypotheses about the theoretical link with the dependent variable, and estimating an expected relationship. Selected factors are presented in table 1.

The first three factors ("Female gender", "Occupational status" and "No car ownership") relate to socio-economic characteristics of population. "Land use prevalently residential" and "Predominance of services among economic activities" are factors regarding land use characteristics. Lastly, "Accessibility" is a factor that measures the ease to access in all the other zones, and "Availability of metro stations" refers to metro system characteristics. The factors were defined on the basis of the official zoning of the Madrid municipality.

"Female gender" relates to the number of females in total population. The factor "Worker" refers to the occupational status in the considered zone, and considers the amount of workers compared to the total population. Car ownership is taken into account in the factor "No car ownership", which considers the households without a car.

The factor "Land use prevalently residential" regards the amount of areas addressed to residential settlements compared to the amount of the other land uses. "Predominance of services among economic activities" refers to the amount of services (public and private) compared to other economic activities as retails and restaurants.

The factor "Accessibility" measures accessibility of a certain zone defined as the ease to access in all the other zones (Geurs and van Wee, 2004). In this work, active accessibility AAo was measured on the metro network using a gravitational model:

AAo =Zd=1 AddolDod (12)

where Addd is the number of total jobs in the zone representing the trip destination, and Dod is the distance between the origin and the destination zones measured on the metro network. This accessibility formulation is more suitable than the accessibility measured on the street network because in this case the objective is analysing the trips by metro system. In fact, the distance Dod was calculated on the metro network using specific tools included in ArcGIS. The trip origin and destination correspond to the metro station nearest to user's origin and destination. The accessibility values were classified in three increasing levels.

The factor "Availability of metro stations" refers to the number of metro stations that can be reached walking less than 600 meters from home. The distances were measured on the pedestrian network (Calvo et al., 2013).

Each factor was defined by two explanatory variables, except the factor "Accessibility", which is defined by three variables. In the case of factor "Female gender", the variable is equal to "1" when the number of females is higher than the number of males in the considered zone, "0" otherwise. One of the variables of factor "Worker" has the value "1" if the number of workers is higher than the number of people who do not work. Considering the factor "No car ownership", the variable values "1" when the number of households that have not a car is higher than the number of households that have at least one car. The variable of the factor "Land use prevalently residential" is equal to "1" if

the residential settlements are prevalent in comparison with the amount of the other land uses in the same zone. "Predominance of services among economic activities" is equal to "1" if the amount of services (public and private) is prevalent compared to other economic activities. The variable of the factor "Availability of metro stations" is "1" when there is the availability of at least one metro station that can be reached walking less than 600 meters from home. In the case of factor "Accessibility", the first variable corresponds to the lowest level of accessibility, the se cond to the intermediate level, and the third to the highest level.

Table 1. A priori hypothesis regarding the use of metro system

Factor Reasoning Expected relationship

Female gender Gender can impact on the modal split. Usually, females, compared to males, Trips made by metro system

prefer more to move using transit systems rather than private car. increase.

Worker Resident people who work make more trips than unemployed inhabitants. Trips made by metro system

Generally, trips made by all transport mode increase. increase.

No car ownership Car ownership promotes the choice of car for moving. If people have not a Trips made by metro system

car, they are obligated to move using transit systems. increase.

Land use prevalently The predominance of residential building in a zone could produce the Trips made by metro system

residential necessity to move for making daily activity, as reaching own job or increase.

shopping.

Predominance of services People who live in a zone where a lot of services are localised do not need Trips made by metro system

among economic activities to travel towards another zone. decrease.

Accessibility Higher level of accessibility measured considering metro network makes Trips made by metro system

metro system as more attractive to users for moving. increase.

Availability of Metro stations Availability of at least one metro station promotes the use of metro system. Trips made by metro system

increase.

3.3. Results

Two OP models were elaborated in order to identify what model better explains the data. The first model contains all the variables previously specified. In order to calibrate the coefficients for each variable, the model was based on a particular reference case, which considers female gender, worker, no car ownership, land use prevalently residential, predominance of services among economic activities, high level of accessibility, and availability of at least one metro station. Definitively, we have a total of 7 factors and 15 independent variables. The statistics on the goodness of fit are adequate. Based on the p-values of the Wald tests, four variables are found to be significant with p<0.1, and two variables with p<0.2; the other two variables cannot be considered as significant (table 2).

In order to improve the number of variables that are not significant, after some attempts we decided to reduce the number of variables in the model, considering only the significant variables. In this way, the final model was elaborated considering the variables related to the factors "Female gender", "Land use prevalently residential", "Predominance of services among economic activities", and "Accessibility". The model was based on a particular reference case, corresponding to female gender, land use prevalently residential, predominance of services among economic activities, and high level of accessibility. In this case, we have a total of 4 factors and 9 independent variables. As we can observe in table 2, all the variables are significant.

Observing the sign of the estimated coefficients p, we can say that a priori hypotheses about the theoretical reasoning link between dependent and independent variables are verified. The sign of the variable V1=0 is negative; this means that if the population is mainly composed by males the trips made by metro system decrease. The sign of V7=0 is negative; therefore the number of trips generated from a zone with a low amount of areas addressed to residential building is lower compared to a zone where land use is prevalently residential. A positive sign is associated to the coefficient of the variable V9=0, referred to the factor "Predominance of services among economic activities". Considering this variable, the number of trip made by metro is higher from a zone with a lower presence of services compared with a zone with predominance of services among economic activities. As expected, the variables V11=1

and V12=2 have negative sign because lower levels of accessibility, which is measured on metro network, negatively impact on the number of trips made by metro.

Table 2. Ordered probit model results

Factor

Variable

Model 1

Estimated Coefficient (ß)

Model 2

Estimated p-value Wald

Coefficient (ß)

p-value

Estimated probability 0 1

Reference case Female gender

Worker

No car ownership

Land use prevalently residential

Predominance of services among economic activities

Accessibility

Availability of metro stations

Number of observations k1 (threshold) p2 (Cox and Snell) p2 (Nagelkerke) p2 (McFadden) log likelihood

[V1=0]

[V2=1]

[V3=0]

[V4=1]

[V5=0]

[V6=1]

[V7=0]

[V8=1]

[V9=0]

[V10=1]

[V11=1]

[V12=2]

[V13=3]

[V14=0]

[V15=1]

-1.120

-0.292 0

-0.290 0

-0.514

0.628 0

-1.181 -1.331

-0.408 0 128 -2.319 0.240 0.352 0.240 -32.841

3.674 10.696

-1.067

0.984 0.321

0.909 0.340

0.162 -0.669 . 0 0.058 0.701 . 0 0.055 -1.317 0.001 -1.462 . 0 0.184

-1.853

-12.790

3.827 0.050

4.986 0.026

5.051 0.025 13.934 0.000

0.1641 0.5486 0.2389

0.3379 0.0936 0.2508 0.2630 0.3872 0.3709 0.0607

0.8359 0.4514 0.7611

0.6621 0.9064 0.7492 0.7370 0.6128 0.6291 0.9393

By observing the estimated probabilities we can state that the probabilities of the reference case are about 84% for daily trips made using metro system and about 16% for daily trips made using other motorised transportation means. The estimated probabilities related to variables with negative coefficients show that the probability to travel by metro system decreases compared to the variable considered as reference case. Instead, in the case of the variable V9=0, whose coefficient has positive sign, the probability to make trips by metro system increases compared to the probabilities calculated for the variable considered as reference case. The variable that mainly impacts on the choice of metro system is V1=0, related to the factor "Female gender". When the travellers are males, the probability to have trips made by metro decreases to about 45%, and, at the same time, the probability to use other motorized means increases to about 55%. The influence of the factor "Accessibility" is also relevant on the drop of the probability to make trips by metro.

4. Conclusions

In this paper, an OP model was proposed for exploring how factors related to socio-economic characteristics of population, territorial features, accessibility, and transportation system, impact on transit use. The study case was the

metro network of Madrid (Spain), where the most recent expansions have caused important variations in the localisation of population and economic activities, and in users' travel behaviour.

The explanatory variables were selected on the basis of the analysis of the theoretical reasoning linking some factors to the use of metro system. Each factor was treated making a priori hypotheses about the theoretical reasoning link to the dependent variable, and estimating an expected relationship between themselves. At first, seven factors were selected. The first three factors ("Female gender", "Occupational status" and "No car ownership") relate to socioeconomic characteristics of population. "Land use prevalently residential" and "Predominance of services among economic activities" are factors regarding land use characteristics. Lastly, "Accessibility" is a factor that measures the ease to access in all other zones, and "Availability of metro stations" refers to Metro system characteristics. After some attempts, we proposed a final model considering four factors, and based on a particular reference case, corresponding to female gender, land use prevalently residential, predominance of services among economic activities, and high level of accessibility.

The results showed that the variable that mainly impacts on the use of metro system is related to the factor "Female gender". When the travellers are males, the probability to use metro decreases and, at the same time, the probability to use other motorized means increases. This result shows a clear inclination of males to travel using other motorised transportation modes rather than metro system. Relevant results concern also the influence of the factor "Accessibility" on the drop of the probability to make trips by metro. Factors regarding land use characteristics, as "Land use prevalently residential" and "Predominance of services among economic activities", showed also a certain influence on the probability to use the metro system. The other socio-economic characteristics of the population ("Occupational status" and "No car ownership") seem to have not influence, as well as the factor "Availability of metro stations". Definitively, the results confirmed the hypothesis we made about the influence of the categorical variation of variables affecting the use of metro system. In this context, by operating on accessibility and supplied transit services, travel demand can be shifted from the individual means of transportation towards transit systems.

References

Borooah, V.K., 2001. Logit and Probit: Ordered and Multinomial Models. Sage University Papers Series on Quantitative Applications in Social

Sciences, Serie no. 07-138. Thousand Oaks, CA: Sage. Calvo, F.J., de Oña, J., Arán, F., 2013. Impact of the Madrid subway on population settlement and land use. Land Use Policy 31, 627-639. Cardamone, A.S., Eboli, L., Forciniti, C., Mazzulla, G., 2016. Willingness to use mobile application for smartphone for improving road safety,

International Journal of Injury Control and Safety Promotion 23 (2), 155-169. Cardamone, A.S., Eboli, L., Forciniti, C., Mazzulla, G., 2015. How usual behaviour can affect perceived drivers' psychological state while

driving. Transport, DOI: 10.3846/16484142.2015.1059885. CRTM,2006. Encuesta domiciliaria de movilidad en día laborable de 2004 en la Comunidad de Madrid. Resumen. Consorcio Regional de Transportes de Madrid.

de Oña, J., de Oña, R., Eboli, L., Forciniti, C. Mazzulla, G., 2014. Key factors affecting drivers' perception of accident risk. Accident Analysis and Prevention 73 (1), 225-235.

Eboli, L., Mazzulla, G., 2009. An ordinal logistic regression model for analysing airport passenger satisfaction. Euromed Journal of Business 4(1), 40-57.

Eboli, L., Forciniti, C., Mazzulla, G., 2012. Exploring Land Use and Transport Interaction through Structural Equation Modelling. Procedia -

Social and Behavioral Sciences 54, 107-116. Friedman, B., Gordon, S.P., Peers J.B., 1994. The effect of neo-traditional neighbourhood design on travel characteristics. Transportation Research Record 1400, 63-70.

Geurs, K.T., Van Wee, B., 2004. Accessibility evaluation of land-use and transport strategies: review and research directions. Journal of Transport Geography, 12, 127-140.

Handy, S., Cao, X., Mokhtarian, P., 2005. Correlation or causality between the built environment and travel behavior? Evidence from Northern

California. Transportation Research Part D 10, 427-444. Kitamura, R., Mokhtarian, P., Laidet, L., 1997. A micro-analysis of land use and travel in five neighborhoods in the San Francisco Bay Area. Transportation 24, 125-158.

Mazzulla, G., Forciniti, C., 2012. Spatial association techniques for analysing trip distribution in an urban area. European Transport Research Review 4, 217-233.

McKelvey, R.D., Zavoina W., 1975. A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology 4, 103-120.

Monzón, A., de la Hoz, D., 2009. Efectos sobre la movilidad de la dinámica territorial de Madrid. Urban 14, 58-71.

Nocera, S., Maino, F., Cavallaro, F., 2012. A heuristic method for determining CO2 efficiency in transportation planning. European Transport Research Review. An Open Access Journal 4(2), 91-106.