Available online at www.sciencedirect.com

ScienceDirect

journal homepage: www.elsevier.com/locate/jtte

Original Research Paper

Statistical modeling of total crash frequency at highway intersections

CrossMark

Arash M. Roshandeh a'*, Bismark R. D. K. Agbelie b, Yongdoo Leec

a Department of Engineering and Public Works, City of Alpharetta, GA 30009, USA b School of Civil Engineering, Purdue University, West Lafayette, IN 47907, USA

c Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA 16802, USA

ARTICLE INFO

ABSTRACT

Article history:

Available online 15 March 2016

Keywords:

Total crash frequency Random-parameter count model Intersection Defected pavement

Intersection-related crashes are associated with high proportion of accidents involving drivers, occupants, pedestrians, and cyclists. In general, the purpose of intersection safety analysis is to determine the impact of safety-related variables on pedestrians, cyclists and vehicles, so as to facilitate the design of effective and efficient countermeasure strategies to improve safety at intersections. This study investigates the effects of traffic, environmental, intersection geometric and pavement-related characteristics on total crash frequencies at intersections. A random-parameter Poisson model was used with crash data from 357 signalized intersections in Chicago from 2004 to 2010. The results indicate that out of the identified factors, evening peak period traffic volume, pavement condition, and unlighted intersections have the greatest effects on crash frequencies. Overall, the results seek to suggest that, in order to improve effective highway-related safety countermeasures at intersections, significant attention must be focused on ensuring that pavements are adequately maintained and intersections should be well lighted. It needs to be mentioned that, projects could be implemented at and around the study intersections during the study period (7 years), which could affect the crash frequency over the time. This is an important variable which could be a part of the future studies to investigate the impacts of safety-related works at intersections and their marginal effects on crash frequency at signalized intersections.

© 2016 Periodical Offices of Chang'an University. Production and hosting by Elsevier B.V. on behalf of Owner. This is an open access article under the CC BY-NC-ND license (http://

creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

Intersection-related crashes are associated with high proportion of accidents involving drivers, occupants, pedestrians,

and cyclists. A significant portion of total fatal crashes usually occur at intersections. In order to enhance the safety of intersections, significant attention is needed to ensure safe movement of road users. In general, the purpose of

* Corresponding author. Tel.: +1 773 691 2900.

E-mail address: arash.moradkhani@gmail.com (A. M. Roshandeh). Peer review under responsibility of Periodical Offices of Chang'an University. http://dx.doi.org/10.1016/j.jtte.2016.03.003

2095-7564/© 2016 Periodical Offices of Chang'an University. Production and hosting by Elsevier B.V. on behalf of Owner. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

intersection safety analysis is to determine the impact of safety-related variables on pedestrians, cyclists and vehicles, so as to facilitate the design of effective and efficient coun-termeasure strategies to improve safety at intersections.

In the U.S., Illinois is one of the states with considerable number of crashes. For example, in 2010, 927 persons were killed, and 88,937 people were injured in crashes. Out of the total fatal crashes, 25.6% occurred at intersections. Besides, in the previous three years (2007, 2008, and 2009), an average of over 26% fatalities occurred at intersections on average (IDOT, 2013).

Although, urban intersections have been received significant attention in terms of signals' timing optimization to minimize vehicles' delay (Dong et al., 2014c; Nesheli et al., 2009) or to simultaneously minimize the delay of both vehicles and pedestrians (Roshandeh et al., 2014). Ignoring the expected safety effects may not yield the overall performance-based benefits. Compared with vehicle drivers, pedestrians and cyclists are much more vulnerable in intersection crashes due to their interactions with vehicles (Dong et al., 2014a; Zhou et al., 2014). Evidently, pedestrians comprise more than 22% of the 1.24 million people killed in traffic accidents worldwide (World Health Organization, 2014).

In order to investigate contributing factors on total crash frequency at intersections, this study intends to analyze effects of traffic, environmental, intersection geometric and pavement-related characteristics using a 7-year crash data from 357 randomly selected intersections in Chicago.

Over the past years, several studies have been conducted on intersection safety analysis. Agbelie and Roshandeh (2015) applied a random-parameters negative binomial model to investigate the impacts of signal-related factors on crash frequency. The results showed that increasing the number of signal phases and traffic volume at an intersection would increase crash frequency, whereas, increasing the number of approach lanes and the maximum green time would also lead to an increase in crash frequency at many intersections. Wu et al. (2013) used intersections crash data, all with intersection approaches having signal-warning flashers, to estimate a random-parameter negative binomial model of crash frequency. The estimation results revealed that a 5 mi/h speed-limit reduction decreases the frequency of crashes in some cases, whereas in other cases, it increases crash frequency. Oh et al. (2010) used a traffic conflict technique to evaluate traffic safety at signalized intersections and analyzed traffic conflicts that occurred at the time of a signal violation. By using image procedure technology, traffic images from two intersections in South Korea were analyzed and found that serious and dangerous conflicts took place at the time of signal violation. Gomes et al. (2012) applied Poisson-gamma modeling framework to develop predictive models for estimating safety performance of signalized and unsignalized intersections in Lisbon, Portugal. They put forward that highway geometric characteristics affect crash severity occurring at urban three- and four-leg intersections. Zhou et al. (2013) proposed a root cause degree procedure to measure intersection safety in Shanghai and found that clearance time, safety education, enforcement, trajectory inside the intersection, crossing a refuge island, speeding, and the right turn control

pattern affect crash frequency. Wang and Abdel-Aty (2006) used the generalized estimating equations with the negative binomial link function to model rear-end crash frequencies at signalized intersections to investigate the crash temporal or spatial correlation among the data. Results showed that intersections with heavy traffic, more right and left-turn lanes, large number of phases per cycle, high speed limits, and in high population areas are more likely to have higher rear-end crashes. Das and Abdel-Aty (2011) applied genetic programming technique to analyze rear-end crash counts and found that crashes decreased with an increase in skid resistance during morning peak hours, whereas the crashes increased in the afternoon peak period. Dong et al. (2014a) investigated the contributing factors on crash frequency at urban signalized intersections and found that compared with the univariate Poisson-lognormal (UVPLN) and multivariate Poisson (MVP) models, the multivariate Poisson-lognormal (MVPLN) model better identifies significant factors and predicts crash frequencies. Their analysis suggests that traffic volume, truck percentage, lighting condition, and intersection angle significantly affect intersection safety. Park and Lord (2007) employed a new multivariate approach for modeling data on crash counts by severity based on MVPLN models. The method was applied to the multivariate crash counts from 451 intersections in California obtained in 10 years. It showed that the new MVPLN regression approach could address both overdispersion and a fully general correlation structure in the data. El-Basyouny and Sayed (2013) used a dataset corresponding to 51 signalized intersections in British Columbia to investigate the relationship between conflicts and collisions. A lognormal model was employed to predict conflicts and a conflicts-based negative binomial (NB) safety performance function was then used to predict collisions. Data on collision frequency, average hourly conflicts, average hourly volumes, area type (urban/suburban), the number of through lanes and the presence of right and left turn lanes were used as explanatory variables. The results showed that the effects of conflicts on collisions are nonlinear with decreasing rates. Dong et al. (2014b) employed a multivariate random-parameters zero-inflated negative binomial model for crash frequency modeling, which showed that this method performed better than Poisson, negative binomial, and Poisson-lognormal models. In another study, Dong et al. (2014c) used a Bayesian a multivariate zero-inflated Poisson model and proved that it would address correlations among various crash severity levels and properly handles observations with zero crashes.

The existing literature have extensively accounted for impacts of different variables on crash frequency at intersections. However, in order to investigate contributing effects of pavement condition and intersection's work zones on crash frequency, further research is still required. As such, this study endeavors to fill this gap and analyze impacts of pavement condition, intersection's work zone, traffic and environmental characteristics on total crash frequency at intersections.

The dataset available for the present study includes crash data of 357 intersections in Chicago, collected from 2004 to 2010. For the purpose of investigating safety impacts at these

Table 1 - Descriptive statistics of selected variables.

Variable description Mean Std. dev. Minimum Maximum

Work zone (1 if yes, 0 otherwise) 0.163 0.369 0 1

Average annual traffic on major road (in thousands) 65.824 28.289 23.821 298.861

Snowy weather condition (1 if yes, 0 otherwise) 0.502 0.500 0 1

Average annual traffic on minor road (in thousands) 37.185 13.964 4.901 99.750

Intersection without street lights in the evening (1 if yes, 0 otherwise) 0.574 0.495 0 1

Weekday evening peak periods (5 p.m.—7 p.m.) (1 if yes, 0 otherwise) 0.799 0.401 0 1

Defected pavement surface (1 if yes, 0 otherwise) 0.805 0.396 0 1

Number of crashes 17.40 4.11 2 106

intersections, each intersection was considered as an observation, with the preservation of distinct geometric, pavement and traffic-flow characteristics for each approach. Meanwhile, the crash frequency of each intersection for each year was also considered as an observation for the 357 intersections, generating 2499 records since each intersection has 7 years of crash data. The primary variables of interest were traffic, environmental, intersection geometric and pavement-related characteristics. The variables selected in the final model were those observed to be statistically significant and were provided in Table 1.

2. Methodology

Crash frequency analysis at intersections is a count data analysis, and count data modeling approach is generally adopted in the modeling process. It can be observed that crash numbers are nonnegative in nature, and the assigned crash frequencies are nonnegative integers, thus, using a count data modeling technique will be the most appropriate (Lord and Mannering, 2010). A number of statistical approaches, including Poisson regression, negative binomial, zero-inflated Poisson regression, and zero-inflated negative binomial, have been used to model count data (Washington et al., 2011). In the Poisson framework, the probability P(gk) of intersection k having gk crashes per year is shown as follow

P(gk) = exp(-4kK7 gk!

where 4k is the assigned Poisson parameter for intersection k, which is intersection k's expected crash frequencies, E(gk).

Generally, the Poisson regression specifies the crash frequency parameter 4k as a function of independent variables using a log-linear function.

4k = exp(aXk)

where a is a vector of estimable parameters, Xk is a vector of independent variables (Washington et al., 2011).

The Poisson distribution constraints the variance and mean to be equal, such that VAR(gk) equals to E(gk). However, if this equality is violated, the data can be considered as either overdispersed (VAR(gk) > E(gk)) or underdispersed (VAR(gk) < E(gk)). The estimated parameter vector standard errors will be incongruous resulting in incongruous inferences. In order to account for the possibility of overdispersion, the negative binomial model is derived by modifying Eq. (2) as follow

= exp(aXk + tk)

where exp(tk) is a gamma-distributed error term with mean 1 and variance 6. The justification for adding this term allows the flexibility of the variance to vary from the mean as follow

VAR(gk) = E(gk)[1 + bE(gk)] = E(gk) + b[E(gk )]2

The negative binomial probability density function has the form as follow

P(gk) =

\ß + <

\ß + <

where G(-) is a gamma function, b is an overdispersion factor.

As b approaches zero, the Poisson regression becomes a limiting model of the negative binomial regression. Thus, if b is significantly different from zero, the use of the negative binomial is appropriate, and if it is not, the Poisson model is appropriate (Washington et al., 2011). In order to account for the possible heterogeneity, which may vary from one intersection to another, random-parameter model can be introduced. Greene (2007) developed an estimation procedures (using simulated maximum likelihood estimation) for incorporating random-parameters in count-data models. To develop a random-parameter model that accounts for possible unobserved heterogeneity across intersections, the individual estimable parameters are written as follow

ai = a + uj

where u is a randomly distributed term for each signalized intersection i and it can take on a wide variety of distributions, such as Weibull, Erlang, logistic, log-normal, normal, etc.

By using Eq. (5), the crash frequency parameter (Poisson parameter), <pk, becomes 4k|wi = exp(aXk + tk) in the negative binomial with the corresponding probabilities for Poisson or negative binomial now P(^k|wi). The log-likelihood function (LL) for the random-parameter negative binomial in this case can be written as follow

LL = In / g(ui)P(4k|ui)dwi Ck J

where g(-) is the probability density function of the ui.

Because maximum likelihood estimation of the random-parameter negative binomial models is computationally cumbersome (due to the required numerical integration of the

negative binomial function over the distribution of the random-parameters), a simulation-based maximum likelihood method is used (the estimated parameters are those that maximize the simulated log-likelihood function while allowing for the possibility that the variance of u for intersection-level parameters is significantly greater than zero). The most popular simulation approach uses Halton draws, which has been shown to provide a more efficient distribution of draws for numerical integration than purely random draws (Greene, 2007). Finally, to assess the impact of specific variables on the mean number of crashes, marginal effects are computed (Washington et al., 2011). Marginal effects are computed for each observation and then averaged across all observations. The marginal effects give the effect that a one-unit change in x has on the expected number of crashes at each approach, gk.

3. Estimation results

In order to investigate the appropriate model, the random-parameter Poisson and negative binomial models were developed. The models were estimated using simulation-based maximum likelihood with 200 Halton draws. The number of Halton draws was selected because it has been proven to produce consistent and accurate parameter estimates (Agbelie, 2014; Bhat, 2003; Mannering and Bhat, 2014; Milton et al., 2008). Halton draws were used for the simulation instead of random draws, because it has been shown that fewer Halton draws are required to attain convergence compared to random draws (Train, 2003). Also, the efficiency of Halton draws is generally significant relative to random draws. In order to select the random-parameter density functional forms, the following distributions were investigated: uniform, triangular, lognormal and normal distributions. However, the normal distributions were found to yield the best statistical fit for all the parameters. Nonetheless, future studies could be conducted to further

investigate and compare the different distributions for random-parameter model in accident analysis.

Out of the two models, the random-parameter Poisson model was found to be the most appropriate, because the overdispersion factor b (8.155 x 107) was found to be statistically insignificant (t-statistic of 0.00001). Also, when the fixed-parameter Poisson model was initially investigated, the McFadden p2 statistic was 0.11, while a random-effect Poisson model yielded a McFadden p2 statistic of 0.90. However, the results from the random-parameter Poisson model indicate that the model has the best statistical fit (McFadden p2 statistic of 0.93) and the estimated parameters are of plausible sign and magnitude. The estimated results and marginal effects from the random-parameter Poisson model are presented in Table 2. In order to determine if a parameter is random, the standard deviation of the parameter density has to be statistically significant. However, if the estimated standard deviation is not statistically significant (statistically different from zero), the parameter is considered to be fixed across the population of intersections. The estimation results presented in Table 2 reveal that eight parameters are statistically significant (including constant term), and six of them (constant, average annual traffic on major road (in thousands), snowy weather condition, intersection without street lights in the evening, weekday evening peak periods, and defected pavement surface), are found to yield statistically significant random-parameters (estimated standard deviation for each parameter distribution is found to be significantly different from zero), while two parameters (work zone location, and average annual traffic on minor road) are fixed across the population of intersections.

Examination of the results shows that increasing the number of intersections located at work zones would increase crash frequency. This variable produced a positive fixed parameter. A unit increase in the number of intersections located at work zones would increase the mean crash rate by 1.184 (as shown by the marginal effect value in Table 2). This indicates that intersections located at work zones are more

Table 2 — Random-parameter Poisson regression model for annual accident frequencies (all random parameters are normally distributed).

Variable description Estimated parameter t-statistic Marginal effect

Constant (standard deviation of parameter distribution) 2.176 (0.129) 105.510 (28.102)

Work zone location (1 if yes, 0 otherwise) 0.064 5.222 1.184

Average annual traffic on major road (in thousands) 7.681 x 10—4 (8.096 x 10—4) 4.624 (12.696) 0.014

(standard deviation of parameter distribution)

Snowy weather condition (1 if yes, 0 otherwise) 0.106 (0.052) 11.540 (8.411) 1.971

(standard deviation of parameter distribution)

Average annual traffic on minor road (in thousands) 0.003 9.524 0.058

Intersection without street lights in the evening (1 if yes, 0 otherwise) 0.125 (0.040) 13.230 (6.992) 2.319

(standard deviation of parameter distribution)

Weekday evening peak periods (5 p.m.—7 p.m.) (1 if yes, 0 otherwise) 0.328 (0.084) 24.077 (16.951) 6.085

(standard deviation of parameter distribution)

Defected pavement surface (1 if yes, 0 otherwise) 0.225 (0.064) 17.482 (12.981) 4.164

(standard deviation of parameter distribution)

Number of observations 2499

Log-likelihood at zero LL(0) —111,205.70

Log-likelihood at convergence LL(b) —7752.62

p2 [1—LL(b)/LL(0)] 0.93

likely to experience crashes compared with those without work zones. Thus, the results indicate that work zones around intersection locations should have adequate signage and enforcement in order to reduce crash frequency at such locations.

Average annual traffic on major road was found to have a normally distributed parameter with mean 7.681 x 10~4 and standard deviation of 8.096 x 10~4, signifying that an increase in traffic volume on the major road would positively increase accident frequency for 82.86% of the intersections, and would decrease accident frequency for the remaining 17.14% of intersections. The result reveals that for most of the intersections, a unit increase (in thousands) in major road traffic volume would increase accident frequency by 0.014 (in thousands). From the results, it can be observed that traffic on the major road at an intersection plays a significant role in crashes.

The snowfall indicator variable produced a statistically significant normally distributed positive parameter with mean 0.106 and standard deviation of 0.052, indicating for 97.92% of intersections, an increase in snowfall would increase the probability of experiencing crashes, while reducing the crash frequency for 2.08% of the intersections. A unit increase in snow fall at an intersection would increase crash frequencies by 1.971. This shows that as snow fall increases at an intersection, drivers may find it difficult to stop when approaching an intersection thereby resulting in a crash.

Finally, intersections with defected pavement surfaces produced a normally distributed positive parameter with a mean 0.225 and standard deviation 0.064, suggesting that for 99.98% of intersections, an increase in the number of intersections with defected pavement surfaces would increase accident frequency, while decreasing crash frequency for the remaining 0.02% of intersections. A unit increase in the number of defected surfaces at an intersection would increase crash frequencies by 4.164. The marginal effect value is high and suggests that agencies should ensure defected surfaces at intersections are prioritized in maintenance works, and pavement conditions should be regularly monitored to ensure defected surfaces are timely repaired to prevent crashes.

4. Summary and conclusions

This paper investigates effects of a number of factors on the safety of highway intersections. Random-parameter Poisson and negative binomial models were initially tested, however, the dispersion factor was found to be statistically insignificant. Thus, the random-parameter Poisson model was used to analyze crash frequency data from 2004 to 2010, in order to examine the effects of a number of variables on crashes at intersections. Out of the eight estimated parameters (including the constant term) found to be statistically significant, six produced normally distributed parameters while the remaining were observed to be fixed across intersections. The marginal effects were used to estimate the effects of explanatory variables on crash frequency. The results indicate that out of the identified factors, evening peak period traffic volume, pavement condition, and unlighted intersections have the greatest effects on crash frequencies. In relation to

pavement condition, the marginal effect value for defected pavement surface at intersections was high, and the coun-termeasure for this will be for agencies to ensure defected surfaces at intersections are prioritized in maintenance works so that defected surfaces are timely repaired to prevent crashes. Overall, the results seek to suggest that, in order to ensure effective highway-related safety countermeasures at intersections, significant attention must be focused on ensuring that pavements are adequately maintained and intersections should be well lighted.

It needs to be mentioned that, projects could be implemented at and around the study intersections during the study period (7 years), which could affect the crash frequency over the time. This is an important variable, which could be a part of the future studies to investigate the impacts of safety-related works at intersections and their marginal effects on crash frequency at signalized intersections.

Acknowledgments

The authors are grateful for the assistance of transportation agencies in the Chicago metropolitan area for data collection as part of methodology application.

REFERENCES

Agbelie, B.R.D.K., 2014. An empirical analysis of three econometric frameworks for evaluating economic impacts of transportation infrastructure expenditures across countries. Transportation Policy 35, 304-310.

Agbelie, B.R.D.K., Roshandeh, A.M., 2015. Safety impacts of signal-related characteristics at urban signalized intersections. Journal of Transportation Safety & Security 7 (3), 199-207.

Bhat, C., 2003. Simulation estimation of mixed discrete choice models using randomized and scrambled Halton sequences. Transportation Research Part B: Methodological 37 (9), 837-855.

Das, A., Abdel-Aty, M., 2011. A combined frequency-severity approach for the analysis of rear-end crashes on urban arterials. Safety Science 49 (8), 1156-1163.

Dong, C., Clarke, D.B., Richards, S.H., et al., 2014a. Differences in passenger car and large truck involved crash frequencies at urban signalized intersections: an exploratory analysis. Accident Analysis & Prevention 62 (1), 87-94.

Dong, C., Clarke, D.B., Yan, X., et al., 2014b. Multivariate random-parameters zero-inflated negative binomial regression model: an application to estimate crash frequencies at intersections. Accident Analysis & Prevention 70 (5), 320-329.

Dong, C., Richards, S.H., Clarke, D.B., et al., 2014c. Examining signalized intersection crash frequency using multivariate zero-inflated Poisson regression. Safety Science 70, 63-69.

El-Basyouny, K., Sayed, T., 2013. Safety performance functions using traffic conflicts. Safety Science 51 (1), 160-164.

Greene, W., 2007. Limdep, Version 9.0. Econometric Software Inc., New York.

Gomes, S.V., Geedipally, S.R., Lord, D., 2012. Estimating the safety performance of urban intersections in Lisbon, Portugal. Safety Science 50 (9), 1732-1739.

Illinois Department of Transportation (IDOT), 2013. Motor Vehicle Crash Information. Division of Traffic Safety, Springfield.

Lord, D., Mannering, F., 2010. The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transportation Research Part A: Policy and Practice 44 (5), 291-305.

Mannering, F., Bhat, C., 2014. Analytic methods in accident research: methodological frontier and future directions. Analytic Methods in Accident Research 1, 1-22.

Milton, J., Shankar, V., Mannering, F., 2008. Highway accident severities and the mixed logit model: an exploratory empirical analysis. Accident Analysis & Prevention 40 (1), 260-266.

Nesheli, M.M., Puan, O.C., Roshandeh, A.M., 2009. Optimization of traffic signal coordination system on congestion: a case study. WSEAS Transactions on Advances in Engineering Education 7 (6), 203-212.

Oh, J., Kim, E., Kim, M., et al., 2010. Development of conflict techniques for left-turn and cross-traffic at protected left-turn signalized intersections. Safety Science 48 (4), 460-468.

Park, E.S., Lord, D., 2007. Multivariate Poisson-lognormal models for jointly modeling crash frequency by severity. Transportation Research Record 2019, 1-6.

Roshandeh, A.M., Levinson, H.S., Li, Z., et al., 2014. A new methodology for intersection signal timing optimization to simultaneously minimize vehicle and pedestrian delays. Journal of Transportation Engineering 140 (5), 382-398.

Train, K., 2003. Discrete Choice Methods with Simulation. Cambridge University Press, Cambridge.

Wang, X., Abdel-Aty, M., 2006. Temporal and spatial analyses of rear-end crashes at signalized intersections. Accident Analysis & Prevention 38 (6), 1137-1150.

Washington, S.P., Karlaftis, M.G., Mannering, F.L., 2011. Statistical and econometric methods for transportation data analysis. Technometrics 46 (4), 492-493.

World Health Organization, 2014. Global Status Report on Road Safety 2013: Supporting a Decade of Action. World Health Organization, Geneva.

Wu, Z., Sharma, A., Mannering, F.L., et al., 2013. Safety impacts of signal-warning flashers and speed control at high-speed signalized intersections. Accident Analysis & Prevention 54 (5), 90-98.

Zhou, B., Roshandeh, A.M., Zhang, S., 2014. Safety impacts of push-button and countdown timer on non motorized traffic at intersections. Mathematical Problems in Engineering 2014, 460109.

Zhou, S., Sun, J., Li, K.P., et al., 2013. Development of a root cause degree procedure for measuring intersection safety factors. Safety Science 51 (1), 257-266.