Available online at www.sciencedirect.com

ScienceDirect

Transportation Research Procedia 9 (2015) 205 - 224

Transportation

Procedia

www.elsevier.com/locate/procedia

21st International Symposium on Transportation and Traffic Theory, 5 - 7 August, 2015, Kobe,

Characterization of network traffic processes under adaptive traffic

control systems

Alessandra Pascale a *, Hoang Thanh Lam a, Rahul Nair a

aIBM Research - Ireland, Technology Campus, Dublin 15, Ireland

Abstract

We present a compact characterization of network-level traffic processes for a dense urban area operating under an adaptive traffic control system. The characterization is based on a state classification scheme that is employed at a detector level, and a state transition model that works with combinations of detectors that are topologically dependent. Jointly, the two models provide a concise but rich representation of traffic processes at the network level. The key insight is the identification of transient states, termed under-utilized (U) states, where network effects such as insufficient downstream capacity are captured. In such states the green time is not fully used. The approach provides the space-time evolution of states across the network, conditional probabilities of upstream traffic states that drive state propagation in the near term, and probabilistic information on congested paths on the network, where paths are described as a sequence of detectors. The paper presents empirical evidence based on the SCATS adaptive control system in Dublin, the insights provided by the proposed approach, and the importance of under-utilized states, which represent as much as 20% of unused capacity along certain corridors in peak periods. The results provide a basis for future network control procedures.

©2015PublishedbyElsevierB.V.Thisisan open access article under the CC BY-NC-ND license

(http://creativecommons.Org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of the Scientific Committee of ISTTT21

Keywords: adaptive traffic control; network traffic processes; urban traffic networks

1. Introduction

Adaptive traffic control systems work by adjusting supply conditions to match locally observed traffic. Supply considerations, typically though of as green time provided to a particular intersection approach, include a host of factors that influence vehicle throughput. In urban networks and major corridors, throughput can be increased significantly if groups of traffic lights are coordinated to serve major directional flows. Such systems are deployed in many cities across the world and include SCATS (Sims and Dobinson, 1980) and SCOOT (Hunt, Robertson, Bretherton and Royle, 1982), two of several systems proposed in the literature and available in practice.

While control mechanisms used by adaptive systems perform well for under-saturated conditions, over-saturated conditions remain challenging. This is especially the case in general urban networks where small perturbations at

* Corresponding Author

E-mail addresses: apascale@ie.ibm.com (Alessandra Pascale)., t.l.hoang@ie.ibm.com (Hoang Thanh Lam)., rahul.nair@ie.ibm.com (Rahul Nair).

2352-1465 © 2015 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license

(http://creativecommons.Org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of the Scientific Committee of ISTTT21

doi:10.1016/j.trpro.2015.07.012

critical locations, such as incomplete dissipation of a queue, can quickly move the system towards gridlock. While the role of network-effects, the influence of one intersection on its topologically dependent neighbors, has been identified as being critical, traffic dynamics of real-world systems in these regimes have not been well understood. When the intersections are adaptively regulated, supply-side actions introduced further complexity to establish network-level understanding of a traffic system.

This paper aims to provide empirical evidence of traffic dynamics of networks under adaptive control using real-world data from Dublin where the SCATS system is deployed. The system is characterized by a novel three state classification scheme that describes free-flow, congested, and transient states which are driven by network effects. In over-saturated conditions, such network states are critical as strategies to increase supply (via increased green time) are ineffective, since this cannot be effectively utilized. The classification scheme relies jointly on the degree of saturation, a unit-less measure computed for each approach and flow in vehicles per hour. Since data from adaptive control systems represents both supply and demand side of traffic processes, classical state variables of flow, density, and speed are not directly applicable and cannot be interpreted in a coherent manner due to discrete changes in supply and feedback mechanisms. The state of the control system is employed to further establish the penalty for coordination, a loss in throughput at a particular location that can be solely attributed to a (sub-optimal) control decision to favour other approaches within the coordinated subsystem.

Additionally, a network-level state transition model to capture dynamics of the three states based on a dynamic Bayesian network (DBN) is developed (Friedman, Murphy and Russell, 1998; Neapolitan, 2003; Murphy, 2002). BNs have been previously applied in the literature for traffic state prediction and estimation (Sun, Zhang and Yu, 2006; Pascale and Nicoli, 2011; Castillo, Menendez and Sanchez-Cambronero, 2008) although not in the specific context studied herein, that of identification of relevant spatio-temporal patterns of traffic states. The DBN is calibrated based on the classified data obtaining a compact model which represents important traffic patterns over the network. With the compact DBN model, inferences such as evaluation of congestion propagation likelihood along any path in the network and acquisition of conditional distribution of the states at upstream and downstream detectors, can be made. Empirical results with a dataset in the Dublin city show that interesting state transient patterns can be revealed by querying the learned DBN.

At the macroscopic level, the major thrust in the literature has been towards establishing relationships between key traffic quantities of flow and density at the city-wide level. Such relationships that capture network-wide physics are powerful and pave the way for better network operations. The theory postulates that independent of demand, network topology and control mechanisms define a macroscopic fundamental diagram (MFD) which systematically relates the average network-wide densities and flow (Ardekani and Herman, 1987; Mahmassani, Williams and Herman, 1987; Daganzo, 2007; Geroliminis and Daganzo, 2008; Helbing, 2009). (Daganzo, 2007) argues for the use of MFD and neighborhood models in adaptive control.

Studies on key properties that would make the MFD applicable (Geroliminis and Sun, 2011; Buisson and Ladier,

2009) have found that spatial variability of density to be a critical factor. (Mazloumian, Geroliminis and Helbing,

2010) asserts that spatial inhomogeneity is critical in understanding poor network capacity. Network capacity is non-deterministic and highly variable. Similar conclusions on flux being dependent on topological features were reached by (Mendes, Da Silva and Herrmann, 2012). Studies that address the problem of homogeneously partitioning cities into neighborhoods such that MFDs are valid have been conducted (Ji and Geroliminis, 2012; Pascale, Mavroeidis and Lam, in review). (Helbing, 2009) analytically derive macroscopic relations based on kinematic wave theory. Assuming cyclical phases, relationships for three regimes are shown. Recently, (Mahmassani, Saberi et al., 2013) conclude, via simulation experiments in Chicago, that networks tend to gridlock in many ways and network capacity is highly influenced by demand considerations, such as adaptability of drivers and route choice.

Several studies have looked at network processes from the perspective of control engineering (?, see)for re-views]papageorgiou2003review,papageorgiou2007its. These works aim to provide optimal control at the network level and devise specialized traffic control strategies that mitigate gridlock. (Keyvan-Ekbatani, Papageorgiou and Papamichail, 2013) demonstrate the use of the MFD within a feedback control mechanism for gating. Gating refers to a strategy whereby traffic is carefully metered into a protected area, so as to disperse density as insights from the MFD suggests. They further advance this to remote gating using fewer observations (Keyvan-Ekbatani, Papageorgiou and Papamichail, 2014) for a perimeter control strategy and show an improvement in throughput. Traffic responsive urban control (TUC) (Dinopoulou, Diakaki and Papageorgiou, 2000) is a strategy that aims to 'minimize and balance the

number of vehicles on an urban link'. The model is a linear-quadratic optimal control problem that is demonstrated in Glasgow (Diakaki, Papageorgiou and Aboudolas, 2002; Dinopoulou, Diakaki and Papageorgiou, 2006). (Aboudolas, Papageorgiou, Kouvelas and Kosmatopoulos, 2010) formulate a quadratic program to balance link queues and report an improvement over TUC. The model explicitly aims to reduce the risk of over-saturation and show promise for network-wide control in Chania, Greece for a network with 16 nodes and 60 links. (de Gier, Garoni and Rojas, 2011) use a cellular automata to conclude that adaptive control that relies on traffic states from topologically dependent nodes outperforms more traditional adaptive control and no adaptation.

A robust analysis using a cellular automata model by (Zhang, Garoni and de Gier, 2013) provides more insights into spatial heterogeneity and hysteresis for networks with adaptive control. The key insights, are that network heterogeneity increases as network density increases and that control strategies directly impact the network performance. (Mazloumian, Geroliminis and Helbing, 2010) states that inhomogeneity of spatial densities have critical impact on throughput and can be attributed to spillovers in networks that have significant negative impact on throughput. (Geroliminis and Skabardonis, 2011) leverage the LWR framework to identify queue spillovers. The detectors are mid-block detectors and using a triangular fundamental diagram, they derive expressions for critical blocking. The main insight in that work is that queue discharge rates are lower than saturation flows when spillovers occur. Via simulation experiments they also relate the existence of spillovers to overall network throughput.

The main motivation for our work arises from the complexity involved in managing adaptive control systems. Since these systems evolve over time, have a large parameter space that impact efficiency, and rely on field experience of traffic engineers to configure correctly, there is a large role for data-driven insights at the local and network level that can be leveraged to improve services, and potentially assist mobility managers with tools to automate tasks such that the benefits of adaptive control are realized. As a example, (Jhaveri, Perrin and Martin, 2003) show that adaptive control systems are not 'plug-and-play' and validation is critical. In their simulation experiments they should a degradation of 219%, in terms of delay, when the system is deployed without validation, and an improvement of 8% when the system is correctly validated.

The key contributions in this paper are (a) the characterization of the dynamics of arterial networks where the traffic process is governed by both demand and control processes, (b) identification of a transient state that can be attributed solely to network effects, (c) an interpretation scheme of field observations based on a local state classification model, and a network transition model used to establish network relationships. In addition, empirical insights from the SCATS system in Dublin are also presented. While the state classification model is specific to the SCATS system, other insights can be leveraged for other types of control systems where supply-side information is available.

The paper is structured as follows. Section 2 presents the interpretation scheme and the local state classification model. Sections 3 and 4 present the network transition model. Section 5 presents empirical data from the SCATS system in Dublin followed by a discussion of the results 6.

2. The State Classification Scheme

The SCATS traffic control system works by adapting supply conditions to changing traffic. The supply conditions primarily refer to green time, but also include other parameters such as offset, cycle time, and dynamic coupling/decoupling of intersections that influence vehicle throughput. In coordinated settings, a sub-system is defined as a subset of intersections that should essentially 'move together'. The cycle time for intersections within this sub-system is typically set to be equal, such that major directional flows can be efficiently served.

The system consists of stop-line detectors that report two critical quantities at the end of each cycle (usually 2-3 minutes in rush hours). The number of vehicles, or flow in vehicles per hour, denoted by f, and a unit-less measure of utilization, called degree of saturation, denoted by DS. The control mechanism is driven by DS. An optimal utilization of green time implies a DS = 100, where flows were at the optimum for that green time offered. Higher values, indicate over-saturated conditions, in which case the system aims to provide additional green time to that approach (or sub-system) such that the DS returns to the target levels, typically around 90% utilization. SCATS computes DS based on detector occupancy rates as

g - (T -1- f)

DS = 100 ■ -—--— (1)

where g is the green time for that phase, T is the time for which the sensor is not occupied during the green phase, f is the number of vehicles and t is the temporal gap between consecutive vehicles at the maximum flow. At optimal flow rates, DS = 100, since the temporal gaps are equal to the time that detector is not occupied. For larger values the system is considered over-saturated, while for smaller values, the system is considered at free-flow. Using this principle, the control mechanism aims to keep the DS 90, a target utilization, by allowing additional green time to approaches with a high DS . In coordinated settings, the cycle time for a sub-system is determined by the detector with highest DS.

Degree of saturation DS is therefore a local utilization measure, that the control mechanism uses to evaluate if the detector is over-saturated (DS > 100) or not. The expected fundamental relationship between f and DS is similar to the triangular fundamental diagram, in that flows increase with utilization up until a threshold, past which there is a severe degradation of flows due to congestion. In practice, as the data from Dublin describe in more detail in Section 5 show, this relationship between the flow f and DS in dense networks is not as systematic, with high levels of dispersion in the under-saturated regime.

This dispersion in the under-saturated conditions is demonstrated to be due to two distinct regimes. One in the freeflow state, where utilization of green time is as expected, and the other, where vehicular flows are low, but with high DS leading to under-utilization of green time. The second state is transient in nature, and can be directly explained by the traffic processes at downstream and upstream intersections. These 'network-effects' lead to the paradoxical behavior of high utilization and low flows.

2.1. Interpretation of states

Figure 1 depicts the three states based on data from one detector. A free-flow (F) state is where flows increase with utilization proportionally to the optimal service rate. This represents normal and under-saturated conditions. In this case, a phase serves the entire queue. A congested (C) regime is detected when DS is higher than the one corresponding to maximum flow, while flow is low. This C state corresponds to a congested regime.

These two states would describe the fundamental diagram between f and DS with the expected triangular shape. As outlined before a dispersion in the undersaturated regime is actually observed and the paper is devoted to model it. A third under-utilized (U) state shows low flow values for under-saturated cycles (DS < 100). The consequence of such cycles is that the green time is not completely used over that cycle despite higher than normal occupancy of the detector, as seen by the high but under-saturated DS . We present empirical evidence that the existence of these states are the direct result of network effects, such as down stream capacities on one or more approaches not being available to meet throughput needs. Their existence is related to spillovers that propagate through the network and cause residual queues.

The behavior of the three states can be also described using the detector occupancy profiles as shown in Figure 2. SCATS uses stop line detectors that measures traffic for the entire cycle length. During the red phase r, the detector is completely occupied should there be vehicle waiting to be served. During the green time g it registers the passing of vehicles. An example of the temporal signature of the sensor in the three different states are shown. While the F state shows free flowing vehicles, with low detector occupancy, the C state shows slow moving vehicles, with long detector stays, leading to high DS. To achieve, high (but under-saturated) DS and low f, the U state has a characteristic detector signature, where a platoon of vehicles is served till such time there is a spillover, which leads to higher DS.

Despite both having low flows, the critical difference between C and U states is that the congested regime C has low unoccupied detector time T (see Equation 1), while the U has near optimal T for part of the phase, and then is completely occupied, the average of which results in low flows for the cycle, but higher (but under-saturated) utilization DS .

Section 5 demonstrates the critical nature of the U state from an empirical standpoint. The existence of U states is symptomatic of local perturbations that are likely to amplify downstream, and move the system to the congested regimes, especially if these transient states are observed at critical locations at critical times of the day. When the U state occurs the flow is lower than it would be expected if a F state were observed for the same value of DS, this is clear looking to fig 1. The difference between the actual flow and the corresponding value in F state can be interpreted as a capacity loss in terms of a portion of green time that is not efficiently used by the vehicles. A similar loss can be computed for the C state if the actual flow is compared with the maximum flow. The U state plays a key role in

DS [%]

Underutilization of green time - U

Fig. 1. Observed flow and degree-of-saturation and the three states for a single detector

Fig. 2. Detection of F, U and C state. The detector profile in the three cases is showed.

determining the amount of wasted capacity in the system and while it is transient, analysis have been conducted on real data from Dublin city resulting to presence of state U accounting for as much as 20% of lost capacity.

With this interpretation of the field data, the subsequent state classification scheme captures this three-state model, which captures the under-utilized states at the detector level. Since this indicates the existence of network effects, the next section presents a network transition model that seeks to leverage this state information to derive congestion propagation paths.

Fig. 3. Calibration of the classification parameters for a turn lane (left) and a main line lane (right)

2.2. The classification algorithm

The classification method uses the four detector-specific parameters Qopt, DSmax, aQ and <q. These parameters are calibrated from historical data as shown below (Section 2.3). The capacity rate, defined as Q = f /DS, is a very important measure of the service rate of the traffic light in terms of vehicles per hour of green time (veh/h/g). Qopt is defined as the service rate of a detector in free-flow conditions. DSmax is value of DS that corresponds to the maximum achievable flow. < Q is a dispersion parameter for utilization at maximal service flow, and aQ is a parameter that delineates free-flow service rates from transient ones.

Given the observed flow f, utilization DS, a derived capacity rate Q = f /DS and a set of calibrated parameters, the state of a detector can be classified based on the following expression.

s(f, DS, Q) =

U, if f > Q - aQ and Q < Qopt - <q and DS < DSmax,

C, if f > Q - aQ and Q < Qopt - <q and DS > DS max, (2)

F, otherwise.

The conditions can be interpreted using Figure 3. The free-flow F state occurs around an optimal capacity rate Qopt with a dispersion of < Q. For capacity rate values that exceed this (Q < Qopt - < Q), the state can either be congested C or transient U. High capacity rates (beyond the theoretical limit) can occur in practice. To classify these as free-flow, the second condition restricts C and U states to only when capacity rates are lower than the optimal rate (Q < Qopt - <q). The last condition distinguishes between congested and transient states using degree of saturation DS.

This classification scheme is simple and can be calibrated as shown next. The three state model explicitly captures network efforts that arise in arterial networks.

2.3. Calibration and implementation

Since real-world data can be noisy and contain outliers, a filtering procedure is employed to exclude outliers, e.g. very high values of the capacity Q observed for low flows, before calibrating the needed parameters. Depending on the traffic intensity at each detector, the parameters can be very different. Figure 3 shows the results of the classification algorithm for two sensors on a turn lane and on the main line.

The parameters can be calibrated from data as follows.

1. Qopt first estimate: A preliminary classification on DS and f can be done using reconstituted flow frec which is another variable reported by SCATS and is defined as the expected flow for a certain value of DS and can be considered as a preliminary estimation of Qopt, since it tends to be too optimistic.

2. F state first classification: A preliminary identification of the F state is conducted selecting the samples where

Q e ((frec ± frec) * DS ) where flec is computed from the standard deviation of the reconstituted flow collected by the sensor over a sufficient interval of time. frtec needs to be large enough to provide a good identification of the F state.

3. Qopt second estimate: Given the samples identified as F a least squares linear fitting method is applied to them in order to estimate the Qopt, the standard deviation of this distribution stdQ is then computed.

4. tq and aQ estimates: The dispersion parameters are computed from the standard deviation by tq = stdQ and aQ = 1.5 • stdQ*. The calibration of tq and aQ depends on the shape of diagrams in fig. 3 and can change for different areas of the city.

5. DSmax estimate: Due to the effect of outliers, taking the highest value of f is not reliable to estimate the fmax, we compute the average of the highest N values of f taken into the interval 90 < DS < 130, N depends on the amount of available data.

This three-state classification model is specific to the SCATS adaptive system, since it leverages stop-line detectors and DS metrics that are unique to SCATS. Mid-block detectors, such as those employed by SCOOT, report occupancy fraction directly [? ]. Given detector profiles similar to those shown in Figure 2, additional models to derive similar U states can be constructed directly from the detector signatures, or model-based analytical expressions as shown in (Geroliminis and Skabardonis, 2011).

To translate the state and flow values in terms of lost throughput, we can write an expression for a loss function that compares the ideal flow under the current supply conditions to the current flow. This represents the magnitude of the under-utilization for U and C states.

L(f, DS, Q) =

QoPt • DS - f if s(f, DS, Q) = U, fmax - f if s(f, DS, Q) = C,

0 otherwise.

in the section 5, L(f, DS, Q) is used to evaluate the detrimental effect of these states on traffic.

3. Network traffic modelling

Given the three-state detector-level model, a network transition model that relates detectors that are topologically dependent is developed. The main aim of the network transition model is to leverage the local three-state model with link level processes, to provide a concise characterization of network processes. The key considerations needed to model dynamics over a network are that the anisotropic properties of traffic be considered, the correlations over space and time be captured, and the spatial decay of statistical dependence, i.e. links located far apart can be treated as being independent.

Therefore, a directed spatio-temporal graph model is considered via a dynamic Bayesian network (DBN). Bayesian networks (BN) are directed acyclic graphs whose nodes represent random variables and edges the conditional dependence among them. For our application, nodes denote the traffic state observed in different time instants and/or spatial locations, edges their probabilistic relations. The computation of the probability distribution of traffic state over the entire network can be decomposed in smaller subnetworks where each subnetwork defines the spatial relation on adjacent nodes (linked by edges). Specifically, to capture spatial and temporal traffic dynamics, the BN needs to be dynamic to capture the temporal evolution of the system (Murphy, 2002). DBNs are defined for this scope, as they generalize the BN to model the evolution of stationary processes. In particular, in DBNs, we make a Markov assumption on the temporal evolution of traffic, i.e. traffic state at time t is independent from traffic at time earlier than t - 1 given that the traffic at time t - 1 is known.

A. Road network

B. Dynamic Bayesian network

Fig. 4. A. A network graph with connections between downstream and upstream detectors. B. A dynamic Bayesian network created from the network graph.

To construct the DBN, first a spatial network that consists of topologically dependent sensors is constructed. A time-expanded graph of this base network is then considered, where each time-step is a fixed duration of time. In practice, SCATS reports data at the end of each cycle. As a result, the time series for each set of intersections are typically asynchronous. Procedures to regularize this are also needed to train the model.

Any number of higher order models can be considered including those with more sophisticated dependence structures, such as n-order temporal dependence. For large-scale implementations, calibration of such models is computationally prohibitive. Interpretation of such models is also complex due to higher dimensional transition probabilities. The added value in terms of explanatory power therefore are negligible. The Markovian and stationarity assumptions made herein therefore represent a trade-off between a concise, feasible, and applicable solution versus efficiency of computations.

3.1. DBN definition

Denote the set of detectors as V = {v1, v2, ••• , vn}. We build the DBN starting from the graph G(E, V) describing the urban network. E is the set of links between detectors where a link (v, vj) between two detectors v and vj exists if and only if there is direct flow from v to vj. Figure 4.A shows a set of 6 detectors v1, v2, v3, v4, v5 and v6 and the network created by connecting upstream and downstream sensors.

For any detector v observed at time t we define the state vi e {F, U, C}, obtained from the classification algorithm described in section 2. Vf is the set of nodes at time t, {vt1, vt1, ••• , vfn}. The DBN considered in this work is represented as a graph G(E*, V1 U Vt+1), where the set of vertices is the union of the sets of states of detectors at time t and t + 1. The edges El are built as follows:

• for any i e V create a direct edge from v\ to vt+1 to represent conditional self-dependence of the detector from time t to time t + 1

• if (v, vj) e E then create a direct edge from vj to vt+1 to represent the causality relationship between upstream and downstream detectors. As the traffic states as defined in this paper are strictly depending on the availability of downstream capacity,(the presence and severity of spillover effects is a key interpretation of the classification procedure), we make the assumption that states propagates mainly in the backward direction. Then the edge is created from the downstream sensor to the upstream sensor.

An example of the building process is given in fig 4. The network graph G(E, V) shown in Figure 4.A is used to derive the DBN G(Ef, Vf U Vt+1) shown in Figure 4.B. If we look to node v3+1, its set of parents is composed by v3 (self-influence) and vt2, vt6 (downstream influence).

Network structures can also be directly learned from the data using a structure learning task. However, in this application, structure learning is not reliable and returns sub-optimal solutions to the network structure. This is due to the exponential increase in the search space as the number of nodes in the network increases. Searching for the optimal network structure subjective to different objective functions such as the AIC or the BIC score (Neapolitan, 2003) becomes computationally expensive and for a specific time budget often return local optima. Given the physical traffic processes, direct causal relationships between topologically dependent sensors can be assumed safely without the need for learning the network structure.

3.2. Parameter learning

Given the known BN structure, for a node vi, the learning phase involves estimating the conditional probability of the traffic state P(v'i\pa(v'i)), given the states of its parents pa(v'i) from the graph G. Denote D' = V U V+1 as the states of detectors observed at time t and t + 1. Assume that the training data contains observations of network states in the time interval 1,2 ••• , S. The log-likelihood of the training data is defined as:

L = Z Zlog P(vi\pa(v'i), Dt) (4)

i=1 t=1

The log-likelihood function can be further decomposed into independent component corresponding to each node in the network. A maximum likelihood estimation procedure to learn P(vi\pa(vti)) for each node in the network is employed (Neapolitan, 2003).

This procedure relies on a complete time series of traffic states for each node of the network. The state classification scheme output is used as input to this learning routine. In the case of Dublin, roughly three months of data were employed. We use the BNT Matlab toolbox1 to learn the network parameters.

In order to use the BNT toolbox, we first created a network structure as defined in subsection 3.1. The network was constructed manually for the location of interest by visualizing the positions of the sensors on the map. Automatic network construction is considered as a future work. Subsequently, we collected historical data and created a table in which each column corresponds to a node in the DBN associated with a sensor and rows are classified states of the sensors at a given timestamp. The network structure and the data table were served as input for learning the parameters of the DBN with the BNT toolbox. Recall that each node in the DBN has three states. Depending on the spatial configuration of the network the number of transition probabilities that need to be estimated is fairly limited (in the case of Dublin the maximum cardinality of the parent set was 16). The model can be trained in a reasonable amount of time on reasonable hardware.

Based on our empirical study, the behavior of traffic varies most across three different time intervals: 0-7 AM and 20-24 PM when the traffic is very spare, 7-10 AM and 17-20 PM when the traffic is heaviest and from 10 AM to 17 PM when the traffic is moderate. Therefore we divided the time into three mentioned categories and learned a different DBN for each of them. Our key assumption is that within an interval of interest, the data is stationary, i.e. the conditional distribution P(vi\pa(v'i)) is independent from t. In our framework, it is possible to consider smaller interval splits in which the stationary assumption would hold with stronger confidence. However, since the number of rows in the training data table is proportional to the length of the split intervals, more data is needed to learn a reliable DBN when the time interval length is set to a small value.

4. Inference and Pattern Discovery

Armed with the local state classification, and network transition probabilities, we now focus on evaluation of interesting traffic states patterns over the network. The basic premise is that state and transition probabilities can be queried in a variety of ways. Here we present two specific types of queries, related to link and path processes where U state is critical. The U state is effectively related to network effects, a lack of capacity downstream causes a U state

1 https://code.google.com/p/bnt/

upstream propagating over the network. The analysis of propagation over links or sequence of links, i.e. paths, is then important to catch these network effects. Specifically we are interested in detecting the most critical links or paths in the network where the strong propagation of congested states (C) or transient states (U) is observed.

Having the learned DBN, we consider different types of inference to discover interesting patterns from the data. We are interested in two types of patterns. The first one concerns how likely the congestion state propagates along a link of interest in the network. The second one relates to the likelihood of congestion propagation along a path of arbitrary length. Although the latter type of pattern is more general than the former one, we consider these two patterns separately because the latter one requires much more elaborated work to query from the DBN. In the following subsections we will discuss these patterns in detail. In the following subsections the method is presented taking as an example the patterns of state C. The analysis is not limited to the propagation of C, also patterns of U states or a combination of the two can be discovered using this procedure.

4.1. Congestion propagation pattern on links

Given a link (v,, vj) where vi is the sensor at upstream and vj is a sensor at downstream, we are interested in acquiring the likelihood of congestion propagation from vj to vi at time t + 1 given that vj is congested at time t. This likelihood can be formally represented as the probability of v'+1 = C given that vj = C and all other parents of v'+1 are in the free-flow state at time t.

For example, consider the link (v3, v6) in figure 4.A, assume that congestion happens at location v6 at time t and no congestion is observed at location v2 or v3 (other parents of v3+1) at time t. The likelihood of congestion propagation along the link (v3, v6) can be calculated as Pr(v'3+1 = C\v6 = C, v'3 = F, v'2 = F). This conditional probability can be queried directly from the DBN.

In the experiments, we used this method to query for the congestion propagation likelihood for several links of interest with the same starting point. In doing so we can discover interesting patterns and provide with deep insights on why some links are more vulnerable to congestion propagation than the others. These patterns can be useful in exploratory analysis of network behaviors at link levels.

4.2. Congestion propagation pattern on paths

We define the likelihood of congestion propagation on paths with arbitrary length. A path in this context is defined as a sequence of topologically connected detectors. To simplify the discussion, we will focus on the case when the path length is equal to 2. The discussion can be easily extended to the general case for paths of arbitrary length.

Given a path (v,, vj, vk), we define the likelihood of congestion propagation along this path as the joint probability of two events:

• Event 1 (£1): vj+1 = C, v[ = C and P = F for any P e pa(vj+1) and P * vtk

• Event 2 (E2): v'+2 = C, vj+1 = C and P = F for any P e pa(v\+2) and P * v'+

The former event corresponds to the congestion propagation likelihood along the link (vj, vk) at time t and the latter event corresponds to the congestion propagation likelihood along the link (v,, vj) at time point t + 1. Calculation of the likelihood of congestion propagation of a path with length 2 requires knowledge about events happening at three consecutive timestamps t, t + 1 and t + 2. While the DBN described in subsection 3.1 only represents the knowledge about events happening at two consecutive timestamps, it is not straightforward to evaluate the likelihood of congestion propagation of a path with length 2 using the learned DBN. However, the following theorem shows that the joint probability of event 1 and 2 can be decomposed into components which can be calculated by querying the DBN described in subsection 3.1.

Theorem 1. The congestion propagation likelihood can be evaluated as follows:

Pr(Ei, E2) = Pr(v\+2 = C\v'+l = C; P = F, P e pa(v'+2), P * vj+1) * Pr(v'+1 = C\v'k = C; P = F, P e pa(vj+1), P * v'k) * Pr(v'k = C; P = F, P e pa(vj+1), P * vk)

(6) (7)

Proof We rewrite Pr(E\, E2) in the following format:

Pr(Eù E2) = Pr(v'+2 = C; vJ+1 = C; v'k = C;

P = F, P e pa(vJ+1), P * v'k; P = F, P e pa(v\+2), P * vJ+1)

= Pr(v]+2 = Civ';1 = C; P = F, P e pa(v]+2), P * v'+l) * 1 J 1 J

Pr(vJ+1 = Civk = C; P = F, P e pa(vJ+1), P * vp * Pr(vk = C; P = F, P e pa(vJ+1), P * v'k)

(8) (9) (10) (11) (12)

The second equality is due to the Markov property of Bayesian network. The theorem is proved.

The three components in theorem 1 can be evaluated by querying the DBN thanks to the stationary assumption we made in this paper. For example, the congestion propagation likelihood on the path (v4, v5, v6) in Figure 4 can be calculated by Theorem 1 as Pr(v'+2 = C|v5+1 = C, v4+1 = F) * Pr(v5+1 = C|v6 = C, v5 = F) * Pr(V6 = C, v5 = F). The first two components in that formula correspond to the likelihood of congestion propagation along the link (v4, v5) and (v5, v6) respectively.

It is important to notice that the likelihood of congestion propagation along a path is slightly different from the definition of congestion propagation on a link which considers the conditional distribution instead of joint distribution. The difference is due to different purposes of usage. For links, we are interested in comparing the likelihood of congestion propagation between links with the same starting point. While for paths we are more interested in comparing the likelihood between any path on the network. The likelihood computed are used to rank the paths (or links) in order to identify the most critical one. If spillover or capacity loss are observed on specific locations over the network the procedure presented in this section can be used to find the most likely path causing it. In the next section experimental findings of critical paths and links in the Dublin city centre will be presented. A useful feature of the BNT toolbox is that it allows us to efficiently calculate conditional distribution of a state given observed states of any set of nodes in the DBN. This is usually done via giving an evidence set as an input to the enter_engine function and performing conditional distribution calculation with the marginaLnodes function of the BNT toolbox 2. Based on this feature we can evaluate the the congestion propagation likelihood Pr(E1 , E2) in three steps as shown in theorem 1.

5. Empirical evidence from Dublin

The SCATS traffic control system is deployed in over 700 intersections in Dublin, Ireland. The real-time data from the strategic monitor is used in this analysis, and provides information on phases, flows, degree-of-saturation and other measures at the end of each cycle length. The system is linked and has defined sub-systems within the core down-town area where we focus on 125 detectors in two sub-systems. For calibration and training purposes, weekday data for each of the 125 detectors spanning three months is employed. The data is from February, 2013 till April, 2013. The study area and SCATS layout is shown in Figure 5.

The state classification and network transition models are implemented on this sub-network and in this section we discuss the key characteristics observed. The critical nature of the transient U state is highlighted via the capacity loss function.

5.1. Classification results

The classification parameters are calibrated over the three months data. The real time procedure is then tested over a set of 10 days in May 2013. Figure 6 shows a snapshot of the classification results in the afternoon rush hours period at 5:39 pm on the 10th of May. Each dot is a sensor, the colour identifies its state as in Figure 3. While a subset of detectors are in free flow, areas where a capacity loss is observed can be easily identified over the network. The presence of the U state helps in the detection of corridors and links where the green time is not being completely exploited and spillover effects are observed.

2 http://bnt.googlecode.com/svn/trunk/docs/usage.htmlmarginal

Fig. 5. Dublin network showing SCATS intersections and a sub-network

Fig. 6. Snapshot of the states identification from the 10th of May at 5:39 pm.

Figure 7 shows the time series of state classification over the Eastbound Quays corridor on a typical weekday. Some features of congestion evolution along the Quays is very well represented in this picture. In the last two sensors 17-262-2 and 197-120-1 congested traffic is observed in the afternoon with a mix of C and U states, in particular some of the C states on 17-262-1 become U when observed on 197-120-1. Identifying U states is very important in this case as it represent a congestion even if under-saturated values of DS are observed, failing in detecting it as a congested condition would result in a failed detection of a capacity loss at that location. Failing in detecting this loss can be also be the cause of green time wasting. This phenomenon can be observed if we look to sensor 193-108-2 and focus on the joint behavior of states and green time during the morning rush hours. In this case even if more green

Fig. 7. Typical weekday along the Eastbound Quays corridor showing the traffic states along a path and supply conditions per cycle (direction of travel is top to bottom)

time is given when a congestion is observed, the recovering to a F state does not take place, as the system stays in U state, and the increasing in green time is wasted by the system.

5.2. Critical paths identification and loss function

The identification of critical propagation paths over the network is crucial when the network effects of C and U states are analyzed. The method presented in section 3 is used here to detect the critical paths over the network. We

Fig. 8. Critical paths identification over the network. Red lines indicate sensors included along congested paths while black dots indicates other sensors.

Fig. 9. Capacity loss observed over the sensors along the paths identified in fig. 8.

queried for the likelihood of congestion propagation along different paths of length two in various locations of the network and did exploratory analysis on the results.

One of the most interesting results is shown in fig 8. The likelihood of congestion propagation of every path starting from the six upstream sensors located at intersection 193 on two approaches 135 and 108 were queried from

(a) AM Peak period without loss due to U state

(b) PM Peak period without loss due to U state

(c) AM Peak period with U states

(d) PM Peak period with U states

Fig. 10. Capacity loss along the Eastbound Quays corridor showing the loss function with and without the U states (direction of travel along y-axis top to bottom)

the learned DBN. These paths yield toward the three downstream sensors at intersection 26 and three other sensors at intersection 197. Among these paths, we identified 6 different partially overlapped critical paths with highest

congestion propagation likelihood both for C and U states (highlighted as red lines in the figure). These paths start from sensors 193-108-0, 193-108-2 and 193-135-2. Interestingly, all of them end at a common detector that is the 197-120-0 located on a left turn lane bringing to O'Connell Street, one of the biggest road in Dublin city centre. It is then identified as the most critical source of backward spillover for that area of the network.

Another interesting pattern we discovered is that the congestion propagation likelihood along the left lane of the Ormond Quay Lower street corresponding to the path (193-108-0, 26-110-0, 197-120-0) is some orders of magnitude lower than the congestion propagation likelihood of the other paths. Looking closer to this location we observe that the left lane of the Ormond Quay Lower street is reserved for buses. Therefore, vehicles on the private lanes willing to turn left to the O'Connell street use the other two lanes that become then critical for traffic congestion. These patterns are automatically detected by the algorithm and give an insight that might be used for planning operation on the traffic control system.

Figure 9 shows the capacity loss observed on the sensors along the paths (plot as red dots in fig. 8). The highest under-utilization of capacity happens during morning hours and the contribution to that of U and C states is almost equal. As a consequence U and C state have the same impact on the control system performances in terms of capacity loss. The correct detection and evaluation of the effect of U state is then fundamental when a specific traffic control strategy needs to be evaluated. In particular for sensor 193-135-2 the U state affects more the capacity loss than the C state, having U state in that particular location is then more critical than the C state as more spillovers are then observed propagating over it.

Figure 10 shows the effect of capacity loss computed with and without the U states over morning and afternoon periods along the Eastbound Quays corridor. The most important insight from the figure is that detecting and computing the loss in case of U state gives a more complete information on capacity loss profiles respect to considering only the C state. This is evident in the interval 8:45-9am and 5:30-7:30pm. In particular if we consider time 5:30pm in fig. 10(b) where the loss is computed only looking at C states the major effect seems to be observed locally only around Inns Quay. If we exploit also the information coming from the U states, see fig. 10(d), then the real spreading of this phenomenon can be distinguish, specifically a corridor of loss from Bachelor Walk to Inns Quay.

5.3. Interpretation of the states for coordinated control

U and C states have a relevant interpretation in cases of coordinated control. The control strategies of SCATS are decided and implemented over subsystems. Each subsystem is a group of intersections that have common control decision (cycle length, plan, and coordinated offsets), at each cycle the approach with the higher DS is defined as 'master' because it leads the final decision (particularly on the cycle length) on the control strategy. SCATS provides more green time to the approach with the highest DS . This strategy could potentially lead to large under-utilization of capacity, if the approach is impacted by spillovers, and the desired throughput not realized. These situations can be easily identified by the method presented in this paper.

Figure 12 shows a set of intersections belonging to the same subsystem. At this time instant 5:49pm the detector with the black circle is the one leading the subsystem cause it has the highest value of the DS . It is located on one of the left turn lanes and his high DS is due to the fact that it is affected by a spillover starting in the top part of the road and backward propagating. The high value of the DS would indicates that this sensor needs more green time in order to serve the demand. Even if more green time is given to this approach (or to the corridor) it is not used efficiently as the spillover, identified as the sequence of U and C states upstream, prevents the normal flow of cars during green time. This phenomenon is a direct consequence of the fact that the sensor is not the source of the spillover but it is affected by it. A portion of the green time is then wasted over that approach. Moreover the figure shows with yellow arrows other competitive approaches that could use the green time more efficiently. On these lanes the measured DS is still high but on F state meaning that there are not any spillovers and that the green time is currently fully utilized.

5.4. Throughput over the network

We now focus on sub-network throughput and aim to relate the vehicle exit rates from the sub-network based on sub-network density. To do this, a subset of 25 detectors on the sub-network that are exit sensors are identified, and throughput along with network-level loss measured. Figures 11(a) shows the relationships between network-level state (as described by the loss function) and the throughput. Throughput degrades as the sub-network loss function

Network-wide loss (ve h/h our)

(a) Network-wide loss versus throughput

12 AM 6 AM 12 PM 6 PM 12 AM

Hour of day

(b) Temporal profile of network-wide throughput Fig. 11. Network level throughput

increases. Recall that the loss function is related to the existence of U and C states, therefore this shows the empirical relationship between the existence of transient and congested states on network-level throughput. Figure 11(b) shows the temporal variation in daily throughput profile for all weekdays of February, 2013.

The severity of loss in throughput is less than that reported in the literature via simulation, see, for example (Geroliminis and Skabardonis, 2011). This is most likely since highly congested, grid-lock like conditions were not observed within this network during this period. The impact of spatial heterogeneity of congestion on throughput was

J / fMMl

'lL J-il|iH n 4L

□ ff .L *

Fig. 12. Effects of coordination in control analyzed using U and C states.

also studied, but no clear conclusions could be drawn from the results. Several measures of spatial heterogeneity were considered, but the throughput had very high dispersion, mainly on account of different phase times at exit nodes.

6. Discussion

A state classification and transition model is presented for a dense network controlled by an adaptive control system. Empirical data from Dublin is used to demonstrate the existence of transient U states that occur solely due to network effects. While such states are not classified by the control system as over-saturated, they represent major capacity losses along arterial corridors.

The main insight presented in this paper is based on the interpretation scheme that directly leverages degree-of-saturation measurements and flow measurements to detect the three traffic states. Moreover a compact description of network level effects of the states is provided, a method for the automatic identification of critical links and paths on the network is also proposed.

Real-world data from Dublin, where the SCATS system is deployed, has been used to calibrate and validate the classification procedure and the paths discovering. In particular the method proposed has been proved to provide relevant insights on the detrimental effects of the U and C state in specific locations over the network. The capacity loss function has been used to quantify the consequences of U state over corridors. Moreover when the control strategy runs over coordinated intersections, the three state model gives important instruments to identify wasting in green time due to the non optimal configuration of the subsystems.

In its current form, the classification model is specific to the SCATS system, but similar data-driven approaches can be considered if detector occupancy signatures are available. Preliminary examination of SCOOT data from London (Pascale, Mavroeidis and Lam, in review) is encouraging. Alternative schemes for state detection need to be developed for such cases. In the network transition models, stationarity assumptions made during the learning phase can be also relaxed. A limitation of the current study is that special turning restrictions and mid-block friction that may cause lower discharge rates are not explicitly accounted for. These aspects will be the focus of future studies.

The work presented in this paper has the main objective to help traffic engineers gain insight on the performances of adaptive control systems implemented in the real world. These systems have a high level of complexity given by the large number of variables dynamically changing over time and the large number of configuration parameters in practice. Moreover they need to be calibrated and validated in cities that are evolving. Our methods contribute towards pattern detection of congestion are a starting point to improving services or planning new control strategies.

Acknowledgements

We thank the Brendan O'Brien and Dublin City Council for facilitation and useful discussions, without implication of agreement or endorsement of the findings. Contributions from Dominik Dahlem related to the SCATS data are gratefully acknowledged. A subset of data used in this paper is available via Dublin's open data initiative at http: //www.dublinked.ie/.

References

Aboudolas, K, M Papageorgiou, A Kouvelas and E Kosmatopoulos. 2010. "A rolling-horizon quadratic-programming approach to the signal control problem in large-scale congested urban road networks." Transportation Research Part C: Emerging Technologies 18(5):680-694.

Ardekani, Siamak and Robert Herman. 1987. "Urban network-wide traffic variables and their relations." Transportation Science 21(1):1-16.

Buisson, Christine and Cyril Ladier. 2009. "Exploring the impact of homogeneity of traffic measurements on the existence of macroscopic fundamental diagrams." Transportation Research Record: Journal of the Transportation Research Board 2124(1):127-136.

Castillo, Enrique, Jose Maria Menendez and Santos Sanchez-Cambronero. 2008. "Predicting traffic flow using Bayesian networks." Transportation Research Part B: Methodological 42(5):482-509.

Daganzo, Carlos F. 2007. "Urban gridlock: Macroscopic modeling and mitigation approaches." Transportation Research Part B: Methodological 41(1):49-62.

de Gier, Jan, Timothy M Garoni and Omar Rojas. 2011. "Traffic flow on realistic road networks with adaptive traffic lights." Journal of Statistical Mechanics: Theory and Experiment 2011(04):P04008.

Diakaki, Christina, Markos Papageorgiou and Kostas Aboudolas. 2002. "A multivariable regulator approach to traffic-responsive network-wide signal control." Control Engineering Practice 10(2):183-195.

Dinopoulou, Vaya, Christina Diakaki and Markos Papageorgiou. 2000. Simulation investigations of the traffic-responsive urban control strategy TUC. In Intelligent Transportation Systems, 2000. Proceedings. 2000 IEEE. IEEE pp. 458^-63.

Dinopoulou, Vaya, Christina Diakaki and Markos Papageorgiou. 2006. "Applications of the urban traffic control strategy TUC." European Journal of Operational Research 175(3):1652-1665.

Friedman, Nir, Kevin Murphy and Stuart Russell. 1998. Learning the structure of dynamic probabilistic networks. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc. pp. 139-147.

Geroliminis, Nikolas and Alexander Skabardonis. 2011. "Identification and analysis of queue spillovers in city street networks." Intelligent Transportation Systems, IEEE Transactions on 12(4):1107-1115.

Geroliminis, Nikolas and Carlos F Daganzo. 2008. "Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings." Transportation Research Part B: Methodological 42(9):759-770.

Geroliminis, Nikolas and Jie Sun. 2011. "Properties of a well-defined macroscopic fundamental diagram for urban traffic." Transportation Research Part B: Methodological 45(3):605-617.

Helbing, Dirk. 2009. "Derivation of a fundamental diagram for urban traffic flow." The European Physical Journal B-Condensed Matter and Complex Systems 70(2):229-241.

Hunt, PB, DI Robertson, RD Bretherton and MC Royle. 1982. "The SCOOT on-line traffic signal optimisation technique." Traffic Engineering & Control 23(4).

Jhaveri, Chintan S, Joseph Perrin and Peter Martin. 2003. Scoot Adaptive Signal Control: An Evaluation of its Effectiveness over Range of Congestion Intensities. In Transportation Research Board 2003 Annual Meeting, Compendium of Papers.

Ji, Yuxuan and Nikolas Geroliminis. 2012. "On the spatial partitioning of urban transportation networks." Transportation Research Part B: Methodological 46(10):1639-1656.

Keyvan-Ekbatani, Mehdi, Markos Papageorgiou and Ioannis Papamichail. 2013. "Urban congestion gating control based on reduced operational network fundamental diagrams." Transportation Research Part C: Emerging Technologies 33:74-87.

Keyvan-Ekbatani, Mehdi, Markos Papageorgiou and Ioannis Papamichail. 2014. "Perimeter Traffic Control via Remote Feedback Gating." Procedia-Social and Behavioral Sciences 111:645-653.

Mahmassani, H, JC Williams and R Herman. 1987. Performance of urban traffic networks. In Transportation and Traffic Theory (Proceedings of the Tenth International on Transportation and Traffic Theory Symposium, Cambridge, Massachusetts), NH Gartner, NHM Wilson, editors, Elsevier.

Mahmassani, Hani S, MeeadSaberietal. 2013. "Urban network gridlock: Theory, characteristics, and dynamics." Procedia-Social and Behavioral Sciences 80:79-98.

Mazloumian, Amin, Nikolas Geroliminis and Dirk Helbing. 2010. "The spatial variability of vehicle densities as determinant of urban network capacity."

Philosophical Transactions ofthe Royal Society A: Mathematical, Physical and Engineering Sciences 368(1928):4627-4647.

Mendes, GA, LR Da Silva and HJ Herrmann. 2012. "Traffic gridlock on complex networks." Physica A: Statistical Mechanics and its Applications 391(1):362-370.

Murphy, Kevin. 2002. Dynamic Bayesian Networks: Representation, Inference and Learning PhD thesis UC Berkeley, Computer Science Division.

Neapolitan, Richard E. 2003. Learning Bayesian Networks. Upper Saddle River, NJ, USA: Prentice-Hall, Inc.

Pascale, A and M Nicoli. 2011. Adaptive Bayesian network for traffic flow prediction. In Statistical Signal Processing Workshop (SSP), 2011 IEEE. IEEE pp. 177-180.

Pascale, Alessandra, Dimitrios Mavroeidis and Hoang Thanh Lam. in review. "Spatio-temporal clustering of urban networks: a real case scenario in London.".

Sims, Arthur G and KW Dobinson. 1980. "The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits." Vehicular Technology, IEEE Transactions on 29(2):130-137.

Sun, Shiliang, Changshui Zhang and Guoqiang Yu. 2006. "A Bayesian network approach to traffic flow forecasting." Intelligent Transportation Systems, IEEE Transactions on 7(1):124-132.

Zhang, Lele, Timothy M Garoni and Jan de Gier. 2013. "A comparative study of Macroscopic Fundamental Diagrams of arterial road networks governed by adaptive traffic signal systems." Transportation Research Part B: Methodological 49:1-23.