Scholarly article on topic 'Application of plume analysis to build land use regression models from mobile sampling to improve model transferability'

Application of plume analysis to build land use regression models from mobile sampling to improve model transferability Academic research paper on "Earth and related environmental sciences"

CC BY-NC-ND
0
0
Share paper
Academic journal
Atmospheric Environment
Keywords
{"Air pollution" / "Black carbon" / "Exposure assessment" / "Geographic information systems" / "Land use regression"}

Abstract of research paper on Earth and related environmental sciences, author of scientific article — Yi Tan, Timothy R. Dallmann, Allen L. Robinson, Albert A. Presto

Abstract Mobile monitoring of traffic-related air pollutants was conducted in Pittsburgh, PA. The data show substantial spatial variability of particle-bound polycyclic aromatic hydrocarbons (PB-PAH) and black carbon (BC). This variability is driven in large part by pollutant plumes from high emitting vehicles (HEVs). These plumes contribute a disproportionately large fraction of the near-road exposures of PB-PAH and BC. We developed novel statistical models to describe the spatial patterns of PB-PAH and BC exposures. The models consist of two layers: a plume layer to describe the contributions of high emitting vehicles using a near-roadway kernel, and an urban-background layer that predicts the spatial pattern of other sources using land use regression. This approach leverages unique information content of highly time resolved mobile monitoring data and provides insight into source contributions. The two-layer model describes 76% of observed PB-PAH variation and 61% of BC variation. On average, HEVs contribute at least 32% of outdoor PB-PAH and 14% of BC. The transferability of the models was examined using measurements from 36 hold-out validation sites. The plume layer performed well at validation sites, but the background layer showed little transferability due to the large difference in land use between the city and outer suburbs.

Academic research paper on topic "Application of plume analysis to build land use regression models from mobile sampling to improve model transferability"

ELSEVIER

Contents lists available at ScienceDirect

Atmospheric Environment

journal homepage: www.elsevier.com/locate/atmosenv

Application of plume analysis to build land use regression models from mobile sampling to improve model transferability

Yi Tan, Timothy R. Dallmann, Allen L. Robinson, Albert A. Presto*

Center for Atmospheric Particle Studies, Carnegie Mellon University, Pittsburgh, PA, 15213, United States

HIGHLIGHTS

• We observed spatial gradients of polycyclic aromatic hydrocarbons (PB-PAH) and BC.

• BC and PB-PAH variability is driven by plumes from high emitting vehicles.

• Two-layer models (plume + background) were developed to describe spatial patterns.

• The model plume layer is transferable to an independent holdout dataset.

GRAPHICAL ABSTRACT

ARTICLE INFO

ABSTRACT

Article history: Received 5 January 2016 Received in revised form 11 March 2016 Accepted 12 March 2016 Available online 15 March 2016

Keywords:

Air pollution

Black carbon

Exposure assessment

Geographic information systems

Land use regression

Mobile monitoring of traffic-related air pollutants was conducted in Pittsburgh, PA. The data show substantial spatial variability of particle-bound polycyclic aromatic hydrocarbons (PB-PAH) and black carbon (BC). This variability is driven in large part by pollutant plumes from high emitting vehicles (HEVs). These plumes contribute a disproportionately large fraction of the near-road exposures of PB-PAH and BC. We developed novel statistical models to describe the spatial patterns of PB-PAH and BC exposures. The models consist of two layers: a plume layer to describe the contributions of high emitting vehicles using a near-roadway kernel, and an urban-background layer that predicts the spatial pattern of other sources using land use regression. This approach leverages unique information content of highly time resolved mobile monitoring data and provides insight into source contributions. The two-layer model describes 76% of observed PB-PAH variation and 61% of BC variation. On average, HEVs contribute at least 32% of outdoor PB-PAH and 14% of BC. The transferability of the models was examined using measurements from 36 hold-out validation sites. The plume layer performed well at validation sites, but the background layer showed little transferability due to the large difference in land use between the city and outer suburbs.

© 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND

license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

Exposure to traffic related air pollutants is linked with adverse health effects including childhood cancer, respiratory, and

* Corresponding author. E-mail address: apresto@andrew.cmu.edu (A.A. Presto).

cardiovascular diseases (Brugge et al., 2007; Heck et al., 2013). The spatial variability of traffic related pollutants, such as black carbon (BC) and particle bound polycyclic aromatic hydrocarbons (PB-PAH), is substantial in urban areas (Clougherty et al., 2013; Tan et al., 2014a). However, the large spatial variation of traffic related pollutants cannot be characterized by sparse monitoring systems such as the U.S. EPA Air Quality System (AQS) that are designed to

http://dx.doi.org/10.1016/j.atmosenv.2016.03.032

1352-2310/© 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

monitor regional compliance with the National Ambient Air Quality Standards. In many cases markers of traffic related emissions such as BC or PB-PAH are not widely monitored.

Pollutant mapping studies aim to measure pollutants at high spatial density using either saturation sampling or mobile monitoring. Saturation sampling studies collect integrated samples over multiple weeks in different seasons, but there are still large uncertainties in reproducing annual mean concentrations (Tan et al., 2014b). Nevertheless, distributed sampling data is frequently used to build land use regression (LUR) models that predict pollutant spatial patterns (Zhang et al., 2014, 2015; Wang et al., 2013; Kheirbek et al., 2012; Jedynska et al., 2014; Clougherty et al., 2008). Mobile monitoring is even more uncertain in reproducing annual mean concentrations because it collects data for shorter durations than distributed sampling. However, the ability of mobile monitoring to capture pollutant spatial patterns is comparable with saturation sampling (Tan et al., 2014b), and mobile monitoring data have been used to build LUR models (Larson et al., 2009; Patton et al., 2015; Saraswat et al., 2013). While most mobile sampling platforms are equipped with high time resolution (~1 s—1 min) instrumentation, many LUR models built from mobile sampling use highly averaged data that loses much of the informational value inherent in high time resolution measurements.

Highly time resolved data collected in mobile monitoring studies can provide important information on pollutant sources. For example, a mobile monitoring study in the Los Angeles area quantified particle emissions from the Los Angeles International Airport (Hudda et al., 2014). High time resolution data can be used to estimate emission factors foron-road traffic (Hudda et al., 2013), including analysis of individual vehicle plumes (Dallmann et al., 2011, 2012; Canagaratna et al., 2004). Mobile sampling can also identify pollutant hotspots not captured by stationary monitoring (Brantley et al., 2014). In our recent mobile monitoring campaign in the Pittsburgh region, Tan et al. analyzed pollutant plumes from high emitting vehicles (HEVs), most of which were diesel trucks and buses, to partially resolve the sources of particle bound poly-cyclic aromatic hydrocarbons (PB-PAH) and black carbon (BC) (Tan et al., 2014a). HEVs contributed up to 70% of the on-road PB-PAH and 30% of BC, with significant spatial variability that showed strong linear correlation between the contribution of HEVs and the Average Daily Truck Traffic (ADTT) counts (Tan et al., 2014a).

LUR are statistical relationships between land-use variables and pollutant concentrations. Land-use variables typically include traffic, zoning (e.g., industrial or residential), and elevation (Hoek et al., 2008). Some variables in typical LUR models may be indicative of pollutant sources. For example, traffic variables are related with vehicle emissions, and variables associated with industrial land use may indicate that a particular pollutant is emitted from point sources. However, LUR is not a rigorous method to apportion pollutant sources, and the regression coefficients in LUR models do not necessarily represent the contributions of specific sources (Hoek et al., 2008). LUR models may also include variables that lack obvious physical interpretability or are not directly related to sources. These limitations of LUR models limit their ability to directly attribute observed pollutant concentrations to specific sources and to predict potential changes in air quality due to mitigation strategies. Additionally, LUR and other statistical models often suffer from poor transferability. Models built for a specific city, or even a portion of a city, typically are not applicable outside of that region (Patton et al., 2015; Poplawski et al., 2009). Poor transferability is most likely a consequence of the purely statistically based, rather than physically based, representation of pollutant spatial patterns in LUR models.

An alternative method to predict spatial distribution of pollutant concentrations is the distance-weighted kernel algorithm

(Loibl and Orthofer, 2001; Vienneau et al., 2009; Pratt et al., 2014; Gulliver and Briggs, 2011). This method explicitly links emissions to pollutant concentrations based on the proximity to sources and expected dispersion patterns. Compared to other geospatial approaches, the kernel method better represents the transport of pollutants away from sources by assuming a smooth fall-off near sources rather than the sharp cutoff created by using fixed-distance buffers, as recently demonstrated by Pratt et al (Pratt et al., 2014). The distance-weighted kernel therefore offers the possibility of improving model transferability. The impact of potential changes in emission sources (e.g., reduction in high emitting diesel trucks) can also be readily estimated.

In this manuscript, we develop two types of spatial models based on mobile sampling data collected in Pittsburgh, PA. The first model is a traditional LUR. The second is a novel two-layer model that leverages the unique attributes of highly time resolved data to predict the spatial patterns of PB-PAH and BC with insight into source contributions. The plume layer of the two-layer model uses a previously published relationship between HEV plumes and ADTT reported by Tan et al. (Tan et al., 2014a) and a distance-weighted kernel algorithm to predict near-road contributions of HEVs. The background layer predicts the spatial variability of the non-plume background using LUR. We assess model transferability using a separate holdout dataset, and compare the performance of the two-layer model to the traditional LUR model.

2. Methods

2.1. Air pollution dataset

This paper analyzes data that were collected using the Carnegie Mellon University mobile laboratory, which is equipped with realtime instruments to measure black carbon (BC; Magee Scientific AE31 Aethalometer), air toxics (e.g., benzene and toluene), PB-PAH (EcoChem PAS2000), NOx, SO2, O3, and CH4. The mobile monitoring campaign was conducted in two phases, and Table 1 summarizes all the data.

Phase I of this study and the mobile laboratory have been described in detail previously (Tan et al., 2014a). Briefly, the Phase I monitoring domain included the city of Pittsburgh and its immediate suburbs (Fig. S1). The monitoring was conducted during the

2011-2012 winter (Nov 2011-Feb 2012) and the 2012 summer (Jun

2012—Aug 2012). A total of 42 sites were selected using random sampling stratified by elevation (valley or upland) and traffic volume (high or low traffic). Eight sites were valley sites with low traffic, 11 sites were valley sites with high traffic, 13 sites were upland sites with low traffic, and 10 sites were upland sites with high traffic. Monitoring sites included different neighborhoods within the city, suburban sites, and locations near major pollution sources.

The mobile laboratory was driven along a prescribed driving route at each site. While some applications of mobile monitoring sampled specified intersections in a cloverleaf pattern, such as Larson et al. (Larson et al., 2009), the roadway network in Pittsburgh was not always conducive to this strategy. Instead, each sampling site is defined as the centroid of a driving route consisting of local major and minor roadways. Points along the driving route were within 250 m of the sampling site, and were within the same stratum (e.g., valley and low traffic). The mobile laboratory was typically driven ~5 mph below the posted speed limit (25 mph for most roads). We avoided high-speed highway driving, and avoided following specific vehicles, such as diesel trucks and buses, that could have high emissions that might skew estimates of site average concentrations. Mobile measurements were performed in three periods in both seasons to cover different times of day:

Table 1

Distributions of measured site average particle phase PAH (PB-PAH) and BC. "All data" indicates the measured total concentration (i.e., non-plume background and the contribution of HEVs); "plume" indicates the contribution of HEVs.

Phase I (42 sites) Minimum 25th Median 75th Maximum Mean

PB-PAH (ng/m3) (all data) 7.2 13.0 18.7 27.8 50.3 22.0

BC (mg/m3) (all data) 0.83 1.08 1.28 1.64 2.44 1.39

PB-PAH (ng/m3) (plume) 1.0 4.6 9.6 16.7 35.9 11.4

BC (mg/m3) (plume) 0.04 0.13 0.22 0.41 0.73 0.28

Phase II (36 sites) Minimum 25th Median 75th Maximum Mean

PB-PAH (ng/m3) (all data) 4.9 10.8 24.3 38.0 71.2 26.7

BC (mg/m3) (all data) 0.36 0.79 1.07 1.47 2.04 1.12

PB-PAH (ng/m3) (plume) 0.1 1.6 8.3 18.1 44.5 10.8

BC (mg/m3) (plume) 0 0.07 0.24 0.45 0.91 0.29

mornings (5 a.m—11 a.m), afternoons/evenings (11 a.m—9 p.m), and overnight (9 p.m—5 a.m). Each site was visited once in each time period per season, with each visit lasting one hour, for a total of 6 h of sampling at each site. Self-pollution from the generator and vehicle exhaust was minimized by monitoring with the vehicle in motion, and no instances of self-pollution were identified during QA/QC.

The pollutant time series for each one-hour visit to the sites included short-duration plumes and an underlying non-plume background (Fig. 1A). Plumes correspond to emissions from individual vehicles, as indicated by CO2 concentrations. A small fraction of the CO2 plumes display concurrent BC and PB-PAH plumes. As described in detail in Tan et al. (Tan et al., 2014a), these BC and PB-PAH plumes are attributed to emissions from high emitting vehicles (HEVs). CO2 plumes without concurrent BC and PB-PAH plumes are assumed to be emitted by low-emitting vehicles (e.g., gasoline vehicles or low-emitting diesel vehicles).

We fit a baseline (blue line in Fig. 1A) to the BC and PB-PAH data, and major plumes were integrated in Matlab using the ipeak function. Peaks in PB-PAH concentrations that were more than 15 ng m~3 above baseline levels were used to specify plume captures. The area below the baseline is considered the non-plume background. Fig. 1A shows a case where the BC background was roughly constant over the course of a one-hour visit to a sampling site; in other cases the background would change over the course of sampling. The contribution of HEVs to a pollutant at a particular site (Fig. 1B) is the total integrated plume area divided by the total sampling time. The mean concentration at each site was the sum of the mean non-plume background and the contribution of HEVs.

Phase II of the mobile monitoring campaign was conducted during 2013 summer (Aug) and 2013—2014 winter (Dec 2013—Jan 2014). A total of 36 sites were selected using a similar stratified sampling design as Phase I. Six sites from the Phase I domain were repeated for comparison purposes, and 30 sites were located outside of the Phase I domain to characterize air pollutant concentrations in the outer suburbs (Fig. S1). In the Phase II study, the mobile laboratory was parked at roadside or in parking lots rather than driving along prescribed routes. Electrical power was provided to the mobile laboratory by coupling a second alternator to the vehicle engine. Thus, the mobile laboratory was left idling during stationary sampling. Self-pollution was prevented by attaching a 7-m exhaust hose that extended in the downwind direction. The data were analyzed following the same method as in Phase I.

2.2. Model development and evaluation

The Phase I data were used as the training dataset to develop both two-layer and traditional LUR models for PB-PAH and BC, and the Phase II data were held out for validation. The two-layer model consists of a plume layer that describes the contribution of emissions from HEVs, and a background layer that predicts the spatial pattern of the non-plume background. All models assume that the mean BC and PB-PAH concentrations (total concentrations, plume concentrations, and background concentrations) measured across the six visits to each sampling site represent annual average concentrations. Models were constructed to represent this assumed annual average concentration.

The plume layer model is based on the regression between PB-

Fig. 1. (A) Time series of BC and CO2 measured at a valley site during the summer afternoon session. The black line shows measured BC concentrations, and the blue line is the fitted baseline. Three major plumes from high emitting vehicles were identified and integrated, each coinciding with a CO2 peak (red line). Minor peaks were considered as part of the non-plume background. (B) Total BC associated with vehicle plumes is correlated with total truck traffic (ADTT) at Phase I measurement sites (R2 = 0.67), whereas background BC concentrations do not show a correlation with truck traffic. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

PAH and BC in high concentration plumes attributable to HEVs and truck traffic volume (ADTT) derived by Tan et al. (Tan et al., 2014a) and shown in Fig. 1B. The plume concentration profile above background near the roadway was assumed to have the following form:

m = c0 i -

for d < D

C(d) = 0 for d > D

C is concentration and d is the distance from the edge of the roadway. C0 is the concentration at the road edge, and is determined from the regression between plume contributions and ADTT shown in Fig. 1B and Tables 2 and 3.

D indicates the fall-off distance where concentrations return to the background level. Fig. 2 shows the shape of the near-roadway kernel for several different values of D. The two-layer models described below used D = 100 m. This selection of D was informed by a recent analysis of near-road air quality indicating that elemental carbon decayed to background levels by 130 m from the road edge (Karner et al., 2010). The sensitivity of model predictions to the specific value of D is discussed further below.

The near-roadway kernel interpolation was implemented in ArcMap (version 10, ESRI, Redlands, CA) to predict the near road decay of HEV emissions. The plume layer model uses average daily truck traffic (ADTT) count for major roads (interstates, U.S. and state routes, and major named roads/streets comprising about 30% of the road network) (PennDOT, 2012); minor roads were assumed to have no truck traffic. The value of the near-road kernel in each grid cell in the 10 m x 10 m prediction raster was dependent on all nearby roadways.

To develop the background layer model, prescribed driving routes of mobile monitoring sites were treated as polyline features in GIS. We generated 6 different buffer polygons with buffer distances of 100, 200, 300, 500, 750, and 1000 m at each side of the driving route. The mean elevation in each buffer was calculated from a digital elevation model developed from the 2006 Allegheny County contour data (Allegheny County Contours, 2006). The annual average daily traffic (AADT) and ADTT counts from the Pennsylvania Department of Transportation were normalized by

road lengths to estimate the average total and truck traffic densities in buffer polygons (PennDOT, 2012). Minor roads without official AADT and ADTT counts were assumed to have an AADT value of 100 vehicles per day and no truck traffic. We performed sensitivity tests assuming the AADT on minor streets to be 100, 500 and 1000 vehicles per day, and found it did not change the spatial pattern of pollutants significantly. The 2012 land use data of Allegheny County were obtained from the Allegheny County GIS group. Individual properties were classified into categories based on land use codes (Table S1, Supporting Information): residential (single family houses, apartments, etc.), commercial and public (shopping plazas, commercial buildings, schools and universities, etc.), industrial, vacant (forests, cemetery, etc.), other transportation (active railroads and ports, not including roadways), and agricultural.

In most previous LUR studies, monitoring sites were treated as point features, and the total area of specific land use categories were used as independent variables. However, in this study the driving routes at each sample site were not of identical size. Thus the area within each buffer distance varied from site to site in this study. Therefore, we used the fraction of land in each use category in the buffer as the predictor variable.

Background layer LUR models used the same stepwise regression procedure employed in the ESCAPE project (Eeftens et al., 2012). First, univariate regression models were developed for the measured pollutants using individual predictor variables. The model with the highest R2 and a slope conforming to expectation was chosen as the initial model. Remaining variables were added into the initial model separately, and the p-value of an F-statistic was computed for the null hypothesis that the variable would have a zero coefficient if added to the model. Variables with p-values less than 0.05 were added to generate an intermediate model (Draper and Smith, 1998). The intermediate model was considered valid if the following criteria were met: 1) the absolute increase in R2 was more than 1%, and 2) directions of all the coefficients were the same as expectation. When a variable with a given buffer size (e.g., 100 m) was included in the intermediate model, outer rings of buffers (e.g., 100 m—200 m, 100 m—300 m, etc.) were calculated for this variable and these variables were also considered in the next steps. This step was repeated until no more variables could be added to the intermediate model. The ultimate two-layer model was the sum of the predictions of the background- and plume-

Table 2

Statistical models for PB-PAH. ADTT, average daily truck traffic; VAC750m, the fraction of vacant land within the 750 m buffer; ADTT100m, truck traffic density within the 100 m buffer; COM200m, the fraction of commercial and public land within the 200 m buffer.

Two-layer Model

Variable Coefficient p-value Model R2 RMSE (ng m—3) HVR2d

Plume layer (C0) Intercept 3.57 2.81E-04 - - -

ADTT 3.23E-02 1.79E-15 - - -

- - 0.80a 4.08a 0.48 (0.79)a

Background layer Intercept 20.8 1.25E-12 - - -

Elevation (m) —2.85E-02 4.65E-04 (0.35)c - —

VAC750m -9.82 3.43E-02 (0.42)c - —

- - 0.42b 2.33b 0.26b

Final model - - 0.76 5.35 0.48 (0.47)

Traditional Model

Variable Coefficient p-value Model R2 RMSE HV R2 d

Intercept 9.99 1.17E-06 - - —

ADTT100m 3.58E-02 2.25E-07 (0.56)c - —

COM200m 27.7 1.15E-03 (0.67)c - —

- - 0.67 6.25 0.34 (0.40)

a Plume layer alone after application of the near-road kernel. b Background layer alone.

c R2 values in parenthesis are partial R2 for each covariate in either the background layer or traditional LUR.

d HV R2, the hold-out validation R2 value for Phase II sites. The value in parenthesis is the R2 value after excluding three biased sites. The HV R2 also indicates the transferability of the model at Phase II sites.

Table 3

Statistical models for BC. ADTT, average daily truck traffic; ADTT100m, truck traffic density within the 100 m buffer; COM200m, the fraction of commercial and public land within the 200 m buffer; IND200, the fraction of industrial land within the 200 m buffer.

Two-layer Model

Variable Coefficient p-value Model R2 RMSE (mg m—3) HVR2d

Plume layer (C0) Intercept 0.13 3.02E-06 - - -

ADTT 6.12E-04 2.52E-11 - - -

0.68a 0.107a 0.36 (0.60)

Background layer Intercept 1.39 4.17E-08 - - -

COM200m 0.77 2.20E-03 (0.36)c - —

Elevation — 1.58E-3 1.65E-02 (0.49)c - —

lND200m 1.13 4.37E-02 (0.55)c - —

0.55b 0.175b 0.23b

Final model 0.67 0.236 0.31 (0.28)

Traditional Model

Variable Coefficient p-value Model R2 RMSE HV R2 d

Intercept 1.60 1.16E-06 - - —

COM200m 1.20 1.75E-03 (0.44)c - —

ADTT100m 7.71E-04 2.07E-03 (0.55)c - —

Elevation —2.04E-03 2.28E-02 (0.61)c - —

0.61 0.252 0.34 (0.31)

a Plume layer alone after application of the near-road kernel. b Background layer alone.

c R2 values in parenthesis are partial R2 for each covariate in either the background layer or traditional LUR.

d HV R2, the hold-out validation R2 value for Phase II sites. The value in parenthesis is the R2 value after excluding three biased sites. The HV R2 also indicates the trans-ferability of the model at Phase II sites.

0 50 100 150 200 250 300 Distance from road edge (m)

Fig. 2. Normalized near-roadway concentration profiles predicted by the near-road kernel employed in the two-layer model (solid lines), using a fixed-width buffer (dashed lines), and using linear distance to the roadway (dash-dot line). Near-road kernels are shown for different values of D, the near road fall-off distance.

In addition to the two-layer models, we also developed traditional LUR models. The traditional LUR models did not separate the data between plume and background layers. Models were instead fit to the mean total concentration measured at each sampling site. The models were generated using the same methodology and predictor variables described above for developing background layer models.

As with previous LUR modeling efforts, the models developed here (i.e., the background layer and the traditional models) were evaluated using the leave-one-out cross-validation (LOOCV) method (Wang et al., 2013; Eeftens et al., 2012; Basagna et al., 2012). All the models were also evaluated by hold-out validation using the entire Phase II dataset. We extrapolated the prediction raster to the Phase II domain, and compared predictions with observations at the 36 Phase II monitoring locations. Because the Phase II sampling was conducted in a different year and covered a larger area, the hold-out validation is a test of the temporal-spatial transferability of models.

3. Results and discussion

layers.

The ESCAPE LUR building methodology described above does not explicitly consider co-linearity between variables. We investigated the possible effects of co-linear variables by repeating our LUR process using the method described by Larson et al (Larson et al., 2009). This method explicitly ranks all variables correlated with the measured pollutant and eliminates variables in each category that are correlated (Pearson's R > 0.6) with the most highly ranked variable. When applied to this dataset, the resultant models identified the same explanatory variables. Therefore all LUR models presented herein use the ESCAPE methodology as described above.

We checked the spatial auto-correlation of LUR covariates based on Moran's I. All variables in the final models had Moran's I close to 0, and the z-scores ranged from -0.04 (Industrial land in the 200 m buffer) to 0.77 (Vacant land in the 750 m buffer). The values of Moran's I are not statistically significant at the p < 0.05 level. This suggests that there was no significant spatial auto-correlation.

3.1. PB-PAH and BC concentrations

Fig. IB shows a strong relationship between BC plumes and ADTT at each of the phase I sites. A similar relationship exists for PB-PAH, as shown by Tan et al (Tan et al., 2014a). The regression equation between plume BC and ADTT (R2 = 0.67) is used as the input to the near-road kernel for the two-layer models described below. In contrast, background BC varies across the 42 sites, but does not have a discernable relationship with ADTT.

Table 1 shows the distributions of site average PB-PAH and BC concentrations and the estimated contribution of HEVs to each pollutant. Substantial spatial variability was observed in both sampling phases. The highest concentrations were observed at sites representing busy commercial areas and sites near selected industrial facilities (e.g., coke production), while the lowest concentrations were observed in large urban parks and residential areas in the outer suburbs. For the Phase I data, PB-PAH had a coefficient of variation (CV) of 0.50 and BC had a CV of 0.29, indicating substantial

spatial variability across the set of sampling sites. Phase II included several urban sites as well as many sites located far from the city center. Therefore, Phase II sites had lower average concentrations but larger spatial variability (CV = 0.63 for PB-PAH, 0.37 for BC). In general, BC showed less spatial variation than PB-PAH due to a higher background concentration.

The average contribution of HEVs to total PB-PAH concentrations was higher at Phase I sites than Phase II sites, though contributions to BC were similar. HEV plumes contributed 12—73% of observed PB-PAH (mean = 52%) and 4—34% of BC (mean = 20%) at Phase I sites. HEV plumes contributed 2—73% of PB-PAH (mean = 40%) and 0—56% of BC (mean = 26%) at Phase II sites. While the mean contribution of HEVs to BC at Phase II sites was slightly higher, Phase II included sites with no impact of HEVs, which was not observed in Phase I. The estimated contributions of HEVs were more spatially variable (CV = 0.81 and 1.00 for PB-PAH, 0.69 and 0.87 for BC in Phase I and Phase II, respectively) than the mean pollutant concentration, indicative of the heterogeneous spatial impact of HEV emissions.

3.2. Two-layer models

Tables 2 and 3 summarize the two-layer models of PB-PAH and BC. Figs. 3 and 4 show corresponding prediction surfaces in the Phase I sampling domain, and exhibit large spatial gradients for both species. Near roadways, concentrations of both PB-PAH and BC are predicted to vary by a factor of 2—3 or more over distances of a few hundred meters. For example, near a major north-south highway in the northwestern portion of the domain, predicted BC concentrations fall from 2.5 mg m~3 at the road edge to 0.83 mg m~3 (the regional background is 0.76 mg m~3) over a spatial scale of 100—150 m. While the strongest gradients are observed near major roadways, significant variations are also evident in the background layer. Notably, elevated BC concentrations are predicted in the industrialized areas in the southeastern portion of the domain, and are not attributable to the plume layer.

The two-layer PB-PAH model described 76% of the variability in the Phase I domain. Consideration of the model layers separately reveals superior performance of the plume layer. The plume layer explained 80% of the spatial variability in the contribution of HEVs, and the background layer model explained 44% of the spatial variability in the non-plume background at the 42 Phase I sites. The LOOCV R2 of the PB-PAH background layer is 0.34.

The explanatory variables in the PB-PAH background layer model included elevation and the fraction of vacant land in the 750 m buffer. The background layer model did not identify any variables related to traffic, suggesting that the plume layer largely describes the effect of traffic emissions on the spatial variability of PB-PAH. The negative coefficient of elevation indicates that non-plume PB-PAH concentrations are elevated in river valleys. Vacant land, most of which was forests and grassland, did not have sources of PB-PAH, and PB-PAH was expected to decrease in these areas (hence a negative coefficient in Table 2).

The two-layer BC model explained 67% of the variability in the Phase I domain. The BC plume layer explained 68% of the variability in the contribution of HEVs, and the background layer explained 55% of the variability in the non-plume background BC at Phase I sites. The LOOCV R2 of the BC background layer was 0.44. The explanatory variables in the background layer model include elevation (again with a negative coefficient), the fraction of commercial and public land within the 200 m buffer, and the fraction of industrial land within the 200 m buffer. Commercial and public land was expected to positively correlate with BC because it was associated with high traffic density. Although most vehicles were not HEVs, aggregate BC emissions in higher traffic areas are expected to contribute to the

üü 5

non-plume background. Industrial facilities in the sampling domain, including metallurgical coke ovens and steel industries, are important sources of elemental carbon (Subramanian et al., 2006). Thus the fraction of industrial land has a positive regression coefficient in the background layer model.

3.3. Spatial-temporal transferability of two-layer models

The transferability of the two-layer models was examined using the Phase II sites. The final models explained 48% and 31% of the variability of PB-PAH and BC at the 36 Phase II sites, respectively. However the plume layer of the model was more transferable than the background layer.

Fig. 5 shows that overall, HEV plume measurements at Phase II sites matched predictions based on the Phase I plume layer. The fact that the plume layer of the model shows good transferability may not be surprising, as it attempts to link observed concentrations with a known emissions source (traffic) and a physically informed decay away from the source. The transferability of the plume layer

Measured PB-PAH from HEVs (ng/m3)

Measured BC from HEVs (¿tg/m3)

Fig. 5. Performance of the plume layer at Phase II sites. The three potential outliers are color-coded: blue—an industrially dominated site, green — downtown Pittsburgh, red — a site located near a railway and a dedicated bus-only highway. The solid lines are 1:1 lines. Dashed lines show ±5 ng/m3 (PB-PAH) and ±5 mg/m3 (BC). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

therefore speaks to the value of analyzing high time resolution data from mobile sampling. In this study, it allows for a direct link between traffic emissions, specifically HEV emissions, and spatial variations in resultant pollutant concentrations.

The plume layers determined for Phase I describe 48% of PB-PAH variability and 36% of BC variability due to HEV emissions at Phase 2 sites. However there are three obvious outliers, shown as the colored points in Fig. 5. When these three biased sites were excluded, the plume layer model predicted 79% and 60% of the variability in the contribution of HEVs to PB-PAH and BC at the remaining 33 Phase II sites, respectively. This is nearly identical to the performance of the plume layer in the Phase I domain training dataset, and demonstrates the utility of the plume analysis and modeling method employed here.

The three plume layer outliers in Fig. 5 are described in detail in the Supporting Information. Briefly, at an industrially dominated site (blue points in Fig. 5) and a site adjacent to a rail line and bus-only highway (red points), ADTT fails to describe the number of HEVs. At the site adjacent to the railroad and bus-only highway, ADTT does not account for emissions from those sources, and therefore HEV contributions are under predicted. At the industrially dominated site an incentive program to either retrofit diesel vehicles with diesel particulate filters (DPFs) or replace older vehicles was started between Phase I and II, and HEV contributions are therefore over predicted for Phase II. At the downtown Pittsburgh site (green), the contribution of HEVs was under predicted for both Phase I and Phase II, suggesting that the near-roadway kernel used in the plume layer does not adequately describe this area characterized by narrow streets and tall buildings.

The relationship between plume layer concentrations and ADTT assumes that HEVs are evenly distributed throughout the diesel vehicle fleet, and that no spatial variations exist in the fraction of total diesel vehicles that are HEVs. Two of the three outliers in Fig. 5 (industrial site and near railway site) exhibit cases where this assumption breaks down. At the industrial site, the diesel vehicle fleet is biased towards newer or lower-emitting vehicles than the overall fleet. At the near railway site, ADTT under estimates the total contribution of potential high emitters.

The plume layer is clearly transferable within the sampling domain considered here, and represents an improvement in model transferability over traditional LUR. The performance of the plume layer at Phase II sites suggests that it may be transferable to other areas outside of the study domain, for instance other cities. Additional measurements in other cities are required to verify this hypothesis, particularly in areas where the diesel vehicle fleet has a significantly different age distribution than our sampling domain and in areas with high densities of tall buildings such as the downtown Pittsburgh site sampled here. This land use type represents a small portion of our sample domain, but may be more prevalent in other cities.

The transferability of background layer models was considerably lower than plume layer models, only explaining 26% and 23% of the variability of the non-plume background PB-PAH and BC at Phase II sites. To attempt to account for the difference in land use, we calibrated our background layer models with Phase II data. The trans-ferability only increased slightly after the calibration (R2 = 0.30 and 0.29 for PB-PAH and BC, respectively). Some variables had large p-values (>0.05) in the calibrated model, indicating that they were not associated with the observed spatial variation. Such variables included the fraction of vacant land in the 750 m buffer (p = 0.09) in the calibrated background PB-PAH model, and the fraction of commercial land in the 200 m buffer (p = 0.14) in the calibrated background BC model. The fraction of industrial land in the 200 m buffer had both the wrong direction and a large p-value (p = 0.24) in the calibrated background BC model. Therefore, calibrating background

layer models did not significantly improve the performance at Phase II sites, and the final two-layer model does not include the recalibrated background layer.

The low transferability of background layer models was not unexpected. One possible explanation for the poor transferability of the base layer model is that the majority of Phase II sites were located in suburban areas with much lower building density and less industry than the Phase I training data set. It also likely reflects the problems with applying LUR modeling to mobile datasets with very limited sampling. As described by Tan et al. (Tan et al., 2014b), the mean concentrations measured by short-duration mobile sampling at a specific site can deviate significantly from the true annual average concentration. In turn, the accuracy of traditional LUR models, including the background layer generated herein, built from such data will suffer, with potential implications for model transferability. The poor transferability of the background layer LUR model further demonstrates the usefulness of the plume analysis approach. The plume layer of the model offers improved accuracy and superior transferability as well as the potential to overcome short sample time issues inherent to mobile sampling.

3.4. Traditional models

We also generated traditional LUR models from the mobile data set; these are outlined in Tables 2 and 3. The traditional PB-PAH model included two explanatory variables: truck traffic density in the 100 m buffer, and the fraction of commercial and public land in the 200 m buffer. The variable of truck traffic density in the 100 m buffer is likely associated with the emission of HEVs.

The traditional model explained 67% of the variability in PB-PAH concentrations, 9% lower than the two-layer model. The LOOCV R2 of the traditional PB-PAH model is 0.59. The traditional model predicted 34% of the variability of PB-PAH concentrations at Phase II sites, which was 11% lower than the two-layer model.

The traditional model of BC included three explanatory variables: truck traffic density within the 100 m buffer, elevation, and the fraction of commercial and public land in the 200 m buffer. Elevation and the fraction of commercial and public land in the 200 m buffer were included in both traditional and two-layer models. The traditional LUR did not identify any covariates associated with industry. Industrial emissions are known to be an important source of BC emissions in Pittsburgh, and BC measurements were elevated near major industrial facilities (9 sites in Phase I, 6 sites in Phase II). The lack of an industrial covariate (such as the fraction of industrial land in the 200 m buffer found in the background layer of the two-layer model) suggests that the traditional LUR may under estimate BC concentrations in industrial areas.

The traditional model explained 61% of the variability in BC concentrations at Phase I sites and the LOOCV R2 is 0.51. The traditional model predicted 34% of the variability of BC concentrations at Phase II sites, which was slightly higher (3%) than the two-layer model. HEV plumes contribute less to total BC than PB-PAH, therefore the similar transferability between the two-layer and traditional model is not surprising. Additionally, the traditional BC model had similar independent variables as the two-layer BC model, if considering the truck traffic density within the 100 m buffer as the surrogate of the plume layer model.

3.5. Comparison of two-layer and traditional models

To compare the predictions of the two-layer and traditional models, we divided the Phase I sampling domain into 100 m x 100 m grid cells. Pollutant spatial patterns predicted by the two types of models agreed reasonably well (Fig. S4). The Pearson's R between the prediction of the two-layer and traditional models

was 0.80 for PB-PAH and 0.85 for BC, respectively. The Spearman's rho was 0.65 for PB-PAH and 0.9 for BC, suggesting that the two models predict similar overall spatial patterns.

The traditional model predicted consistently higher PB-PAH throughout the sampling domain. Two factors account for the consistently higher predictions of the traditional LUR for PB-PAH. Near roadways, the 100 m buffer associated with ADTT in the traditional model predicts higher concentrations than the near-road kernel in the two-layer model. This is because the near-road concentration profile predicted by the traditional LUR method has a step-like change at the buffer boundary, while the prediction of the kernel interpolation is bell-curved and better simulates the near-road decay of traffic emissions described by Karner et al (Karner et al., 2010). The difference in near-roadway concentration predictions is shown in Fig. 2. While both the near-road kernel and the fixed-width buffer predict identical concentrations at both the road edge (C0 at d = 0) and at the outer edge of the buffer (d = D), the fixed-width buffer predicts higher concentrations at all intermediate distances. The discrepancy between the two methods of representing near-roadway concentrations can be a factor of two or greater.

Some traditional LUR models use distance or inverse distance to the nearest road as a predictor variable instead of fixed-width buffers (Eeftens et al., 2012). These variables predict gradual decays in pollutant concentrations with distance from roadways, though with different shapes than the kernel interpretation used here and most available near-road measurements. An example of linear decay is shown for D = 100 m in Fig. 2. Linear decay predicts lower concentrations near the roadway (d < 60 m) and higher concentrations farther from the road.

In areas far from major roadways, the traditional PB-PAH LUR model predicts higher concentrations than the two-layer model because of a higher background concentration. The lowest PB-PAH concentrations predicted in the traditional LUR are around 10 ng m-3. Based on the measured plume and background concentrations in the raw data (e.g., Table 1), this appears to be an over estimation of the true regional background of PB-PAH. In comparison, the two-layer model predicted significantly lower PB-PAH concentrations (~4 ng m-3) in urban background areas.

The traditional BC model predicted much lower concentrations near major point sources than the two-layer model (Fig. S6), primarily because the traditional model does not include an industrial covariate. The 2011 National Emissions Inventory indicated that industrial facilities were important sources of elemental carbon in the sampling domain (NEI, 2011), and elevated BC concentrations were measured near the fence line of major point sources. Therefore, the traditional BC model might misleadingly indicate that industrial facilities are not important BC sources in the study domain.

3.6. Discussion and implications

This manuscript presents a novel two-layer model to predict spatial distributions of traffic-related air pollutants measured with mobile monitoring. Overall the two-layer model outperforms a traditional LUR of the same dataset, and displays greater trans-ferability. Additionally, as discussed below, the link between emissions and concentrations inherent in the two-layer model allows it to be used to investigate potential interventions or future scenarios for traffic emissions.

The fall-off distance D used in the near-roadway kernel is an important parameter in the plume layer of the two-layer model. Following the review of Karner et al., we used D = 100 m for BC and PB-PAH (Karner et al., 2010). Though, as shown by Karner et al. and other near-roadway studies (Massoli et al., 2012), D will depend on

the pollutant of interest and may also depend on the specific sampling location. Changing D in the two-layer model will change the spatial distribution associated with the plume layer, with larger values of D corresponding to a wider area impacted by each individual roadway. Fig. 2 illustrates the sensitivity of the plume layer width to D. The data set collected here, which was either collected on-road (Phase I) or within ~10 m of the road edge (Phase II) is insufficient to determine whether the value of D used here is optimal for our sampling domain.

The traditional LUR models for BC and PB-PAH both identified ADTT within the 100 m buffer as the representative traffic variable. As described above, the traditional model therefore predicts higher concentrations of these species near roadways. If instead we used D = 200 m for the plume layer, the two-layer model and traditional LUR would predict nearly identical concentrations for the first 50 m from the road edge, and the two-layer model would typically predict higher concentrations for the region between 100 and 200 m from the road edge.

The plume layer in the two-layer model directly estimates the contribution of HEVs, and can therefore be used to investigate impacts of emissions reductions from this source. Although traditional models included truck traffic variables to represent the influence of HEVs, the regression coefficients cannot be directly used to estimate the rate of change of PB-PAH and BC concentrations with respect to the change in truck traffic density because other variables (e.g., elevation in the traditional BC model) are not fixed.

We used the two-layer model to investigate outdoor pollutant concentrations and the contribution of HEVs at 200,939 residential properties in the Phase I domain. These residential properties included single-family houses, multi-family houses, townhouses, and row houses. Apartments were excluded from this analysis, because the model only predicted ground-level concentrations. Based on two-layer models, the interquartile range of residential outdoor exposure was 3.34 ng m-3 for PB-PAH and 0.19 mg m-3 for BC. The highest PB-PAH concentrations occurred at properties near major roadways with high ADTT values. The contribution of HEVs to outdoor PB-PAH exceeded 20% at 197,851 (>98%) properties, and exceeded 40% at 34,680 (~17%) properties. As BC was more heavily influenced by regional background and industrial sources than HEV plumes, the contribution of HEVs was substantial only at a small number of residential properties. The contribution of HEVs to BC exceeded 20% at 18,909 (~9%) properties and exceeded 40% at 751 properties (0.4%).

Installing aftertreatment devices such as diesel particulate filters (DPFs) can reduce diesel particulate emissions by 90—99% (May et al., 2014; Khalek et al., 2013). The two-layer model allows us to investigate the extreme case with no emissions from HEVs by removing the plume layer and considering only the base layer. In this case, the average residential outdoor exposure to PB-PAH and BC at all the Phase I domain residential properties can be lowered by 32% (5.23 ng m-3) and 14% (0.16 mg m-3). Actual reductions may be higher because diluted emissions from HEVs can also contribute to the non-plume background of PB-PAH and BC. In this study domain, controlling emissions from HEVs appears to be an effective method to reduce outdoor exposures to PB-PAH and BC.

3.7. Limitations

The primary limitation of this study is the relatively short sampling time at each site (6 h). As noted by Tan et at (Tan et al., 2014b), the sampling strategy used here may not represent long-term average concentrations at each site. However, the sampling strategy is sufficient to determine spatial patterns within the sampling domain. The limited sampling time at each site may impact model transferability between the Phase I and II domains.

Nonetheless, both the two-layer and traditional LUR models presented in this manuscript were generated for the same dataset, and the two-layer models exhibit superior performance. The trans-ferability of the plume layer of the two layer model also suggests that separating plumes from background concentrations can help overcome some of the limitations inherent in mobile monitoring studies that use short sampling times.

Acknowledgement

The authors acknowledge the funding from the Heinz Endowments grants C2940 and E0144. Author contributions: A.A.P. and A.L.R. designed research. Y.T. and T.R.D. collected and analyzed data. Y.T. and A.A.P. wrote the manuscript.

Appendix A. Supplementary data

Supplementary data related to this article can be found at http:// dx.doi.org/10.1016/j.atmosenv.2016.03.032.

References

Allegheny County Contours. ftp://www.pasda.psu.edu/pub/pasda/alleghenycounty/

AlleghenyCounty_Contours2006.zip. Basagna, X., Rivera, M., Aguilera, I., Agis, D., Bouso, L., Elousa, R., Foraster, M., de Nazelle, A., Nieuwenhuijsen, M., Vila, J., Kunzli, M., 2012. Effect of the number of measurement sites on land use regression models in estimating local air pollution. Atmos. Environ. 54, 634—642. Brantley, H.L., Hagler, G.S.W., Kimbrough, E.S., Williams, R.W., Mukerjee, S., Neas, L.M., 2014. Mobile air monitoring data-processing strategies and effects on spatial air pollution trends. Atmos. Meas. Tech. 7, 2169—2183. Brugge, D., Durant, J., Rioux, C., 2007. Near-highway pollutants in motor vehicle exhaust: a review of epidemiologic evidence of cardiac and pulmonary health risks. Environ. Health 6,1—12. Canagaratna, M.R., Jayne, J.T., Ghertner, D.A., Herndon, S., Shi, Q., Jimenez, J.L., Silva, P.J., Williams, P., Lanni, T., Drewnick, F., Demerjian, K.L., Kolb, C.E., Worsnop, D.R., 2004. Chase studies of particulate emissions from in-use New York City vehicles. Aerosol Sci. Technol. 38, 555—573. Clougherty, J.E., Wright, R.J., Baxter, L.K., Levy, J.I., 2008. Land use regression modeling of intra-urban residential variability in multiple traffic-related air pollutants. Environ. Health 7 (17). Clougherty, J.E., Kheirbek, I., Eisl, H.M., Ross, Z., Pezeshki, G., Gorczynski, J.E., Johnson, S., Markowitz, S., Kass, D., Matte, T., 2013. Intra-urban spatial variability in wintertime street-level concentrations of multiple combustion-related air pollutants: the New York city community air survey (NYCCAS). J. Expo. Sci. Environ. Epidemiol. 23, 232—240. Dallmann, T.R., Harley, R.A., Kirchstetter, T.W., 2011. Effects of diesel particle filter retrofits and accelerated fleet turnover on drayage truck emissions at the port of Oakland. Environ. Sci. Technol. 45,10773—10779. Dallmann, T.R., DeMartini, S.J., Kirchstetter, T.W., Herndon, S.C., Onasch, T.B., Wood, E.C., Harley, R.A., 2012. On-road measurements of gas and particle phase pollutant emission factors for individual heavy-duty diesel trucks. Environ. Sci. Technol. 46 (15), 8511—8518. Draper, N.R., Smith, H., 1998. Applied Regression Analysis. Wiley-Interscience, Hoboken, NJ.

Eeftens, M., Beelen, R., de Hoogh, K., Bellander, T., Cesaroni, G., Cirach, M., Declerq, C., Dedele, A., Dons, E., de Nazelle, A., et al., 2012. Development of land use regression models for PM25, PM2 5 absorbance, PM10 and PMcoarse in 20 European study areas; results of the ESCAPE project. Environ. Sci. Technol. 46, 11195—11205.

Gulliver, J., Briggs, D., 2011. STEMS-air: a simple GIS-based air pollution dispersion model for city-wide exposure assessment. Sci. Total Environ. 409 (12), 2419—2429.

Heck, J.E., Wu, J., Lombardi, C., Qiu, J.H., Meyers, T.J., Wilhelm, M., Cockburn, M., Ritz, B., 2013. Childhood cancer and traffic-related air pollution exposure in pregnancy and early life. Environ. Health Perspect. 121 (11—12), 1385—1391. Hoek, G., Beelen, R., de Hoogh, K., Vienneau, D., Gulliver, J., Fischer, P., Briggs, D., 2008. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos. Environ. 42, 7561—7578. Hudda, N., Fruin, S., Delfino, R.J., Sioutas, C., 2013. Efficient determination of vehicle emission factors by fuel use category using on-road measurements: downward trends on Los Angeles freight corridor I-710. Atmos. Chem. Phys. 13, 347—357. Hudda, N., Gould, T., Hartin, K., Larson, T.V., Fruin, S.A., 2014. Emissions from an international airport increase particle number concentrations 4-fold at 10 km downwind. Environ. Sci. Technol. 48 (12), 6628—6635. Jedynska, A., Hoek, G., Wang, M., Eeftens, M., Cyrys, J., Keuken, M., Ampe, C.,

Beelen, R., Cesaroni, G., Forastiere, F., Cirach, M., de Hoogh, K., De Nazelle, A., Nystad, W., Declercq, C., Eriksen, K.T., Dimakopoulou, K., Lanki, T., Meliefste, K., Nieuwenhuijsen, M.J., Yli-Tuomi, T., Raaschou-Nielsen, O., Brunekreef, B., Kooter, I.M., 2014. Development of land use regression models for elemental, organic carbon, PAH, and hopanes/steranes in 10 ESCAPE/TRANSPHORM European study areas. Environ. Sci. Technol. 48 (24), 14435—14444.

Karner, A.A., Eisinger, D.S., Niemeier, D.A., 2010. Near-roadway air quality: synthesizing the findings from real-world data. Environ. Sci. Technol. 44 (14), 5334—5344.

Khalek, I.A., Blanks, M.G., Merritt, P.M., 2013. Phase 2 of the Advanced Collaborative Emissions Study Final Report. Coordinating Research Council, Alpharetta, GA. November 2013.

Kheirbek, I., Johnson, S., Ross, Z., Pezeshki, G., Ito, K., Eisl, H., Matte, T., 2012. Spatial variability in levels of benzene, formaldehyde, and total benzene, toluene, ethylbenzene and xylenes in New York City: a land-use regression study. Environ. Health 11 (51).

Larson, T., Henderson, S.B., Brauer, M., 2009. Mobile monitoring of particle light absorption coefficient in an urban area as a basis for land use regression. Environ. Sci. Technol. 43, 4672—4678.

Loibl, W., Orthofer, R., 2001. From national emission totals to regional ambient air quality information for Austria. Adv. Environ. Res. 5 (4), 395—404.

Massoli, P., Fortner, E.C., Canagaratna, M.R., Williams, L.R., Zhang, Q., Sun, Y., Schwab, J.J., Trimborn, A., Onasch, T.B., Demerjian, K.L., Kolb, C.E., Worsnop, D.R., Jayne, J.T., 2012. Pollution gradients and chemical characterization of particulate matter from vehicular traffic near major roadways: results from the 2009 Queens College air quality study in NYC. Aerosol Sci. Technol. 46,1201—1218.

May, A.A., Nguyen, N.T., Presto, A.A., Gordon, T.D., Lipsky, E.M., Karve, M., Guitierrez, A., Robertson, W.H., Zhang, M., Chang, O., Chen, S., Cicero-Fernandez, P., Fuentes, M., Huang, S.-M., Ling, R., Long, J., Maddox, C., Massetti, J., McCauley, E., Na, K., Pang, Y., Rieger, P., Sax, T., Truong, T., Vo, T., Chattopadhyay, S., Maldonado, H., Maricq, M.M., Robinson, A.L., 2014. Primary gas and PM emissions from light and heavy duty vehicles. Atmos. Environ. 88, 247—260.

NEI, National Emissions Inventory, 2011. http://www.epa.gov/ttnchie1/net/ 2011inventory.html.

Patton, A.P., Zamore, W., Naumova, E.N., Levy, J.I., Brugge, D., Durant, J.L., 2015. Transferability and generalizability of regression models of ultrafine particles in urban neighborhoods in the Boston area. Environ. Sci. Technol. 49, 6051 —6060.

PennDOT, Pennsylvania Department of Transportation, Pennsylvania Traffic Counts. http://www.pasda.psu.edu/data/padot/state/PaTraffic2012_01.zip.

Poplawski, K., Gould, T., Setton, E., Allen, R., Su, J., Larson, T., Henderson, S., Brauer, M., Hystad, P., Lightowlers, C., Keller, P., Cohen, M., Silva, C., Buzzelli, M., 2009. Intercity transferability of land use regression models for estimating ambient concentrations of nitrogen dioxide. J. Expo. Sci. Environ. Epidemiol. 19, 107—117.

Pratt, G.C., Parson, K., Shinoda, N., Lindgren, P., Dunlap, S., Yawn, B., Wollan, P., Johnson, J., 2014. Quantifying traffic exposure. J. Expo. Sci. Environ. Epidemiol. 24, 290—296.

Saraswat, A., Apte, J.S., Kandlikar, M., Brauer, M., Henderson, S.B., Marshall, J.D., 2013. Spatiotemporal land use regression models of fine, ultrafine, and black carbon particulate matter in New Delhi, India. Environ. Sci. Technol. 47 (22), 12903—12911.

Subramanian, R., Donahue, N.M., Bernardo-Bricker, A., Rogge, W.F., Robinson, A.L., 2006. Contribution of motor vehicle emissions to organic carbon and fine particle mass in Pittsburgh, Pennsylvania: effects of varying source profiles and seasonal trends in ambient marker concentrations. Atmos. Environ. 40, 8002—8019.

Tan, Y., Lipsky, E.M., Saleh, R., Robinson, A.L., Presto, A.A., 2014. Characterizing the spatial variation of air pollutants and the contributions of high emitting vehicles in Pittsburgh, PA. Environ. Sci. Technol. 48, 14186—14194.

Tan, Y., Robinson, A.L., Presto, A.A., 2014. Quantifying uncertainties in pollutant mapping studies using the Monte Carlo method. Atmos. Environ. 99, 333—340.

Vienneau, D., de Hoogh, K., Briggs, D., 2009. A GIS-based method for modelling air pollution exposures across Europe. Sci. Total Environ. 408 (2), 255—266.

Wang, M., Beelen, R., Basagana, X., Becker, T., Cesaroni, G., de Hoogh, K., Dedele, A., Declercq, C., Dimakopoulou, K., Eeftens, M., Forastiere, F., Galassi, C., Grazuleviciene, R., Hoffmann, B., Heinrich, J., Iakovides, M., Kunzli, M., Korek, M., Lindley, S., Molter, A., Mosler, G., Madsen, C., Nieuwenhuijsen, M., Phuleria, H., Pedeli, X., Raaschou-Nielsen, O., Ranzi, A., Stephanou, E., Sugiri, D., Stempfelet, M., Tsai, M.-Y., Lanki, T., Udvardy, O., Varro, M.J., Wolf, K., Weinmayr, G., Yli-Juuti, T., Hoek, G., Brunekreef, B., 2013. Evaluation of land use regression models for NO2 and particulate matter in 20 European study areas: the ESCAPE project. Environ. Sci. Technol. 47, 4357—4364.

Zhang, K., Larson, T.V., Gassett, A., Szpiro, A., Daviglus, M., Burke, G.L., Kaufman, J.D., Adar, S.D., 2014. Characterizing spatial patterns of airborne coarse particulate (PM10-2.5) mass and chemical components in three cities: the Multi-Ethnic Study of Atherosclerosis. Environ. Health Perspect. 122, 823—830.

Zhang, J.J.Y., Sun, L., Barrett, O., Bertazzon, S., Underwood, F.E., Johnson, M., 2015. Development of land-use regression models for metals associated with airborne particulate matter in a North American city. Atmos. Environ. 106, 165—177.