Home > Sensitivity analysis of a gravity-based land use model: the importance of scenarios∗

Sensitivity analysis of a gravity-based land use model: the importance of scenarios∗

Page 1
Sensitivity analysis of a gravity-based land use model: the importance of scenarios∗
E. Borgonovo, M Percoco, R. Polizzi,K. Kockelman December 2014
Abstract Traffic and land use simulation models are complex mathematical machines that reflect social, geographic, economic and demographic factors. They rely on large datasets, and the associated computational complexity makes it difficult for analysts to readily appreciate how variations in one of the exogenous variables or parameters impact the simulation results. In such settings, computationally frugal methods need to be used. Analysts typically run the models through a very limited set of selected scenarios, which help the analyst understand what results the model produces under alternative exogenous contexts. In this work, we address the issue of tackling the scenario analysis of complex computational codes using local sensitivity analysis with interactions that allow analysts to extract new insights when running a model through scenarios. Keywords: Sensitivity analysis; Traffic and Land use modelling; G-LUM.
∗The authors wish to thank Laura Cavalli and Guangmin Wang for useful comments on earlier drafts. Finan-
cial support from the CerTeT Research Centre, within the research grant of Autostrade per L��Italia, is gratefully acknowledged by E. Borgonovo, M. Percoco, L. Cavalli, and R. Polizzi. Affiliations: Emanuele Borgonovo and Ric- cardo Polizzi, Department of Decision Sciences, Bocconi University. Marco Percoco, Department of Policy Analysis and Public Management, Bocconi University, Milan, Italy. Kara Kockelman Department of Civil, Architectural and Environmental Engineering, University of Texas Austin, USA.

Page 2
1 Introduction
Decision-makers in the transport sector often benefit from the use of quantitative models pre- dicting outcomes of given policy actions (the tolling of a highway, the construction of a tunnel or other interventions that impact transport costs). Land use and transportation forecasting models are usually complex and computationally inten- sive, since they need to account for multiple factors, ranging from the social and economic features of the region of interest to physical layout and congestibility of the transportation network, and the travel times and costs between locations to travellers�� preferences for modes, destination, routes, and times of the day. Realistic traffic simulations are encoded in computer software, where a series of interactive calculations are performed. The specific relationship that binds the endogenous to exogenous vari- ables is not obvious to analysts. A systematic approach is then needed to gauge the response of the endogenous variables or outputs of interest to changes in the exogenous model inputs. Sensitivity analysis plays a crucial role in evaluating the reaction of model outputs to changes in the baseline scenario and/or other assumptions. Changes in model outputs are then key for evaluating the reliability of simulations and the importance of input variables and parameters. In this paper, we propose a systematic approach for the conduct of sensitivity analysis of complex transportation and land use models. Sensitivity analysis of such models is currently gaining attention of transportation scholars, and several recent studies have appeared in the literature. UrbanSim is an urban land use simulation model with nine sub-models describing the spatial configuration of a given city as a function of transport costs and land use regulations (Waddell (2002), Waddell et al. (2003)). Ševc��kov�� et al. (2007) applied Bayesian melding to UrbanSim to assess uncertainty in model forecasts and found that this approach generates better predictions than frequentist model averaging. Ševc��kov�� et al. (2011) used the same approach to assess the effects of Alaskan Way Viaduct on downtown Seattle��s land use and traffic patterns. Here, we consider an analyst who is evaluating a complex simulation model across different scenarios. We rely on a new and different method that produces finite change sensitivity indices for the variation of exogenous variations across scenarios. Calculation of the indices is computationally 1

Page 3
frugal and indices can be rapidly obtained for sectors and situations and plotted across the space. This feature is particularly appealing when the set of uncertain variables is particularly large since our procedure requires a relatively low number of model runs. We illustrate the local sensitivity analysis applied to the case of scenarios in transport models through an analysis of the Gravity Land Use Model (G-LUM) (Zhou and Kockelman (2009); Zhou et al. (2009)). Over years the development of transportation plans at regional levels often seeks to account for the interaction of and feedbacks between transport and land uses encouraged by the significant environmental and traffic impacts of urbanization. In response to this need, many land use models (LUMs) have been developed to forecast the future spatial distribution of households and employment across regions. Among these different models, the G-LUM has been applied to Texas in order to facilitate prediction and multi-scenario analysis with different kinds and degrees of uncertainty in inputs. The model anticipates the future spatial distributions of jobs and households, by type, based on current conditions and measures of transportation access.Alternative scenarios are run for the exogenous variables of jobs, households and travel costs. This complex dataset contains several thousand data values for the Austin, Texas region. The model��s response is analysed in detail in terms of the previously mentioned settings. The remainder of the paper is structured as follows: Section 2 presents an overview of Scenario and Sensitivity Analysis, with a focus on the methods relevant for this work. Section 3 presents the sensitivity measures for scenario decomposition. Section 4 discusses the G-LUM model, Section 5 presents results, and Section 6 offers conclusions.
2 Sensitivity and Scenario Analyses: An Overview
Scenario analysis and sensitivity analysis imply different but interrelated concepts. Scenarios can be defined as ��different possible future states of a system" (Tietje (2005)), ��stories about how the future might turn out" (O��Brien (2004)). A scenario analysis1 allows an analyst or decision-maker to forecast future outcomes in accordance with his/her state of knowl- edge (Jungermann and Thuring (1988)) as the final goal of a scenario analysis is the consistent elicitation of predictions, the cognitive and methodological aspects of scenario generation have been
1 If it is implemented according to the canons defined in the literature.
2

Page 4
subjected to thorough discussion over time Jungermann and Thuring (1988).2 O��Brien (2004) also highlights the potential pitfalls in identifying scenarios and emphasizes consistency issues, whereas Tietje (2005) deems ��a core part of a formative scenario analysis (p. 419) and notes that the desirable generation method should lead to scenarios that are consistent (i.e., they provide realistic descriptions of the future), different (to avoid redundancy), few in number (to facilitate compari- son), reliable (to grant for repeatability of the analysis) and efficient (to keep simulation cost low). In Tietje (2005), consistency analysis is followed by a scenario filtering to achieve a compact and efficient set of results for final decision-making. Here, we consider the situation in which decision-makers rely on a mathematical model to generate results for all scenarios considered. Denote the input-output mapping as (1) y = f(x), f : ��X �� R where y is the endogenous variable of interest, ��X ⊆ R and x = (x1,x2,...,x), x �ʦ�X, is the vector of the exogenous variables (Appendix 1 contains a table with notations and symbols used in this work). Usually the simulation is run with exogenous variables assigned to a base-case scenario, x0 to first obtain the base-case output of the simulation: (2) y0 = f(x0) Using scenario analysis, one defines a set of alternative scenarios for the exogenous variables and the simulation is then run over these alternative scenarios to obtain different outputs depending on alternative values of the endogenous variable, y = f(x), where s = 1,2,....S. In this vein, the analyst is informed about the response of the endogenous variable in each scenario, although he/she does not have information about the sources of change. However, recent works have shown that one can make the analysis methodologically robust through the concept of sensitivity analysis setting. Setting 1: Sign of Change What is the sign of change implied by the changes in exogenous
2 We refer to O��Brien (2004) and Tietje (2005) for a complete summary of literature findings.
3

Page 5
variables? Setting 2: Model Response Structure Is the change in value of the endogenous variable the direct superimposition of the individual changes in outputs or interaction effects are relevant? Do interaction effects across exogenous variables amplify or dampen individual effects? Setting 3: Prioritization What are the key drivers of the change in simulation results? Answering these questions calls for augmenting the quantitative side of scenario analysis through scenario decomposition and sensitivity analysis "settings" (Saltelli and Tarantola (2002); Saltelli et al. (2004)). A setting concerns the systematic formulation of the sensitivity question before the sensitivity analysis exercise is carried out. For models that produce spatially distributed outputs this is particularly relevant, as a methodological approach has not yet been developed. We need an extension of the sensitivity analysis framework, which we discuss in the next section.
3 Sensitivity measures for scenario decomposition of LUMs
3.1 Methodology
In a scenario analysis, the change from scenario 0 to scenario 1 of the exogenous variables induces the change Ay = y1 −y0 in the endogenous variable. A first way to decompose this change is to suppose that f(x) is r times differentiable at x0 and to use a multivariate Taylor expansion of Ay = y1 − y0: (3) Ay = y1 − y0 = P
 =1 f0(x0)Ax + P  =1 f
00
(x0)AxAx + ...+
+... + P
 1=1 P 2=1 ...P  r=1 f
r
12r (x0)Ax1 ...Axr + o(khk 
) The first-order derivatives in eq. (3) are the sensitivity measures commonly used for comparative statics (CS) (Samuelson (1947); Quirk (1997); Takayama (1993)), i.e., (4) CS = f0(x0) (k = 1,2,...,K) To be noted is the fact that in our empirical analysis derivatives are computed numerically, so that we do not need to explicitly consider the generating function f(x). 4

Page 6
As conceived by Samuelson (1947), the sensitivity measure CS provides ��qualitative restrictions on slopes, curvatures etc." (Samuelson, 1947; p.20) indicating the direction and rate of change in model output following an infinitesimal change in x. If the change in y is infinitesimal, that is Ay ' P
 =1 f0(x0)Ax, then the fraction of the change in y across two scenarios is
(5) D = df df = f0(x0)dx P
=1 f0(x0)dx
= CSdx P
=1 CSdx
(k = 1,2,...,K) where D is the differential importance measure, and coincides with the fraction of the differential change in y provoked by a slight perturbation of x [Borgonovo and Apostolakis (2001), Borgonovo (2007)]. However, application of CS and D may lead to partially informative results, if the changes in y are not infinitesimal. Also, if f(x) is not differentiable CS and D are not applicable. To model input changes, we observe that they are generally not infinitesimal in a scenario analysis. A decomposition that accounts for these features is the functional ANOVA decomposition, widely used in the field of mathematical statistics (Efron and Stein (1981), Oakley and O��Hagan (2004), Rabitz and Alis (1999), Saltelli et al. (2000), Sobol�� (2003), Sobol�� (2001), ). The decomposition is based on a sequence of nested integrations. Therefore, the parameter space, ��X, is complemented by a Borel algebra and a measure (��,A,��). The theory has been widely discussed in the literature, and a thorough overview is outside the purpose of the present work. Relevant results of this paper appear through an application of Theorem 1[Borgonovo (2010), Borgonovo and Peccati (2011)]. Theorem 1. Let (��X,A,��) be a measure space and f �� L1(��) be a measurable function. Then, if for any x0 and x1 belonging to �� and for any measure �� satisfying d�� =

Y
=1
d��, the following holds: f(x) = f0 +

X
=1
f(x) +

X

f(x,x) + ... + f12(x1,x2,...,x) = (6) = f0 +

X
=1
X
12k
f12k (x1 ,x2 ,...,xk ) 5

Page 7
where (7) ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ f0 = R ···R f(x)d�� f(x) = R ···R f(x)Q6= d�� − f0 f(xx) = R ···R f(x)Q6= d�� − f(x) − f(x) − f0 ... Decomposition in (6) can then be used to decompose the results of the sensitivity analysis of land use models, identifying the importance of changes in single variables or of interactions between variables. The number of terms in eq. (6) is 2. The first-order functions f(x) represent average behaviour of y as a function of the sole x; the second order terms, f(x,x) represent the effect of the interaction between x and x; similarly, higher order terms in the expansion represent the residual interaction of the corresponding group of exogenous variables. Under the sole assumption of measurability for f(x), and if one sets d�� equal to the Dirac-�� measure

Y
=1
��(x1
 − x0 )dx, one
obtains (8) Ay = f(x1) − f(x0) =

X
=1
Af +

X

Af + ... + A12f where3 (9) ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ Af = f(x1
,x0 ∼) − f(x0)
Af = f(x1

,x1
 ,x0 ∼()) − Af − Af − f(x0)
... Eq.(8) suggests that Ay is the sum of 2 −1 terms of increasing dimensionality. All these terms are obtained via a finite difference operator. For instance, Af is the difference between a) f(x1
,x0 ∼
), the value attained by the simulation model when all variables are at scenario 0, but x1
, which
is moved at scenario 1, and b) f(x0). The second order terms, Af are the difference between a) f(x1
,x1  ,x0 ∼()
), the value attained by the simulation model when all variables are at scenario 0, except for x1
 and x ; b) f(x0), and c) Af and Af are subtracted. Thus, through the
3 (X1 k, x0 ∼k
) denotes that exogenous variable X is set at the value it assumes in scenario 1, while all other variables are the the values of scenario 0. A similar interpretation holds for (X1
k, x0 ∼k
).
6

Page 8
subtraction of Af and Af, the terms Af account for the residual part of the change caused by the interaction of x and x. A similar interpretation applies to the higher order terms. Based on such decomposition, finite change sensitivity indices (FiCSI) can be computed: (10) ��
12r
:= A12r f where 12r denotes a group of r indices (r �� K), and ��
12r
is the portion of Ay due to the interaction of exogenous variables corresponding to the selected indices. The indices ��
12r
play the same role in the decomposition of a finite change as Sobol�� indices in the decomposition of variance. Of particular relevance are the first-order finite change sensitivity indices (11) ��1
 = Af
and the total order indices (12) ��
 = Af + X 
Af + ... + A12f ��

[eq.(12)] is the total contribution of x to Ay, and is the sum of the individual contribution of x, plus all the contributions due to the interaction of x with the remaining exogenous variables. The index (13) ��
 = ��  − ��1 
represents the effect of interactions associated with x. ��1

and ��

are generalization of the differential sensitivity measures CS and D. In fact, it can be shown that, if f(x) is smooth, then (14) lim
∆x7��0
��

Ax = lim
∆x7��0
��1

Ax = CS 7

Page 9
and: (15) lim
∆x7��0
��

/Ay Ax = lim
∆x7��0
��1

/Ay Ax = df df = D As to the computation of ��

, it can be shown that (16) ��
 = f(x1) − f(x0 ;x1 (∼)
) where f(x1) is the value of the endogenous variable in scenario 1 and (x0

;x1
(∼)
) is the point obtained with all exogenous variables at scenario 1 but x that remains at the base case scenario. Thus, the triplet (��1
,�� ,��  ) can be computed at the cost of 2K simulations, instead of 2. This
computational burden reduction result makes the sensitivity measures applicable also to complex simulation codes. As to the settings, as discussed in the literature, the sensitivity measures for setting 1 are the indices �� in (10), (11) and (12). The sign of the first-order indices (��1
) is the sign change
in y due to the individual change in x. The sign of ��
12r
is the sign of the interaction between the exogenous variables x
1
, x
2
and xr . As to Setting 2, the total-order indices (��

) are the appropriate sensitivity measures, since they deliver not only the individual importance of the factors, but also account for interactions. As to Setting 3, the magnitudes of ��
12r
provide the natural sensitivity measures. If the model is complex, or several exogenous variables are present and it is not possible to compute all the terms. In that case, one obtains a synthetic measure of the relevance of interactions by considering the magnitude of ��
 relative to the magnitude of ��  . Here,
we recall that if the endogenous variable responds additively to the exogenous variables variation then interactions are not present and ��
 = 0. Conversely, if the effect of interaction is strong, we
expect ��

to be close to ��

.
3.2 A numerical example
Before applying the scenario-based sensitivity analysis, it would prove to be beneficial to apply the method to a simpler case. To this end, let us consider the model 8

Page 10
(17) y1 = 2 ∗ x1 + 3/x2 + x4
3
(18) y2 = x4 ∗ ln(y1)/10 Then, we have 4 exogenous variables and two endogenous variables. Let us evaluate the model under two scenarios: the base case and an alternative scenario. The base case is obtained with all model inputs at their mean value: (19) x0 = [10,10,10,10] and the alternative scenario is (20) x+ = [19.5,19.5,19.5,19.5] Since we have four exogenous variables, we have four individual effects, ¡
4 2¢ = 6 interaction effects
of order two, ¡
4 3¢ = 4 interaction effects of order three and one residual effect of order four for a
total of 16 terms in the decomposition of the finite change. Formally: (21) y1(x+) − y1(x0) =
4
X
=1
��1
 + X 
��2
 + X 
��3
 + ��4 1234
To compute the first-order effects, we move one factor at a time to x+. The base case value is (22) y1(x0)=1.0020e + 04 Thus, we have (23) ��1
1 = y1(x+ 1 ,x0 −1) − y1(x0)=2x+ 1 + 3/x0 2 + (x0 3)4 − 2x0 1 − 3/x0 2 − (x0 3)4 = 2(19.5 − 10) = 19
9

Page 11
Proceeding in a similar way, we find (24) ��1
2 = −0.1462
and (25) ��1
3 = 1.3459 ∗ 105
Since the model is additive, we have no interaction effects. Note that (26) y1(x+)=1.4463 ∗ 105 so that (27) y1(x+) − y1(x0)=1.4463 ∗ 105 − 1.0020 ∗ 104 = 1.346 1 �� 105 which is the sum of 1.3459 ∗ 105 + 19 − 0.1462 = 1.346 1 �� 105. For y2, instead, we have interactions, because now we have a multiplicative form. Therefore, we obtain that second order terms are also significant now. For instance, we have a strong interaction effect between x3 and x4. In detail, the non-null indices are ��1
1 = 0.0019, ��1 2 = 0, ��1 3 = 2.6694, ��1 4 = 8.7517, ��2 34 = 2.536,
��2
23 = 0
��2
13 = 0
��2
14 = 0.0018, ��2 24 = −0.0018, ��3 134 = −0.0017.
To get, for instance, ��2
34 = 2.536, we need to compute the following:
(28) ��2
34 = y2(x+ 3 ,x+ 4 ,x0 ∼34) − y2(x0) − ��1 3 − ��1 4 = 23.1695 − 9.2124 − 2.6694 − 8.7517 = 2.536
4 The G-LUM Simulation Model
This section provides a concise review of the G-LUM models as introduced in Zhou et al. (2009) and Zhou and Kockelman (2009). G-LUM consists of three sub-models: EMPLOC, RESLOC, and LUDENSITY, calibrated using 10

Page 12
Base Year Data Lag Year Data
Calibration EMPLOC RESLOC LUDENSITY
HH location forecast Emp location  forecast Land Use forecast Next time period Travel Demand  Model
Figure 1: Flow-chart of the G-LUM model an entropy maximization and non-linear least-square principles (similar to maximum likelihood), thereby providing estimates of parameter values that generate the ��best fit�� of forecasts for the base year based on prior year data. Figure 1 shows the model flowchart. The EMPLOC sub-model, to forecast employment��s spatial distribution, relies on: i) accessi- bility to (all) households and jobs of that type, across zones, in prior time periods, ii) zone-to-zone travel costs and, iii) zone sizes. The endogenous variable of interest is the number of jobs of jobs of type p (basic or commercial, retail or service types, for example) in zone i at time t, denoted E
 .
This endogenous variable is estimated from: (29) E
  =  
X
=1
H−1A
 −1
U
 −1
cp
 exp(pc) + (1 − )E  −1
11

Page 13
where (30) A
 −1
= [

X
=1
(E
 −1
)p Lp
 cp  exp(ßc)]−1
and (31) U
 −1
= (E
 −1
)p (L)p where H is total households in the region , c is the travel cost, L is the land area, t represents the year/time period, and U
 −1
represents the attractiveness of the zone per employment type at t − 1. a,, b,p, ß,  are parameters, which are the result of a calibration, which we are to discuss. The second submodel, RESLOC, concerns the household spatial distribution. It is based on: i) access to all households in all categories and jobs of all types in prior time periods, ii) zone-to-zone travel costs, and iii) land use condition (in prior period). (For further details see Putman (1995)). The corresponding equations are (32) H
 = �� 
X
=1
Q
 B  W  cd  exp(ßc) + (1 − ��)N −1
where H
 is the number of households of type d residing in zone i at time t, and
(33) Q = P aE
  , B  = [P  =1 W  cd  exp(ßc)]−1
and W
 = (L  )d
(��
 )d
(L
 )d Q0 Ã1 + H0 
P H
 !d
d0
where Q converts employment to households, and W
 represents the attractiveness of zone i for
household type d at t − 1, and ; c is impedance (travel time and/or cost) between zones, a is the number of type d households per type p employee in the region under investigation; L
 is vacant
developable land in zone i; ��
 is the proportion of developable land already developed; L 
is residential land and a,��,s and b are parameters estimated in model calibration. Finally, LUDENSITY (land per household or job) relies on: i) land use conditions in the current time period, and ii) the share of households and jobs by type in the current period. It estimates 12

Page 14
the new area of land in each use by using information on the number of employees of each type and the households by type. At the end of the three sub-models, the travel demand model (TDM) integrates with the land use model. It is a four-step aggregate TDM whose main inputs are the roadway network with link capacities and free-flow times and zonal attributes for each traffic analysis zone. Zonal attributes include the number of jobs and the number of households by type, provided by the land use model. Individual trips are segmented to include home- and non-home-based, direct and complex, work and non-work types, and they are modelled explicitly. Fixed person-trip rates (per household, per weekday) are used for each of the household types. The TDM consists of six modules performed in sequence; to ensure consistency in input and output travel times, a model feedback mechanism is employed. Before proceeding with the simulation, a calibration process is performed in order to find the parameters that best predict the base year data using the past year (so the lag) ones. The G-LUM Matlab code uses an entropy maximization principle to estimate all parameters in the EMPLOC, RESLOC and LUDENSITY equations. The simulation then starts by using calibrated parameters to predict base year populations, jobs and land use. The predicted totals are then scaled to match the actual totals in the base year. During this process, residuals (the difference between the actual and the predicted values in each zone) are generated. Such residuals capture other key factors that influence the spatial distribution of jobs, and for such a reason they are used to adjust forward predictions. The process involves the prediction of households, jobs and land uses in each zone. Land use predictions are scaled to match the total developable land, household and job counts in each period provided by the users. The residuals are added to the estimates in order to better reflect the base-year target.
5 Sensitivity Analysis of GLUM
5.1 A baseline example
This section describes the scenario decomposition method as applied to G-LUM. The Austin, Texas data used in Zhou and Kockelman (2009) are used here with Z = 1074 zones. G-LUM 13

Page 15
produces the following forecasts: E
 : number of jobs across all zones, for each job type at t = 1,2,3,4,5. The job types are
Basic, Retail and Service; H
: contains the forecasted number of households across all zones at t = 1,2,3,4,5, for each
household of the four types (low, medium, medium-high and high income); LB: amount of land allocated for basic jobs in all zones at t = 1,2,3,4,5 after time period; LC: amount of land allocated for commercial jobs in all zones at t = 1,2,3,4,5; LR: amount of land allocated for residential use in all zones, at t = 1,2,3,4,5. Overall, we have M = 10 endogenous variables. As exogenous variables, we consider the following three datasets that are used as model inputs. The employment dataset (EMP), that contains jobs basic, retail, service for each zone in the base year; it is a matrix of size 1074 �� 3. The households dataset (HH), that contains the number of households per type in each zone, in the base year; it is a matrix of size 1074 �� 4. The link impedance dataset (TT), containing the travel times between each pair of zones in the base year. It is a 1074 �� 1074 matrix. The supplied values form our base case scenario, so that x0 = (EMP0,HH0,TT0).4 In Paul and Kockelman (2005), the model is run over three alternative scenarios, with EMP, HH and TT increased by 50% of the their distribution respectively, each one at a time. In our framework, we consider the following scenario that encompasses the three variations, namely x1 = (EMP1,HH1,TT1). Then, in accordance with eq. (8), the change of each model out- put from x0 to x1 can be decomposed into eight terms that account for the individual change in EMP, HH and TT, to their interactions in pairs and in the residual term that contains their overall and residual interaction Thus, we have the following sensitivity indices: ��1
 , ��1 , ��1  ,
��2
,��2  ,��2  , and ��3  and the total indices
(34) ��
 = ��1  + ��2  + ��2  + ��3 
��
 = ��1  + ��2  + ��2  + ��3 
��
 = ��1  + ��2  + ��2  + ��3 
4 It should be mentioned that our working scenarios are proposed only for expositional purposes and do not intend
to describe necessarily specific situations.
14

Page 16
Figure 2: ��
1 
, m = 1,2,...,10, t = 1,2,..5. Each of these sensitivity measures will then assume a different value for each of the ten outputs of interest. We present here as a first illustration the results obtained for the endogenous variables of the type hyi, m = 1,2,...,10 and t = 1,2,...,5. We start with first-order sensitivity indices. Figure 2 displays the first-order sensitivity indices of EMP. Figure 2 reads as follows. On the vertical axis (z), one finds the magnitude and sign of the sensitivity measures. On the first horizontal axis (y) one finds the model outputs (from basic to LR), and on the second horizontal axis (x) one finds time, from year one to five. Figure 2 allows us to appreciate that the increase in the values in the dataset EMP, alone, impacts upon predictions on future jobs, especially in year one and two, more heavily than the households counts, which are almost insensitive to this variation. The amount of land for residential use (LR) is also much more impacted than the amount of land for commercial and basic jobs. We observe that the effect here is negative; that is, to an increase in EMP there corresponds a decrease in the average of land allocated to residential use. Figure 3 reports the first-order effects of HH on each of the exogenous variables. One notes the almost null effects on the predicted numbers of jobs and on household counts. However, we register a strong (negative) influence of HH on LR, with a decrease in the value of land allocated 15

Page 17
Figure 3: ��
1 
, m = 1,2,...,10, t = 1,2,..5. Figure 4: ��
1 
, m = 1,2,...,10, t = 1,2,..5. to residential use. Figure 4 reports the first-order effects associated with the change in the 1074 �� 1074 dataset TT. One observes that the exogenous variables are almost insensitive to this variation, with the exception of a slight change in LR. We now consider the effect of interactions. Figure 5 reports the results for the interactions between EMP and HH. We observe here that interaction effects are negligible for jobs and household counts. However, positive interactions are registered on endogenous variables LC and LR. All signs are positive, 16

Page 18
Figure 5: ��
2 
, m = 1,2,...,10, t = 1,2,..5. Figure 6: ��
2 
, m = 1,2,...,10, t = 1,2,..5. signalling that, when varied together, EMP and HH amplify their individual effects. Figure 6 shows the interaction effects between EMP and TT. They are negligible for all exogenous variables. A similar result is encountered for the interactions between HH and TT (Figure 7). Figure 7 shows non-negligible, albeit slight interaction effects only on LR. Overall, the message of Figures 5, 6, and 7 is that the endogenous variables respond additively to the finite changes, with the sole exception of LR. Then, we come to total effects. They are displayed in Figure 8. Figure 8 shows that EMP is the most influential group on job predictions exogenous variables, 17

Page 19
Figure 7: ��
2 
, m = 1,2,...,10, t = 1,2,..5. Figure 8: Total order indices, ��
 
(upper left), ��
 
(upper right), ��
 
(lower centered). 18

Page 20
with a positive effect, while HH is most influential on land allocation exogenous variables. Household counts are more sensitive to changes in HH than in EMP. Practically no exogenous variable but LR is influenced by the change in TT.
5.2 A further example: a change in total counts
Let us now consider an homogeneous change input variables by +50% of total counts. The sensitivity analysis is carried out through the use of finite change sensitivity indices deriving from the spatial average of zonal outputs generated, for each endogenous variable, prediction period and scenario, by changes in the exogenous variables EMP, HH and TT. The first two variables (hereinafter respectively EMP and HH) represent the control totals for jobs and households in the region under investigation, while the third variable refers to the travel times between each pair of zones. As a result of individual variations of the three input files (i.e., the exogenous variables), the model produced output files containing the endogenous variables used to calculate first-order sen- sitivity indices. Results relative to the individual variation of the exogenous variable EMP are reported in Figure 9, where the horizontal axis displays the set of endogenous variables and the vertical axis shows the magnitude of the sensitivity indices. According to the figure, the individual variation of EMP has a significant effect on the number of future jobs, with an upward trend throughout prediction periods On the contrary, both the number of future households and land use variables, for every category, appear not be affected and are almost insensitive to this variation. Figure 10 reports the first-order sensitivity indices relative to the individual variation of HH. The figure clearly shows that the increase in the exogenous variable HH has a strong effect on the forecasted number of households and the amount of land allocated to residential use (LR), while there is almost no impact on the number of future jobs. Differently to the two previous variables, first-order effects relative to the variation of the ex- ogenous variable TT are null, except for very slight changes in the amount of land allocated to residential use. Similar considerations apply to second- and third-order effects; that is, changes in endogenous variables due to interactions of two or all three exogenous variables. Interactions deriving from simultaneous variations of EMP-HH and EMP-TT have only a modest impact on 19

Page 21
Figure 9: First order effects of EMP Figure 10: First order effects of HH 20

Page 22
Figure 11: Total order effects of EMP the land allocated to residential use, while the variation of HH-TT, similarly to the variation of all exogenous variables, also produces some small effects on the remaining endogenous variables. The sensitivity analysis relative to second- and third-order effects demonstrates that endoge- nous variables respond mainly to individual variations of exogenous variables rather than their simultaneous variations. For this reason, total-order effects, like the one of EMP represented in Figure 11, are very similar to first-order effects. This is particularly significant for the effect that exogenous variables EMP and HH have on their respective endogenous variables. For instance, first-order effects of EMP account for the entire total change (i.e., total-order effects) in job-related endogenous variables, as highlighted in Table 1. 21

Page 23
Table 1: First-order effects of EMP as a share of total change in job-related endogenous variables Endogenous variable Y 2010 Y 2015 Y 2020 Y 2025 Y 2030 Basic 103.21% 98.80% 101.45% 100.36% 100.00% Retail 108.22% 100.96% 101.15% 99.98% 100.00% Service 101.82% 100.00% 100.26% 99.94% 100.00%
6 Conclusions
Transport-land use models are widely utilized to inform decision-making. Those analytical tools consider estimation and/or calibration of a large number of equations and are computationally intensive also in the optimization stage. Once model outputs are obtained, policy makers are often interested into changes in model inputs or parameters (e.g., a change in transport costs or in the structure of demand). In this paper, we have proposed the use of a local sensitivity analysis technique that has a direct interpretation in terms of comparative statics as it relies on the computation of derivatives and relative changes describing given scenarios. The proposed approach has the advantage of being extremely inexpensive in terms of model runs and provides useful information on the relative importance of single and groups of variables. Furthermore, by relying on a simple Taylor expansion of the equations of interest, it also quantifies the synergies between variables. The methodology has been applied to the well-known G-LUM model of Zhou and Kockelman (2009). Using the scenario decomposition technique, we have found that, over the given scenarios, the endogenous variables respond almost additively to variations in the model inputs. Changes in the base year data concerning employments also influence future predictions on the number of jobs and land use. Our approach to sensitivity analysis allows the decision-maker to assess the relevance of several scenarios at a low cost in terms of model runs. In particular, in cases of extended cost-benefit analysis, when general equilibrium effects are crucial our approach to local sensitivity analysis may provide important information for understanding the sources of changes in social welfare. 22

Page 24
REFERENCES
Paul B. and Kockelman K., 2005: Gravity-based land use model, A Presentation for Texas MPOs, University of Texas Austin. Borgonovo E. and Apostolakis G.E., 2001: ��A New Importance Measure for Risk-Informed Decision-Making��, Reliability Engineering and System Safety, 72 (2), pp. 193-212. Borgonovo E., 2007: ��Differential Importance and Comparative Statics: an Application to Inventory Management��, International Journal of Production Economics, forthcoming in 2007. Borgonovo E., 2010: ��Sensitivity Analysis with Finite Change: Application to Modified EOQ Models��, European Journal of Operational Research, 200, pp. 127-138. Borgonovo E. and L. Peccati, 2011: ��Managerial Insights from Service Industry Models: A New Scenario Decomposition Method,�� Annals of Operations Research, 185 (1), pp. 161-179. Ciriello V., Di Federico V., Riva M., Cadini F., De Sanctis J., Zio E. and Guadagnini A., 2013: Polynomial Chaos Expansion for Global Sensitivity Analysis Applied to A Model of Radionuclide Migration in a Randomly Heterogeneous Aquifer, Stochastic Environmental Re- search and Risk Assessment, 27(4), pp. 945-954. Efron B. and Stein C., 1981: ��The Jackknife Estimate of Variance,�� The Annals of Statistics, 9 (3), pp. 586-596. Eschenbach T.G., 1992: ��Spiderplots versus Tornado Diagrams for Sensitivity Analysis,�� Inter- faces, 22, pp. 40-46. Helton J.C., 1993: ��Uncertainty and Sensitivity Analysis Techniques for Use in Performance Assessment for Radioactive Waste Disposal,�� Reliability Engineering and System Safety, 42, pp. 327-367. Howard R.A., 1988: ��Decision Analysis: Practice and Promise,�� Management Science, 346, pp. 679-695. Jungermann H. and Thuring M., 1988: ��The Labyrinth of Experts�� Minds: Some Reasoning Strategies and Their Pitfalls��, Annals of Operations Research, 16, pp. 117-130. 23

Page 25
Kleijnen J. P. C., 2005: ��An Overview of the Design and Analysis of Simulation Experiments for Sensitivity Analysis��, European Journal of Operational Research, 164, pp. 287-300. Koltai T. and Terlaky T., 2000: ��The Difference between the Managerial and Mathematical Interpretation of Sensitivity Analysis Results in Linear Programming,�� International Journal of Production Economics, 65 (3), pp. 257-274. Kouridis C., Kioutsioukis I., Papageorgiou T., Mills S., White L. and Ntziachristos L., 2011: ��Uncertainty/Sensitivity Analysis of the Transport Model TREMOVE,�� Report to DG Climate Action, European Commission, Bruxelles. Li G., Wang S.-W., Rosenthal C. and Rabitz H., 2001: ��High Dimensional Model Rep- resentations Generated from Low Dimensional Data Samples. I. mp-Cut-HDMR��, Journal of Mathematical Chemistry, 30 (1):1-30. Mulvey J.M. and Ruszczynski A., 1995: ��A New Scenario Decomposition Method for Large- Scale Stochastic Optimization,�� Operations Research, 43 (3), pp. 477-490. Oakley J.E. and O��Hagan A., 2004: ��Probabilistic Sensitivity Analysis of Complex Models: A Bayesian Approach,�� Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66 (3), pp. 751-769. O�� Brien F.A., 2004: ��Scenario Planning–-Lessons for Practice from Teaching and Learning��, European Journal of Operational Research, 152, pp. 709-722. Putman, S.H., 1995: ��EMPAL and DRAM Location and Land Use Models: A Technical Overview,�� Presented at the Land Use Modeling Conference, Dallas, Texas, 1995. Quirk J., 1997: ��Qualitative Comparative Statics,�� Journal of Mathematical Economics, 28, p. 127-154. Rabitz H. and Alis O.F., 1999: ��General Foundations of High-Dimensional Model Representa- tions,�� Journal of Mathematical Chemistry, 25, pp. 197-233. Saltelli A., Tarantola S. and Campolongo F., 2000: ��Sensitivity Analysis as an Ingredient of Modelling��, Statistical Science, 19 (4), pp. 377-395. 24

Page 26
Saltelli A., 2002: ��Sensitivity Analysis for Importance Assessment��, Risk Analysis, 22 (3), pp. 579. Saltelli A. and Tarantola S., 2002: ��On the Relative Importance of Input Factors in Math- ematical Models: Safety Assessment for Nuclear Waste Disposal��, Journal of the American Sta- tistical Association, 97 (459), pp. 702-709. Saltelli A., Tarantola S., Campolongo F. and Ratto M., 2004: ��Sensitivity Analysis in Practice. A Guide to Assessing Scientific Models��, John Wiley & Sons, New York, USA. Saltelli, A. & Annoni, P., 2010: ��How to Avoid a Perfunctory Sensitivity Analysis.�� Environ- mental Modeling and Software, 25, pp. 1508-1517. Saltelli A., Ratto M., Tarantola S. and Campolongo F. 2012: ��Update 1 of: Sensitivity Analysis for Chemical Models,�� Chemical Reviews, 112 (5), pp. PR1-PR21. Samuelson P., 1947: ��Foundations of Economic Analysis,�� Harvard University Press, Cambridge, MA. Sevcikova H. and Raftery A. and Waddell P., 2007: Assessing Uncertainty in Urban Simulations Using Bayesian Melding. Transportation Research Part B: Methodology, 41(6), pp. 652-659. Sevcikova H., Raftery A. and Waddell P., 2011: Uncertain Benefits: Application of Bayesian Melding to the Alaskan Way Viaduct in Seattle. Transportation Research Part A, 45, pp. 540-553. Smart Mobility, 2003: ��Envision Central Texas Transportation Model: Technical Documenta- tion Prepared for Envision Central Texas,�� Smart Mobility, Inc, Norwich,VT. Sobol I.M., 1993: ��Sensitivity Estimates for Nonlinear Mathematical Models,�� Matem. Mod- elirovanie, 2 (1), pp. 112-118. Sobol I.M., 2001: ��Global Sensitivity Indices for Nonlinear Mathematical Models and their Monte Carlo Estimates,�� Mathematics and Computers in Simulation, 55 (1), pp. 271-280. 25

Page 27
Sobol I.M., 2003: ��Theorems and Examples on High Dimensional Model Representation,�� Reli- ability Engineering and System Safety, 79, pp. 187-193. Takayama A., 1993: ��Analytical Methods in Economics,�� The University of Michigan Press, MI, USA, ISBN 0-472-10162-5. Tietje 0., 2005: ��Identification of a Small Reliable and Efficient Set of Consistent Scenarios��, European Journal of Operational Research, 162, pp. 418—432. Wagner H.M., 1995: ��Global Sensitivity Analysis,�� Operations Research, 43 (6), pp. 948-969. Waddell P., 2002: ��UrbanSim: Modeling Urban Development for Land Use, Transportation, and Environmental Planning,�� JJournal of the American Planning Association, 68(3), pp. 297-314. Waddell, P., A. Borning, M. Noth, N. Freier, M. Becke, and G. Ulfarsson, 2003: ��Microsimulation of Urban Development and Location Choices: Design and Implementation of UrbanSim.�� Networks and Spatial Economics, 3(1), pp. 43-67. Wallace S.W., 2000: ��Decision-Making Under Uncertainty: is Sensitivity Analysis of Any use?��, Operations Research, (1), pp. 20-25. Wendell R.E., 2004: ��Tolerance Sensitivity and Optimality Bounds in Linear Programming,�� Management Science, 50 (6), pp. 797-803. Kockelman K. and Zhou B., 2009: ��Lessons Learned in Developing & Applying Land Use Model Systems: A Parcel-Based Example,�� Transportation Research Record, 2133, pp. 75-82. Zhou B, Kockelman K. and Lemp J., 2009: ��Applications of Integrated Transport and Gravity-Based Land Use Models for Policy Analysis, �� Transportation Research Record, 2133, pp. 123-213. 26

Page 28
Appendix 1: Notations and symbols
Symbol Meaning y Endogenous variables (model outputs) y
 ()
m endogenous variable at time t in zone i,j hyi V [y] Max{y} Min{y} Average, variance, maximum and minimum of yvalues x =(x1,x2,...,x), exogenous variables K Number of exogenous variables (model inputs, including parameters) ��X Exogenous variable space f(x) Simulation exogenous-endogenous variable mapping x0, x1 Any two scenarios Ax,Ay Change in x and y across scenarios (x1

;x0
(−)
) [(x0

;x1
(−)
)] Point of ��X with x at scenario 1[0] and theremaining variab. at scen.0[1] S{·} Local sensitivity operator CS Comparative statics sensitivity measure of x D Differential importance of x f12k Generic term in the functional ANOVA expansion of f(x) A12k f Orthogonalized change in f due to the changes in x1 ,x2 ,...,xk ��
12r
Finite change sensitivity index of order r ��1

First-order finite change sensitivity index of x ��

Total order finite change sensitivity index of x I
  : = S{f (x)}
Vector containing the K sensitivity measures of endogenous variable y E
 
Number of jobs of type p in zone i at time t H

Households count of type d in zone i at time t 27

Page 29
Appendix 2: Local sensitivity analysis for mathematical models
We offer here an analysis of the local sensitivity analysis framework as it will help us in framing the sensitivity analysis quests that we are to conduct. A local sensitivity measure is a mathematical operator that acts on f(x) to produce the sensi- tivity indicator of exogenous variable x, k = 1,2,...,K. Formally, we write (35) S : ��S �� R In eq. (35), ��S is the set of functions with domain ��X for which the application of the operator S is well posed. That is, ��Sis the domain of S. We suppose that ��S is contained in a linear space of functions. For instance, if S is a differential operator, then ��S must be enclosed in the set of differentiable functions on ��X. We then write: I : = S{f(x)} where I defines the importance of the k exogenous variable, k = 1,2,...,K. That is, after application of the operator S, we obtain a vector of k sensitivity measures. Thus, acting on f(x), S maps each of the indices 1,2,...,K into a corresponding value in R. Such a value is called the importance (sensitivity) measure of x [Helton (1993)]. This framework applies to generic local sensitivity measures, when the simulation output is a scalar quantity. However, in realistic applications, the simulation model produces several quantities of interest (exogenous variables) and often they are spatially and temporally distributed. The fact that these models are becoming increasingly popular calls for an extension of the previous framework, if we wish to set up the sensitivity analysis exercise in a rigorous context. The first step is the extension to the case in which the model computes M quantities of interest to the analyst/decision-maker. The model output is, then, a vector of quantities of interest. The mapping in eq. (1) becomes: (36) y = f(x), f : ��X �� R 28

Page 30
where f is a vector function of K variables, with domain ��X. Here we suppose that each of the M functions is in the domain of the operator S. Then, by application of S, we obtain a K ��M matrix of sensitivity measures, (37) I = S{f(x)} whose component vectors are defined as follows I: = S{f(x)}, m = 1,2,..,M The column vector I collects the K sensitivity measures of the model inputs with respect to the m exogenous variable. In traffic models, a further generalization is needed if the model outputs are spatially distributed. In fact, it is not seldom that one is interested in, say, the number of jobs in a given zone, in the use of land in a given zone, where a zone is a subset of a geographical region. If we use two coordinates to denote zones, then the traffic simulation model is computing the vector of M outputs for each zone. If we consider a two dimensional geographic region, then let Z denote the number of zones on the horizontal axis and Z the number of zones on the vertical axis. If R is the region of interest, it is partitioned into z ·z zones. Denoting the generic zone as Z, we have that R = ��Z, with Z ��Z = ∅, if the pair (i,j) differs from the pair (s,t), with i,s = 1,2,...,z and j,t = 1,2,...,z, respectively. Eq. (1) becomes (38) y
 = f (x), m = 1,2,..,M, i = 1,2,...,z and j = 1,2,...,z
where y
 is the m output calculated in region (i,j).
Application of the operator S to eq. (38) produces a K �� M �� z �� z array of sensitivity measures. We write I
: = S{f (x)}, m = 1,2,..,M, i = 1,2,...,z and j = 1,2,...,z
where I
 is the K-component vector of sensitivity measures of the exogenous variable with respect
29

Page 31
to the m−i−j endogenous variable. That is, we have a K elements vector I
 that contains the
sensitivity measures of the M exogenous variables simulated by the model over the spatial region of interest. Finally, the framework is further extended if the simulation produces the time-evolution of the M endogenous variables, in each region. If a discrete time simulation is chosen (we will restrict to this case for simplicity of an already complicated notation), we have that eq. (1) becomes also parameterized by the time t: (39) y
  = f   (x), t = 1,2,..,T, i = 1,2,...,z, j = 1,2,...,z and m = 1,2,..,M
Correspondingly, we have I
  : = S{f (x)}, t = 1,2,..,T, m = 1,2,..,M, i = 1,2,...,z and j = 1,2,...,z
which represents the vector of sensitivities of the m − i − j endogenous variables at time t, with respect to the K exogenous variables. The above-mentioned discussion should help in evidencing a potential problem. The analyst/decision- maker runs the risk of becoming flooded with sensitivity measures, especially if the number of re- gions or model outputs of interest is high. One then needs to find a way to understand the insights that can be obtained from the sensitivity measures and how results can be represented (visually or graphically) in the most convenient way. This need is already highlighted in earlier works, such Eshenbach (1992), that suggest that one should avoid to ��overwhelm managers with data.�� Therefore, one needs a systematic approach to the extraction of insights and the selection of the quantities of interest. As to insights, we examine them in the context of the following three settings. - sign of change: the sensitivity measures in I
  convey the sign of change in the m − i −
j endogenous variable, given the variations in the exogenous variables over the predetermined scenarios; - key-drivers: the magnitudes of the sensitivity measures in I
  can be used to understand
which exogenous variable impacts upon the m − i − j endogenous variable the most; 30

Page 32
- model structure: the magnitudes and signs of the interaction terms allow us to appreciate whether the endogenous response sees a significant role played by interactions among the exogenous variables. As to the representation of results, several options are available. First, one can visualize the results for all zones through a graphical visualization tool. The idea is to represent results ��on the map�� of the output. We deem that this presentation mechanism is best suited in relationship to setting 1. It has the advantage of information completeness, but the disadvantage of not providing a synthesis of results. Towards result summaries, one can limit the amount of information by restricting attention to results produced in one or a few given zones of interest. One can also opt for offering the spatial average of the sensitivity measures across zones. Here, conciseness represents a positive, with the cons of a loss in details of results. Alternatively, one can synthesize the spatial distribution of the exogenous variables in a mean- ingful statistic. These statistics are typically the spatial average, maximum, minimum, spatial variance. In detail, the spatial average of the exogenous variable y is defined as (40) hyi =
ij
X
=1=1
y
 
ZZ The spatial variance is given by: (41) V [y] =
ij
X
=1=1
(y
  )2
ZZ − ¡hyi¢2 The spatial maximum and minimum of y are: (42) Max{y} = max=12i=12j {y
  } and Min{y} =
min
=12i=12j{y   }
Here, we need to recognize that each of these quantities depends on the scenario on which we run the model. Then, each of them is an endogenous variable depending on x. The choice of which quantity to use is on a case-by-case basis and is driven by the overall purposes of the analy- sis. For instance, in hydraulic applications concerning the modelling of floodings, the maximum 31

Page 33
discharge can be of interest for safety concerns. In radioactive waste management applications, the maximum dose is considered (see, for instance, the application of the LevelE model in Saltelli and Tarantola (2002)). In some other environmental applications the average and variance are considered meaningful criteria and adopted as endogenous variables Ciriello et al. (2013). The numerical values of the above-mentioned criteria depend on the scenario, i.e., on the value assigned to the exogenous variables. To apply the sensitivity measures based on operator S, we need the function hyi(x), (or V [y](x), or Max{y}(x) or Min{y}(x), depending on the quantity chosen as endogenous variable), to belong to the domain of the operator S, that we are to use. For instance, if we are considering the spatial average and wish to apply a differentiation-based operator, then we need to assume that hyi(x) is at least differentiable in x. 32

Set Home | Add to Favorites

All Rights Reserved Powered by Free Document Search and Download

Copyright © 2011
This site does not host pdf,doc,ppt,xls,rtf,txt files all document are the property of their respective owners. complaint#nuokui.com
TOP