Analytical global sensitivity analysis with Gaussian processes

Ankur Srivastava; Arun K. Subramaniyan; Liping Wang

doi:10.1017/S0890060417000142

Analytical global sensitivity analysis with Gaussian processes

Published online by Cambridge University Press: 03 August 2017

Ankur Srivastava ,

Arun K. Subramaniyan and

Liping Wang

Show author details

Ankur Srivastava*: Affiliation:
GE Global Research Center, Niskayuna, New York, USA
Arun K. Subramaniyan: Affiliation:
GE Global Research Center, Niskayuna, New York, USA
Liping Wang: Affiliation:
GE Global Research Center, Niskayuna, New York, USA
*: Reprint requests to: Ankur Srivastava, GE Global Research Center, 1 Research Circle, Niskayuna, NY 12309, USA. E-mail: ankur.rice@gmail.com

Article contents

Abstract
INTRODUCTION
METHODS AND TECHNIQUES
ANALYTICAL EXAMPLES
CONCLUSIONS
References

Rights & Permissions

Abstract

Methods for efficient variance-based global sensitivity analysis of complex high-dimensional problems are presented and compared. Variance decomposition methods rank inputs according to Sobol indices that can be computationally expensive to evaluate. Main and interaction effect Sobol indices can be computed analytically in the Kennedy and O'Hagan framework with Gaussian processes. These methods use the high-dimensional model representation concept for variance decomposition that presents a unique model representation when inputs are uncorrelated. However, when the inputs are correlated, multiple model representations may be possible leading to ambiguous sensitivity ranking with Sobol indices. In this work, we present the effect of input correlation on sensitivity analysis and discuss the methods presented by Li and Rabitz in the context of Kennedy and O'Hagan's framework with Gaussian processes. Results are demonstrated on simulated and real problems for correlated and uncorrelated inputs and demonstrate the utility of variance decomposition methods for sensitivity analysis.

Keywords

Bayesian Quadrature Effect Functions Gaussian Processes Sobol Indices

Type: Special Issue Articles
Information: AI EDAM , Volume 31 , Special Issue 3: Uncertainty Quantification for Engineering Design , August 2017 , pp. 235 - 250

DOI: https://doi.org/10.1017/S0890060417000142 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

1. INTRODUCTION

Designing large complex systems requires a thorough understanding of the relative importance of input variables and their interactions with respect to the desired system outputs. Having the knowledge of relative sensitivity of input design parameters or uncertain random variables can help the designers in a number of ways including accelerating design exploration and recognizing a set of critical design or uncertain variables that need more attention, to name a few. Primary methods for sensitivity analysis include screening, local, and global methods as shown in Figure 1. Screening methods primarily include scatter plots, regression analysis, and correlation coefficients. Such methods are effective with linear models where only main effects are desired. Nonlinearity in the models or desire to study interaction effects raises the need for more sophisticated methods. One factor at a time and the more specialized Morris methods can handle nonlinear models but are only suited for capturing main effects (Iooss & Lemâitre, Reference Iooss, Lemaître, Meloni and Dellino2015).

Fig. 1. Sensitivity analysis methods.

Partial derivatives of the system response with respect to an input variable averaged in some sense over the whole domain can provide reasonable metrics for sensitivity analysis. However, there are two problems with these methods. Unless true gradients can be computed cheaply for the system response, partial derivatives have to be computed from a metamodel emulating the system response. Accuracy of gradients predicted by metamodels generally cannot be quantified. One would hope that gradients become increasingly accurate as the number of training points to build the metamodel increase; however, one cannot be sure on a lower bound on the number of training points. In addition, averaging a partial derivative over the entire input domain becomes computationally expensive as the number of dimensions increases. Variance-based global sensitivity analysis methods perform an exhaustive sensitivity analysis capturing main, interaction, and total effects. However, these methods tend to be computationally expensive, and over the past decade new methods have been developed to alleviate this computational burden. Figure 2 shows a comprehensive review of sensitivity analysis methods with increasing nonlinearity and problem dimensionality. Variance-based methods are clearly more suited to problems with high-order interactions. Next, we focus on the variance-based global sensitivity methods and address issues about nonuniqueness and ambiguity in interpreting results when correlation is introduced in the input variables. For a more detailed review of methods, please refer to the references (Saltelli et al., Reference Saltelli, Tarantola and Campolongo2000; Sudret, Reference Sudret2008; Iooss & Lemâitre, Reference Iooss, Lemaître, Meloni and Dellino2015). In Section 2, we present the theoretical details of variance-based global sensitivity analysis methods and discuss issues in the presence of input correlation. We also present methods from Li et al. (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) that reformulate Sobol indices to handle input correlation. In Section 3, we present some simulated examples with and without input correlation, and finally, sensitivity analysis on a 100-dimensional nonlinear thermal problem. Section 4 discusses our conclusions.

Fig. 2. Review of sensitivity analysis methods with increasing dimensionality and nonlinearity (Iooss & Lemâitre, Reference Iooss, Lemaître, Meloni and Dellino2015).

2. METHODS AND TECHNIQUES

In this section we will focus on variance-based global sensitivity methods. In Section 2.1, we will first present the variance decomposition concept and the definition of Sobol indices followed by the high-dimensional model representation (HDMR) method in Section 2.2. Then we will focus on the Kennedy and O’ Hagan framework in Section 2.3 and present computation of Sobol indices in Section 2.4. We will address the effect of correlation on HDMR and the Li–Rabitz method in Section 2.5. Section 2.6 covers problems with input correlation.

2.1. Variance-based global sensitivity methods

Variance-based global sensitivity analysis investigates how the uncertainty in the system response of a mathematical/simulation model of a system can be divided among the different uncertain inputs and their interactions. In this section, we will go over the foundation of variance decomposition for sensitivity analysis and high-dimensional model representation and then present application of these methods for Gaussian processes in the popular Kennedy and O'Hagan framework (Reference Kennedy and O'Hagan2001).

In this work, the model of a system could also be an actual experiment that studies the system with pointwise evaluations. Such methods rank the inputs according to the following criteria: if the true value of the uncertain parameters is known, then how much variance there is in the output can be reduced. Higher reduction in variance means that the output is more sensitive to that input as shown in Figure 3.

Fig. 3. Variance decomposition.

Here it is important to distinguish between polynomial regression-based analysis of variance methods and the generic variance decomposition methods. Analysis of variance methods are also variance decomposition methods but specialized for only polynomial models. In many real problems with sparse data, polynomial models are insufficient. In addition, regression coefficients of various responses may not be normalized and thus may require more sophisticated variance decomposition methods (Reuter et al., Reference Reuter, Mehmood, Gebhardt, Liebscher, Müllerschön and Lepenies2011).

Sobol indices are defined as in Eq. (1):

(1)

$$S_p = {{\rm Var}(E) \over {\rm Var}(Y)}.$$

Here Y is the system response and X is the input variable. Var[.] and E[.] are the variance and expectation operators, respectively. Similarly, Sobol indices for two-way and high-order interactions can be written as in Eq. (2):

(2)

$$\eqalign{ S_{ij} &= {{\rm Var}(E(Y \vert x_{ij} )) \over {\rm Var}(Y)} - S_i - S_j \cr {S_\nu} &= {{\rm Var}(E(Y \vert x_\nu )) \over {\rm Var}(Y) } - S_{\nu - 1} - S_{\nu - 2} - \cdots - S_2 - S_1.} $$

Here ν refers to multi-index high-order notation ranging from ν = 1, 2, 3, ν = 1, 2, 4, … , ν = 1, 2, d and so on to ν = 1, 2, … , d, where d is the number of dimensions. Total effect of an input X _p is defined as the sum of all partial sensitivities involving the input X _p, or in other words, the sum of the main and interaction indices for a particular input. For example, total sensitivity Sobol index of input X _i can be written as

(3)

$$\eqalign{ \mathop \sum \limits_{i = 1}^d S_i + \mathop \sum \limits^{i\comma \,j = 1:d} &S_{ij} + \cdots + S_{12 \cdots d} = 1\comma \, \cr &S_i^{{\rm total}} = 1 - S_{ - i}.} $$

The Sobol indices are known to be good descriptors of the sensitivity of the model to its input parameters, as they do not suppose any kind of linearity or monotonicity in the model. The above expressions for Sobol indices can be estimated directly with a model that is relatively inexpensive to evaluate. However, in reality, simulation models or physical experiments are rarely inexpensive. In such problems, metamodels such as artificial neural networks, radial basis functions, and Gaussian processes are of use as they can be evaluated millions of times to estimate Sobol indices given in Eq. (1).

Even though metamodels replace the black box simulation models, computation of Sobol indices by directly sampling a metamodel becomes computationally expensive quickly as the number of dimensions increase. Variance of the system response Var[Y] and Var[E(Y|X _i)] are high-dimensional integrals that are computationally prohibitive to estimate both by Monte Carlo sampling or sparse grid-based quadrature. Special methods discussed later are needed to tackle this computational cost.

2.2. High-dimensional model representation

We can certainly estimate the Sobol indices using a metamodel y(x). Sobol decomposition of y(x) is written uniquely as an HDMR representation in Eq. (4) if the inputs x are uncorrelated:

(4)

$$y\lpar x \rpar = z_0 + \mathop \sum \limits_{i = 1}^d z_i \lpar {x_i} \rpar + \mathop \sum \limits^{i\comma \,j = 1:d} z_{ij} \lpar {x_i\comma \, x_j} \rpar + \cdots + z_{12 \ldots d}. $$

Here, the effect functions z _i(x _i) are given as in Eq. (5) (Oakley & O'Hagan, Reference Oakley and O'Hagan2004):

(5)

$$\eqalign{z_0 & = {E[Y] \comma \,} \cr {z_i} & = {E[Y \vert x_i ] - E[Y] \comma \,} \cr {z_{ij}} & = {E[Y \vert x_i\comma \, x_j ] - z_i \lpar {x_i} \rpar - z_j \lpar {x_j} \rpar - E[Y] \comma \,} \cr {z_{ijk}} & = {E[Y \vert x_i\comma \, x_j\comma \, x_k ] - z_i \lpar {x_i} \rpar - z_j \lpar {x_j} \rpar - z_k \lpar {x_k} \rpar - z_{ij} \lpar {x_i\comma \, x_j} \rpar } \cr & \quad { - z_{\,jk} \lpar {x_j\comma \, x_k} \rpar - z_{ik} \lpar {x_i\comma \, x_k} \rpar - E[Y]. } \hfill} $$

It is important to note that by construction E[z _i] = 0, E[z _ij] = 0 and E[z _ijk] = 0. Finally, Sobol indices are given by

$$S_i = {{\rm Var}[z_i] \over {\rm Var}[Y] } \quad {\rm and} \quad S_{ij} = {{\rm Var}[z_{ij}] \over {\rm Var}[Y] }.$$

2.3. Kennedy and O'Hagan framework

Kennedy and O'Hagan (Reference Kennedy and O'Hagan2001) presented a framework to accomplish a variety of tasks such as building metamodels, performing probabilistic calibration of simulation parameters with respect to observed data, and building a metamodel for the discrepancy between a calibrated simulation model and observed data. Researchers at the Los Alamos National Lab first released a Matlab implementation of this framework in the gpmsa software package. At GE Global research Center, we have built the GE Bayesian hybrid modeling framework as shown in Figure 4 with some core components from gpmsa and several significant enhancements developed in-house at GE. For more details, please refer to Wang et al. (Reference Wang, Fang, Subramaniyan, Jothiprasad, Gardner, Kale, Akkaram, Beeson, Wiggs and Nelson2011), Chennimalai Kumar et al. (Reference Chennimalai Kumar, Subramaniyan and Wang2012), Subramamiyan et al. (Reference Subramaniyan, Kumar, Wang, Beeson and Wiggs2012), and Chennimalai Kumar et al. (Reference Chennimalai Kumar, Subramaniyan and Wang2013). Borrowing notation from Oakley and Hagan (Reference Oakley and O'Hagan2004), we will first describe the Kennedy and O'Hagan theoretical framework and then delve into computation of Sobol indices with GPs in this framework. In this figure, θ_j represents any calibration parameters a simulation may have. In the Kennedy and O'Hagan framework, sensitivity analysis is done both on input variables and calibration parameters.

Fig. 4. Bayesian hybrid modeling framework. GEBHM, Bayesian hybrid model implementation at GE that is the in-house implementation of Kennedy and O'Hagan framework.

Input Data: Let X : Training data with m rows and d input dimensions and Y be the output data. Here ξ_i represents the ith vector of X.

$$\eqalign{ &\hbox{Prior }GP\!:N( 0 \comma \, {\rm \sigma }^2{c}( x \comma\, \acute{x} )) \cr &\hbox{Posterior }GP{\rm \eta }\lpar x\rpar \!:N(m^\ast (x)\comma\,{\rm \sigma}^{2}c^\ast(x\comma\,\acute{x} )) }$$

(6)

$$\eqalign{ m^\ast \lpar x \rpar &= t\lpar x\rpar^{T} A^{ - 1} {\bf Y} \cr c^\ast \lpar {x\comma \,\acute{x}} \rpar &= c\lpar {x\comma \,\acute{x}} \rpar - t\lpar x\rpar^{T} A^{- 1} t\lpar {\acute{x}}\rpar.} $$

Here, t(x)^T = [c(x, ξ₁), c(x, ξ₂), … , c(x, ξ_m)] and

(7)

$$A = \left[ {\matrix{ {c\lpar {{\rm \xi}_1\comma \, {\rm \xi}_1} \rpar } & {c\lpar {{\rm \xi}_1\comma \, {\rm \xi}_2} \rpar } & { \cdots } & {c\lpar {{\rm \xi}_1\comma \, {\rm \xi}_m} \rpar } \cr {c\lpar {{\rm \xi}_2\comma \, {\rm \xi}_1} \rpar } & {c\lpar {{\rm \xi}_2\comma \, {\rm \xi}_2} \rpar } & { \cdots } & {c\lpar {{\rm \xi}_2\comma \, {\rm \xi}_m} \rpar } \cr \vdots & \vdots & \ddots & \vdots \cr {c\lpar {{\rm \xi}_m\comma \, {\rm \xi}_1} \rpar } & {c\lpar {{\rm \xi}_m\comma \, {\rm \xi}_2} \rpar } & { \cdots } & {c\lpar {{\rm \xi}_m\comma \, {\rm \xi}_m} \rpar } \cr}} \right]\comma \,$$

(8)

$$c\lpar {x\comma \, \acute{x}}\rpar = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1}^{p} {\rm exp}[ - {\rm \beta}_i \vert x_i - \acute{x}_i \vert ^{2}] + \displaystyle{1 \over {{\rm \lambda} _{\rm WS}}}. $$

Here β, 1/λ_z, and 1/λ_WS are parameters that need to be estimated. In the Kennedy and O'Hagan framework, they are estimated with a Markov chain Monte Carlo method.

2.3.1. Deriving effect functions analytically for Gaussian process models in the Kennedy and O'Hagan framework

From Oakley and O'Hagan's Reference Oakley and O'Hagan2004 paper, the posterior mean of any linear functional of a GP, such as an integral, is also a GP. Using this principle, Oakley and O'Hagan derive linear functionals for a GP with a polynomial mean function. In the rest of this work, the posterior mean and variance of a GP are given by E*[] and Var*[], respectively. Simplifying those expressions for a zero mean GP and the covariance kernel used in Eq. (8), we get

(9)

$$E[Y] = \mathop \int \nolimits_X {\rm \eta} \lpar x \rpar dG\lpar x \rpar = TA^{ - 1} Y.$$

Here T is an integral computed over all d dimensions for each of the m input data points. For the BHM GP, all inputs are assumed to be uniformly distributed and scaled between [0, 1]. For the kth row of the input data T[ξ_k] is given as

(10)

$$\eqalign{ T\left[{{\rm \xi}_k} \right] &= \mathop \prod \limits_{i = 1}^d \mathop \int \nolimits_0^1 c\lpar {x_i\comma \, {\rm \xi}_k} \rpar dx_i \cr & = \mathop \prod \limits_{i = 1}^d \mathop \int \nolimits_0^1 \displaystyle{1 \over {{\rm \lambda} _z}} {\rm exp}[ - {\rm \beta} _i \vert x_i - {\rm \xi} _k \vert ^2 ] + \displaystyle{1 \over {{\rm \lambda} _{{\rm WS}}}} dx_i.} $$

Applying the transformation (please note, here y represents the transformed variable only, whereas Y in the paper represents the system response or output in question),

(11)

$$\eqalign{ y &= \sqrt {{\rm \beta} _i} \vert x_i - {\rm \xi} _{ki}\comma \, \cr dy &= \sqrt {{\rm \beta} _i} dx_i\comma \, \cr T\left[{{\rm \xi}_k} \right] & = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1}^d \mathop \int \nolimits_a^b {\rm exp}[{ - y^2} ] \displaystyle{{dy} \over {\sqrt {{\rm \beta} _i}}} + \displaystyle{1 \over {{\rm \lambda} _{{\rm WS}}}}\comma \, \cr a & = \sqrt {{\rm \beta} _i} {\rm \xi} _{ki}\comma \,\, \cr b & = \sqrt {{\rm \beta} _i} \vert {1 - {\rm \xi}_{ki}} \vert \comma \,} $$

and $b = \sqrt {{\rm \beta} _i} \vert1 - {\rm \xi} _{ki}, $

(12)

$$T\left[ {{\rm \xi}_k} \right] = {1 \over {\rm \lambda} _z} \mathop \prod \limits_{i = 1}^d \sqrt {{\rm \pi} \over {\rm \beta} _i} \left[ {\rm \Phi} ( \sqrt {2{\rm \beta}_i} ( 1 - {\rm \xi}_k^i ) ) - {\rm \Phi} ( \sqrt {2{\rm \beta}_i} {\rm \xi}_k^i ) \right].$$

Finally, from Eqs. (11) and (12) we can compute E[η(x)]. Φ(x) is the norm cdf function. Here c ₃ is a m × p matrix where c ₃[i, j] is the jth component of T[ξ_i] given as

(13)

$$T\left[{{\rm \xi}_k} \right] = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1}^d c_3 \lpar {{\rm \xi}_k^i\comma \, {\rm \beta}_i} \rpar + \displaystyle{1 \over {{\rm \lambda} _{{\rm WS}}}}. $$

Next, we focus on the computation of E[Y|x _p] given by

(14)

$$E[Y \vert x_p ] = \mathop \int \nolimits_{X_{ - p}} {\rm \eta} \lpar x \rpar dG_{ - p \vert p} (x_{ - p} \vert x_p ).$$

Here, X _−p means integration over all dimensions except the pth dimension, and G _−p|p (x _−p |x _p) represents the product of input probability distributions of all the inputs except the pth input. From Oakley and O'Hagan (Reference Oakley and O'Hagan2004), the posterior mean of E[Y|x _p] is given as

(15)

$$E[Y \vert x_p ] = \mathop \int \nolimits_{X_{ - p}} {\rm \eta} (E^\ast [ E [Y \vert x_p ]] = T_p \lpar {x_p} \rpar A^{ - 1} Y.$$

This would yield the main effect function as z _i = [T _i(x _i) − T]A ⁻¹ Y. Similarly, z _ij = [T _ij(x _ij) − z _i(x _i) − z _j(x _j) − T]A ⁻¹ Y. Following Eq. (12), T _p(x _p) can be written as

(16)

$$T_p \left[{{\rm \xi}_k} \right] = \displaystyle{1 \over {\lambda _z}} {\rm exp}[ - {\rm \beta} _p \vert x_p - {\rm \xi} _k^p \vert ^2 ]\mathop \prod \limits_{i = 1\comma \,i \ne p}^d c_3 \lpar {{\rm \xi}_k^i\comma \, {\rm \beta}_{\rm i}} \rpar + \displaystyle{1 \over {{\rm \lambda} _{{\rm WS}}}}. $$

It is important to note here that in the computation of the effect functions z _i(x _i) the 1/λ_WS term cancels out. In general, 1/λ_WS is usually close to zero, and in most problems it is acceptable to ignore it.

Oakley and O'Hagan (Reference Oakley and O'Hagan2004) also give expressions for posterior variance of E[Y|x _p], which when simplified for a zero mean GP is given as

(17)

$${\rm Var}^\ast[E(Y \vert x_p )] = U_p - T_p \lpar {x_p} \rpar A^{ - 1} T_p (x_p )^T\comma \, $$

where U _p is the second-order integral of the covariance kernel $c\lpar {x\comma \,\acute{x}} \rpar $ and is given as

(18)

$$U_p = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1\comma \,i \ne p}^d c_3 \lpar {x_i\comma \, {\rm \beta}_i} \rpar d\lpar {x_i} \rpar + \displaystyle{1 \over {{\rm \lambda} _{{\rm WS}}}}. $$

2.4. Computation of Sobol indices

Revisiting Eqs. (1) and (2), we need to compute E* [Var(E(Y|x _p))], E*[Var(E(Y|x _ij))], and E*[Var(Y)]. Using the identitity Var(Y) = E(Y ²) − E(Y)², decomposing the numerator as [say, a = E(Y|x _p)], E*[Var(a)] = E*[E(a ²)] − E*[E(a)²] = E*[E(E(Y|x _p)²)] − E*[E(Y)²] as E(E(Y|x _p)) = E(Y). From Oakley and O'Hagan paper (Reference Oakley and O'Hagan2004),

(19)

$$E^\ast [E(E(Y \vert x_p )^2 )] = U_p - tr\lpar {A^{ - 1} P_p} \rpar + tr\lpar {e^T P_p e} \rpar .$$

Here, $e = A^{-1} Y U_p = 1/{\rm \lambda} _z \mathop \prod \nolimits_{i = 1\comma \,i \ne p}^d c_1 \lpar {{\rm \beta}_i} \rpar$ , and P _p is given by

(20)

$${P_p = \mathop \int \nolimits_{X_p} \mathop \int \nolimits_{X_{ - p}} \mathop \int \nolimits_{X_{ - p}} t\lpar x \rpar t(x)^T dG_{ - p \vert p} (x_{ - p} \vert x_p )dG_{ - p \vert p} (\acute{x} _{ - p} \vert x_p )dG_p \lpar {x_p} \rpar}\comma $$

where P _p is an m × m matrix and is given as

(21)

$$\eqalign{P_p [{{\rm \xi}_i\comma \, {\rm \xi}_j}] & = \displaystyle{1 \over {{\rm \lambda} _{\rm z}^2}} \mathop \int \nolimits_{X_p} \mathop \int \nolimits_{X_{ - p}} \mathop \int \nolimits_{X_{ - p}} \mathop \prod \limits_{k = 1}^d \,{\rm exp}\,[{ - {\rm \beta}_k {\rm \vert} x_k - {\rm \xi}_k^i \vert^2} ] \cr & \quad \times {\rm exp}\,[ - {\rm \beta} _k \vert \acute{x} _k - {\rm \xi} _k^j \vert ^2 ]dG_{ - p \vert p} (x_{ - p} \vert x_p ) \cr & \quad \times dG_{ - p \vert p} (\acute{x}_{- p} \vert x_p)dG_{\,p} \lpar {x_p} \rpar \comma \,} $$

which can be simplified as below

(22)

$$\eqalignno{P_p \left[{{\rm \xi}_i\comma \, {\rm \xi}_j} \right] &= \displaystyle{1 \over {{\rm \lambda} _z^{2}}} \mathop \prod \limits_{k = 1}^d c_3 \left( {{\rm \beta}_k\comma \, {\rm \xi}_k^{i}} \right) c_3 \left( {{\rm \beta}_k\comma \, {\rm \xi}_k^j} \right) \cr &\hskip -7pt\times {\mathop \int \nolimits_{X_p}} {\rm exp}[ - {\rm \beta}_p \vert {x_p} - {\rm \xi}_p^i \vert^{2} ] {\rm exp} [ - {\rm \beta}_{k} \vert x_p - {\rm \xi}_p^j \vert^{2} ] dG_p \lpar {x_p} \rpar .} $$

The integral in the above equation can be solved by first considering the transformation

$$y = x_p - \displaystyle{{x_{ip} + x_{\,jp}} \over 2} = x_p - m_p $$

and

$$d_p = \displaystyle{{x_{ip} - x_{\,jp}} \over 2}.$$

This transformation would simplify the above integral as

(23)

$$C_2 = \mathop \int \nolimits_{- m_p} ^{1 - m_p} {\rm exp} [ { - {\rm \beta}_p \vert y - d_p \vert^2} ] {{\rm exp}} [ {- {\rm \beta}_{\,p} \vert y + d_{\,p} \vert^2}] dy\comma \,$$

which would yield:

(24)

$$\eqalign{ C_2 &= \sqrt {2{\rm \beta}_{p}} {\rm exp}[{ - {\rm \beta}_{p} \lpar {2d_{p}^2} \rpar } ] \cr & \quad \times \left[ {2{\rm \Phi} \left( {\displaystyle{{1 - {m}_{p}} \over {\sqrt {{\rm \beta}_{p}}}}} \right) + 2{\rm \Phi} \left( {\displaystyle{{{m}_{p}} \over {\sqrt {{\rm \beta}_{p}}}}} \right)e - 2} \right].} $$

Now to compute the numerator of the Sobol index we need to compute E*[E(Y)²], which can be decomposed as

(25)

$$E^\ast [E\lpar {Y)^2} ] = {\rm Var}^\ast \left[{E(Y) } \right] + (E^\ast ( {E(Y) }) )^2. $$

From the computation of posterior variance of effect function we have

(26)

$${\rm Var}^\ast \left[{E(Y) } \right] = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1}^d c_1 \lpar {{\rm \beta}_i} \rpar - TA^{ - 1} T^T. $$

and

(27)

$$\big(E^\ast \lpar {E(Y) } ) \big)^2 = (TA^{ - 1} y)^T \lpar {TA^{ - 1} y} \rpar = y^T A^{ - 1T} T^T TA^{ - 1} y.$$

Thus, finally, we have

(28)

$$\eqalign{ e_2 &= ({E^\ast \lpar {E(Y) } ) })^2 = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1}^d c_1 \lpar {{\rm \beta}_i} \rpar \cr & \quad - \displaystyle{1 \over {{\rm \lambda} _z^2}} tr[{T^T ( {A^{ - 1} - A^{ - 1} yy^T A^{ - 1T}} ) }] .} $$

This gives us the final expression for the numerator of the Sobol index as

(29)

$$E^\ast [{\rm Var}(E(Y \vert x_p ))] = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1\comma \,i \ne p}^d c_1 \lpar {{\rm \beta}_i} \rpar - tr\left[{{\rm Var}\,f*Q} \right] - e2.$$

Now, to complete calculation of the Sobol index, we need to compute the posterior expectation of the total variance of Y given as

(30)

$$\eqalign{ E^\ast [{\rm Var}(Y)] & = E^\ast [{E\lpar {Y^2} \rpar}] - E^\ast [ {E(Y)^2} ] \cr & = E^\ast [E\lpar {Y^2} \rpar ] - e2.} $$

The first term in the above Eq. is given in (24):

(31)

$$\eqalign{ E^\ast [{E\lpar {Y^2} \rpar }] &= y^T A^{ - 1} P_p A^{ - 1} y + [{1 - tr\lpar {A^{ - 1} P_p} \rpar }] \cr & = \displaystyle{1 \over {{\rm \lambda} _z}} - \displaystyle{1 \over {{\rm \lambda} _z^2}} tr[{{\rm Var}\,f \times Q} ]. } $$

The expressions derived above take into account the prediction uncertainty of the GP as well. It is relatively easier to derive the integrals based on the mean function of the effect function GP only, but they might give incorrect results. Oakley and O'Hagan (Reference Oakley and O'Hagan2004) suggest, var[E*{z _i(x _i)}] is not the same as E*[var{z _i(x _i)}]. Here, the term var[E*{z _i(x _i)}] is basically variance of only the mean function of the effect function GP, and the latter term E*[var{z _i(x _i)}] is the posterior expectation of the variance of the effect function and includes the GP prediction uncertainty. In other words, the second term on the right-hand side is usually not zero, which makes it essential to include GP prediction uncertainty when calculating Sobol indices:

(32)

$$E^\ast \left[{\rm Var} \left\{{z_i \lpar {x_i} \rpar} \right\} \right] \;{= {\rm Var}} \left[E^\ast \left\{{z_i \lpar {x_i} \rpar } \right\} \right] + E \left[{\rm Var}^\ast \left\{{z_i \lpar {x_i} \rpar } \right\} \right] .$$

To summarize, Sobol indices from a Gaussian process model in the Kennedy and O'Hagan framework are given as below. U _p, P _p, and e ₂ are defined in Eqs. above.

(33)

$$\eqalign{ S_p &= \displaystyle{{E^\ast [{\rm Var}(E(Y \vert x_p ))]} \over {E^\ast \left[{{\rm Var}(Y) } \right] }} \cr\noalign{\vskip 9pt} &= \displaystyle{{U_p - tr\lpar {A^{ - 1} P_p} \rpar + tr\lpar {e^T P_p e} \rpar - e2} \over {y^T A^{ - 1} P_p A^{ - 1} y + [{1 - tr\lpar {A^{ - 1} P_p} \rpar } ] - e2}}.} $$

2.5. Derivation of higher order interaction effects

For two factor interactions, z _i j = E(Y|x _i, x _j) − z _i(x _i) − z _j(x _j) − E(Y), and the variance can be broken down as

(34)

$$\eqalign{ {\rm Var}\lpar {z_{ij}} \rpar &= {\rm Var}(E(Y \vert x_i\comma\, x_j )) + {\rm Var}\lpar {z_i} \rpar + {\rm Var}\lpar {z_j} \rpar \cr & \quad + 2\hbox{Cov}(E(Y \vert x_i\comma\, x_j )\comma\, z_i ) + 2\hbox{Cov}(E(Y \vert x_i\comma\, x_j )\comma\, z_j ) \cr & \quad + 2\hbox{Cov}\lpar {z_i\comma\, z_j} \rpar .}$$

In the above equation covariance terms, Cov(E(Y), z _i) = 0, and assuming inputs are uncorrelated, we can safely assume Cov(z _i, z _j) = 0 and 2Cov(E(Y|x _i, x _j), z _i) terms can be broken down as

(35)

$$\eqalign{ 2\hbox{Cov}(E(Y \vert x_i\comma \, x_j )\comma \,z_i ) &= 2\hbox{Cov}\lpar {z_{ij} - z_i - z_j - E(Y) \comma \,z_i} \rpar \cr &= 2\hbox{Cov}\lpar {z_{ij}\comma \, z_i} \rpar - 2\hbox{Cov}\lpar {z_i\comma \, z_i} \rpar - 2\hbox{Cov}\lpar {z_j\comma \, z_i} \rpar \cr & \quad - 2\hbox{Cov}\lpar {E(Y) \comma \,z_i} \rpar .} $$

Again Cov(E(Y), z _i) = Cov(E(Y), z _j) = 0, and assuming inputs are uncorrelated Cov(z _j, z _i) = Cov(z _ij, z _i) = Cov(z _ij, z _j) = 0. This leaves us with 2Cov(E(Y|x _i, x _j), z _i) = 2Cov(z _i, z _i) = 2Var(z _i) and 2Cov(E(Y|x _i, x _j), z _j) = 2Var(z _j) which gives us

(36)

$${\rm Var}\lpar {z_{ij}} \rpar = {\rm Var}(E(Y \vert x_i\comma \, x_j )) - {\rm Var}\lpar {z_i} \rpar - {\rm Var}\lpar {z_j} \rpar .$$

Thus, finally Sobol indices for two-way interactions are given by

(37)

$$S_{ij} = \displaystyle{{E^\ast [{\rm Var}(E(Y \vert x_{ij} ))]} \over {E^\ast \left[{{\rm Var}(Y) } \right] }} - S_i - S_j. $$

Similarly a higher order interaction Sobol index can be defined as

(38)

$$S_{ijk} = \displaystyle{{E^\ast [{\rm Var}(E(Y \vert x_{ijk} ))]} \over {E^\ast \left[{{\rm Var}(Y) } \right] }} - S_{ij} - S_{jk} - S_{ik} - S_i - S_j - S_k. $$

To complete the calculation of S _ijk, we need to compute only E*[Var(E(Y|x _ijk))] as E*[Var(Y)] has already been computed. We can extend the expression for E*[Var(E(Y|x _p)] in [Eq. (27)] as

(39)

$$\eqalign{& E^\ast [{\rm Var}(E(Y \vert x_{ijk} ))] = U_{ijk} - tr\lpar {A^{ - 1} P_{ijk}} \rpar + tr\lpar {e^T P_{ijk} e} \rpar - e2.} $$

In the above equation, A ⁻¹, e, and e2 do not change for the higher order interactions. The terms $U_p^{ijk} $ and $P_p^{ijk} $ can be redefined as

(40)

$$U_{ijk} = \displaystyle{1 \over {{\rm \lambda} _z}} \mathop \prod \limits_{i = 1\comma \,i \ne i\comma \,j\comma \,k}^d c_3 \lpar {x_i\comma \, {\rm \beta}_i} \rpar d\lpar {x_i} \rpar + \displaystyle{1 \over {{\rm \lambda} _{{\rm WS}}}}. $$

(41)

$$\eqalign{ P_{ijk} \left[{{\rm \xi}_{\rm i}\comma \, {\rm \xi}_j} \right] &= \displaystyle{1 \over {{\rm \lambda} _z^2}} \mathop \prod \limits_{k = 1}^d c_3 \big( {{\rm \beta}_k\comma \, {\rm \xi}_k^i} \big) c_3 \big( {{\rm \beta}_k\comma \, {\rm \xi}_k^j} \big) \cr & \quad \times \mathop \int \nolimits_{X_i} {\rm exp} [ - {\rm \beta} _i \vert x_i - {\rm \xi}_i^{i} \vert^{2}] {\rm exp} [ - {\rm \beta}_i \vert x_i - {\rm \xi}_i^{j} \vert^{2} ] dG_i \cr & \quad \times \mathop \int \nolimits_{X_j} {\rm exp} [ - {\rm \beta} _j \vert x_j - {\rm \xi}_j^i \vert^2 ] {\rm exp} [ - {\rm \beta}_j \vert x_j - {\rm \xi}_j^j \vert^2 ] dG_j \cr & \quad \times \mathop \int \nolimits_{X_k} {\rm exp} [ - {\rm \beta} _k \vert x_k - {\rm \xi}_k^i \vert^2 ] {\rm exp} [ - {\rm \beta}_k \vert x_k - {\rm \xi}_k^j \vert^2 ] dG_{k}.} $$

This term can easily be computed as

(42)

$$\eqalign{& P_{ijk} \left[{{\rm \xi}_i\comma \, {\rm \xi}_j} \right] = \displaystyle{1 \over {{\rm \lambda} _z^2}} \mathop \prod \limits_{k = 1}^d c_3 \lpar {{\rm \beta}_k\comma \, {\rm \xi}_k^i} \rpar c_3 \lpar {{\rm \beta}_k\comma \, {\rm \xi}_k^j} \rpar {\rm C2}\lpar i \rpar {\rm C2}\lpar j \rpar {\rm C2}\lpar k \rpar \comma \,} $$

where C ₂ is given in Eq. (24).

2.6. Problem with input correlation

The work discussed in the previous sections assumes that the inputs are uncorrelated. This assumption renders unique HDMR of the underlying function with orthogonal effect functions. Orthogonality of effect functions allows computation of each Sobol index independently without worrying about the covariance of the different effect functions. In the presence of input correlation, sampling has to be done from joint distributions, which increases computational burden. Use of copulas and sampling strategies such as the replicated Latin hypercube sampling can be used to alleviate computational burden to an extent; however, the effect of input correlation on variance decomposition has a more critical effect on sensitivity analysis and is discussed here. Saltelli and Tarantola (Reference Saltelli and Tarantola2002) give the following specific example. If Y = x ₁ + x ₂ + a ₂₃ x ₂ x ₃ and x ₂ and x ₃ are correlated, Var(E[Y|x ₁]) will be a function of the correlation between x ₂ and x ₃. In other words, if we compute Sobol indices assuming x ₂ and x ₃ are uncorrelated, it would lead to an erroneous calculation if a ₂₃ is nonzero. Saltelli and Tarantola and other authors have described this effect to be carried over due to correlation and stress the need to be careful when making conclusions with Sobol indices when inputs are correlated.

Xu and Gertner (Reference Xu and Gertner2008) developed an approach for linear models by splitting the contribution of an individual input to the uncertainty of the model output into two components: the correlated contribution and the uncorrelated one. Li et al. (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) extended the work of Xu and Gertner (Reference Xu and Gertner2008) with a relaxed HDMR concept to compute variance contribution due to correlation and problem structure for nonlinear models. Similarly, Chastaing et al. (Reference Chastaing, Gamboa and Prieur2014) used the hierarchical orthogonality of component functions and the traditional Gram–Schmidt method to develop a hybrid method for variance decomposition. Finally, Caniou and Sudret (Reference Caniou and Sudret2010) used polynomial chaos expansions with copula methods to model the dependence structure. Next, we discuss the Li–Rabitz framework in a little more detail.

Li and Rabitz relax this orthogonality condition by enforcing orthogonality with only nested lower order component functions. This is also referred to as hierarchical orthogonality in the literature. In other words, z _ij(x _i, x _j) is only required to be orthogonal to z _i(x _i) and z _j(x _j) and not to any other component functions. Using the hierarchical orthogonality condition, Li et al. (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) and Li and Rabitz (Reference Li and Rabitz2012) decompose the output variance as in

(43)

$${\rm Var}[Y] = \mathop \sum \limits_{i = 1}^{2^d - 1} {\rm Var}\left[{z_{\,p_i}} \right] + \hbox{Cov}\left( {z_{\,p_i}\comma \, {\mathop \sum \limits_{k = 1\comma \,k \ne i}^{k = 2^d - 1}} z_{\,p_k}} \right)$$

Here p _i is the multi-index representing the different interaction terms p ₁, p ₁₂, p ₁₂₃, … , p _12⋯d. Using this decomposition, Li et al. (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) define three sensitivity indices, total $\lpar {S_{p_j}} \rpar $ , structural $\lpar {S_{p_j}^a} \rpar $ , and correlative $\lpar {S_{p_j}^b} \rpar $ as

(44)

$$S_{\,p_j} ^a = \displaystyle{{{\rm Var}\left[{z_{\,p_j}} \right] } \over {{\rm Var}[Y] }}\comma \,$$

(45)

$$S_{p_j} ^b = \displaystyle{{\hbox{Cov}\left( {z_{p_i}\comma \, \displaystyle\sum\nolimits_{k = 1\comma \,k \ne i}^{k = 2^d - 1} {z_{p_k}}} \right)} \over {{\rm Var}[Y] }}\comma \,$$

(46)

$$S_{\,p_j} = S_{\,p_j} ^a + S_{\,p_j} ^b. $$

It is important to note here that the total index $\lpar {S_{p_j}} \rpar $ is not representative of the total effect, but is only a sum of the structural and correlative contribution. For problems with correlated input variables, the component functions are nonunique, and multiple solutions are possible as long as they satisfy Eq. (4). In other words, as explained by Li et al. (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) and Li and Rabitz (Reference Li and Rabitz2012), the component functions cannot be estimated independently of each other in the presence of input correlation. They suggest that Sobol indices need to be estimated simultaneously with a back-fitting method (Hastie et al., Reference Hastie, Tibshirani and Friedman2001) for additive models.

3. ANALYTICAL EXAMPLES

In this section, we will present comparison of sensitivities given by Gaussian process models in the Kennedy and O'Hagan framework on toy examples where Sobol indices can be computed analytically.

3.1. Uncorrelated example: Polynomial model

Consider the following polynomial model, for 0 ≤ x ₁, x ₂ ≤ 1:

(47)

$$y\lpar {x_1\comma \, x_2} \rpar = \displaystyle{1 \over {2^2}} \lpar {3x_1^2 + 1} \rpar \lpar {3x_2^2 + 1} \rpar .$$

Here, E[y] = 1 and the effect functions are given as

(48)

$$z_1 \lpar {x_1} \rpar = \displaystyle{1 \over 2}\lpar {3x_1^2 + 1} \rpar - 1\comma \,$$

(49)

$$z_2 \lpar {x_2} \rpar = \displaystyle{1 \over 2}\lpar {3x_2^2 + 1} \rpar - 1\comma \,$$

(50)

$$z_{12} \lpar {x_1, x_2} \rpar = \displaystyle{1 \over {2^2}} \lpar {3x_1^2 + 1} \rpar \lpar {3x_2^2 + 1} \rpar - z_1 \lpar {x_1} \rpar - z_2 \lpar {x_2} \rpar - 1. $$

Sobol indices can be computed analytically for this problem and are given as below. | p| = 1 if p = 1 or p = 2 and | p| = 2 if p = 1, 2.

(51)

$$S_p = \displaystyle{{5^{ - \vert p \vert }} \over {\left( {\displaystyle{6 / 5}} \right)^d - 1}}.$$

Figure 5 shows comparison of the main and interaction effect functions generated by BHM GP, and Figure 6 shows a comparison of the Sobol indices. The model was built with 125 Latin hypercube simulation DOE.

Fig. 5. Comparison of main and interaction effect functions generated by Bayesian hybrid model Gaussian process.

Fig. 6. Comparison of analytical main, interaction, and total sensitivity indices with Bayesian hybrid modeling for the polynomial problem.

3.2. Uncorrelated example: Ishigami function

Generate Sobol indices for the following Ishigami function with −π ≤ x ₁, x ₂, x ₃ ≤ pi and a = 0.7, b = 0.1.

(52)

$$y = \hbox{sin}\lpar {x_1} \rpar + a\,\hbox{sin}^2 \lpar {x_2} \rpar + bx_3^4 \hbox{sin}\lpar {x_1} \rpar .$$

As described previously, Sobol indices can be computed as ratios of the following analytically computed variances (Sudret, Reference Sudret2008):

(53)

$$\openup 8pt\eqalign{& D = \displaystyle{{a^2} \over 8} + \displaystyle{{b{\rm \pi} ^4} \over 5} + \displaystyle{{b^2 {\rm \pi} ^8} \over {18}} + \displaystyle{1 \over 2}\comma \, \cr & D_1 = \displaystyle{{b{\rm \pi} ^4} \over 5} + \displaystyle{{b^2 {\rm \pi} ^8} \over {50}} + \displaystyle{1 \over 2}\comma \, \cr & D_2 = \displaystyle{{a^2} \over 8}\comma \, \cr & D_3 = 0\comma \, \cr & D_{12} = D_{23} = 0\comma \, \cr & D_{13} = \displaystyle{{8b^2 {\rm \pi} ^8} \over {225}}\comma \, \cr & D_{123} = 0.} $$

For sensitivity analysis on this problem, a BHM model was built with 300 points. Figure 7 shows accurate computation of main, second, and higher order interaction Sobol indices for the Ishigami problem.

Fig. 7. Comparison of analytical main, interaction, and total sensitivity indices with Bayesian hybrid modeling for the Ishigami problem.

3.3. Uncorrelated example: Sobol function

Generate Sobol indices for the following Sobol function (Sudret, Reference Sudret2008) with 0 ≤ x ₁, x ₂, … , x _q ≤ 1 and q = 0.7:

(54)

$$y = \mathop \prod \limits_{i = 1}^q \displaystyle{{\parallel \! 4x_i - 2 \! \parallel + a_i} \over {1 + a_i}}. $$

Sobol indices can be computed as ratios of the following analytically computed variances (Sudret, Reference Sudret2008):

(55)

$$\eqalign{ D &= \mathop \prod \limits_{i = 1}^q (D_i + 1) - 1\comma \, \cr D_i &= \displaystyle{1 \over {3(1 + a_i )^2}}\comma \, \cr \quad S_{i_1 i_2 \cdots i_s} &= \displaystyle{1 \over D}\mathop \prod \limits_{i = 1}^s D_i.}$$

For sensitivity analysis on this problem, a BHM model was built with 300 points. Only main and two-way interactions are presented. Figure 8 shows accurate computation of main, second, and higher order interaction Sobol indices.

Fig. 8. Comparison of analytical main, and interaction sensitivity indices with Bayesian hybrid modeling for the Sobol problem. The first eight indices are main effects, and the rest are two-way interaction effects.

3.4. Correlated example: Linear function with equal structural contributions and correlated inputs

Consider the following example:

(56)

$$Y = X_1 + X_2 + X_3 + X_4 + X_5. $$

Here the inputs are drawn from a multivariate normal distribution of with mean of 0.5 in each dimension and a covariance matrix ∑ of

(57)

$${\sum { = \left[ {\matrix{ {1.0} & {0.6} & {0.2} & {0.0} & {0.0} \cr {0.6} & {1.0} & {0.2} & {0.0} & {0.0} \cr {0.2} & {0.2} & {1.0} & {0.0} & {0.0} \cr {0.0} & {0.0} & {0.0} & {1.0} & {0.2} \cr {0.0} & {0.0} & {0.0} & {0.2} & {1.0} \cr}} \right]}. }$$

To build BHM models, 500 samples were randomly generated from the above multivariate random distribution. If inputs were uncorrelated, all main effect Sobol indices would be equal to 0.2 and BHM would reproduce this result as we have shown consistently in the previous section. However, in this section the correlation structure hidden in the 500 training points BHM results in different Sobol indices shown in Table 1. Table 1 presents the structural, correlative, and total indices presented in Li et al. (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) and their comparison with BHM assuming uncorrelated inputs. Interpretation with separated structural and correlative contributions is less ambiguous because as expected structural contributions from all inputs are equal. In addition, the correlative contribution shows how strongly the inputs are correlated. This can be especially helpful if sensitivity analysis is needed to be done in early design stages and is used to guide future tests. The correlative contribution also suggests which inputs need to be varied in groups. It is clear that when planning experiments, the inputs X ₁ and X ₂ should be sampled together as they have equally high correlative contributions that should not be ignored. In contrast, correlative contributions of X ₄ and X ₅ are equal and small, which means their correlation can be ignored. It is important to note here that to compute structural and correlative contributions in the Li–Rabitz framework, a priori knowledge of the covariance structure is not needed.

Table 1. Differences between the structural, correlative, and total indices according to Li et al. (Reference Li and Rabitz2012) and their comparison with BHM, assuming uncorrelated inputs

Note: BHM, Bayesian hybrid modeling.

Interpreting Sobol indices with BHM is quite ambiguous as it clearly shows that X ₁ and X ₄ are much more sensitive and there is no physical evidence that X ₄ would be more sensitive even if we know the correlation structure of the inputs beforehand. In addition, there are no signs that can signify that our initial assumption of independent inputs is invalid. The joint Sobol indices by BHM are zero as expected and are not presented here. Even though the main effect indices shown by BHM sum up to unity, it does not mean that the sensitivity indices are accurate.

3.5. Correlated example: Linear function with distinct structural contributions and correlated inputs

Consider the following example:

(58)

$$Y = 5X_1 + 4X_2 + 3X_3 + 2X_4 + X_5. $$

Here the inputs are drawn from a multivariate normal distribution of with mean of 0.5 in each dimension and a covariance matrix ∑ of

(59)

$$\sum { = \left[ \matrix{ {1.0} & {0.6} & {0.2} & {0.0} & {0.0} \cr {0.6} & {1.0} & {0.2} & {0.0} & {0.0} \cr {0.2} & {0.2} & {1.0} & {0.0} & {0.0} \cr {0.0} & {0.0} & {0.0} & {1.0} & {0.2} \cr {0.0} & {0.0} & {0.0} & {0.2} & {1.0} \cr} \right]}. $$

To build BHM models, 500 samples were randomly generated from the above multivariate random distribution. Table 2 presents the structural, correlative, and total indices presented in Li et al. Rabitz (Reference Li, Rabitz, Yelvington, Oluwole, Bacon, Kolb and Schoendorf2010) and their comparison with BHM assuming uncorrelated inputs. Interpretation with separated structural and correlative contributions is again less ambiguous because as expected structural contributions are decreasing $S_{X_1} \gt S_{X_2} \gt S_{X_3} \gt S_{X_4} \gt S_{X_5} $ . Sobol indices predicted by BHM are inaccurate as it clearly shows that X ₂ is more sensitive than X ₁ when it should clearly be the other way around. The joint Sobol indices by BHM are zero as expected and are not presented here. Although not precise, the sensitivities computed by BHM accurately identify the most sensitive parameters to be X ₁ and X ₂ and X ₄, X ₅ to be the least sensitive parameters with X ₃ falling in the middle of the sensitivities on a relative scale. In practical applications, this kind of relative ranking is valued highly because critical dimensions can be narrowed very quickly.

Table 2. Differences between the structural, correlative and total indices according to Li et al. (Reference Li and Rabitz2012) and their comparison with BHM, assuming uncorrelated inputs

Note: BHM, Bayesian hybrid modeling.

3.6. Correlated example: Portfolio model

Munoz Zuniga et al. (Reference Munoz Zuniga, Kucherenko and Shah2013) presented the following example for sensitivity analysis with dependent inputs in four dimensions:

(60)

$$Y = X_1 X_3 + X_2 X_4. $$

Here the inputs are drawn from a multivariate normal distribution of N(μ, ∑) with μ = [00μ₃μ₄]

(61)

$$\sum { = \left[ {\matrix{ {{\rm \sigma}_1^2} & {{\rm \sigma}_{12}} & 0 & 0 \cr {{\rm \sigma}_{12}} & {{\rm \sigma}_2^2} & 0 & 0 \cr 0 & 0 & {{\rm \sigma}_3^2} & {{\rm \sigma}_{34}} \cr 0 & 0 & {{\rm \sigma}_{34}} & {{\rm \sigma}_4^2} \cr}} \right]} {\rm.} $$

The following analytical expressions for Sobol indices were provided by the authors:

(62)

$$\eqalign{ {S_1} & = {\displaystyle{{{\rm \sigma} _1^2 \left( {{\rm \mu}_3 + {\rm \mu}_4 {\rm \rho}_{12} \displaystyle{{{\rm \sigma}_2} \over {{\rm \sigma}_1}}} \right)} \over D}\comma} \cr {S_1^T} & = {\displaystyle{{{\rm \sigma} _1^2 \big( {1 - {\rm \rho}_{12}^2} \big) \big({{\rm \sigma}_3^2 + {\rm \mu}_3^2} \big) } \over D}\comma} \cr {S_2} & = {\displaystyle{{{\rm \sigma} _2^2 \left( {{\rm \mu}_4 + {\rm \mu}_3 {\rm \rho}_{12} \displaystyle{{{\rm \sigma}_1} \over {{\rm \sigma}_2}}} \right)} \over D}\comma} \cr {S_2^T} & = {\displaystyle{{{\rm \sigma} _2^2 \big( {1 - {\rm \rho}_{12}^2} \big) \big( {{\rm \sigma}_4^2 + {\rm \mu}_4^2} \big) } \over D}\comma} \cr {S_3} & = {0\comma} \cr {S_3^T} & = {\displaystyle{{{\rm \sigma} _1^2 {\rm \sigma} _3^2 (1 - {\rm \rho} _{34}^2 )} \over D}\comma} \cr {S_4} & = {0\comma} \cr {S_4^T} & = {\displaystyle{{{\rm \sigma} _2^2 {\rm \sigma} _4^2 (1 - {\rm \rho} _{34}^2 )} \over D}\comma} \cr {{\rm \rho}_{ij}} & = {\displaystyle{{{\rm \sigma} _{ij}} \over {{\rm \sigma} _i {\rm \sigma} _j}}\comma} \cr {D} & = {{\rm \sigma} _1^2 \big( {{\rm \sigma}_3^2 + {\rm \mu}_3^2} \big) + {\rm \sigma} _2^2 \big( {{\rm \sigma}_4^2 + {\rm \mu}_4^2} \big) + 2{\rm \sigma} _{12} \big({\rm \sigma} _{34} + {\rm \mu} _3 {\rm \mu} _4 \big).} \cr} $$

The authors used the following values, which we have also used:

$$\eqalign{ {\rm \mu} _3 &= 250\comma \, \cr {\rm \mu} _4 & = 400\comma \, \cr {\rm \sigma} _1^2 & = 16\comma \, \cr {\rm \sigma} _{12} & = 2.4\comma \, \cr {\rm \sigma} _2^2 & = 4\comma \, \cr {\rm \sigma} _3^2 & = 4 \times 10^4\comma \, \cr {\rm \sigma} _4^2 &= 9 \times 10^4\comma \, \cr {\rm \sigma} _3 {\rm \sigma} _4 & = - 1.8e4.} $$

To build BHM models, 500 samples were randomly generated from the above multivariate random distribution. Table 3 presents the analytical sensitivity indices presented by Munoz Zuniga et al. (Reference Munoz Zuniga, Kucherenko and Shah2013) and their comparison with BHM assuming uncorrelated inputs. Table 3 clearly shows that, assuming the inputs to be independent can give erroneous sensitivity indices. Again, although the absolute values of the sensitivity are inaccurate, the relative ordering of variables (as seen by the % contributions) is preserved. The relative ordering of variables can be very useful in engineering design problems.

Table 3. Differences between the analytical sensitivity indices according to Munoz Zuniga et al. (Reference Munoz Zuniga, Kucherenko and Shah2013) and their comparison with BHM, assuming uncorrelated inputs

Note: BHM, Bayesian hybrid modeling.

3.7. Convergence of Sobol indices

One of the real advantages of computing probabilistic sensitivities with the Kennedy and O'Hagan framework is that it can estimate Sobol indices accurately with only a small number of data points. This is especially useful in high-dimensional complex problems where each pointwise evaluation of the underlying function is costly. In the literature, the only other method that can estimate Sobol indices analytically is the polynomial chaos expansion method. However, these methods require a large number of data points to achieve acceptable accuracy in the Sobol indices, which can be unrealistic for many problems. We use the Ishigami function to demonstrate convergence of main, interaction, and total Sobol indices as the number of training points increase. Figure 9 shows convergence of the absolute error in the main interaction and total sensitivity indices predicted by BHM as the number of data points increase. The indices converge at 300 points. It is also important to note here that even regression-based polynomial chaos expansion sensitivity indices for the Ishigami problem presented by Sudret (Reference Sudret2008) takes more number of points to estimate the indices accurately with a ninth-degree polynomial.

Fig. 9. Convergence of main interaction and total sensitivity indices predicted by Bayesian hybrid modeling as the number of data points increases.

3.8. Nonlinear thermal model

In this section, we present sensitivity analysis on a system-level thermal model of an aircraft engine component. This problem consisted of 100 calibration parameters, and twenty outputs. The calibration parameters included heat transfer coefficients, temperatures, flow rates, and so on. The outputs were the temperatures at several locations in the component. Only 148 simulation data were available from a nonlinear thermal finite element model. Details of the BHM model for this application can be found in Chennimalai Kumar et al. (Reference Chennimalai Kumar, Subramaniyan and Wang2012). Variance-based sensitivity analysis is really helpful in such a high-dimensional problem as we need to know which calibration parameters can be fixed without losing output variability. Below we present sensitivity analysis for the first output only assuming that the calibration parameters are independent. Figures 10 and 11 show the first 25 main and total effects in descending order corresponding to the first output. Looking at these figures, it is clear that ×35 is the most sensitive calibration parameter. The top five inputs according to the main and total effect Sobol indices differ in ranking, which means there is significant interaction between the calibration parameters. Ranking of the total Sobol indices can be used to discard the lower ranked inputs as they account for both the main and interaction effects. These main and total indices were computed within minutes by the Kennedy and O'Hagan framework even for this 100-dimensional problem.

Fig. 10. First 25 main effect Sobol indices for the first output in a 100-dimension nonlinear thermal calibration problem.

Fig. 11. First 25 total effect Sobol indices for the first output in a 100-dimension nonlinear thermal calibration problem.

In this problem there are 4950 two-way interaction effects, and it would be computationally prohibitive to estimate all the interactions. We selected the following inputs with high total effect (×35, ×76, ×5, ×78, ×39, ×34, ×77, ×1, ×20, ×13) and computed two-way interactions only among those variables. Figure 12 shows the top 25 two-way interaction indices, and Fig. 13 shows the percentage contribution of main and interaction effects to the total output variance of output 1. Clearly, the interaction between variables ×76 and ×78 dominates the two-way interactions. In the pie chart, the blue color is the main effect contribution, the green color represents contribution due to 45 two-way interaction effects among the top 10 most sensitive inputs, and finally, the orange color represents the contribution due to the rest of the terms that include the other 4905 two-way interaction terms and other higher order effects.

Fig. 12. First 25 two-way interaction effect Sobol indices for the first output in a 100-dimension nonlinear thermal calibration problem.

Fig. 13. Pie chart showing percentage contribution of main and interaction effects to the total output variance of output 1.

Although separating the sensitivities into structural and correlation elements yields accurate sensitivity measures, it is seldom practical in real-world applications. This is mainly because the correlations of the input variables are generally unknown and the number of dimensions is large. Sobol indices computed through GEBHM and other metamodel approximations provide a viable solution for computing sensitivities albeit with inaccuracies. In many cases, if the correlated inputs do not participate in a strong interaction effect function, the ranking from the Sobol indices are accurate. Only when the correlated inputs also participate in a strong interaction is the relative ranking is put in question. Thus, we contend that for a wide range of practical problems, the proposed technique provides accurate engineering estimates of sensitivities.

4. CONCLUSIONS

In this work we present variance-based global sensitivity analysis methods that are an important part of design and exploring engineering science of complex high-dimensional systems. We present the Kennedy and O'Hagan framework and computation of probabilistic sensitivities with Gaussian processes. Following results in Figures 5–13, we show that this method is extremely efficient and very accurate when input variables are uncorrelated. However, due to the traditional HDMR formulation, the sensitivity indices can be inaccurate when input variables are correlated. We present a literature survey of methods that address the issue of correlated variables and discuss the Li and Rabitz framework in detail. We explore these concepts on a number of simulated problems. Finally, we present application of variance-based sensitivity on a 100-dimensional nonlinear thermal model. It is shown that for many engineering applications with and without correlated inputs, the variance-based global sensitivity method augmented with metamodels yields efficient and accurate results for high-dimensional nonlinear responses.

Ankur Srivastava is a Lead Engineer in the Probabilistics Lab at GE Global Research Center. Before joining GE, he completed his postdoctoral work in the Aerospace Engineering Department at MIT and obtained his PhD in mechanical engineering at Rice University, Houston. He joined GE Global Research Center, Niskayuna, in June 2012. He manages multiple projects involving development of probabilistic methods and their application to reliability and robust design of complex systems. Dr. Srivastava has published a number of journal papers and conference proceedings in this area and has filed three patents.

Arun K. Subramaniyan leads the Data Science & Analytics Team at GE Oil & Gas (O&G) Digital. He received a PhD in aerospace engineering from Purdue University. Arun joined O&G Digital after working at GE Global Research Center in Niskayuna, NY, where he led the development of the Analytics Workbench, which is the foundation for the Digital Twin framework. This framework has enabled thousands of engineers at GE to build advanced models efficiently. The asset specific cumulative damage modeling techniques his team pioneered have saved millions of dollars for several businesses. Dr. Subramaniyan developed advanced techniques and tools for efficiently modeling large scale systems like jet engines and accelerated design times by three to four times. He is a prolific researcher with over 40 international publications that have been cited more than 300 times. Arun received the Hull Award from GE, which honors early career technologists for their outstanding technical impact.

Liping Wang is currently the Technical Operation Leader for the Probabilistics & Structural Optimization Group at GE Global Research Center. She obtained her PhD in mechanical engineering from Beijing University of Aeronautics and Astronautics. She joined GE Global Research Center, Niskayuna, in January 1997. Since 2007, she has led the team in developing state-of-the-art probabilistic methods and tools to provide for world-class NPI and NTI and Services capabilities throughout the company. She and her team have been driving numerous engineering applications across multiple GE businesses. Dr. Wang has published more than 60 articles in archival journals and peer-reviewed conference proceedings and 45 internal reports and has filed six patents.

References

REFERENCES

Caniou, Y., & Sudret, B. (2010). Distribution-based global sensitivity analysis using polynomial chaos expansion. Proc. 6th Int. Conf. Sensitivity Analysis of Model Output, pp. 7625–7626, Milan, Italy, July 19–22.CrossRef Google Scholar

Chastaing, G., Gamboa, F., & Prieur, C. (2014). Generalized Sobol sensitivity indices for dependent variables: numerical methods. Journal of Statistical Computation and Simulation 85(7), 1306–1333.CrossRef Google Scholar

Chennimalai Kumar, N., Subramaniyan, A.K., & Wang, L. (2012). Improving high-dimensional physics models through Bayesian calibration with uncertain data. Proc. ASME Turbo Expo 2012, Copenhagen, June 11–15.CrossRef Google Scholar

Chennimalai Kumar, N., Subramaniyan, A.K., & Wang, L. (2013). Calibrating transient models with multiple responses using Bayesian inverse techniques. Proc. ASME Turbo Expo 2013, San Antonio, TX, June 5–10.CrossRef Google Scholar

Hastie, T., Tibshirani, R., & Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.CrossRef Google Scholar

Haylock, R.G., & O'Hagan, A. (1996). On inference for outputs of computationally expensive algorithms with uncertainty on the inputs. In Bayesian Statistics (Bernardo, J.M., Berger, J.O., Dawid, A.P., & Smith, A.F.M., Eds.), Vol. 5, pp. 629–637. Oxford: Oxford University Press.CrossRef Google Scholar

Iooss, B., & Lemaître, P. (2015). A review on global sensitivity analysis methods. In Uncertainty Management in Simulation-Optimization of Complex Systems: Algorithms and Applications (Meloni, C., & Dellino, G., Eds.). New York: Springer.Google Scholar

Kennedy, M., & O'Hagan, A. (2001). Bayesian calibration of computer models (with discussion). Journal of the Royal Statistical Society (Series B) 68, 425–464.CrossRef Google Scholar

Kennedy, M.C., & O'Hagan, A. (2001). Bayesian calibration of computer models. Journal of the Royal Statistical Society (Part B) 63(3), 425–464.CrossRef Google Scholar

Li, G., & Rabitz, H. (2012). General formulation of HDMR component functions with independent and correlated variables. Journal of Mathematical Chemistry 50(1), 99–130.CrossRef Google Scholar

Li, G., Rabitz, H., Yelvington, P.E., Oluwole, O.O., Bacon, F., Kolb, C.E., & Schoendorf, J. (2010). Global sensitivity analysis for systems with independent and/or correlated inputs. Journal of Physical Chemistry: Part A 114, 6022–6032.CrossRef Google Scholar PubMed

Munoz Zuniga, M., Kucherenko, S., & Shah, N. (2013). Metamodelling with independent and dependent inputs. Computer Physics Communications 184, 1570–1580.CrossRef Google Scholar

Oakley, J.E., & O'Hagan, A. (2004). Probabilistic sensitivity analysis of complex models: a Bayesian approach. Journal of the Royal Statistical Society: Part B 66(3), 751–769.CrossRef Google Scholar

Reuter, U., Mehmood, Z., Gebhardt, C., Liebscher, M., Müllerschön, H., & Lepenies, I. (2011). Using LS-OPT for meta-model based global sensitivity analysis. Proc. 8th European LS-Dyna Conf., Session 4, Strasbourg, France.Google Scholar

Saltelli, S., & Tarantola, S. (2002). On the relative importance of input factors in mathematical models. Journal of the American Statistical Association 97(459), 2002.CrossRef Google Scholar

Saltelli, S., Tarantola, S., & Campolongo, F. (2000). Sensitivity analysis as an ingradient of modeling. Statistical Science 15(4), 377–395.Google Scholar

Subramaniyan, A.K., Kumar, N., Wang, L., Beeson, D., & Wiggs, G. (2012). Enhancing high-dimensional physics models for accurate predictions with Bayesian calibration. Proc. 2012 Propulsion-Safety and Affordable Readiness Conf., Jacksonville, FL, March 20–22.Google Scholar

Sudret, B. (2008). Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering and System Safety 93(7), 964–979.CrossRef Google Scholar

Wang, L., Fang, X., Subramaniyan, A., Jothiprasad, G., Gardner, M., Kale, A., Akkaram, S., Beeson, D., Wiggs, G., & Nelson, J. (2011). Challenges in uncertainty, calibration, validation and predictability of engineering analysis models. Proc. ASME Turbo Expo 2011: Power for Land, Sea and Air GT2011, Paper No. GT2011-46554, Vancouver, CA, June 6–10.Google Scholar

Xu, C., & Gertner, G.Z. (2008). Uncertainty and sensitivity analysis for models with correlated parameters. Reliability Engineering and System Safety 93, 1563–1573.CrossRef Google Scholar

Fig. 1. Sensitivity analysis methods.

Fig. 2. Review of sensitivity analysis methods with increasing dimensionality and nonlinearity (Iooss & Lemâitre, 2015).

Fig. 3. Variance decomposition.

Fig. 4. Bayesian hybrid modeling framework. GEBHM, Bayesian hybrid model implementation at GE that is the in-house implementation of Kennedy and O'Hagan framework.

Fig. 5. Comparison of main and interaction effect functions generated by Bayesian hybrid model Gaussian process.

Fig. 6. Comparison of analytical main, interaction, and total sensitivity indices with Bayesian hybrid modeling for the polynomial problem.

Fig. 7. Comparison of analytical main, interaction, and total sensitivity indices with Bayesian hybrid modeling for the Ishigami problem.

Table 1. Differences between the structural, correlative, and total indices according to Li et al. (2012) and their comparison with BHM, assuming uncorrelated inputs

Table 2. Differences between the structural, correlative and total indices according to Li et al. (2012) and their comparison with BHM, assuming uncorrelated inputs

Table 3. Differences between the analytical sensitivity indices according to Munoz Zuniga et al. (2013) and their comparison with BHM, assuming uncorrelated inputs

Fig. 9. Convergence of main interaction and total sensitivity indices predicted by Bayesian hybrid modeling as the number of data points increases.

Fig. 10. First 25 main effect Sobol indices for the first output in a 100-dimension nonlinear thermal calibration problem.

Fig. 11. First 25 total effect Sobol indices for the first output in a 100-dimension nonlinear thermal calibration problem.

Fig. 12. First 25 two-way interaction effect Sobol indices for the first output in a 100-dimension nonlinear thermal calibration problem.

Fig. 13. Pie chart showing percentage contribution of main and interaction effects to the total output variance of output 1.

Article contents

Analytical global sensitivity analysis with Gaussian processes

Abstract

Keywords

1. INTRODUCTION

2. METHODS AND TECHNIQUES

2.1. Variance-based global sensitivity methods

2.2. High-dimensional model representation

2.3. Kennedy and O'Hagan framework

2.3.1. Deriving effect functions analytically for Gaussian process models in the Kennedy and O'Hagan framework

2.4. Computation of Sobol indices

2.5. Derivation of higher order interaction effects

2.6. Problem with input correlation

3. ANALYTICAL EXAMPLES

3.1. Uncorrelated example: Polynomial model

3.2. Uncorrelated example: Ishigami function

3.3. Uncorrelated example: Sobol function

3.4. Correlated example: Linear function with equal structural contributions and correlated inputs

3.5. Correlated example: Linear function with distinct structural contributions and correlated inputs

3.6. Correlated example: Portfolio model

3.7. Convergence of Sobol indices

3.8. Nonlinear thermal model

4. CONCLUSIONS

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests