# «EMPIRICAL GROUND-MOTION MODELS FOR PROBABILISTIC SEISMIC HAZARD ANALYSIS: A GRAPHICAL MODEL PERSPECTIVE Kumulative Dissertation zur Erlangung des ...»

The BN is particularly well suited to investigate the set of possible predictor variables X. Learning the structure of the BN means learning conditional independences, and hence with complete information, PGA is only inﬂuenced by the variables that are directly connected to it. We ﬁnd that only ﬁve parameters – magnitude, Joyner-Boore distance, azimuth, style of faulting and depth to a shear wave horizon of 2.5 km/s, ZP:S – are directly connected to PGA in the ﬁnal model, where the connection style of faulting ← PGA is included as expert knowledge. In particular, the effect QH QH of VS on PGA is mediated by ZP:S. Hence, once ZP:S is known, VS does not provide any further information for the prediction of PGA. Thus, it is sufﬁcient to know the ﬁve parameters for r(PGA X). This may have implications for future studies, as these ﬁve variables estimation of should be included in new GMMs. However, this requires that information about them is present in strong motion datasets, which is currently not the case for all datasets.

Both the regression model as well as the BN provide valuable insight into the functional form f(X) and/or the set of predictors X of a GMM. However, both analyses also reveal that the underlying dataset has limitations, and that a completely data-driven (assumption free) approach is unwarranted. Furthermore, often there is good reason to make assumptions – they may be based on ﬁrm beliefs about the physics of the process. Assumptions can be made on speciﬁc forms of scaling of ground motions with the predictor variables (i.e. on the functional form of f(X), but also on the parameters (e.g. to ensure monotonic scaling with magnitude).

Here is where the Bayesian approach to inference comes into play. Bayesian inference makes it possible to quantify prior beliefs, which are subsequently updated using data (cf. section 4.2).

Thus, even though assumptions are made/quantiﬁed, these can be ‘overridden’ by data.

Two Bayesian GMMs were developed in this work. Prior beliefs were not quantiﬁed on the parameters of the GMMs themselves, but on physical parameters such as stress drop and QH.

Using stochastic simulations, a synthetic dataset was created, on which prior distributions on the parameters are estimated by regression. We ﬁnd that this is a good way to incorporate physical knowledge into a GMM.

In the Bayesian GMMs, we have speciﬁed the functional form based on physical considerations, since we have found that some constraints must be made regarding the scaling. However, as we have elaborated on chapter 2, the capability of a model to generalize should be an important factor.

Therefore, we use generalization error based on cross-validation to choose between different forms of scaling – e.g., different forms of the magnitude scaling (quadratic, linear, tri-linear) can be found in the literature.

We ﬁnd that the Bayesian approach works well for estimating the parameters of a GMM. The models we learned ﬁt the data well, but also include (via the prior) our prior beliefs. Uncertainty r(¢ D)). This on the parameters is quantiﬁed in terms of a probability distribution (the posterior is useful since it simpliﬁes a full probabilistic treatment, which is desirable for PSHA.

In this work, we have developed two Bayesian GMMs. Both are similar in that they use the same basis model, but they investigate different aspects which are important in th context of GMMs and PSHA. The ﬁrst model directly estimates the correlation between different ground motion intensity parameters during learning. This is important, since neglecting this correlation may distort the results of a PSHA. Here, we ﬁnd no large difference in the posterior distribution of the parameters estimated with and without correlation, but this might not hold in general. Hence, learning a GMM and ignoring correlation between ground motion intensity parameters can lead to distortions.

We ﬁnd that there is strong correlation between PGA, PGV and the response spectrum at three periods. The strength of the correlation depends on the period difference. There is both considerable between-event and within-event correlation, with the former being larger.

The second model takes into account potential regional differences in ground motion scaling between different regions. Here, the results are somewhat inconclusive, as we do not ﬁnd the “smoking gun” evidencing the existence or absence of regional differences. There probably is no such thing as a smoking gun, as regional dependance of ground motion scaling is to all appearances not binary – it is not a GMM per se that is regional dependent, but aspects of a model. For example, we observe regional differences for the large distance scaling, while they appear to be negligible for the other aspects of the model. Similarly, Chiou et al. (2010) ﬁnd that small magnitude scaling is different between southern and central California.

Even if we do not ﬁnd evidence for regional differences for most aspects of ground motion scaling, that does not mean that we can rule them out, since the results are associated with large uncertainties. However, the Bayesian approach to inference is particularly apt to deal with that problem – we do not need to specify a model one way or the other, but can allow for the possibility of regional differences. Initially, all parameters can vary regionally, with large uncertainties associated with the degree of the regional differences. As new data becomes available, the uncertainties decrease, and regional variability in the model persists only for those parameters for which true regional differences exist.

Most of the models presented in this thesis are developed as probabilistic graphical models, with the BN being a special kind of these models. We ﬁnd that graphical models are a convenient tool for reasoning under uncertainty. On the one hand, their graphical structure provides an easy and intuitive insight into the model, the data generating structure and the dependencies between variables. It also makes it easy to enhance them with additional complexities. Furthermore, due to their ability to encode conditional independencies between parameters they are an ideal instrument for setting up/analysing probabilistic models.

The ﬂexibility of graphical models is demonstrated by the two Bayesian GMMs of chapters 4 and 5, which enhance the same base model to account for different complexities. The BN of chapter 3 is again a different kind of graphical model. Due to the factorization properties of graphical models, they are easy to enhance with additional nodes. For example, one can add nodes describing the magnitude distribution (e.g. a, b-value and maximum magnitude of a GutenbergRichter distribution, together with their associated uncertainty) without changing the other local probability distributions. That way, one can arrive at graphical model that describes all steps and uncertainties of a PSHA. This is conceptually shown in Figure 7.1. Here, the magnitude distribution depends on the values of the b- and Mmax node, each of which can be assigned uncertainty

** Figure 7.2: Graphical model of the naive Bayes classiﬁer of chapter 6.**

to. The distribution of the ground motion parameter Y depends on the Mod-node, which der(Y M; R). The conditional distribution of the scribes which GMM should be used to calculate ground motion parameter of interest can then be obtained using sampling or employing directly the fast inference algorithms of BNs. The conceptual model shown in Figure 7.1 can be enhanced in different ways to include more uncertainties/submodels/variables.

The model developed in chapter 6, a naive Bayes classiﬁer connecting seismic intensities I, PGA and PGV, can also be represented by a graphical model, event though it is not explicitly mentioned in chapter 6. The graphical model which is equivalent to the naive Bayes classiﬁer is shown in Figure 7.2. Here, the joint probability of I, P GA and P GV is encoded as r(I;PGA;PGV ) = r(I)r(PGA I) r(PGV I); (7.1) r(I P GA; P GV ), all of which are learned from data. From eq. (7.1) it is straightforward to compute which can be used in the generation of ShakeMaps (Wald et al., 1999a) or for the selection of GMMs in regions where data is sparse (Scherbaum et al., 2009; Delavaud et al., 2009). We ﬁnd that the naive Bayes classiﬁer performs better than commonly employed linear regression models, where performance is assessed via the 0-1 loss and generalization error (cf. section 6.3).

The naive Bayes classiﬁer also has a conceptual advantage over regression models – it treats seismic intensity as a discrete rather than continuous variable. This leads to a better representation r(I P GA; P GV ). Thus, the naive Bayes classiﬁer lends itself to a of uncertainty, quantiﬁed by convenient way of a fully probabilistic treatment of the conversion between instrumental ground motion parameters and seismic intensities.

Bibliography Abrahamson, N. A. (2000). State of the practice of seismic hazard evaluation. GeoEng 2000.

Melbourne, Australia, 2000.

Abrahamson, N. A. and J. J. Bommer (2005). Probability and Uncertainty in Seismic Hazard Analysis, Earthquake Spectra 21, 603-607.

Abrahamson, N. A. and R. R. Youngs (1992). A Stable Algorithm for Regression Analyses Using the Random Effects Model, Bull. Seism. Soc. Am. 82, 505-510.

Abrahamson, N. A. and W. J. Silva (1997). Empirical response spectral attenuation relations for shallow crustal earthquakes, Seismol. Res. Lett 68, 9-23.

Abrahamson, N. A. and W. J. Silva (2008). Summary of the Abrahamson & Silva NGA GroundMotion Relations, Earthquake Spectra 24, 67-97.

Abrahamson, N. A. and R. R. Youngs (1992). A Stable Algorithm for Regression Analyses Using the Random Effects Model, Bull. Seism. Soc. Am. 82, 505-510.

Ahmad, I., M. H. El Naggar and A. M. Khan (2008). Neural Network Based Attenuation of Strong Motion Peaks in Europe, J. Earthq. Eng. 12, 663-680.

Akaike, H. (1974). A new look at the statistical model identiﬁcation, IEEE Transactions on Automatic Control AC19, 716-723.

Akkar, S. and J. J. Bommer (2010). Empirical Equations for the Prediction of PGA, PGV, and Spectral Accelerations in Europe, the Mediterranean Region, and the Middle East, Seism.

Res. Let. 81, 195-206.

Akkar, S. and J. J. Bommer (2007a). Empirical Prediction Equations for Peak Ground Velocity Derived from Strong-Motion Records from Europe and the Middle East, Bull. Seism. Soc.

Am. 97, 511-530.

Akkar, S. amd J. J. Bommer (2007b). Prediction of elastic displacement response spectra in Europe and the Middle East, Earthq. Engng. Struct. Dyn., 36, 1275-1301.

Al Atik, L., N. Abrahamson, F. Cotton, F. Scherbaum, J. Bommer, and N. Kuehn (2010). The Variability of Ground-Motion Prediction Models and its Components, submitted to Seism.

Res. Let.

Allen T. I., D. J. Wald, P. S. Earle, K. D. Marano, A. J. Hotovec, K. Lin and M. Hearne (2009).

An Atlas of ShakeMaps and population exposure catalog for earthquake loss modeling. Bull.

Earthq. Eng. 7(3), 701-718, doi:710.1007/s10518-10009-19120-y Allen, T. I. and D. J. Wald (2009). Evaluation of Ground-Motion Modeling Techniques for Use in Global ShakeMap - A Critique of Instrumental Ground-Motion Prediction Equations, Peak Ground Motion to Macroseismic Intensity Conversions, and Macroseismic Intensity Predictions in Different Tectonic Settings, U.S. Geological Survey Open-File Report 2009-1047, 114 p.

Allison, P. D. (2002). Missing Data, Sage Publications.

Allmann, B. P. and P. M. Shearer (2009). Global variations of stress drop for moderate to large earthquakes, J. Geophys. Res. 114, B01310, doi:10.1029/2008JB005821.

Ambraseys, N. N., J. Douglas, S. K. Sarma and P. M. Smit (2005). Equations for the Estimation of Strong Ground Motions from Shallow Crustal Earthquakes Using Data from Europe and the Middle East: Horizontal Peak Ground Acceleration and Spectral Acceleration, Bull. Earthq.

Eng. 3, 1-53.

Anderson, J. G. (2000). Expected Shape of Regressions for Ground-Motion Parameters on Rock, Bull. Seism. Soc. Am. 90, S43-S52.

Anderson, J. G. and J. N. Brune (1999). Probabilistic seismic hazard assessment without the ergodic assumption, Seism. Res. Let. 70, 19-28.

Anderson, J. G. and Y. Lei (1994). Nonparametric Description of Peak Acceleration as a Function of Magnitude, Distance, and Site in Guerrero, Mexico, Bull. Seism. Soc. Am. 84, 1003-1017.

Arroyo, D. and M. Ordaz (2010a). Multivariate Bayesian Regression Analysis Applied to GroundMotion Prediction Equations, Part 1: Theory and Synthetic Example, Bull. Seism. Soc. Am.

100, 1551-1567.

Arroyo, D. and M. Ordaz (2010b). Multivariate Bayesian Regression Analysis Applied to GroundMotion Prediction Equations, Part 2: Numerical Example with Actual Data, Bull. Seism. Soc.

Am. 100, 1568-1577.

Atkinson, G. M. (2008). Ground-Motion Prediction Equations for Eastern North America from a Referenced Empirical Approach: Implications for Epistemic Uncertainty, Bull. Seism. Soc.

Am. 98, 1304-1318.

Atkinson, G. M. and W. J. Silva (2000). Stochastic Modeling of California Ground Motions, Bull.

Seism. Soc. Am. 90, 255-274.

Atkinson, G.M. and E. Sonley (2000). Empirical relationships between modiﬁed Mercalli intensity and response spectra, Bull. Seism. Soc. Am. 90, 537-544.

Atkinson, G. M. and D. M. Boore (2006). Earthquake Ground-Motion Prediction Equations for Eastern North America, Bull. Seism. Soc. Am. 96, 2181-2205.

Atkinson, G.M. and S.I. Kaka (2007). Relationships between Felt Intensity and Instrumental Ground Motion in the Central United States and California, Bull. Seism. Soc. Am. 97, 497Baker, J. W. (2007). Correlation of ground motion intensity parameters used for predicting structural and geotechnical response, Proceedings of the 10th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP10).

Baker, J. W. and C. A. Cornell (2006). Correlation of response spectral values for multicomponent ground motions, Bull. Seism. Soc. Am. 96, 215-227.

Baker, J. W. and N. Jayaram (2008). Correlation of Spectral Acceleration Values from NGA Ground Motion Models, Earthquake Spectra 24, 299-317.