FREE ELECTRONIC LIBRARY - Thesis, documentation, books

Pages:     | 1 |   ...   | 15 | 16 || 18 | 19 |   ...   | 20 |


-- [ Page 17 ] --

The generalization errors for the different classifiers/regressions are displayed in Table 6.2. As one can see, the naive Bayes classifiers perform consistently better than regression, shown by a lower generalization error. Table 6.2 also shows that both PGV and PGA are predictors of similar quality when using the naive Bayes classifier, while for the regression PGV performs better (similar to Boatwright et al.(2001)). We also see that a combined predictor of PGV and PGA does not improve the predictive performance of either naive Bayes or regression. This can be interpreted that the information content of PGA and PGV with respect to seismic intensity is similar. In the case of the naive Bayes classifiers, there is also no difference between the ones with a common standard deviation and those with a different one for each intensity class. This indicates that both classifiers generalize equally well to unseen data. However, we believe that it is preferable to use a common standard deviation, since for some intensity classes there are only a few data points, which might render the estimation of the standard deviations unstable.

In Figure 6.1, we show predictions of I given PGV. Here, for each PGV value the full condi€r(I €r(I P GV ) is shown, color-coded from light (low P GV )) to dark (high tional distribution €r(I P GV )) colors. For each value of PGV on the x-axis, the corresponding color-coded values €r(I P GV ) along the vertical (I) axis sum up to unity. For comparison, we also plot the data of points as well as the geometric means of the PGV values for each intensity class. The latter are Discussion and Conclusions

–  –  –

used for the regression of the model of Faenza and Michelini (2010), which is also shown in Figure 6.1. As one can see, the most likely I predicted by the naive Bayes classifier (i.e. the intensity €r(I P GV ), corresponding to the darkest color for each intensity class) class with the highest correlates reasonably well with the model of Faenza and Michelini (2010).

Figure 6.1 also shows the large scatter in the data, both for a given PGV value as well as for a given intensity value.

This is very well represented by the naive Bayes classifier, which returns €r(IP GV ). The large scatter in intensity values for a particular a relatively broad distribution PGV value indicates that it is important to treat I probabilistically, i.e. use the full distribution.

This is facilitated by a naive Bayes classifier.

6.4 Discussion and Conclusions

We have presented naive Bayes classification to predict intensities from ground motion intensity parameters (PGA and PGV) as an alternative to traditional regression models. A naive Bayes classifier predicts the distribution of a discrete variable given some predictor variables using Bayes’ rule, making the naive assumption that the predictor variables are conditionally independent given the target. This assumption greatly reduces the number of parameters to learn and is, albeit not realistic from a physical perspective, often sufficient for prediction. In our case, the assumption of conditional independence only applies if we use both PGA and PGV as predictors (and it applies to regression as well). From a purely physical perspective, this assumption is not justified, since there is correlation between PGA and PGV, but analysis of the generalization error (see Table 6.2) shows that the naive Bayes classifier nevertheless outperforms regression when it comes to prediction of seismic intensities from PGA and PGV. The naive Bayes classifier, however, is not suitable Discussion and Conclusions as a physical model for the data generating process.

€r(I = k P GV; P GA), making the assumpWe have built a naive Bayes classifier to estimate tion that the conditional distribution of PGA and PGV, respectively, given an intensity class, is log-normal. The analysis of the generalization error, estimated via leave-one-out cross-validation, shows that the naive Bayes performs better than regression when it comes to predicting unseen data. The generalization error also shows that PGV and PGA individually can both predict I similarly well, while the joint use of them does not lead to an improvement in prediction. Incidentally, we believe that this is due to the high correlation between PGA and PGV, which means that one can be used as a surrogate for the other.

A particular appealing feature of the naive Bayes classifier is that it provides a direct estimate of €r(I P GV; P GA). Compared to regression, there is no roundthe discrete intensity distribution ing or interpolation necessary, meaning that directly integer values are estimated. Since Bayes’ €r(I €r(X P GV; P GA), an estimate of I) is rerule [eq. (6.2)] is used for the estimation of quired, where X is either PGA or PGV. Thus, the model can be just as easily used to predict the ground motion parameters given I.

We have learned two naive Bayes classifiers, one with a common standard deviation of the distribution of the ground motion intensity parameters over the different intensity classes, and one with different standard deviations. Even though both classifiers have a similar generalization error, we believe that it is better to use the former, since it provides a more stable estimate of the standard deviation. For some intensity classes, there are only 3 or 5 records, which makes it difficult to obtain a precise estimate of the standard deviation. Other possibilities exist, e.g. one could estimate a common standard deviation for adjacent intensity classes, which is done in Ebel and Wald (2003). Nevertheless, we think that the assumption of a common standard deviation over all intensity classes is reasonable.

In contrast to a regression model, which is unbounded, the naive Bayes classifier can only predict intensity values which occur in the underlying dataset. In principle, one could extrapolate a regression model to ground motion intensity values that lie beyond the extreme values found in the dataset to predict higher/lower intensity values (e.g. intensity values greater than 8 for the current dataset). This is not possible with a naive Bayes classifier. However, it is questionable if this is a disadvantage, since extrapolation of a model outside the parameter boundaries of its underlying dataset can be dangerous (see e.g. Bommer et al. (2007), for a discussion of extrapolating ground motion prediction equations).

The naive Bayes classifier that was learned in this study is trained on a dataset consisting of macroseismic intensities (of the Mercalli-Cancani-Sieberg scale) and PGA/PGV values from Italy, which is the same dataset used in Faenza and Michelini (2010) (see Data and Resources section).

The reason why we have chosen it was because of the good documentation of the selection and preprocessing steps. We do not claim that this automatically justifies the application of their or our model in other regions which is an issue which requires careful consideration of a number of arguments (e.g. Cotton et al., 2006; Bommer et al., 2010). Certain ground shaking levels are bound to cause damage everywhere in the world, but since macroseismic intensity is a somewhat qualitative parameter that may include information on building quality, exact values/distributions might change from region to region. On the other hand, if the goal is to predict the most probable intensity value, the model may well be applicable in other regions, since this task is probably less Discussion and Conclusions sensitive to regional influences. As said before, we leave this issue up to the user.

In this short note, we have considered PGA and PGV as predictor variables for I. Of course, the model can be extended to include also other variables such as magnitude or distance (e.g.

Tselentis and Danciu, 2008). In that case, however, it is not as easy as in the case of PGA and PGV to assume a parametric distribution for each intensity class. Thus, either these variables need to be discretized, or some other method such as a Kernel density estimation needs to be employed.

Such an analysis, however, is beyond the scope of this article.

Data and Resources The dataset used in this study is the one compiled by Faenza and Michelini (2010), which is available in their electronic supplement under http://www3.interscience.wiley.


Acknowledgments We acknowledge that this paper was helped by the discussions in the Pegasos Refinement Project workshops. We thank the reviewers Fleur Strasser and Karen Assatourians and the editor Arthur McGarr for their comments which helped to clarify and improve the manuscript.


In this work, we have looked at uncertainty in GMMs. During that process, we also investigated some other questions that are of interest in the context of GMMs and PSHA, such as correlation between ground motion intensity parameters or regional differences in ground motion scaling.

A considerable amount of uncertainty that is associated with GMMs pertains to their functional form f(X) (cf. eq. (1.2)). Often, f(X) is determined based on physical considerations and the analysis of residuals. In chapter 2, we have taken a new stance and based f(X) on its predictive capability over the generating dataset. Therefore, we introduced the concept of generalization error and cross-validation. The idea here is that for PSHA, the primary goal of a GMM is not to to be a model of the physical processes in the ground motion domain, but to accurately predict future expected ground motions. Therefore, a GMM should be oriented along the lines of its predictive power. In this context, see also Breiman (2001a).

Based on the above considerations, a regression model is learned based on the NGA dataset which is optimized for its predictive capability. The model is rather complex (having many parameters), but is not overfit. We have calculated an equivalent stochastic model, which is physically interpretable (and also plausible, compared with already published models for western North America). Thus, the method we proposed is a convenient way to optimize a regression model for predictive power and checking that it makes physical sense.

A real physical interpretation is possible only for the equivalent stochastic model, since the parameters of the regression model are not tied to any physical meaning. However, partial dependence plots can reveal several characteristics of the model/data (cf. Figure 2.4, eq. (2.7) and Friedman (2001)). For example, in the partial dependence plot showing the scaling of PGA with SHkm WHkm.

distance there is a ‘bump’ visible in the range between RJB = and RJB = This ‘bump’ can be associated with the so-called Moho-bounce. This effect is not modeled in the NGA models, but our analysis shows that it is supported by the data. Hence, the flexible, generalizationerror optimized model shows which features are actually inherent in (or supported by) the data and can thus be helpful in choosing a functional form that models these features, thereby reducing uncertainty about f(X).

On the other hand, the partial dependence plots also show data ranges which are problematic.

In particular, this holds for the magnitude and the depth to the top of the rupture. Since there are typically fewer earthquakes than records in a strong motion dataset, these two variables are less well sampled than e.g. distance, and thus the scaling of ground motion with them is less clear defined by data. This manifests itself in ‘rougher’ partial dependence plots for the earthquake related variables. The overall scaling makes sense, but in some ranges is overly complicated. This reflects that the underlying dataset – at least for the earthquake related variables magnitude and depth to the top of the rupture – is not a representative sample of the true underlying distribution.

Thus, for these variables the model can only provide guidance on the general form of ground motion scaling.

In chapter 2, we have optimized the model with respect to generalization error for the moment QH magnitude, Joyner-Boore distance, VS and depth to the top of the rupture – the faulting style is included in the model, but is not adapted. One could also include other variables, such as directivity parameters or sediment depth, to investigate their (functional) relation to ground motion. One could also use other basis functions than polynomials, such as splines. Non-parametric regression methods such as MARS (multivariate adaptive regression splines, Friedman (1991)) or random forests (Breiman, 2001b) also may provide viable insights.

Along the same lines as the flexible regression model, we used Bayesian networks (BNs) to investigate what can be learned (purely) from data. The BN is a representation of the joint dis€r(PGA;X).

tribution of PGA and the (potential) predictors X, Here, results are slightly complicated due to the need of discretizing the data, but again we find that there are problems in the underlying dataset – several data ranges are not well sampled. However, in ranges with good data coverage the BN gives reasonable results. One example of a possible not well represented data U:S.

range is the scaling of PGA with very large magnitudes (MW Here, a decrease of PGA with increasing magnitude is observed (so-called oversaturation). This is also seen in the NGA models (Abrahamson and Silva, 2008; Boore and Atkinson, 2008; Campbell and Bozorgnia, 2008), but it was decided not model this effect due to a lack of scientific consensus on that matter.

Pages:     | 1 |   ...   | 15 | 16 || 18 | 19 |   ...   | 20 |

Similar works:

«INDEX HISTORIC CURIOSITIES 1. The streets of Barcelona 2. The charms and hidden corners of Barcelona 2.1. La Rambla 2.1.1. Rambla de Canaletes 2.1.2. Rambla dels Estudis 2.1.3. Rambla de les Flors 2.1.4. Rambla dels Caputxins 2.1.5. Rambla de Santa Mònica 2.2. Gothic Quarter 2.3. La Ribera 2.4. El Raval 2.5. The Eixample LITERARY CURIOSITIES Don Quixote in Barcelona HISTORIC ESTABLISHMENTS 1. Theatres, cafés and entertainments 2. Restaurants and hotels 3. Shops FILMS * Texts written by Mireia...»

«Geologic Map of the Lake Tahoe Basin, California and Nevada Compiled by George J. Saucedo Digitized by Jason D. Little, Sarah E. Watkins, Jennifer R. Davis, Marina T. Mascorro, Victoria D. Walker, and Eric W. Ford Copyright © 2005 by the California Department of Conservation California Geological Survey. All rights reserved. No part of this publication may be reproduced without written consent of the California Geological Survey. The Department of Conservation makes no warranties as to the...»

«June 2016 Online Education Center Catalog Table of Contents: Certificate Programs California Specific Courses Communication Compliance Bank Specific Compliance Workplace Computer Skills Customer Service / Sales Finance Health & Safety Human Resources Management / Leadership Personal Development Security Skills Teller Training * Full Library subscription does not include certificate programs Page 1 of 137 1 Certificate Programs ICBA Teller Specialist Certificate Program (P3045EN) The ICBA Teller...»

«biblatex-juradiss Ver. 0.1f Dr. Tobias Schwan 17. 7. 2012 Inhaltsverzeichnis 1 Einleitung 2 2 Kurzanleitung 2 3 Installation 2 4 Dokumentenklasse 3 5 Verhältnis zu biblatex-jura 3 6 Einzelne Eintragstypen 4 6.1 Eintragstyp article 6.2 Eintragssubtyp newsletter 6.3 Eintragstyp book................................ 5 6.4 Eintragstyp commentary............................ 7 6.5 Eintragstyp incollection 6.6 Eintragstyp periodical 6.7...»

«Boden und Düngung: Bodenfruchtbarkeit und Diversität: Poster Entwicklung der Humusgehalte in Abhängigkeit von Fruchtart/Anbausystem in einem Dauerfeldversuch Brock, C., Leithold, G. und Schulz, F. Keywords: soil organic matter dynamics, cropping system, long-term experiment Abstract Year-to year changes of soil organic matter (SOM) have been surveyed in a long-term experiment in order to allow for crop/cropping system impact assessment. Legumes usually caused SOM build-up, which on the...»

«Title: Narrative in Transition. How New Media Change Our Experience of Stories Author: Teun Dubbelman Institute: Utrecht University, Center for the Study of Digital Games and Play (GAP) ABSTRACT This paper explores how our experience of narrative has changed with the emergence of new forms of narrative media, particularly with the medium of computer games. It explicates the distinctive character of this novel experience, and investigates how it differs from the narrative experiences created in...»

«Attack Graph Based Evaluation of Network Security Igor Kotenko and Mikhail Stepashkin SPIIRAS, 39, 14 Liniya, St.-Petersburg, 199178, Russia {ivkote, stepashkin}@comsec.spb.ru Abstract. The perspective directions in evaluating network security are simulating possible malefactor’s actions, building the representation of these actions as attack graphs (trees, nets), the subsequent checking of various properties of these graphs, and determining security metrics which can explain possible ways to...»

«THE AUDIT FIRM GOVERNANCE CODE A PROJECT FOR THE FINANCIAL REPORTING COUNCIL Audit Firm Governance Working Group Chairman: Norman Murray January 2010 The ICAEW operates under a Royal Charter, working in the public interest. Its regulation of members, in particular in respect of auditors, is overseen by the Financial Reporting Council. As a world leading professional accountancy body, the ICAEW provides leadership and practical support to over 132,000 members in more than 165 countries, working...»

«Seirbhísí Fostaíochta agus Tacaíochta do Chuardaitheoirí Poist. Employment and Support Services for Jobseekers Jobseeker Information This Jobseeker Information booklet tells you about our supports and services that can help you in your job search. The first part of the booklet gives you practical advice on CVs, job letters and interviews. The second part tells you about allowances, schemes, grants and other supports that may help you in your job search. Job Seeking Pathways to Supports...»

«First North Nordic – Rulebook 1 January 2014 1. Introduction 2. Admission and removal of financial instruments to trading on First North 2.1 General 2.2 Admission requirements 2.2.1 Requirements for shares 2.2.2 Company Description 2.2.3 Certified Adviser 2.2.4 Organizational requirements 2.3 Application for admission to trading on First North 2.4 Admission 2.5 Other financial instruments 2.6 Observation Status 2.7 Voluntary removal of financial instruments from trading 2.8 Application and...»

«03096 werben 03096 werben Amtsfeuerwehr, Amt Burg (Spreewald) Ortswehr Werben Ortswehrführer Oberbrandmeister Manuel Marrack. Am Sportplatz 17. 03096 Werben. Tel.: (035603) 75 60 20. Stellv. Ortswehrführer. Löschmeister Dusty Gorenz verkaufe ackerland im spreewald in Brandenburg Werben 03096 Werben (Spreewald) 08.02.2016; ackerland zu verkaufen. verkaufe 10,42ha zusammhängendes ackerland.eine fläche. bestens geeignet für mais Kreios Gmbh Am Sportplatz 7 b, 03096 Werben Firmenprofil...»

«School Newsletter AUTUMN TERM 2 2015-2016 Message from the Headteacher Welcome to our latest edition of the school newsletter which outlines key messages and information for parents regarding various aspects of school life. This communication is supplemented by the wealth of information that is available on the school website. It has been another excellent term at Belper School with many stepping up to the plate within and beyond lessons. Standards are high and students are engaging well with...»

<<  HOME   |    CONTACTS
2016 www.thesis.xlibx.info - Thesis, documentation, books

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.