In case of a non-linear relation linking the model to the data, numerical Markov Chain Monte Carlo (MCMC) algorithms (Sambridge and Mosegaard, 2002) must be employed to accurately sample the target probability density function (pdf), also called the posterior probability density (PPD). A MCMC algorithm performs a random walk in the model space in which the sampling density is designed to be proportional to the posterior, so that the sampled models is used to approximate the target PPD. The first stage of the MCMC sampling (called the burn-in period) can be viewed as a global optimization that moves from a random starting model to a high-probability region of the model space. The second stage is called the sampling stage in which the small fluctuation of the misfit value indicates that the MCMC algorithm reaches the stationary regime. The samples accepted during the burn-in period do not accurately represent the PPD and for this reason they are not considered for computing the PPD. Usually to speed up the probabilistic sampling, multiple MCMC chains are simultaneously run. This offers a robust protection against premature convergence because the chains use different trajectories to explore the target posterior distribution. For example, different chains can efficiently sample different modes of the target pdf without get trapped in a localized optimum. Determining the convergence of the MCMC algorithm to a stable posterior distribution is indeed a very crucial aspect and several strategies have been developed. For example, in cases of multiple chains we can compute the potential scale reduction factor (PSRF) for each model parameter. This number quantifies the difference between the “within-walk” and “between-walk” estimated variances. The PSRF decreases to 1 as the number of drawn samples tends to infinite. A high PSRF value indicates that the variance within the walks is small compared to that between the walks and that longer walks are needed to converge to a stable distribution. Usually, a PSRF lower than 1.2 for a given unknown proves that convergence has been achieved for that model parameter. Although the increasing computational power provided by modern parallel architectures has considerably encouraged the applications of MCMC methods to solve geophysical problems, it is always crucial adopting a specific MCMC recipe to guarantee an efficient sampling of the PPD. To this end many strategies have been proposed over the last years: For examples adopting self-adaptive MCMC algorithms, hybridizing MCMC algorithms with global search methods, or employing peculiar updating schemes that ensure an efficient exploration of the model space. In this work, we test the performances of five MCMC algorithms on analytic pdfs and a strongly non-linear and ill-conditioned geophysical inverse problem: a synthetic cross-hole traveltime tomography. The performances are evaluated by comparison of the target and the estimated pdfs, by comparing the evolutions of the chains, and by computing the PSRF.
A comparison of Markov Chain Monte Carlo algorithms on analytical probability density functions and synthetic cross-hole traveltime tomography
Alessandro Salusti;Aleardi Mattia
2019-01-01
Abstract
In case of a non-linear relation linking the model to the data, numerical Markov Chain Monte Carlo (MCMC) algorithms (Sambridge and Mosegaard, 2002) must be employed to accurately sample the target probability density function (pdf), also called the posterior probability density (PPD). A MCMC algorithm performs a random walk in the model space in which the sampling density is designed to be proportional to the posterior, so that the sampled models is used to approximate the target PPD. The first stage of the MCMC sampling (called the burn-in period) can be viewed as a global optimization that moves from a random starting model to a high-probability region of the model space. The second stage is called the sampling stage in which the small fluctuation of the misfit value indicates that the MCMC algorithm reaches the stationary regime. The samples accepted during the burn-in period do not accurately represent the PPD and for this reason they are not considered for computing the PPD. Usually to speed up the probabilistic sampling, multiple MCMC chains are simultaneously run. This offers a robust protection against premature convergence because the chains use different trajectories to explore the target posterior distribution. For example, different chains can efficiently sample different modes of the target pdf without get trapped in a localized optimum. Determining the convergence of the MCMC algorithm to a stable posterior distribution is indeed a very crucial aspect and several strategies have been developed. For example, in cases of multiple chains we can compute the potential scale reduction factor (PSRF) for each model parameter. This number quantifies the difference between the “within-walk” and “between-walk” estimated variances. The PSRF decreases to 1 as the number of drawn samples tends to infinite. A high PSRF value indicates that the variance within the walks is small compared to that between the walks and that longer walks are needed to converge to a stable distribution. Usually, a PSRF lower than 1.2 for a given unknown proves that convergence has been achieved for that model parameter. Although the increasing computational power provided by modern parallel architectures has considerably encouraged the applications of MCMC methods to solve geophysical problems, it is always crucial adopting a specific MCMC recipe to guarantee an efficient sampling of the PPD. To this end many strategies have been proposed over the last years: For examples adopting self-adaptive MCMC algorithms, hybridizing MCMC algorithms with global search methods, or employing peculiar updating schemes that ensure an efficient exploration of the model space. In this work, we test the performances of five MCMC algorithms on analytic pdfs and a strongly non-linear and ill-conditioned geophysical inverse problem: a synthetic cross-hole traveltime tomography. The performances are evaluated by comparison of the target and the estimated pdfs, by comparing the evolutions of the chains, and by computing the PSRF.File | Dimensione | Formato | |
---|---|---|---|
Comparison of Markov chain.pdf
accesso aperto
Tipologia:
Versione finale editoriale
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.68 MB
Formato
Adobe PDF
|
2.68 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.