A list of Python-based MCMC & ABC packagess
A list of Python-based MCMC & ABC packages. Also here’s a nice list of MCMC algorithms.
A general ABC framework to accommondate any type of model for parameter inference.
A Python Approximate Bayesian Computing (ABC) Population Monte Carlo (PMC) implementation based on Sequential Monte Carlo (SMC) with Particle Filtering techniques.
Features:
- Entirely implemented in Python and easy to extend
- Follows Beaumont et al. 2009 PMC algorithm
- Parallelized with muliprocessing or message passing interface (MPI)
- Extendable with k-nearest neighbour (KNN) or optimal local covariance matrix (OLCM) pertubation kernels (Fillipi et al. 2012)
- Detailed examples in IPython notebooks
ABCpy is a scientific library written in Python for Bayesian uncertainty quantification in absence of likelihood function, which parallelizes existing approximate Bayesian computation (ABC) algorithms and other likelihood-free inference schemes. It presently includes:
- RejectionABC
- PMCABC (Population Monte Carlo ABC)
- SMCABC (Sequential Monte Carlo ABC)
- RSMCABC (Replenishment SMC-ABC)
- APMCABC (Adaptive Population Monte Carlo ABC)
- SABC (Simulated Annealing ABC)
- ABCsubsim (ABC using subset simulation)
- PMC (Population Monte Carlo) using approximations of likelihood functions
- Random Forest Model Selection Scheme
- Semi-automatic summary selection (with Neural networks)
- summary selection using distance learning (with Neural networks)
ABCpy addresses the needs of domain scientists and data scientists by providing
- a fully modularized framework that is easy to use and easy to extend,
- a quick way to integrate your generative model into the framework (from C++, R etc.) and
- a non-intrusive, user-friendly way to parallelize inference computations (for your laptop to clusters, supercomputers and AWS)
- an intuitive way to perform inference on hierarchical models or more generally on Bayesian networks
ABrox is a python package for Approximate Bayesian Computation accompanied by a user-friendly graphical interface.
ABC-SysBio implements likelihood free parameter inference and model selection in dynamical systems. It is designed to work with both stochastic and deterministic models written in Systems Biology Markup Language (SBML). ABC-SysBio is a Python package that combines three algorithms: ABC rejection sampler, ABC SMC for parameter inference and ABC SMC for model selection.
astroABC is a Python implementation of an Approximate Bayesian Computation Sequential Monte Carlo (ABC SMC) sampler for parameter estimation.
Key features
- Parallel sampling using MPI or multiprocessing
- MPI communicator can be split so both the sampler, and simulation launched by each particle, can run in parallel
- A Sequential Monte Carlo sampler (see e.g. Toni et al. 2009, Beaumont et al. 2009, Sisson & Fan 2010)
- A method for iterative adapting tolerance levels using the qth quantile of the distance for t iterations (Turner & Van Zandt (2012))
- Scikit-learn covariance matrix estimation using Ledoit-Wolf shrinkage for singular matrices
- A module for specifying particle covariance using method proposed by Turner & Van Zandt (2012), optimal covariance matrix for a multivariate normal perturbation kernel, local covariance estimate using scikit-learn KDTree method for nearest neighbours (Filippi et al 2013) and a weighted covariance (Beaumont et al 2009)
- Restart files output frequently so an interrupted run can be resumed at any iteration
- Output and restart files are backed up every iteration
- User defined distance metric and simulation methods
- A class for specifying heterogeneous parameter priors
- Methods for drawing from any non-standard prior PDF e.g using Planck/WMAP chains
- A module for specifying a constant, linear, log or exponential tolerance level
- Well-documented examples and sample scripts
A-NICE-MC is a framework that trains a parametric Markov Chain Monte Carlo proposal. It achieves higher performance than traditional nonparametric proposals, such as Hamiltonian Monte Carlo (HMC).
A-NICE-MC stands for Adversarial Non-linear Independent Component Estimation Monte Carlo, in that:
- The framework utilizes a parametric proposal for Markov Chain Monte Carlo (MC).
- The proposal is represented through Non-linear Independent Component Estimation (NICE).
- The NICE network is trained through adversarial methods (A); see jiamings/markov-chain-gan.
bmcmc is a general purpose mcmc package which should be useful for Bayesian data analysis. It uses an adaptive scheme for automatic tuning of proposal distributions. It can also handle hierarchical Bayesian models via Metropolis-Within-Gibbs scheme.
CheKiPEUQ is a pythonMCMC code for Parameter estimation for complex physical problems. The CheKiPEUQ software provides tools for finding physically realistic parameter estimates, graphs of the parameter estimate positions within parameter space, and plots of the final simulation results.
Package which enables parameter inference using an Approximate Bayesian Computation (ABC) algorithm. The code was originally designed for cosmological parameter inference from galaxy clusters number counts based on Sunyaev-Zel’dovich measurements. In this context, the cosmological simulations were performed using the NumCosmo library.
Parallel nested sampling in python. CPNest is a python package for performing Bayesian inference using the nested sampling algorithm. It is designed to be simple for the user to provide a model via a set of parameters, their bounds and a log-likelihood function. An optional log-prior function can be given for non-uniform prior distributions.
A Dynamic Nested Sampling package for computing Bayesian posteriors and evidences.
dyPolyChord implements dynamic nested sampling using the efficient PolyChord sampler to provide state-of-the-art nested sampling performance. Any likelihoods and priors which work with PolyChord can be used (Python, C++ or Fortran), and the output files produced are in the PolyChord format.
Edward2 is a probabilistic programming language in TensorFlow and Python. It extends the TensorFlow ecosystem so that one can declare models as probabilistic programs and manipulate a model’s computation for flexible training, latent variable inference, and predictions.
ELFI is a statistical software package written in Python for likelihood-free inference (LFI) such as Approximate Bayesian Computation (ABC). The term LFI refers to a family of inference methods that replace the use of the likelihood function with a data generating simulator function. ELFI features an easy to use generative modeling syntax and supports parallelized inference out of the box.
emcee is an MIT licensed pure-Python implementation of Goodman & Weare’s Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. It’s designed for Bayesian parameter estimation and it’s really sweet!
A simple Hamiltonian MCMC sampler.
An adaptive basin-hopping Markov-chain Monte Carlo algorithm for Bayesian optimisation. Python implementation of the hoppMCMC algorithm aiming to identify and sample from the high-probability regions of a posterior distribution. The algorithm combines three strategies: (i) parallel MCMC, (ii) adaptive Gibbs sampling and (iii) simulated annealing. Overall, hoppMCMC resembles the basin-hopping algorithm implemented in the optimize module of scipy, but it is developed for a wide range of modelling approaches including stochastic models with or without time-delay.
kombine is an ensemble sampler built for efficiently exploring multimodal distributions. By using estimates of ensemble’s instantaneous distribution as a proposal, it achieves very fast burnin, followed by sampling with very short autocorrelation times.
Multi-Core Markov-Chain Monte Carlo (MC3) is a powerful Bayesian-statistics tool that offers:
- Levenberg-Marquardt least-squares optimization.
- Markov-chain Monte Carlo (MCMC) posterior-distribution sampling following the:
- Metropolis-Hastings algorithm with Gaussian proposal distribution,
- Differential-Evolution MCMC (DEMC), or
- DEMCzs (Snooker).
Flexible and efficient Python implementation of the nested sampling algorithm. This implementation is geared towards allowing statistical physicists to use this method for thermodynamic analysis but is also being used by astrophysicists.
This implementation uses the language of statistical mechanics (partition function, phase space, configurations, energy, density of states) rather than the language of Bayesian sampling (likelihood, prior, evidence). This is simply for convenience, the method is the same.
The package goes beyond the bare implementation of the method providing:
- built-in parallelisation on single computing node (max total number of cpu threads on a single machine)
- built-in Pyro4-based parallelisation by distributed computing, ideal to run calculations on a cluster or across a network
- ability to save and restart from checkpoint binary files, ideal for very long calculations
- scripts to compute heat capacities and perform error analysis integration with the MCpele package to implement efficient Monte Carlo walkers.
Pure Python, MIT-licensed implementation of nested sampling algorithms. Nested Sampling is a computational approach for integrating posterior probability in order to compare models in Bayesian statistics. It is similar to Markov Chain Monte Carlo (MCMC) in that it generates samples that can be used to estimate the posterior probability distribution. Unlike MCMC, the nature of the sampling also allows one to calculate the integral of the distribution. It also happens to be a pretty good method for robustly finding global maxima.
No-U-Turn Sampler (NUTS) for python This package implements the No-U-Turn Sampler (NUTS) algorithm 6 from the NUTS paper (Hoffman & Gelman, 2011).
Python library for working with Probabilistic Graphical Models.
Fork of Daniel Foreman-Mackey’s emcee to implement parallel tempering more robustly. As far as possible, it is designed as a drop-in replacement for emcee. If you’re trying to characterise awkward, multi-modal probability distributions, then ptemcee is your friend.
MPI enabled Parallel Tempering MCMC code written in Python.
Python class that coordinates an MPI implementation of parallel tempering. Supports a fully parallelised implementation of parallel tempering using mpi4py (message passing interface for python). Each replica runs as a separate parallel process and they communicate via an mpi4py object. To minimise message passing the replicas stay in place and only the temperatures are exchanged between the processes. It is this exchange of temperatures that ptmpi handles.
pyABC is a framework for distributed, likelihood-free inference. That means, if you have a model and some data and want to know the posterior distribution over the model parameters, i.e. you want to know with which probability which parameters explain the observed data, then pyABC might be for you.
All you need is some way to numerically draw samples from the model, given the model parameters. pyABC “inverts” the model for you and tells you which parameters were well matching and which ones not. You do not need to analytically calculate the likelihood function.
pyABC runs efficiently on multi-core machines and distributed cluster setups. It is easy to use and flexibly extensible.
A Python implementation of the MT-DREAM(ZS) algorithm.
This package is a straight-forward port of the functions
hmc2.m
andhmc2_opt.m
from the MCMCstuff matlab toolbox written by Aki Vehtari. The code is originally based on the functions hmc.m from the netlab toolbox written by Ian T Nabney. The portion of algorithm involving “windows” is derived from the C code for this function included in the Software for Flexible Bayesian Modeling written by Radford Neal.
PyJAGS provides a Python interface to JAGS, a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation.
PyMC3 is a probabilistic programming module for Python that allows users to fit Bayesian models using a variety of numerical methods, most notably Markov chain Monte Carlo (MCMC) and variational inference (VI). Its flexibility and extensibility make it applicable to a large suite of problems. Along with core model specification and fitting functionality, PyMC3 includes functionality for summarizing output and for model diagnostics.
Simple implementation of the Metropolis-Hastings algorithm for Markov Chain Monte Carlo sampling of multidimensional spaces. The implementation is minimalistic. All that is required is a funtion which accepts an iterable of parameter values, and returns the positive log likelihood at that point.
A python module implementing some generic MCMC routines. The main purpose of this module is to serve as a simple MCMC framework for generic models. Probably the most useful contribution at the moment, is that it can be used to train Gaussian process (GP) models implemented in the GPy package.
The code features the following things at the moment:
- Fully object oriented. The models can be of any type as soon as they offer the right interface.
- Random walk proposals.
- Metropolis Adjusted Langevin Dynamics.
- The MCMC chains are stored in fast HDF5 format using PyTables.
- A mean function can be added to the (GP) models of the GPy package.
The pymcmcstat package is a Python program for running Markov Chain Monte Carlo (MCMC) simulations. Included in this package is the ability to use different Metropolis based sampling techniques:
- Metropolis-Hastings (MH): Primary sampling method.
- Adaptive-Metropolis (AM): Adapts covariance matrix at specified intervals.
- Delayed-Rejection (DR): Delays rejection by sampling from a narrower distribution. Capable of n-stage delayed rejection.
- Delayed Rejection Adaptive Metropolis (DRAM): DR + AM
This package is an adaptation of the MATLAB toolbox mcmcstat.
MultiNest is a program and a sampling technique. As a Bayesian inference technique, it allows parameter estimation and model selection. Recently, MultiNest added Importance Nested Sampling which is now also supported. The efficient Monte Carlo algorithm for sampling the parameter space is based on nested sampling and the idea of disjoint multi-dimensional ellipse sampling. For the scientific community, where Python is becoming the new lingua franca (luckily), I provide an interface to MultiNest.
pysmc is a Python package for sampling complicated probability densities using the celebrated Sequential Monte Carlo method.
PyStan provides an interface to Stan, a package for Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo.
Sampyl is a Python library implementing Markov Chain Monte Carlo (MCMC) samplers in Python. It’s designed for use in Bayesian parameter estimation and provides a collection of distribution log-likelihoods for use in constructing models.
PyTorch package for simulation-based inference. Simulation-based inference is the process of finding parameters of a simulator from observations. sbi takes a Bayesian approach and returns a full posterior distribution over the parameters, conditional on the observations. This posterior can be amortized (i.e. useful for any observation) or focused (i.e. tailored to a particular observation), with different computational trade-offs.
A Python package for Approximate Bayesian Computation.
A Statistical Parameter Optimization Tool for Python. SPOTPY is a Python framework that enables the use of Computational optimization techniques for calibration, uncertainty and sensitivity analysis techniques of almost every (environmental-) model.
UltraNest is intended for fitting complex physical models with slow likelihood evaluations, with one to hundreds of parameters. UltraNest intends to replace heuristic methods like multi-ellipsoid nested sampling and dynamic nested sampling with more rigorous methods. UltraNest also attempts to provide feature parity compared to other packages (such as MultiNest).
zeus is a pure-Python implementation of the Ensemble Slice Sampling method.
- Fast & Robust Bayesian Inference,
- No hand-tuning,
- Excellent performance in terms of autocorrelation time and convergence rate,
- Scale to multiple CPUs without any extra effort.