Goals
In this tutorial, we look at the Metropolis-Hastings sampler. This tutorial will cover how to:
- Implement the Metropolis-Hastings sampler using two different variations of acceptance probabilities:
- An independence chain
- A random walk chain
- Calculate the distribution's mean and variance-covariance matrix.
Introduction
The Metroplis-Hastings sampler is an iterative algorithm which produces a sequence of draws $\theta^{(r)}$ which can be used to estimate sample parameters such that
$$\hat{g} = \frac{\sum_{r=1}^{R} g(\theta^{(r)})}{R}$$
The intuition is relatively simple. The algorithm:
- Picks a candidate draw from the specified candidate generating density.
- Determines if the draw is accepted or rejected based on a computed acceptance probability.
- For each draw $r = 1, ..., R$ sets $\theta^{(r)}$ equal to $\theta^{(r-1)}$, if the draw is rejected.
In this particular tutorial, we will examine a single parameter $\theta$ with the posterior distribution given by
$$ p(\theta|y) \propto exp\bigg(-\frac{1}{2}|(\theta)|\bigg) $$
The Metropolis-Hastings steps
The general Metropolis-Hastings algorithm can be broken down into simple steps:
- Set up sampler specifications, including number of iterations and number of burn-ins draws.
- Choose a starting value for $\theta$.
- Draw $\theta^*$ from the candidate generating density.
- Calculate the acceptance probability $\alpha(\theta^{(r-1)}, \theta^*)$.
- Set $\theta^{(r)} = \theta^*$ with probability $\alpha(\theta^{(r-1)}, \theta^*)$, or else set $\theta^{(r)} = \theta^{(r-1)}$.
- Repeat steps 3-5 for $r = 1, ... , R$.
- Average the draws to produce the Metropolis-Hastings estimates for posterior functions of interest (mean, variance, etc).
The independent chain
The candidate generating density
We will first consider an independent chain which uses a normal distribution to generate the candidate draws
$$ N\big(0,\text{ }d^2\big)$$
The acceptance probability
The acceptance probability for this algorithm depends on the previous draw, $\theta^{(r-1)}$, the candidate draw, $\theta^*$, and the performance parameter, $d$ :
$$ \alpha\big(\theta^{(r-1)}, \theta^*\big) = min \bigg[exp\Big[\frac{1}{2}\Big( | \theta^{(r-1)}| - | \theta^*| - \bigg(\frac{\theta^{(r-1)}}{d}\bigg)^2 + \bigg(\frac{\theta^*}{d}\bigg)^2\Big) \Big],1\bigg]$$
Metropolis-Hastings settings
Our first step is to set up the specifications of the Metropolis-Hastings algorithm. We will :
- Keep 10,000 total draws
- Set a burn-in sample of 100 draws.
- Set the starting values of $\theta$ to 1.
- Set the standard deviation of the candidate draw to 6.
new;
/*
** Set up Metropolis Hastings chain
** Discard r0 burnin replications
*/
burn_in = 100;
// Specify the total number of replications
keep = 10000;
/*
** Standard deviation for the increment
** in the independent chain M-H algorithm
*/
sd_ic = 6;
// Initialize MH sums to zero
theta_mean_ic = 0;
th2_mo_ic = 0;
theta_draw_ic = zeros(keep+burn_in, 1);
// Specify starting value for independent chain theta draw
theta_draw_ic[1, 1] = 1;
// Set count of the number of accepted draws
count_ic = 0;
Implementing the algorithm
Next, we will implement the Metropolis-Hastings algorithm using a for loop. Within each iteration of our loop we need to :
- Draw a candidate $\theta$ using the GAUSS rndn function.
- Compute the acceptance probability.
- Set $\theta^{(r)} = \theta^*$ with probability $\alpha(\theta^{(r-1)}, \theta^*)$, or else set $\theta^{(r)} = \theta^{(r-1)}$.
// Set random seed for repeatable random numbers
rndseed 97980;
for i(2, burn_in+keep, 1);
// Front multiply by sd_ic
theta_can_ic = sd_ic*rndn(1, 1);
// Calculate acceptance probability
acc_prob_ic = minc(exp(.5*(abs(theta_draw_ic[i-1, 1]) - abs(theta_can_ic) + (theta_can_ic/sd_ic)^2 - (theta_draw_ic[i-1, 1]/sd_ic)^2))|1);
// Accept candidate draw with probability acc_prob_ic
if acc_prob_ic > rndu(1, 1);
theta_draw_ic[i, 1] = theta_can_ic;
count_ic = count_ic + 1;
else;
theta_draw_ic[i, 1] = theta_draw_ic[i-1, 1];
endif;
/*
** Discard r0 burnin draws
** Store remainder for computing
** sample parameters.
*/
if i>burn_in;
/*
** This keeps a running sum of
** of theta. It can be used to
** find the mean of theta.
*/
theta_mean_ic = theta_mean_ic + theta_draw_ic[i, 1];
/*
** This keeps a running sum of
** of square of theta. It will
** be used to find the standard
** deviation of theta.
*/
th2_mo_ic = th2_mo_ic + theta_draw_ic[i, 1].^2;
endif;
endfor;
The random walk chain
As an alternative to the independent chain, let's also consider the random walk chain. The random walk chain Metropolis-Hastings algorithm generates candidate draws from the process
$$\theta^* = \theta^{(r-1)} + z$$
where $z$ is a random variable that follows a normal increment.
The candidate generating density
The iteration candidate, $\theta^*$ is drawn from $N(\theta^{(r-1)},\text{ } c^2)$ where, $c$ is a performance parameter. Note how this differs from the independent chain which is drawn from $N(0,\text{ } c^2)$.
The acceptance probability
In the random walk case, the acceptance probability is given by
$$ \alpha\big(\theta^{(r-1)}, \theta^*\big) = min \bigg[exp\Big[\frac{1}{2}\Big( | \theta^{(r-1)}| - | \theta^*|\Big) \Big],1\bigg]$$
Implementing the algorithm
We will again use a for
loop to complete the steps of our Metropolis-Hastings sampler
// Set random seed
rndseed 10385;
// Standard deviation for RW M-H algorithm
sd_rw = 4;
// Initialize MH variables
theta_mean_rw = 0;
th2_mo_rw = 0;
theta_draw_rw = zeros(keep+burn_in, 1);
// Specify starting value for theta draw
theta_draw_rw[1, 1] = 1;
// Set count of the number of accepted draws
count_rw = 0;
for i(2, burn_in+keep, 1);
// Draw a candidate for random walk chain
theta_can_rw = theta_draw_rw[i, 1] + sd_rw*rndn(1, 1);
// Calculate acceptance probability
acc_prob_rw = minc(exp(.5*(abs(theta_draw_rw[i, 1]) - abs(theta_can_rw)))|1);
// Accept candidate draw with probability acc_prob_rw
if acc_prob_rw > rndu(1, 1);
theta_draw_rw[i, 1] = theta_can_rw;
count_rw = count_rw+1;
else;
theta_draw_rw[i, 1] = theta_draw_rw[i-1, 1];
endif;
/*
** Discard r0 burnin draws
** Store remainder for computing
** sample parameters.
*/
if i>burn_in;
/*
** This keeps a running sum of
** of theta. It can be used to
** find the mean of theta.
*/
theta_mean_rw = theta_mean_rw + theta_draw_rw[i,1];
/*
** This keeps a running sum of
** of square of theta. It will
** be used to find the standard
** deviation of theta.
*/
th2_mo_rw = th2_mo_rw + theta_draw_rw[i, 1].^2;
endif;
endfor;
for
loop. This minimizes the amount of looping in our program. However, we compute a separate for
loop for the for the sake of demonstration.Compute sample statistics
Finally, we use our stored data to find our sampler mean and variance.
print "Number of burn in replications";
burn_in;
print "Number of included replications";
keep;
print "Proportion of accepted candidate draws: Independence chain M-H";
count_ic/(keep+burn_in);
print "Proportion of accepted candidate draws: Random walk chain M-H";
count_rw/(keep+burn_in);
theta_mean = (theta_mean_ic~theta_mean_rw)./keep;
th2_mo = (th2_mo_ic~th2_mo_rw)/keep;
thvar = th2_mo - theta_mean.^2;
print "Posterior Mean and Variance, Ind Chain then RW Chain";
theta_mean;
thvar;
The above code should print the following output:
Number of burn in replications 100.0000 Number of included replications 10000.0000 Proportion of accepted candidate draws: Independence chain M-H 0.48455446 Proportion of accepted candidate draws: Random walk chain M-H 0.52603960 Posterior Mean and Variance, Ind Chain then RW Chain 0.038943707 0.075589115 8.1069691 8.0770516
Conclusion
Congratulations! You have:
- Implemented the Metropolis-Hastings sampler using as an independent chain.
- Implemented the Metropolis-Hastings sampler using as a random walk chain.
- Calculated the distribution's mean and variance-covariance matrix.
For your convenience, the entire code is below.
new;
/*
** Set up Metropolis Hastings chain
** Discard r0 burnin replications
*/
burn_in = 100;
// Specify the total number of replications
keep = 10000;
/*
** Standard deviation for the increment
** in the independent chain M-H algorithm
*/
sd_ic = 6;
// Initialize MH sums to zero
theta_mean_ic = 0;
th2_mo_ic = 0;
theta_draw_ic = zeros(keep+burn_in, 1);
/*
** Specify starting value for
** independent chain theta draw
*/
theta_draw_ic[1, 1] = 1;
// Set count of the number of accepted draws
count_ic = 0;
// Set random seed for repeatable random numbers
rndseed 97980;
for i(2, burn_in+keep, 1);
// Candidate draw from normal distribution
theta_can_ic = sd_ic*rndn(1, 1);
// Calculate acceptance probability
acc_prob_ic = minc(exp(.5*(abs(theta_draw_ic[i-1, 1]) - abs(theta_can_ic) +
(theta_can_ic/sd_ic)^2 - (theta_draw_ic[i-1, 1]/sd_ic)^2))|1);
/*
** Accept candidate draw with probability
** theta_can_ic
*/
if acc_prob_ic > rndu(1, 1);
theta_draw_ic[i, 1] = theta_can_ic;
count_ic = count_ic + 1;
else;
theta_draw_ic[i, 1] = theta_draw_ic[i-1, 1];
endif;
// Discard r0 burn-in draws
if i>burn_in;
/*
** This keeps a running sum of
** of theta. It can be used to
** find the mean of theta.
*/
theta_mean_ic = theta_mean_ic + theta_draw_ic[i, 1];
/*
** This keeps a running sum of
** of square of theta. It will
** be used to find the standard
** deviation of theta.
*/
th2_mo_ic = th2_mo_ic + theta_draw_ic[i, 1].^2;
endif;
endfor;
// Set random seed for repeatable random numbers
rndseed 10385;
/*
** Standard deviation for the
** RW M-H algorithm
*/
sd_rw = 4;
// Initialize MH sums to zero
theta_mean_rw = 0;
th2_mo_rw = 0;
theta_draw_rw = zeros(keep+burn_in, 1);
// Specify starting value for random walk chain theta draw
theta_draw_rw[1, 1] = 1;
// Set count of the number of accepted draws
count_rw = 0;
for i(2, burn_in+keep, 1);
// Draw a candidate for random walk chain
theta_can_rw = theta_draw_rw[i-1, 1] + sd_rw*rndn(1, 1);
// Calculate acceptance probability
acc_prob_rw = minc(exp(.5*(abs(theta_draw_rw[i-1, 1]) - abs(theta_can_rw)))|1);
// Accept candidate draw with probability acc_prob_rw
if acc_prob_rw > rndu(1, 1);
theta_draw_rw[i, 1] = theta_can_rw;
count_rw = count_rw+1;
else;
theta_draw_rw[i, 1] = theta_draw_rw[i-1, 1];
endif;
//Discard r0 burn-in draws
if i>burn_in;
/*
** This keeps a running sum of
** of theta. It can be used to
** find the mean of theta.
*/
theta_mean_rw = theta_mean_rw + theta_draw_rw[i,1];
/*
** This keeps a running sum of
** of square of theta. It will
** be used to find the standard
** deviation of theta.
*/
th2_mo_rw = th2_mo_rw + theta_draw_rw[i, 1].^2;
endif;
endfor;
print "Number of burn in replications";
burn_in;
print "Number of included replications";
keep;
print "Proportion of accepted candidate draws: Independence chain M-H";
count_ic/(keep+burn_in);
print "Proportion of accepted candidate draws: Random walk chain M-H";
count_rw/(keep+burn_in);
// Calculate mean using stored sum of theta
theta_mean = (theta_mean_ic~theta_mean_rw)./keep;
// Calculate standard deviation
th2_mo = (th2_mo_ic~th2_mo_rw)/keep;
thvar = th2_mo - theta_mean.^2;
print "Posterior Mean and Variance, Ind Chain then RW Chain";
theta_mean;
thvar;