Metropolis-Hastings Sampler

Goals

In this tutorial, we look at the Metropolis-Hastings sampler. This tutorial will cover how to:

Implement the Metropolis-Hastings sampler using two different variations of acceptance probabilities:
- An independence chain
- A random walk chain
Calculate the distribution's mean and variance-covariance matrix.

Introduction

The Metroplis-Hastings sampler is an iterative algorithm which produces a sequence of draws $\theta^{(r)}$ which can be used to estimate sample parameters such that

$$\hat{g} = \frac{\sum_{r=1}^{R} g(\theta^{(r)})}{R}$$

The intuition is relatively simple. The algorithm:

Picks a candidate draw from the specified candidate generating density.
Determines if the draw is accepted or rejected based on a computed acceptance probability.
For each draw $r = 1, ..., R$ sets $\theta^{(r)}$ equal to $\theta^{(r-1)}$, if the draw is rejected.

In this particular tutorial, we will examine a single parameter $\theta$ with the posterior distribution given by

$$ p(\theta|y) \propto exp\bigg(-\frac{1}{2}|(\theta)|\bigg) $$

The Metropolis-Hastings steps

The general Metropolis-Hastings algorithm can be broken down into simple steps:

Set up sampler specifications, including number of iterations and number of burn-ins draws.
Choose a starting value for $\theta$.
Draw $\theta^*$ from the candidate generating density.
Calculate the acceptance probability $\alpha(\theta^{(r-1)}, \theta^*)$.
Set $\theta^{(r)} = \theta^*$ with probability $\alpha(\theta^{(r-1)}, \theta^*)$, or else set $\theta^{(r)} = \theta^{(r-1)}$.
Repeat steps 3-5 for $r = 1, ... , R$.
Average the draws to produce the Metropolis-Hastings estimates for posterior functions of interest (mean, variance, etc).

The independent chain

The candidate generating density

We will first consider an independent chain which uses a normal distribution to generate the candidate draws

$$ N\big(0,\text{ }d^2\big)$$

The acceptance probability

The acceptance probability for this algorithm depends on the previous draw, $\theta^{(r-1)}$, the candidate draw, $\theta^*$, and the performance parameter, $d$ :

$$ \alpha\big(\theta^{(r-1)}, \theta^*\big) = min \bigg[exp\Big[\frac{1}{2}\Big( | \theta^{(r-1)}| - | \theta^*| - \bigg(\frac{\theta^{(r-1)}}{d}\bigg)^2 + \bigg(\frac{\theta^*}{d}\bigg)^2\Big) \Big],1\bigg]$$

Note : In this example we will set $d = 6$. However, in practice convergence diagnostics should be used to help choose $d$ to optimize the sampler.

Metropolis-Hastings settings

Our first step is to set up the specifications of the Metropolis-Hastings algorithm. We will :

Keep 10,000 total draws
Set a burn-in sample of 100 draws.
Set the starting values of $\theta$ to 1.
Set the standard deviation of the candidate draw to 6.

new;

/*
** Set up Metropolis Hastings chain
** Discard r0 burnin replications
*/
burn_in = 100;

// Specify the total number of replications
keep = 10000;

/*
** Standard deviation for the increment 
** in the independent chain M-H algorithm
*/
sd_ic = 6;

// Initialize MH sums to zero
theta_mean_ic = 0;
th2_mo_ic = 0;
theta_draw_ic = zeros(keep+burn_in, 1);

// Specify starting value for independent chain theta draw
theta_draw_ic[1, 1] = 1;

// Set count of the number of accepted draws
count_ic = 0;

Implementing the algorithm

Next, we will implement the Metropolis-Hastings algorithm using a for loop. Within each iteration of our loop we need to :

Draw a candidate $\theta$ using the GAUSS rndn function.
Compute the acceptance probability.
Set $\theta^{(r)} = \theta^*$ with probability $\alpha(\theta^{(r-1)}, \theta^*)$, or else set $\theta^{(r)} = \theta^{(r-1)}$.

// Set random seed for repeatable random numbers
rndseed 97980;

for i(2, burn_in+keep, 1);
    // Front multiply by sd_ic 
    theta_can_ic = sd_ic*rndn(1, 1);

    // Calculate acceptance probability
    acc_prob_ic = minc(exp(.5*(abs(theta_draw_ic[i-1, 1]) - abs(theta_can_ic) + (theta_can_ic/sd_ic)^2 - (theta_draw_ic[i-1, 1]/sd_ic)^2))|1);

    // Accept candidate draw with probability acc_prob_ic
    if acc_prob_ic > rndu(1, 1);
        theta_draw_ic[i, 1] = theta_can_ic;
        count_ic = count_ic + 1;
    else;
        theta_draw_ic[i, 1] = theta_draw_ic[i-1, 1];
    endif;

    /*
    ** Discard r0 burnin draws
    ** Store remainder for computing
    ** sample parameters.
    */
    if i>burn_in;
        /*
        ** This keeps a running sum of
        ** of theta. It can be used to 
        ** find the mean of theta.
        */
        theta_mean_ic = theta_mean_ic + theta_draw_ic[i, 1];  
        /*
        ** This keeps a running sum of
        ** of square of theta. It will 
        ** be used to find the standard
        ** deviation of theta.
        */
        th2_mo_ic = th2_mo_ic + theta_draw_ic[i, 1].^2;
    endif;
endfor;

The random walk chain

As an alternative to the independent chain, let's also consider the random walk chain. The random walk chain Metropolis-Hastings algorithm generates candidate draws from the process

$$\theta^* = \theta^{(r-1)} + z$$

where $z$ is a random variable that follows a normal increment.

The candidate generating density

The iteration candidate, $\theta^*$ is drawn from $N(\theta^{(r-1)},\text{ } c^2)$ where, $c$ is a performance parameter. Note how this differs from the independent chain which is drawn from $N(0,\text{ } c^2)$.

The acceptance probability

In the random walk case, the acceptance probability is given by

$$ \alpha\big(\theta^{(r-1)}, \theta^*\big) = min \bigg[exp\Big[\frac{1}{2}\Big( | \theta^{(r-1)}| - | \theta^*|\Big) \Big],1\bigg]$$

Implementing the algorithm

We will again use a for loop to complete the steps of our Metropolis-Hastings sampler

// Set random seed 
rndseed 10385;

// Standard deviation for RW M-H algorithm
sd_rw = 4;

// Initialize MH variables  
theta_mean_rw = 0;
th2_mo_rw = 0;
theta_draw_rw = zeros(keep+burn_in, 1);

// Specify starting value for theta draw
theta_draw_rw[1, 1] = 1;

// Set count of the number of accepted draws
count_rw = 0;

for i(2, burn_in+keep, 1);
    // Draw a candidate for random walk chain
    theta_can_rw = theta_draw_rw[i, 1] + sd_rw*rndn(1, 1);

    // Calculate acceptance probability
    acc_prob_rw = minc(exp(.5*(abs(theta_draw_rw[i, 1]) - abs(theta_can_rw)))|1);

    // Accept candidate draw with probability acc_prob_rw
    if acc_prob_rw > rndu(1, 1);
        theta_draw_rw[i, 1] = theta_can_rw;
        count_rw = count_rw+1;
    else;
        theta_draw_rw[i, 1] = theta_draw_rw[i-1, 1];
    endif;

    /*
    ** Discard r0 burnin draws
    ** Store remainder for computing
    ** sample parameters.
    */
    if i>burn_in;
        /*
        ** This keeps a running sum of
        ** of theta. It can be used to 
        ** find the mean of theta.
        */
        theta_mean_rw = theta_mean_rw + theta_draw_rw[i,1];  
        /*
        ** This keeps a running sum of
        ** of square of theta. It will 
        ** be used to find the standard
        ** deviation of theta.
        */
        th2_mo_rw = th2_mo_rw + theta_draw_rw[i, 1].^2;
    endif;
endfor;

Note: It is more efficient to program the random and independent chain sampler in the same for loop. This minimizes the amount of looping in our program. However, we compute a separate for loop for the for the sake of demonstration.

Compute sample statistics

Finally, we use our stored data to find our sampler mean and variance.

print "Number of burn in replications";
burn_in;

print "Number of included replications";
keep;

print "Proportion of accepted candidate draws: Independence chain M-H";
count_ic/(keep+burn_in);

print "Proportion of accepted candidate draws: Random walk chain M-H";
count_rw/(keep+burn_in);

theta_mean = (theta_mean_ic~theta_mean_rw)./keep;
th2_mo = (th2_mo_ic~th2_mo_rw)/keep;
thvar = th2_mo - theta_mean.^2;

print "Posterior Mean and Variance, Ind Chain then RW Chain";
theta_mean;

thvar;

The above code should print the following output:

Number of burn in replications
100.0000
Number of included replications
10000.0000
Proportion of accepted candidate draws: Independence chain M-H
  0.48455446
Proportion of accepted candidate draws: Random walk chain M-H
  0.52603960
Posterior Mean and Variance, Ind Chain then RW Chain
  0.038943707      0.075589115
  8.1069691        8.0770516

Conclusion

Congratulations! You have:

Implemented the Metropolis-Hastings sampler using as an independent chain.
Implemented the Metropolis-Hastings sampler using as a random walk chain.
Calculated the distribution's mean and variance-covariance matrix.

For your convenience, the entire code is below.

new;

/*
** Set up Metropolis Hastings chain
** Discard r0 burnin replications
*/
burn_in = 100;

// Specify the total number of replications
keep = 10000;

/*
** Standard deviation for the increment 
** in the independent chain M-H algorithm
*/
sd_ic = 6;

// Initialize MH sums to zero
theta_mean_ic = 0;
th2_mo_ic = 0;
theta_draw_ic = zeros(keep+burn_in, 1);

/* 
** Specify starting value for 
** independent chain theta draw
*/
theta_draw_ic[1, 1] = 1;

// Set count of the number of accepted draws
count_ic = 0;

// Set random seed for repeatable random numbers
rndseed 97980;

for i(2, burn_in+keep, 1);
    // Candidate draw from normal distribution
    theta_can_ic = sd_ic*rndn(1, 1);

    // Calculate acceptance probability
    acc_prob_ic = minc(exp(.5*(abs(theta_draw_ic[i-1, 1]) - abs(theta_can_ic) +
    (theta_can_ic/sd_ic)^2 - (theta_draw_ic[i-1, 1]/sd_ic)^2))|1);

    /*
    ** Accept candidate draw with probability 
    ** theta_can_ic
    */
    if acc_prob_ic > rndu(1, 1);
        theta_draw_ic[i, 1] = theta_can_ic;
        count_ic = count_ic + 1;
    else;
        theta_draw_ic[i, 1] = theta_draw_ic[i-1, 1];
    endif;

    // Discard r0 burn-in draws 
    if i>burn_in;
        /*
        ** This keeps a running sum of
        ** of theta. It can be used to 
        ** find the mean of theta.
        */
        theta_mean_ic = theta_mean_ic + theta_draw_ic[i, 1];  

        /*
        ** This keeps a running sum of
        ** of square of theta. It will 
        ** be used to find the standard
        ** deviation of theta.
        */
        th2_mo_ic = th2_mo_ic + theta_draw_ic[i, 1].^2;
    endif;
endfor;

// Set random seed for repeatable random numbers
rndseed 10385;

/*
** Standard deviation for the 
** RW M-H algorithm
*/
sd_rw = 4;

// Initialize MH sums to zero
theta_mean_rw = 0;
th2_mo_rw = 0;
theta_draw_rw = zeros(keep+burn_in, 1);

// Specify starting value for random walk chain theta draw
theta_draw_rw[1, 1] = 1;

// Set count of the number of accepted draws
count_rw = 0;

for i(2, burn_in+keep, 1);
    // Draw a candidate for random walk chain
    theta_can_rw = theta_draw_rw[i-1, 1] + sd_rw*rndn(1, 1);

    // Calculate acceptance probability
    acc_prob_rw = minc(exp(.5*(abs(theta_draw_rw[i-1, 1]) - abs(theta_can_rw)))|1);

    // Accept candidate draw with probability acc_prob_rw
    if acc_prob_rw > rndu(1, 1);
        theta_draw_rw[i, 1] = theta_can_rw;
        count_rw = count_rw+1;
    else;
        theta_draw_rw[i, 1] = theta_draw_rw[i-1, 1];
    endif;

    //Discard r0 burn-in draws
    if i>burn_in; 
        /*
        ** This keeps a running sum of
        ** of theta. It can be used to 
        ** find the mean of theta.
        */
        theta_mean_rw = theta_mean_rw + theta_draw_rw[i,1];  
        /*
        ** This keeps a running sum of
        ** of square of theta. It will 
        ** be used to find the standard
        ** deviation of theta.
        */
        th2_mo_rw = th2_mo_rw + theta_draw_rw[i, 1].^2;
    endif;
endfor;

print "Number of burn in replications";
burn_in;

print "Number of included replications";
keep;

print "Proportion of accepted candidate draws: Independence chain M-H";
count_ic/(keep+burn_in);

print "Proportion of accepted candidate draws: Random walk chain M-H";
count_rw/(keep+burn_in);

// Calculate mean using stored sum of theta
theta_mean = (theta_mean_ic~theta_mean_rw)./keep;

// Calculate standard deviation
th2_mo = (th2_mo_ic~th2_mo_rw)/keep;
thvar = th2_mo - theta_mean.^2;

print "Posterior Mean and Variance, Ind Chain then RW Chain";
theta_mean;

thvar;

Metropolis-Hastings Sampler

Goals

Introduction

The Metropolis-Hastings steps

The independent chain

The candidate generating density

The acceptance probability

Metropolis-Hastings settings

Implementing the algorithm

The random walk chain

The candidate generating density

The acceptance probability

Implementing the algorithm

Compute sample statistics

Conclusion

Have a Specific Question?

Need Support?