10 November 2008

Cool Plot :: Gibbs Sampling and a (mostly) independant Bayes Network

Here's a cool plot that came up while I was doing a homework assignment. It is the result of the Gibbs Sampling algorithm approximating a 50-50% variable over a billion iterations on a probability space of dimensionality close to 2^25.

Imagine you have a random coin toss which we'll call the leader. Now imagine you have one or more followers who choose to agree with the leader x% of the time, or disagree otherwise. Suppose that you can't "see" what the leader chose, but only what the followers have chosen. Determine the probability distribution of the leader's coin. We'll call this the leader-follower problem.

In this particular instance of the problem, the leader's coin is unbiased (50-50%), followers agree with 90% probability, and we have 25 followers. Each of the followers depends only on the leader and not on each other, and so each follower represents a nearly independent dimension. I claim that the probability space P(Leader|Followers) is nearly 2^25 in size.

When we put a Gibbs Sampler to the task, it's performance kind of sucks. Effectively, it is running a Monte-Carlo simulation of the scenario, and attempting to count the probability distributions over the network. But, since the space is so large, it cannot effectively explore the whole space, and so there is a lot of error.

What I find so intriguing about this plot is its smooth upward or downward trends, with sharp turns.

You can think of any trend away from the solution (50-50%) as a situation in which more than half of the followers disagree with the leader, and any trend towards it as a situation in which more than half agree with the leader. Since Gibbs Sampling changes one variable at a time, we effectively have a "momentum" effect; once more than half disagree, each one needs to be independently fixed, on at a time, before we can turn around.

Also observe that the slope of these traces is proportional to the number of followers in disagreement. This is why the slope tends to flatten out as we reach the extrema.

Enjoy.

No comments: