Bayesian methods in ranking

2013-06-25 01:37:10

Many websites use a ranking system to determine the display order of content, especially for Q&A or news websites like Quora, Reddit, zhihu, that for a post/question, there might be multiple comments/answers, and the users could use the upvote or downvote buton to gradually change the displaying order. From the simplest perspective, the number of upvote and down vote should decide the ranking of the response/answer. (There may be some extra factors needed to consider for ranking though.)

In this case, a Bayesian approach could be useful. Let us assume that the ranking is purely decided by the number of upvotes, while the number of upvotes is a Binomial distribution with n total users and voting probability p.

Now the question is: how to decide p appropriately? The Bayesian answer would be to calculate the conjugate prior probability and then update with data.

Before going further, let us take a look at the conjugate prior distribution diagram: From this we know that the conjugate prior probability of p follows Beta distribution, \$\mbox{Beta}(a,b)\$, which pdf is proportional to the \$p^{(a-1)}\$ and \$(1-p)^{(b-1)}\$. And to update p with data we calculate the posterior probability: