Comments on Surprise and Coincidence - musings from the long tail: Learning to Rank, in a Very Bayesian Way

@Terry, Yes. Many communities do choose to hear ...

2013-04-28T15:53:25.618-07:00

@Terry,

Yes. Many communities do choose to hear only from people that they agree with. It is hardly limited to people who downvote Republicans or religionists since Republicans and religionists do the same thing.

I would not demean teenage girls by assuming that they are the archetypal example of this behavior. They do exhibit this behavior, as do pretty much all humans.

None of this changes the mathematics involved in trying to predict down-voting behavior and that is all that is addressed in my blog. Rectifying all of human misbehavior is not the goal of the mathematics I do.

2013-04-28T14:35:16.026-07:00

This comment has been removed by a blog administrator.

James, In the implementation I give, the order is...

2013-04-26T08:24:45.986-07:00

James,

In the implementation I give, the order is heavily randomized. That should mean that context effects are randomized as well.

If the process is preserving order and merely changing visibility, then what you say is fair, but I think that it won't matter in practice since the first comment of a duplicative high quality pair will tend to dominate the second if only because it will get more training by appearing first. If the second appears very soon and is significantly better, then the early training of the first can be over-ridden.

Nicely done, the analysis makes for a pretty compe...

2013-04-26T00:43:14.026-07:00

Nicely done, the analysis makes for a pretty compelling case all together.

Although I'm certainly an advocate of this approach over any existing implementations, I do have one small word of caution, although it's more of a theoretical issue. Essentially, there is a very subtle assumption being made here, namely, that the vote percentage of each comment is independent of it's position, and of the comments surrounding it. Which, of course, is not true in practice.

In this context, this is best illustrated by considering the case where two people post roughly the same joke (something that happens a lot on reddit, hah). The comment that appears secondly in ordering given to the user will obviously suffer more downvotes. In this situation this would likely be addressed by one of the comments slowly losing out over time. However, there is certainly a nonzero chance that this approach will converge to some suboptimal ordering.

While this final ordering is probably going to be pretty close to the best one in practice, in any theoretical analysis, this fact directly implies the algorithm must have O(n) regret.

We could, though, find the optimal ordering in O(log n) time by simply making each "arm" define an entire ordering (the payoff would be the total ratio of up/down votes for that sample). Of course, this would create K! arms, which is a rather large constant multiplier on that log term.

Yes. Decaying feedback is very easy to implement,...

2013-04-25T12:02:51.346-07:00

Yes. Decaying feedback is very easy to implement, though a bit dangerous as well since eventually everything can decay to complete ignorance (which you probably don't want).

With the beta distribution, there are two parameters which can be equated with the number of positive and the number of negative ratings. If these decay with time, the system reverts to its prior.

My own feeling is that the preferable way to do this decay is to use a mixture of very long-term and medium to short term decays. That way the system won't revert to complete ignorance, but will revert to a less emphatic state.

It is also possible to restart new bandits at intervals and use a meta-bandit to decide which bandit is better, the brand-new naive one or the wise old one. The meta-bandit can have a simple forgetting strategy. This effectively implements change-point detection.

Your idea to use this sort of system for app rating is an excellent idea.

Can this be modified so the weight of votes decays...

2013-04-25T11:41:23.546-07:00

Can this be modified so the weight of votes decays overtime? While comment quality isn't going to change overtime (and all your votes are going to happen in a short period of time) this isn't true for voting on things like apps. In the case of apps you'd like to give an app that has a poor score it a chance to re-assert itself if it hasn't received a vote for a while. And similarly a highly scored app should be able to drop if all the recent votes are negative (even if they have been positive for a long time).