Thursday, July 3, 2008

Why the long tail isn't as long as expected

Several bloggers and authors are finding out (belatedly relative to people involved in the industry) that the economic returns in systems that are supposedly governed by long-tail distributions are unexpectedly concentrated in the highly popular items. See here for a blogger's commentary on this article.

In practice, the long tail model does predict certain kinds of consumption very well. I speak from experience analyzing view data at Veoh where, except for the very few top titles, a unit power law described number of views pretty well. This means that the number of views from sparsely watched videos is surprisingly large, adding to the woes of anybody trying to review submissions.

Economically speaking, though, you have to factor in a few kinds of friction. These are reasonably modeled as per unit sale costs and per unit of stock costs. The per unit sale costs don't affect the distribution of revenue and profit, but the per stock unit costs definitely do. If you draw the classic Zipf curve and assume that revenue is proportional to consumption for all items, then a per stock unit cost offsets the curve vertically. The resulting profit curve tells you pretty much immediately how things will fall out.

The graph to the right illustrates. The solid line is the classic long-tail distribution and the dashed lines represent different per stock unit costs. Wherever the solid line is above the dashed line, the model implies positive returns; where it is below, it implies loss. Clearly, the lower the stock unit costs, the deeper into the distribution you can go and still make money.

If we assume that you will only be stocking items that have non-negative return, then the percentage of profit that comes from a particular part of the distribution will vary with the threshold. For high thresholds, almost all of the profit will be from mega-hits but for very low thresholds, a much larger percentage will come from low consumption items.

This sort of situation is shown in this second graph which shows the percentage of total profit achieved for different thresholds and ranks. The concentration of profit in the low rank items for high thresholds is quite clear.

A good example of a per stock unit cost is given in p2p download schemes. The first few downloads of each item have to be paid for by the original source while additional downloads make use of sunk network resources that apply minimal marginal cost back to the source. For systems like bit-torrent, you need more than a hundred downloads before you get to high p2p efficiency which makes the system useful for mainstream content and less useful for body and tail content. The Veoh p2p system has a much lower threshold and is thus useful far down into the tail. Either system, though, leads to an economic return different from the theoretical zero-cost long-tail model.

None of these observations is earth-shaking and, as far as I know these are all common wisdom among anybody trying to make money out of the long tail. Why this is news to the Harvard Business Review is the real mystery to me.

No comments: