Coin flipping in the wild

Published: 2016-12-31

From the point of view of a data scientist there is a good deal of coin flipping taking place on the Internet. By this I mean sites where visitors can register a like or dislike, thumbs up or thumbs down, and so on. For purposes of statistical analysis binary responses like these can be thought of as instances of coin flipping.

Coin flipping on the Internet should not be analyzed in the same way we analyze coin flipping in a formal experiment. For our purposes a formal experiment proceeds through three stages:

Determine, in advance, the number of trials to be run--equivalently, the number coin flips to be observed.
Collect the data. Stop when the predetermined number of trials has been run.
Analyze the data.

The analysis of a formal experiment usually assumes that the trials are independent. On the Internet the opposite holds true. When a visitor clicks a thumbs up or thumbs down they typically act while viewing the incidence of prior responses. This may well affect their own response. (There are a few web sites that collect responses before revealing the average response, but they are the exception.)

In addition, on the Internet, the number of trials is not under the control of the data scientist. Some products, services, or opinions may collect no or few responses while apparently similar items collect hundreds of responses. Does this mean that the items are different, or is herding at work?

Two restaurants

Imagine two identical, empty restaurants located side by side. Assume that a couple going for dinner pick the restaurant to the right purely by chance. The next person or group to come along is likely to view the restaurant on the right as better, because it has patrons, and to patronize it themselves. Soon, the restaurant on the right will be bustling while the restaurant on the left is shunned.

Following the intuition gained from the restaurant scenario, I contend that to model coin flips on the Internet we need to model two related phenomena:

Given two or more similar (perhaps identical) items, the item that has collected more responses in the past is likely to collect more in the future.
For a given item, the average approval rating at a given point in time may affect subsequent responses.

Model: Bayesian estimation

A simple model can be applied to coin flips on the web. The model has two stages:

Assume that a visitor to a page is confronted with two items (denoted Item 1 and Item 2) and can respond to one or the other, but not both. This is a simplification, of course, but may approximate reality in some contexts. Assume that the visitor arrives at the page with a uniform Bayesian prior, that is, with no predisposition to respond to one item rather than the other. Finally, assume that the visitor updates his or her predisposition based on the data that are presented concerning previous visitors. According to Alan Agresti, an actor with a uniform prior updates their decision probability according to the following formula:
`p_1 = (y_1 + 1) / (n_1 + 2)`
where y₁ is the observed number of heads and n₁ is the sample size. (See "An Introduction to Categorical Data Analysis, 2nd Edition" by Alan Agresti, page 17.) In the present case n₁ is the total number of prior responses and y₁ is the number of responses to Item 1. The visitor then responds to Item 1 with probability p₁; if they do not respond to Item 1 then they respond to Item 2.
There will of course, be many visitors who do not respond to either item. For our purposes they are invisible. A deeper model would account for the non-responders because a web site with a high percentage of responders among its visitors can be viewed as more engaging than a web site with a low percentage of responders.
Once a visitor has selected which item to respond to they go through the same Bayesian estimation procedure. They note the total number of votes the item has received and how many have been up-votes. They then provide an up-vote with probability
`p_2 = (y_2 + 1) / (n_2 + 2)`
where y₂ is the observed number of up-votes and n₂ is the total number of votes for the item (that is, the sum of the up votes and down votes). The visitor then up-votes with probability p₂. If they do not provide an up-vote then they provide a down-vote.

Notice that for the first visitor to the website (before any responses have been recorded)

`p_1 = 1/2, p_2 = 1/2`

This make intuitive sense, given our assumption that visitors arrive with no predispositions.

Simulation

We can simulate visitors arriving at a web page and responding--either to Item 1 or 2, and either thumbs up or thumbs down--as a dynamically updating decision tree, coupled with a dynamically updating set of probabilities. The probabilities are updated according to the Bayesian inference equations exhibited above.

Please try pressing the various Visit buttons, and the Clear button, a few times.

Watch out for the herd

The simulation above is random, so what you observe is going to be different, in detail, from what I have observed, but we are likely to observe the same general pattern. The pattern I have observed after running the simulation a few times is that one or more of the probabilities end up far from 0.5. This might suggest to the naive observer a population with strong proclivities regarding the choice of items and/or the choice of thumbs up or thumbs down. In fact, in the present simulation, the apparent proclivities arise entirely as the result of herd following similar to the restaurant situation described earlier.

I am not saying that all ratings on the Internet are the result of herd following; only that it may play a role. In other words, peoples' "preferences" as recorded on the Internet may only partly reflect the opinions and desires they bring to the party.

Bayesian inference, use with caution

The simulation also suggests that Bayesian inference, used incautiously, could lead to spurious conclusions.

Code

You can view the source code for the present project on GitHub.