Accounting Predictions

In my Prediction Accountability, I ranted on how no one really knows whether predictions are accurate and ended with it really does not matter because no one is going to really stop using these services because they are usually wrong. Basically, I thought it futile to even try. In retrospect that is probably the perfect reason to do it.

So I came up with a scoring system:

    • Good Recommendation= 3 points
    • Not interested= -1 points
    • Wishlist/Queue= -2 points
    • Dislike= -3 points

Would you score these differently? Why?

My reasoning goes something like this. Something I agree I should watch should equal the inverse number of points of something I know I will dislike from previous experience. Anything I am not really interested in definitely is not a win, so it should be a negative, but not too close to a dislike. Suggesting something already on that company’s records that I am interested in wastes my time because they already know I am interested in it, so lose two points.

First pass, Amazon sent me an email today saying,

Are you looking for something in our <x> department? If so, you might be interested in these items.

One item I have thought I should watch based on TV ads but not put on my wishlist yet, so I agree with Amazon, I might be interested in it. It gets three points. (3) Five items already were in my wishlist so that is negative two points each. (3 -10= -7) One item is the 6th season of a television series I have only seen part of the first season and not gotten around to completing even that so not interested and negative one point. (-7 -1= -8) Another item is the 3rd season of a TV series I where I have not watched even the first yet. If the recommendation had been the first, then I would count it as a good one so instead I’ll award halfway between good and not interested (-8 + 1 = -7) Out of eight items in the email, the score is a -7. That is just one email. I track this for a couple months and see where it goes. And do the same for Netflix.

I think this exercise points out the possibility that these “predictions” are basically nudges more to buy something.

If your Learning Management System vendor claimed they have a 90% plus correct prediction rate for whether students will fail a class, then how would you assess it? The obvious start would be track the predictions for classes but do not provide the predictions to instructors. Compare the predictions to actual results. Of course, these things are designed around looking at past results. What is the investment company statement they have to put in so they do not get sued for fraud? Oh, right, “Past success does not guarantee future performance.” So I would not rely too much on just historical data. I would want a real world test the system is accurately working.

Prediction Accountability

The technology buzzword standard for prediction appears to be Netflix and Amazon. Everyone wants to get to where they make recommendations customers will buy. But are these predictions any good?

Out of the slew of emails you get from Amazon, how percentage do you actually buy? How many do you sneer at it and hit delete in disgust that they could get it that wrong? For me, the latter is more common than the former. Certainly it is not from a lack of data, I buy more off that site than I do all bricks and mortar stores excepting groceries combined. (And that makes me re-think how I buy groceries.) Maybe Amazon has too much data that confuses it mixed with correct data. I look things I have no interest in buying such as someone mentioned having problems with a product. Though I have to question Amazon recommending I buy the camera I bought from them a couple months prior.

Netflix really is not any better. Their top 10 recommendations change weekly for me. In my current top 10, one was already rated 5 stars. Another four were already in my queue. The remaining five predicted I would like them between about 3.0 and 3.3 stars. That is out of five. There are 27 items in my queue with higher predictions than these.

Before I start tracking these predictions to gauge how effectiveness, do I even really care? Am I going to stop consuming from companies that overstate their claims? Or should I close my ears when clueless people spout the prediction buzzword? Not really. No. Guess that is what I am left doing.

I think the standard comes not from them being any good. Instead decision makers are aware of them, so they understand wanting to emulate them.