Rants, Raves, and Rhetoric v4

Percentages

This is an interesting opinion piece. I kind of think of the Bank of America commercial where the CTO or CIO says their goal is not to get right almost every time but to get it right once and replicate it every time.

Wired News: Why Data Mining Won’t Stop Terror

Let’s look at some numbers. We’ll be optimistic — we’ll assume the system has a one in 100 false-positive rate (99 percent accurate), and a one in 1,000 false-negative rate (99.9 percent accurate). Assume 1 trillion possible indicators to sift through: that’s about 10 events — e-mails, phone calls, purchases, web destinations, whatever — per person in the United States per day. Also assume that 10 of them are actually terrorists plotting.

This unrealistically accurate system will generate 1 billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999 percent and you’re still chasing 2,750 false alarms per day — but that will inevitably raise your false negatives, and you’re going to miss some of those 10 real plots.

This is exactly the sort of thing we saw with the NSA’s eavesdropping program: the New York Times reported that the computers spat out thousands of tips per month. Every one of them turned out to be a false alarm.

Finding terrorism plots is not a problem that lends itself to data mining. It’s a needle-in-a-haystack problem, and throwing more hay on the pile doesn’t make that problem any easier. We’d be far better off putting people in charge of investigating potential plots and letting them direct the computers, instead of putting the computers in charge and letting them decide who should be investigated.