Dan Gardner is the New York Times best-selling author of Risk, Future Babble, Superforecasting (co-authored with Philip E. Tetlock), and How Big Things Get Done (co-authored with Bent Flyvbjerg). His books have been published in 26 countries and 20 languages. Prior to becoming an author, Gardner was an award-winning investigative journalist. More >

Big Data and Big Brother

In the 1921 novel We, Yevgeny Zamyatin imagined a future where every building is made of glass so the authorities can see what citizens are doing at all times. Is that the world Big Data will construct? Some pessimists worry that it could. I worry that the pessimists are too optimistic.

If you're scratching your head, "Big Data" is a term that describes the accumulation and analysis of information. Lots of information. Oceans of information.

Every time someone clicks on something at Amazon, it's recorded and another drop is added to the ocean. Every time a scanner beeps at the Loblaws checkout. Every time a home electricity meter reports a reading. Every time a parcel passes a FedEx checkpoint. Every time a customs officer checks a passport, every time someone posts to Facebook, every time someone does a Google search - the ocean swells.

And new forms of data are being developed all the time. Increasingly powerful and clever computer algorithms are able to sift through things we wouldn't think of as "data" - still photos, video images, text - and extract data that can then by analysed as easily as tallies of mouse clicks and scanner beeps.

But Big Data is much more than big data. It's also the ability to extract meaning: To sort through masses of numbers and find the hidden pattern, the unexpected correlation, the surprising connection. That ability is growing at astonishing speed. It won't be long before Amazon's ability to dazzle customers by suggesting just the right book will seem as quaint as our ancestors' amazement at horseless carriages.

In many ways, this is all to the good. Indeed, it could do wonders. Look at medicine. At one time, doctors made decisions based on nothing more than experience and hunches. The shift to proper data collection and analysis - science, in other words - improved medicine spectacularly. Similar advances could be made in other fields where decisions continue to be based mostly on experience and hunch.

Probably the biggest impact could be in business, a field where the manager's gut continues to be the prime source of wisdom and direction. A recent study found that companies that adopted "data-driven decision making" enjoyed significantly greater productivity gains than those that did not. That may sound a little dry. But remember that productivity gains are the foundation of prosperity. Big Data is in its infancy. If it lives up to its promise, the future will be bright.

But as humans learned when we invented fire - a blessing for cooking meat and keeping people warm, a curse when it burned down your hut - technological promise always comes with peril.

In that ocean of data is a frighteningly complete picture of you. Where you live, where you go, what you buy. What you say. What you feel and believe. It's all there. With access to even a small portion of that data, corporations and governments can know far more about you than you might wish them to know.

"If we wanted to figure out if a customer is pregnant, even if she didn't want us to know, can you do that?" As Charles Duhigg reported in the New York Times, that's the question marketers at Target, the American department store chain, asked one of Target's dozens of data analysts. "Specifically," Duhigg writes, "the marketers said they wanted to send specially designed ads to women in their second trimester, which is when expectant mothers begin buying all sorts of new things, like prenatal vitamins and maternity clothing."

Although Target refused to officially discuss the story, Duhigg found that the company succeeded. One man actually complained to Target when his teenage daughter was sent maternity ads, only to apologize later when she admitted she was pregnant.

And this is where we are with existing technology and data sources. If you've seen the video of Google's interactive glasses - imagine a pair of Foster Grants with a high-speed Internet connection - you have a small idea of what's coming. Refrigerators that restock themselves. Clothing with sensors and Internet connectivity. Imagine every photo you've ever taken, every email you've ever sent, every purchase you've ever made, every website visited, every book read, every song listened to, every prescription filled - all stored on giant servers scattered around the globe.

Strict regulatory control to protect privacy may be enough to keep Big Data from becoming Big Brother. Or it may be futile. And we will live in the world of glass buildings imagined by Yevgeny Zamyatin almost a century ago.

But awful as that dystopian vision is, it doesn't quite capture what such a world would be like.

The most basic insight of modern psychology is that most of our mental life happens outside consciousness. By definition, we are not aware of it. And so we are, to an unsettling extent, strangers to ourselves.

This is why, when people are asked what choice they would make in a certain situation, they are often wrong. It's also why, when they are asked why they made a particular choice, the reasons they give are often not what actually motivated the choice but are, rather, post-facto rationalizations that disguise the unsettling truth: We don't know the answer.

Big Data can't tap into our unconscious thought processes directly, of course. But with a vast storehouse of our past decisions to analyse, it could detect patterns of behaviour we are not aware of, and those patterns could reveal the unconscious thought processes that drive the behaviour. In a very real sense, Big Data could know us better than we know ourselves.

In that world, not only the buildings would be made of glass. So would our skulls.