You’re So Predictable

For all the hopes, fears, and predictions about the future, the digital age may have less to do with how we read books than with how our digital books may one day read us. That’s because, thanks to everything from mobile phones to e-mail accounts, ATM and debit cards, and social network activity, humans today generate an astonishing amount of data, with the most connected of us leaving behind an electronic trail of breadcrumbs that enables researchers to piece together not only where we’ve been but where we will be. In fact, writes network researcher Albert Laszlo Barabasi in his mind-bending, entertaining new book, Bursts: The Hidden Pattern Behind Everything We Do, human activity is remarkably—and mathematically—predictable.

“We have a perception that human activity is random,” Barabasi says, when asked to explain the concept of a burst. “But when we look at the data, we see that activity actually comes in short, intensive bursts, preceded by and followed by nothing.” For example, e-mail and cellphone activity, he notes, follow a very predictable pattern—no surprise, perhaps. What is surprising, however, is that these bursts of activity, Barabasi says, follow very precise mathematical laws, not unlike those that pertain to cell biology and quantum physics—and not just for our technology-related behaviors. In gliding prose, Barabasi explains the burst principle from a bloody medieval crusade through the Pentagon’s post-9/11 “Total Information Awareness” efforts, and how analyzing the past is yielding algorithms that can divine our future. “Once you understand the origins of these bursts,” Barabasi says, “it can really change your perspective on how you do things, and how you expect other people to respond.”

So what does the ability to predict the future mean for, well, the future? Barabasi isn’t quite sure. But our current personal data explosion will have major ramifications for all of us, especially those in the information world, from Google’s data-driven search products to e-readers, and the information they can gather about reading and buying habits, and the uneasy implications on privacy and personal freedom. “Secure firewalls and privacy laws might protect our past,” Barabasi writes in Bursts, but “our futures, predicted by sophisticated algorithms, are up for grabs.” PW caught up with Barabasi to talk about human activity in the digital age.

In Bursts you suggest that electronic communication has turned the world into a great human research laboratory. How so?

It’s true. Whether we realize it or not, much of our life is now recorded. We carry mobile phones every time we leave the house so mobile phone companies have a record of our motion. If you are like me, the first thing you do in the morning is check e-mail, so your e-mail provider has a pretty good record again of when you start working, and what you work on. Add to that all the surveillance cameras around major cities, ATM and credit card activity, and so on. If you were to collect all that information, you could pretty much piece together what a person does, let’s say, over a year, almost at an hourly resolution. In the past, social science has relied mostly on interviews and observations, but now it is possible to look at everything we do in real-time.

Internet behavior is still relatively new and emerging, but have you seen any surprising, digital truths emerge from your data?

About a decade ago my research group had a big surprise. We saw, quantitatively, that the Web is not as democratic as everybody thinks. The general perception, both in the scientific world and outside, was that on the Web every voice could be heard, that you can put your information out and if your information is relevant, people will access it. But once we started to have access to maps of the Web, we realized that, yes, you can put anything out there, but most likely nobody will pay attention because the Web is not a democratic network. Rather, it is dominated by a few major hubs that everybody connects to. There are many—and by many I mean hundreds of millions—of nodes that nobody ever connects to because the Web’s structure essentially channels much of the attention, and much of the discoverability, through these major hubs.

So when Google says it searches 11 billion pages, users are really only ever accessing a tiny percentage of those pages?

Well, Google probably does search 11 billion pages, but it actually serves up, on one screen, just 10—100 results. So the question really is: what makes Google a better search engine? The reason Google became so popular is because it had a much better methodology to rank what they showed you, what they call page rank. Page rank essentially exploits the network tautology—that is, if a lot of nodes link to you, you must be important. It favors the highly connected nodes—to be more precise, the nodes to which the highly connected nodes connect. On the Web, that’s how the rich kind of get richer, because Google basically ranks a page based on how many pages point to that page.

At one point in the book, you discuss Google’s data collection. Does Google need all that data? And what are they—or what could a scientist like you—do with all that data?

[Laughs]. Obviously, I don’t know what Google is doing with their data, but we can make guesses. I think they are mainly trying to improve their service. I really don’t look at Google as playing the Big Brother; I don’t think it is. I know there are perceptions lately along those lines, but I think [Google is] really very focused on the engineering aspects, and the truth is, if they want to improve their products, they really do need data. The question in everybody’s mind, of course, is how do you balance the need for data to improve services with the need for privacy? That’s a question that, at this point, nobody really has had a good answer for.

Has the explosion of smartphones and handheld devices had an effect on patterns of human activity?

That’s a good question. When we first discovered this burst in e-mail patterns, the question came up: what did people do before e-mail? So we accessed the whole correspondence of Darwin and Einstein. And when we looked at their correspondence pattern, we found it was just as bursty as our e-mail patterns are today. Obviously, communication has become faster. But while the timescales have accelerated, the underlying patterns have not changed. We follow exactly the same mathematical rules today when it comes to our e-mail and call patterns as Einstein and Darwin did in the 19th century regarding their letter writing.

I ask, because there have been so many predictions in the publishing world about the future of immersive reading in the digital age, and with mobile devices, questions whether...

That’s interesting: Whether our reading and activity patterns will become bursty? And whether we will package information in ways that will be meaningful? What the future relationship will be between our human behavior patterns and reading, I certainly don’t have a clear answer on. But I’m one of those people who said for years there is no way I could see myself reading e-books. And now, I’m reading books on my iPhone.

One of the interesting phrases from the book is “the future is not searchable.” What do you mean—not yet searchable?

This is a question of trying to understand the limits of predictability. Are there fundamental limits that, no matter how much data you collect, you will never be able to predict what will happen tomorrow, or if you collect enough data, can you reasonably predict the future? In some cases, we see the answer is yes to the latter. When it comes to our location patterns, we recently published an article in Science that showed that individuals are 93% predictable. That means that if I collect enough data about your past locations, I can write a piece of software that can tell you where you’re going to be tomorrow at 3 p.m. with 93% accuracy. With advances in data availability, we see predictability. So now what we have to do is pause and say, what does this mean?

Are we humans really that predictable? What about our much vaunted capacity for randomness?

Well, there is always fundamental randomness in our behavior, but people are at the same time very constrained. We have daily patterns, we have weekly patterns. We have jobs that we have to go to. So while I may not be able to predict that you’re going to bump into somebody and fall in love, because that’s a rare event, can I still predict very relevant things if I learn to manage randomness at a microscopic level? That’s really the question at the heart of the book.

On the book jacket for Bursts there is a tantalizing statement: “the way you think about your potential to do something extraordinary will never be the same.” Really?

Let me answer this way. With my book Linked, everybody, depending on their background, took a completely different message away from it. I’m hoping that similar things will happen with Bursts. For example, I never meant to write about the book industry, yet clearly some of the messages from Bursts apply to what you do on a daily basis and what you do professionally. With Linked, understanding the processes that govern networks, or, in the case of Bursts, understanding the processes that cover human activity patterns can force people to have a different perspective. So, in that respect, things really may never be the same again.

You’re So Predictable

In his new book, Bursts, Albert Laszlo Barabasi says the future may not be so unwritten after all