Having Won a Pulitzer for Exposing Data Mining, Times Now Eager to Do Its Own Data Mining

Keach Hagey

April 24, 2007

Barely a year after their reporters won a Pulitzer prize for exposing data mining of ordinary citizens by a government spy agency, New York Times officials had some exciting news for stockholders last week: The Times company plans to do its own data mining of ordinary citizens, in the name of online profits.

The news didn’t make everyone all googly-eyed. In fact, some people at the paper’s annual stockholders meeting in the New Amsterdam Theatre exchanged confused looks when Janet Robinson, the company’s president and CEO, uttered the phrase “data mining.” Wasn’t that the nefarious, 21st-century sort of snooping that the National Security Agency was doing without warrants on American citizens? Wasn’t that the whole subject of the prizewinning work in December 2005 by Times reporters Eric Lichtblau and James Risen?

And hadn’t the company’s chairman and publisher, Pinch Sulzberger, already trotted out Pulitzers earlier in the program?

Yes, yes, and yes. But Robinson was talking about money this time. Data mining, she told the crowd, would be used “to determine hidden patterns of uses to our website.” This was just one of the many futuristic projects in the works by the newspaper company’s research and development department. Heck, she added, the R&D department, when it was founded several years back, was “a concept unique in the industry.”

These days, of course, all media outlets—not just the Times—are trying to bulk up their online presence, and many are desperately attempting to learn more about their readers’ habits and then target ads to them. The old-line newspaper companies in particular are under immense pressure to figure out how to make double-digit leaps in profits annually—something they didn’t have to worry about doing before websites spirited away huge chunks of newspapers’ classified advertisers.

Not that anyone would confuse an old-line media company like the Times with a modern data expert like Google, but Sulzberger himself made kind of a comparison earlier in the stockholders’ meeting. Morgan Stanley and other investors have ragged on the Times for having a two-tiered stock structure that protects the powerful voting shares from falling into the “wrong” hands. Sulzberger reminded the crowd that Google stock, that most coveted of Wall Street delicacies, also comes in two tiers.

But that’s business. Do readers really want data-mining behavior from their newspapers—not just the Times but every other big media outlet? Do they want newspaper databases to store reading histories, minute by minute, until one day the government shows up to examine ordinary citizens’ shopping and viewing and chatting habits in detail? If you think it can’t happen, ask the librarians who’ve been told to hand over readers’ checkout records under the Patriot Act.

Jim Harper, director of information policy studies at the libertarian Cato Institute, agrees that the prospect of a media-compiled reader-habit database is worrisome.

“My concern is, what happens when the government comes in and subpoenas it?” he says. “It’s bad news to keep long, deep storehouses of information about how people use the Internet.”

Harper notes that the Justice Department has been pushing since last spring for a “data retention” law that would require Internet service providers to warehouse their customers’ online activity for the convenience of government investigators.

Ancient Times man Arthur Gelb made this hardly surprising observation to the Observer the other day: “Some day we’ll all be reading our papers electronically.” But the problem with reading papers electronically is that they can also read you.

More: Arthur Sulzberger Jr., Google Inc., Janet Robinson, Jim Harper, News, Press Clips