Late last week, as we pondered Nielsen’s new plan to monitor all video content (as opposed to simply tracking what people watch on a television screen), Jason Miller over at WebProNews was engaged in a little Nielsen speculation of his own. His point is that Nielsen’s expansive new strategy could come into conflict with Google’s stated goal of organizing all the world’s information. The Big G has something TV-related in the works, but it’s not clear just what. After reading the tea leaves, Miller as an idea: Google might get into the TV-ratings biz.
“Interestingly, this can provide a whole new market for Google. The company will be able not only to extend its advertising offerings, but it suddenly will be able to foray in the media measurement market—a market that has been exclusively held by Nielsen.”
Lucrative as such “foraying” might be, is this really a possibility or just another Groomer (that’s “Google rumor” to those not in the know)? The above-linked Google job posting aside, a paper cowritten by some Google researchers makes the idea sound a little less crazy, and even illustrates how it might work. Keep in mind that Google’s approach to categorizing information is to prefer broad algorithmic solutions over approaches that require the company to, say, recruit a statistically meaningful pool of several thousand families who keep paper diaries of their TV watching or use special boxes that track their viewing.
The approach outlined in the paper is radically different from Nielsen’s current scheme. Michael Fink, Michele Covell, and Shumeet Baluja (who was last seen on Ars discussing pornography on mobile phones) have drafted a plan that calls for using the built-in microphones on people’s laptops to sample television audio and to report in on what’s currently being watched. Though it sounds a bit crazy at first, the system makes more sense when you read the paper, which pitches the idea primarily as a way of offering contextual services alongside television programs.
Here’s how it works: software on the laptop samples the ambient audio in five-second chunks, then “irreversibly compresses” it into summary statistics (this means that no one is eavesdropping on you; the actual audio never leaves your laptop). The statistics are transmitted to an audio database server that matches this unique audio fingerprint to a massive database of television programs, advertisements, and movies. Once a match is made, the database server passes this information to a social applications web server, which can then provide additional information about the show being watched or create ad-hoc chat rooms with others watching the same show (this of course would be viewed on the laptop, not the television). It’s a way of bringing automatic contextual information to a more passive medium without requiring people to buy new televisions or set-top boxes, or even to fire up a Web browser and manually enter an address (the current system).
Assuming such a system works, the possibilities would be staggering. Think about how Google Maps and Google Earth have gained popularity by allowing users to overlay contextual information on maps of the world. Imagined if the same thing were possible with television. When watching a Seinfeld rerun, for instance, you might choose to view a user-edited Wiki about that particular episode. Someone might write an app that popped small bubbles of text up on the laptop screen at various moments in the show (think user-created “Pop-Up Video”). As you surf, your laptop could provide you with a user-generated rating of every show on the dial, in real-time, potentially alerting you to excellent news shows you might never have discovered. And on and on.
What does all this have to do with Nielsen? The authors of the paper point out that, if they collect data on millions of TV viewers in real-time, Google (or whoever ran the system) would suddenly have access to a flood of ratings data.
Having real-time, fine-grain ratings is more valuable than ratings achieved by the Nielsen system. Real-time ratings can be used by viewers to “see what’s hot” while it is still ongoing (for example, by noticing an increased rating during the 2004 super bowl [sic] half-time). They can be used by advertisers and content providers to dynamically adjust what material is being shown to respond to drops in viewership. This is especially true for ads: the unit length the short and unpopular ads are easily replaced by other versions from the same campaign, in response to viewer rating levels.
Or, to put it another way, such a system would be a license to print money. Because it would open up so many possibilities for users as well as for advertisers, it should be easy to gain widespread adoption. Whether people will ultimately enjoy multitasking between their laptop and the TV screen remains an open question, though many no doubt do it now. Though such a system sounds impractical to implement, the paper’s authors have already done it and have achieved excellent success at recognizing TV shows, even with a conversation going on in close proximity to the laptop microphone.
Of course, the rise of networked devices (like home theater PCs) in the living room could make the same functionality even simpler to achieve. Rather than mess about with television audio and laptop microphones, Google could simply write software that provides the same benefits to users and just grabs the television data from the tuner card. The advantages of this approach should be obvious: Google has to do less work, the system is more accurate, people don’t need to use or own a laptop, and the contextual information can easily be overlaid on the television picture. Google is already involved in delivering video services to the Viiv platform, but HTPCs are still less common than laptops, so the audio system might be the way to go at the moment.