Nearly everyone has been affected by Google and its sheer ubiquity—to the point where “to google” has even become an officially recognized verb. Google gained this dominance by providing the best web searching service, which required the ability to quickly “crawl” the web, finding and indexing all the content it could get its hands on. Anyone who has set up a web server and peered at the access logs know that the Google spiders come quickly and often.
One thing that most people weren’t aware of, however, is that Google is indexing more than just text and images. The search engine is also capable of indexing and searching binary files, a feature that the security firm Websense has been taking advantage of to uncover malicious and hacked web sites all over the world.
The company utilized a little-known feature of Google to search for binary strings representing Windows-based worms such as W32.Bagel and W32.Mytob. “They [Google] actually look inside the internals of an executable and index that information,” said Dan Hubbard, senior director of security at Websense.
Websense plans to share the code they have developed using the Google APIs with other security researchers, but does not plan to release it to the public. Hubbard fears that would-be virus authors could use the tool to jump start their activities. “Instead of buying them on the black market [an attacker] could search for them and download them on his own,” he said.
Hubbard isn’t the only one concerned about the possibilities of searching for binary patterns on the web. Claudiu Spulber, of the Homemade Computer Tutorials blog, pointed out that hackers could embed common search terms into the binary, and then hope that users looking for a particular page would find a link to the program, click on it, and run the executable. The blog post includes an example of this, an illegitimate version of the shareware “Backup4all” program, but interestingly the malicious version no longer shows up on the results page. According to a spokesperson for Google, the company continues to keep an eye out for this practice.
Should Google continue to index binary files, despite the potential drawbacks? The company’s position is that the more things on the Internet that are searched, the better things are for everyone, and that people shouldn’t worry too much about any possible misuse. Still, the more powerful a tool becomes, the more the potential for abuse increases. This applies not only to Google, but to the Internet in general. As always, skeptical computing is the best defense.