|
Invasion of the Content, Privacy Snatchers
by Marc Wilson
More than half the traffic on newspaper websites - at least on TownNews.com's 1,500 sites - comes from non-humans.
In July, 60 percent of the 782 million page views in our Chicago data center came from electronic spiders and "bots."
Our tracking showed that 14 percent of the traffic came from Yahoo, nine percent from Microsoft, seven percent from Google, two percent from Vocus and one percent from the QRobo ENews robot. The rest basically comes from sources we can't accurately identify (for a variety of reasons).
What are these spiders and bots doing? We know that the major search engines - Google, Yahoo! and Microsoft - are indexing information to make it easier for consumers to find data.
Vocus.com is new on our radar screen, and averaged 260,000 page views a day on TownNews.com sites in July - over 8 million page views for the month. According to its website, Vocus.com offers "On-Demand Software for Public Relations Management."
The company says it allows its customers to "monitor any and all coverage ... across traditional and social media - so you can capture broadcast, print, online as well as millions of quality blogs, Twitter, Facebook, LinkedIn, and more.... What took hours is now done automatically."
Think of Vocus as a Digital Age clipping service. (Remember when virtually all newspaper associations ran clipping services?)
Some 27 percent of our traffic is coming from unidentified "others" who we try to unmask, but don't seem to want to be found. The unidentified spiders and robots come from various locations (or I.P. addresses) and they thwart our efforts to identify and/or block them by continually changing I.P. addresses. Most of them don't honor Internet protocols that govern rules of data usage.
We can't even track down who QRobo ENews is. They (it) account(s) for 1 percent of our total traffic (or 130,000 page views a day), and we can't track them down to even have a clue what they're doing. The web crawlers were coming from Internet servers in Asia. (Our systems team blocked ORobo ENews from having access to our servers as of Aug. 10. We have blocked numerous other Internet companies who are in direct competition with our customers, such as Topix.)
All these companies use content generated by newspapers for various purposes. Google, Yahoo! and Microsoft have sold millions of dollars in advertising around contextual and behavioral-targeted advertising around this and other Internet content.
This is not necessarily a symbiotic relationship. As Internet advertising has grown, newspaper advertising has dwindled. Start-up companies funded by venture capital and angel money cropped up regularly with hopes of using new technology and others' content to become the next Internet success story - and damn the content providers.
Consumers, too, should have concerns. The unbridled development of technology and the parsing of data raise privacy concerns. Data tracking and parsing enable companies of all sorts to know huge amounts of information about individuals.
This technology and data mining enable creation of great products, but it also opens potential problems. Data collected by spiders, cookies, emails and information voluntarily submitted to social media sites allow technology companies to "know" the preferences and desires of consumers. The data, potentially, can be shared and used in ways detrimental to the unwitting consumer by companies intent on becoming the next Internet titans.
The largest Internet company, Google, is looking at new ways to marry data knowledge with user needs. Google co-founder Larry Page recently said: "We are moving very fast. There are lots of new uses for the data." This coming from the head of one of the most scrupulous Web companies, whose famous motto is "Do no evil." Other companies, I fear, know no boundaries.
Government wants access to the data. China has battled with Google over data access. Research in Motion (makers of Blackberry phones) is fighting with Saudia Arabia and Qatar over data access and security.
In the U.S., the Obama administration recently asked Congress to give the FBI. clear authority to obtain the context of emails and other Internet-based communications without first obtaining a warrant from a judge. The Democratic chairman of the Senate Judiciary Committee, Sen. Patrick Leahy, said the proposal raises "serious privacy and civil liberties concerns."
Digital Age technology has given us many great advances, but not without raising concerns about copyright protections, privacy protection and possible governmental snooping and other abuses.
What we don't know can hurt us. Technology wizards - backed by angel investors and investment bankers - will always be one step ahead of those who potentially suffer collateral damage.
We will continue to be vigilant. So should you.
|