Monday, November 21, 2005

Article Note: On Google's Value

Citation for the article:

Caufield, James. "Where did Google Get Its Value?" portal: Libraries and the Academy 5.4 (2005): 555-572.

I usually stay away from the whole talk about whether Google is good for libraries or if Google is going to become their demise. I personally see Google as a tool, and like any other tool, it depends on how it is used. I do read up on the topic now and then. Also, I am finding more questions about using tools like Google from my students and even a faculty member or two, so I have an incentive to keep up on this. This article's title seemed intriguing, and its thesis is an interesting one.

James Caufield says that "the thesis of this paper is that Google has succeeded mostly because it has adopted many library values" (557). He then goes on to discuss some library values, drawing on documents like the ALA's Code of Ethics and the Library Bill of Rights. What Caufield then does is look at what Google has done over time, showing how the company's actions are often analogous to values that librarians hold. He does know that there are reasons to criticize the company, but he makes clear in the paper that he is looking at how Google has embodied some library values. He follows the following plan for the paper: ". . .it is now possible to delineate two ways in which Google has brought library values to the Web environment. First, Google has adopted many of the precepts that guide librarians in their work. Second, Google has created systems that replicate (or at least are analogous to) some of the valuable functions that libraries provide" (557). I am sure this will rub some librarians the wrong way, but I think Caufield at least deserves to be heard.

Caufield begins by summarizing the early days of the web when search engines where not effective and mostly limited to keyword searching. He then describes how Google innovated with their use of PageRank algorithms. He argues, drawing on documents such as the ones listed above, that the early Web environment fell short of many librarian values such as maintaining a balanced collection and facilitating access to the materials. Without an effective search tool, there could be no effective access. Google changed this. Caufield writes, "while the Web provides physical access to materials, search engines and directories provide intellectual access. To do this, any search engine must perform two functions that are essential to a library: it must index the materials in the collection, and it must provide a retrieval system for matching search queries to the index" (559). Libraries provide the physical access, and cataloguers and other technical services make sure users can find it. Search engines do this through their crawling, indexing, and then facilitating searches of what they have indexed.

First, Caufield argues that Google shows value by providing better indexing. To illustrate this, he gives a basic explanation of PageRank and the uses of link analysis. What is interesting here is that he connects this to an idea that operates in the library world: the concept of citation analysis. To illustrate, Caufield states, "for example, an otherwise thorough review of search engine history attributes Google's success to the 'grounbreaking insight . . . that the Web is a giant popularity contest--and that the most-cited pages will probably be the most useful.' Only rarely is it recognized that this is not groundbreaking at all but rather a translation of a traditional library (or academic) value to the Web environment. 'What has made Google special is that, in assessing the quality of sites, it takes note of how many other pages link to any given page. This is an old idea from academia, called citation analysis" (562).

Second, Caufield suggests that Google brings value through better access with a simple user interface. Readers who use Google probably know this is one of Google's strengths. "Simplicity of interface improves access in several ways. First, physical access is improved because the time required to load a page is shortened. Second, an uncluttered interface implies the user's experience and so facilitates intellectual access" (562). When I read that sentence, I had to ask myself how is this changing given that Google is now jumping to offer a variety of other services: rss reader, e-mail, news pages, so on. So far, they have maintained the clean interface, but given the apparent move to have people set up accounts and use other services, I can't help but wonder if the problems that often affect portals will be affecting them down the road.

Third, Caufield claims that Google brings value with (relatively) unbiased selections. This is definitely open to debate. He justifies it on the basis of Google's use of targeted advertising, which in search results show up as the sponsored links on the side. It is not really an innovation, according to Caufield, who sees it as analog to what journalistic enterprises do when they set up a wall between the news section and the advertising section. "In the same way, Google has erected a barrier between advertising and research" (564). Again, for me, it begs the question given targeted advertising in things like Gmail. This is not to mention the privacy issues, which Caufield addresses later in the article.

Fourth, Caufield says that Google provides an uncorrupted index. He looks at Google's efforts to counter manipulation of the search results by advertisers and other unscrupulous parties. He does point out that what some may see as objective others might see as a tyranny on Google's part. However, it does seem a positive for Google that "when confronted with deliberate efforts of this sort [like extreme manipulation of link rankings], Google now takes steps to punish these sites, reducing their ranking or barring the site completely. Google also makes modifications to its ranking process, and it seems that these are also in part intended to defeat the most dubious aspects of search engine optimization" (565).

Fifth, Caufield adds that search engines are aiming to at least be able to emulate the reference interview. This is done through the ways in which they gather personal information from creating personalized accounts to the use of cookies. The idea is to use such information to then customize search results and make them more relevant to an individual user. Caufield observes that "generally, Google's desire to gather personal information has been thought to be motivated by an interest in targeting advertising. While it is certainly true, user profiling can also render searches more relevant by providing a context for an otherwise isolated query" (566). This then leads to the question of privacy, which Caufield uses as his one illustration of critical issues that face Google.

Readers need to keep in mind that there are many incentives for companies not to respect your privacy. Caufield looks at Google's Gmail privacy policy and the way in which Google reserves the right to change it at any time and without notice. It begs the question, if Google really does not intend to sell or share this information, why do they need to state up front that they may change this policy? This brings into place the question of ethics, which to be honest, librarians tend to uphold better. For librarians, patron privacy is practically a sacred trust not to be violated. Down the road, it may be possible that companies may find it in their interest to uphold your privacy. "For instance, outrage over privacy violations could conceivably provoke a massive boycott of certain search engines in favor of others that are more scrupulous" (567). In this case, I think the author is more optimistic than I could be. For one, respect for privacy is clearly dependent on whether the company sees it as a good business practice (read profitable in the long run). Also, I don't think that many people would be outraged because of very low awareness. A lot of users on the Web barely know what a cookie is or what it does. It would take some serious malfeasance, a la hacker stealing a bunch of credit card numbers from a bank for instace, before some serious outrage actually happened. Again, this is my opinion. I wonder what other librarians out there may think.

The article concludes with some speculations about what the future may hold. On profit and quality, the author provides a good example. "For example, Google does not create recommendations of relevance, it only gathers them. Compared to the labor-intensive work performed by librarians, this automated gathering is relatively inexpensive and so makes profit possible. But while an automated system is cheaper than one directly managed by human judgment, it is far less reliable and far more open to manipulation" (568). I think this is where librarians should concentrate their strength and efforts. Every time they worry that Google is going to replace them, they should remind themselves and the powers that be that just because Google gives fast results, it does not mean they are relevant or good for the user. Short sighted administrators and communities may want to keep in mind that saving a few bucks can actually hurt the quality of services. Just an idea to consider. This is why I say Google is a tool. Notice that it gives you the results. What a user does with the results is up to that user, who may need assistance figuring out if they are the best results or not. In the case of a student, yes, they may choose the first two results from a Google list, but if they did not take the time to actually evaluate those results, they will likely pay for it with their grade down the road. Again, just a thought.

No comments: