Recently I've had some conversations about the way that search engine companies use predictive analytics to attempt to provide rapid feedback in suggesting full search phrases as you type. In what is being promoted as a feature with clear benefits to the user, companies like Google will enhance their search engines to suggest completions to frequently sought-after phrases to speed the search process.
As an example, I navigated my browser to Google’s search page and started typing in the word “land” on my MacBook laptop. I was presented with four options (in the following order):
- “land rover”
- “land of nod”
- “land for sale”
- “land for sale in Virginia”
At the same time, I remotely logged into a different machine, a Windows 8 desktop that I had recently bought, brought up Google, and started typing the word “land.” This time the options included (in the following order):
- “lands end”
- “lands end coupon”
- “land of nod”
- “land rover”
Note that there were differences in the list of phrases that were suggested as completions as well as the order in which the similar phrases were presented. My first reaction to the differences in the results was twofold. First of all, the predictive results must be based on some combination of analysis associated with the frequency with which words appear together with contextual information associated with the individual performing the search. I have had my MacBook for a relatively long time, I use Google for most of my web searches, and occasionally sign into my Google account from the machine, so there is a high likelihood that my home’s IP address is recognized by the search engine’s servers. On the other hand, a fresh machine with little search history linked to it may not be recognized, leading to a different set of predicted choices.
Repeating the experiment with different starting words provides similar results – a variety of choices with some overlap, but different orders and on occasion, different phrases.
According to Google’s website describing its “Instant Search” features
, the benefits of this predictive search enhancement include:
- Faster searching, allowing the searchers to “save 2-5 seconds per search”
- Smarter predictions to “… help guide your search”
- Instant results to “help you see where you’re headed…”
These appear to be sound benefits. In particular, if you are not sure what you are looking for, the suggestions may guide you toward a better refinement of your intended research. On the other hand, though, what does the person using this feature sacrifice in order to derive the benefit?
There are a few fundamental byproducts of this feature that may tarnish its presumed shine. The first is the presumption of consistent meaning that is masked by an algorithmic semantic organization that is purely based on frequency and the underlying predictive models. In other words, each letter you type corresponds to a hierarchy of high-frequency character strings pre-organized (presumably) based on frequency of search and augmented by some set demographic attributes associated with the individual presumed to be doing the searching, and has nothing to do with what you might be searching for. The meanings of those character strings are irrelevant to any specific intended search for content.
For example, when I type the character “c” into the search field, the choices for “craigslist,” “cnn,” and “costco” all are suggested, even though I might have intended to look for “cooking classes.”
This leads to the second byproduct, which is what I might describe as a “cognitive jolt” that may derail the searcher’s train of thought as they are performing their search. To continue my example, while I was searching for “cooking classes,” after I typed the letter ‘c’ I was presented with the various high-frequency alternatives. All of a sudden I remember that I wanted to check out Costco’s website for an item I was looking to purchase. Or I decide to visit the CNN website to check out the current news. Or I decide to go to Craigslist to see if anyone is selling something I might be interested in buying. But what I didn't do is finish up the search that I originally intended to execute. Entertaining perhaps, but productive? Not really.
And once we have traversed the boundary between productive work and entertainment, you can ask a different question: if the actual algorithms employed are completely opaque, are we only presuming that the model is predictive and is based on frequency analysis? Or are there other factors that contribute to the determination of what phrases are presented to each user. For example, when I type the letter ‘b,’ I am presented with choices for “Bank of America” and “Best Buy.” Is there some underlying value exchange that positions these particular businesses as instant search results to subtly encourage more users to visit their websites? Try this experiment: type any letter of the alphabet into the Google search field and count how many retail businesses or product names are presented as the first or second choice, as opposed to more generic results such as location names, book titles, or celebrities.
As with many other benevolent time-saving “shortcuts” that are presented as providing benefits to the user, the user yields some degree of freedom when accepting the gift. It is valuable to maintain some awareness of what you really get when you cede control to opaque analytics.
SOURCE: Predictive Search Enhancements: The Presumed Benefits
Recent articles by David Loshin