The next generation of Web search is coming sooner than expected. And with it we will see several shifts in the way people search, and the way major search engines provide search functionality to consumers.
Web 1.0, the first decade of the Web (1989 – 1999), was characterized by a distinctly desktop-like search paradigm. The overriding idea was that the Web is a collection of documents, not unlike the folder tree on the desktop, that must be searched and ranked hierarchically. Relevancy was considered to be how closely a document matched a given query string.
Web 2.0, the second decade of the Web (1999 – 2009), ushered in the beginnings of a shift towards social search. In particular blogging tools, social bookmarking tools, social networks, social media sites, and microblogging services began to organize the Web around people and their relationships. This added the beginnings of a primitive “web of trust” to the search repertoire, enabling search engines to begin to take the social value of content (as evidences by discussions, ratings, sharing, linking, referrals, etc.) as an additional measurment in the relevancy equation. Those items which were both most relevant on a keyword level, and most relevant in the social graph (closer and/or more popular in the graph), were considered to be more relevant. Thus results could be ranked according to their social value — how many people in the community liked them and current activity level — as
well as by semantic relevancy measures.
In the coming third decade of the Web, Web 3.0 (2009 – 2019), there will be another shift in the search paradigm. This is a shift to from the past to the present, and from the social to the personal.
Established search engines like Google rank results primarily by keyword (semantic) relevancy. Social search engines rank results primarily by activity and social value (Digg, Twine 1.0, etc.). But the new search engines of the Web 3.0 era will also take into account two additional factors when determining relevancy: timeliness, and personalization.
Google returns the same results for everyone. But why should that be the case? In fact, when two different people search for the same information, they may want to get very different kinds of results. Someone who is a novice in a field may want beginner-level information to rank higher in the results than someone who is an expert. There may be a desire to emphasize things that are novel over things that have been seen before, or that have happened in the past — the more timely something is the more relevant it may be as well.
These two themes — present and personal — will define the next great search experience.
To accomplish this, we need to make progress on a number of fronts.
First of all, search engines need better ways to understand what content is, without having to do extensive computation. The best solution for this is to utilize metadata and the methods of the emerging semantic web.
Metadata reduces the need for computation in order to determine what content is about — it makes that explicit and machine-understandable. To the extent that machine-understandable metadata is added or generated for the Web, it will become more precisely searchable and productive for searchers.
This applies especially to the area of the real-time Web, where for example short “tweets” of content contain very little context to support good natural-language processing. There a little metadata can go a long way. In addition, of course metadata makes a dramatic difference in search of the larger non-real-time Web as well.
In addition to metadata, search engines need to modify their algorithms to be more personalized. Instead of a “one-size fits all” ranking for each query, the ranking may differ for different people depending on their varying interests and search histories.
Finally, to provide better search of the present, search has to become more realtime. To this end, rankings need to be developed that surface not only what just happened now, but what happened recently and is also trending upwards and/or of note. Realtime search has to be more than merely listing search results chronologically. There must be effective ways to filter the noise and surface what’s most important effectively. Social graph analysis is a key tool for doing this, but in addition, powerful statistical analysis and new visualizations may also be required to make a compelling experience.