Business Standard

Rajesh Jain: Searching the net ? new technologies

FUTURE TECH/ The innovation in searching on the internet is far from over, the game has just begun

Image

Rajesh Jain New Delhi
One of the defining battles in the mid-1990s was between Netscape and Microsoft over control of the desktop. Netscape threatened Microsoft's Windows lock with its web browser.
 
Microsoft fought back with a vengeance and finally won, as a marginalised Netscape was bought by AOL. Another battle is now shaping up which could be equally defining for the future of computing.
 
This time, the attacker is Google. Over the past few years, Google has become the search engine of choice. As its dominance has soared, so have its ambitions.
 
Over the past couple years, Google has extended itself beyond search to other areas organically and via acquisitions. In recent times, the wheel has come a full circle with speculation rife that Google may be planning to launch its own browser.
 
There are two parts to the story we see unfolding as Google, Yahoo and Microsoft, along with a host of others, work to define tomorrow's interface to the information web.
 
The two parallel threads consist of building better search engines and creating richer interfaces. The search engines are the backend to solve the information overload problem, while the interfaces are the doorways to the world of content and applications.
 
We will first discuss advances in search technologies. Later, we will look at how we will access this emerging world of "service-based computing."
 
The problem of search is one of plenty. There is a lot of data on the web that needs to be converted into useful information. Search is one of the solutions to the proliferation of data that has taken place with the growth of the internet. As John Battelle of Searchblog put it recently: "Search is our response to the extraordinary info-abundance in which we're all awash."
 
Google's PageRank technology helped it separate the wheat from the chaff. In a recent article on Google's history, the Economist (Technology Quarterly, Sep 16, 2004) explained how the algorithm works: "PageRank works by analysing the structure of the web itself. Each of its billions of pages can link to other pages, and can also, in turn, be linked too. [Google's founders] Mr Brin and Mr Page reasoned that if a page was linked to many other pages, it was likely to be important.
 
Furthermore, if the pages that linked to a page were important, then that page was even more likely to be important. There is, of course, an inherent circularity to this formula-the importance of one page depends on the importance of pages that link to it, the importance of which depends in turn on the importance of pages that link to them. But using some mathematical tricks, this circularity can be resolved, and each page can be given a score that reflects its importance."
 
The search of today can be considered in the C-prompt era, and needs an upgrade. So, what will be the Windows of the search era? In an interview with ACM Ubiquity, Ramesh Jain, professor of computer science at Georgia Institute of Technology, explains what needs to be done: "Current search engines like Google do not give me a 'steering wheel' for searching the internet. The search engines get faster and faster, but they're not giving me any control mechanism.
 
The only control mechanism, which is also a stateless control mechanism, asks the searcher to put in key words, and if I put in key words I get this huge monstrous list. I have no idea how to refine this list.
 
The only way is to come up with a completely new key word list. I also don't know what to do with the 8 million results that Google threw at me. So when I am trying to come up with those key words, I don't know really where I am.
 
That means I cannot control that list very easily because I don't have a holistic picture of that list. That's very important. When I get these results, how do I get some kind of holistic representation of what these results are, how they are distributed among different dimensions...Two common dimensions that I find very useful in many general applications are time and space.
 
If I can be shown how the items are distributed in time and space, I can start controlling what I want to see over this time period or what I want to see in that space."
 
One glimpse of search innovation comes from Amazon with its A9 search engine, which is built around Google's search results, and also integrates Amazon's own book search results.
 
John Battelle explained A9's approach in a column for Business2.0: "A9 has broken search into its two most basic parts. Recovery is everywhere you've been before (and might want to go again); discovery is all that you may wish to find but have yet to encounter.
 
A9 attacks recovery through its original Search History feature and its integrated toolbar, which tracks every site you visit. But new to this version of the site is a feature A9 calls 'Discover,' which finds sites you might be interested in based on your click stream and "� here's the neat part "� the click streams of others...A9 is more of a Web information management interface, with search as its principal navigational tool. [It is] betting that over time, Web users will come to recognise, then demand, that their search service not only find sites based on queries but also remember where they have been and what they have clicked on."
 
A few years ago, it seemed the search game was over. Results were inaccurate and portals were the thing to do. Google's cutting-edge technology of linking resurrected an industry. Yet the innovation in search is far from over. The game has just begun.
 
In the next column, we will look at some key ideas which will define tomorrow's search.
 
Rajesh Jain is managing director of Netcore Solutions Pvt Ltd. His weblog is http/www.emergic.org. He can be contacted at rajesh@netcore.co.in

 
 

Don't miss the most important news and views of the day. Get them on our Telegram channel

First Published: Oct 06 2004 | 12:00 AM IST

Explore News