Pohjolan Tyo (Finnish newspaper) on 21 September 2001


translation:
From a jungle of text to reliability and precision
Information retrieval is developing


In the last decade, the information retrieval systems in the Internet have made the search for information easier for scientists as well as the general public, but the acquisition of information is by no means unproblematic. On the contrary: it is hard to find the desired web page from billions of alternatives, and the reliability of sources is often somewhat questionable.
Information scientists today are busy developing a system that would be able to separate the relevant information and the new media types in the Internet from countless sources.
Professor Alan F. Smeaton from Dublin City University in Ireland believes that in five years' time the search for information from the Internet will be completely different than it is today. The retrieved information will not be mainly text as it is now, but instead also sound, image, TV programmes, or radio broadcasts.
- The older, adult generation is still strongly text-oriented, but for example 12-year-olds tend to see things more through images, Smeaton thinks.
Although a lot of information is carried in the most developed mobile phones today, it is highly unlikely that a mobile phone will be the final terminal device. Devices keep getting smaller and better, and it is obvious that a terminal device must be wireless.
In the future we will probably be using Personal Digital Assistants, also called palm computers, which function as efficiently as home computers.
- The development of retrieval systems and hardware will go hand in hand, because they both have the same interests, Smeaton believes.

People in the midst of development

Most search engines today regard the nearest search term database as the most important one, although the most important information may be far away, behind many links.
Smeaton and Professor Peter Ingwersen from Copenhagen in Denmark are trying to erase this problem. Experts in information retrieval are gathered in Oulu from Wednesday to Friday for the International Workshop on Information Retrieval.
- At the moment we have to put a lot of time into identifying the web pages that can be called reliable. These include for example government or university pages, says Ingwersen.
In the future we may develop a system in which several search engines from an entity that can pick up the relevant pages. With modern engines, a gush of query replies shoots onto the screen in a few nanoseconds. Smeaton would not mind if the search took ten seconds, as long as the result was more precise than what is currently possible.
The space in commercial search engines is filled by advertisements, which block the retrieval of actual information. Some retrieval services do not need a lot of advertising for funding, because they sell their search system forward.
On the other hand, another great challenge in the development of retrieval systems is that new systems are naturally business secrets.
- Actually only half the challenge is to make people accustomed to this development in information retrieval, because it happens by itself along with everything else, Smeaton concludes.


BACK