Kaleva (Finnish newspaper) on 21 September 2001


translation:
Information retrieval running behind the flood of information

One of the greatest challenges for information retrieval is to develop better and more comprehensive search engines and systems for the general public to use. The current systems in the Internet do not produce results precise enough, says Professor Alan Smeaton, information retrieval expert from Dublin City University in Ireland.
Smeaton is participating in an international workshop on information retrieval at the University of Oulu. The event started on Wednesday and goes on till Friday. There are about 100 participants from ten different countries. "Information retrieval and production of retrieval systems is always running behind the producers of information, and now even more than before, because the flood of information to the Internet is enormous," says Professor Peter Ingwersen from Denmark.
Smeaton regrets the fact that a large part of information retrieval is restricted to text only. However, the Internet offers information in a variety of forms: images, sound, and moving video.

Wrong answers

According to Smeaton, what annoys people about using the Internet is that retrieval engines give them false answers. Furthermore, the capability of search engines to separate irrelevant and relevant information is not sufficient.
One of the problems is that the goal of search engines in the Internet is to respond to a surfer'squery ina few seconds. "If the web happens to be crowded at the time, the engine cannot go through as many million pages as at more quiet times. This means that some pieces of information are not accessed," Smeaton states.
Search engines with better coverage do exist, but they are unacceptable because of their slowness.
However, Smeaton visualises - and is currently developing - retrieval systems that would be more efficient in combining different sources and web pages of different organisations, and not disregard as much information as the current systems do.
He hopes that the future would bring systems like this to public use, for example on the web pages of libraries and ministries. "If a person knows that the retrieved information will be reliable and relevant, he or she will not mind if the search goes on for more than ten seconds."
The reliability of information has traditionally been a problem in the Internet. Professor Ingwersen says that the most sought topic inthe Internet today is health, and that the reliability of information in that field is essential.
"A web user must consider what kind of names he or she trusts as producers and publishers of information," Smeaton assesses.


BACK