Tutorial: Robot Technology and Applications

The Two Step Filtering Process

Finding exactly the information a user wants is a two step process:

An information server offers tools for selecting information. This selection may be as primitive as offering listings of document names, and as complicated as handling natural language questions. In any case, they offer the possibility to retrieve less than the entire contents of the server.
From the retrieved information the user has to select which documents are considered relevant, and are kept, and which are irrelevant, and thus discarded. A filtering process can be used to automate this at least partially.

The biggest problem with information retrieval on Internet is that because of the limited network bandwidth it is important to be very selective in step one, thus leaving less work for step two. Retrieving as little data from Internet as possible is crucial for finding the information the user wants.