School of Information Management & Systems. Spring 2001.
Organization of Information Management in Collections. M. Buckland.

SEARCHERS' OBJECTIVES

Known-item search" and "Subject search". A traditional over-simplification
A Definition: "Information Storage and Retrieval. A generic term for activities, usually using computers, in which data of some sort are stored in an organized way so that they may be recovered in response to enquiries. The expression is used for two quite distinct activities. In one (sometimes known as data retrieval) the complexity arises from the detailed structure of the data and from their bulk, all enquiries being unambiguous as are the encodings of the data. In the other (sometimes known as document retrieval or reference retrieval) the complexity arises from the impossiblity of describing the content of a document, or the intent of a request, precisely or unambiguously. In the first case the difficult question is "What is the thing I am looking for?" and in the second "Is this thing the one I am looking for?" R. M. Needham. Encyclopedia of Modern Thought, 1977.
Note: The word "information" does not appear. A clear distinction between what the Information Systems people worry about (data retrieval) and what the Information Science people worry about (document or reference retrieval).

TAXONOMY OF SEARCHES. How to formulate the query? How to re-formulate the query? and, When to stop searching?
Types of search I: Instances.
"Every instance": When every different document with the specified characteristics has been found. Usually instances of different documents (types) rather than with duplicate copies (tokens) of the same document (because redundant).
"Census search": Every copy (token) of each different document (type).
"Any single instance" of a document with the specified characteristics has been found.
"Any N instances" when any N instances of documents with the specified characteristics.
"Extreme instance": for extreme value for one or more attributes, e.g. the most recent.
Types of search II: "Good" (or "preferred") documents. Conventional Boolean search systems in operate on primitive, unambiguous binary distinctions. In real life searchers would like one or a few "good" documents, rather than just any. Searches for "good" books are characterized by preferences, Complex specifications are inconvenient to formulate and searchers have a low tolerance for complexity in search specification. Empirical studies of online library catalog usage have consistently shown that functionality for specifying complex searches is little used. Searches for "some good" documents are characterized by preferences, e.g. a three-fold approach to attribute values:
- Required (i.e. mandatory);
- Conditional ("Given a choice,..."); and
- Indifference.
Adaptive searches for "n good" instances take effect situationally, adaptively. The more difficult it is to predict the outcome of searches, the more desirable it is to develop systems that support and encourage adaptive search strategies.
Sameness and substitutability: No such thing as "the same". Two or more objects are equally acceptable for some purpose.

SEARCH THEORY.
Likelihood of finding: Look first at the source most likely to contain what is sought.
Cost of searching: Start with the source that is the least expensive to search.
Cost-effective searching: Searched in decreasing order of the expected search success / search cost ratio.
Stopping the search: Compare the marginal cost-benefit with some other, alternative use of resources.
Search diseconomy: Satisficing; Mooers' Law; elasticity of demand.

For a fuller discussion see Searches in a Collection.