School of Information Management & Systems.   Spring 2003.
245 Organization of Information in Collections.   Michael Buckland.

Description: Objects, Attributes, and Values.

We organize information in order to find (retrieve). Finding includes searching (ascertaing what, if anything, is present), selecting (picking one or more from what is available), and locating (finding something that has been identified). These processes are based on "fitting descriptions". How well does the searcher's description of what is wanted fit the system's descriptions provided in the bibliography, catalog, directory, index, or other information system? These descriptions, to be effective, should anticipate how searchers' queries will be expressed. Searchers need to pose their queries in terms of the descriptions used in the system.

The description of an object is composed of attributes ("color of eyes") and values ("brown"). Which attributes will be most useful to provide in any given retrieval system? How can attributes and values be expressed in a mutually intelligible and mutually convenient ways? How much description is worthwhile given limited resources? Some features of attributes are:

1. "Derived" description: Parts of what is being described, e.g. title, original abstract, thumbnail copy of an image, text. Objects can represent themselves or be represented by versions of themselves (or images or copies or fragments of them).

2. "Tidiness", e.g. Can fit a fixed standard format (e.g. year of publication); use standardized codes (e.g. language: ENG SPA JAP). Examples: Social Security No.; Book ID# (e.g. International Standard Book Number ISBN; International Standard Serial Number ISSN); Catalog record ID# (e.g. Library of Congress Card Number LCCN. Each information storage system ordinarily has its own ID numbers for each record. Can it be expressed as YES or NO (Is it a biography?).

3. "Messiness", e.g.:
  --   of variable length (e.g. title, person's name)
  --   absent, present once, or present many times (e.g. authorship: Anonymous? Many authors?)
  --   complex structure (e.g. complicated names)
  --   open-ended (e.g. descriptive notes)
  --   ambiguous (e.g. many names and subject descriptions)

4. Variety, e.g. (a) Copy of (part of) the object (e.g. picture of it, text); (b) Description of the object (physical dimensions, where it came from); (c) Description of what the object signifies (what the document is "about"); (d) Description how one object is related to another (e.g. continuation of, cure for, commentary on, contradiction of).

Distinguish between (i) the description of an object, mostly a selective summary of details derived from the object, but can include description derived from other sources; and (ii) the headings, the index or "access points", used to avoid having to search through all of all of the descriptions. For a book on Albania look for the subject heading ALBANIA, then look at the descriptions of the books listed under ALBANIA. Consider alternatives. A standardized list of headings is called a "Thesaurus" or "Authority File".

245 examines conventions and standardized methods for making and handling (a) descriptions (attributes, values) and (b) headings (access points, searchable values of some attributes). Different domains have different practices. US Federal trade statistics, for example, use geo-political areas and the North American Industry Classification System (NAICS). The US Patent and Trademark Office uses names, text, and a classification of patents based on their purpose. Libraries, for example, use the Anglo-American Cataloging Rules 2nd ed (AACR2), a highly stylized, agreed standard for describing documents, used in conjunction with other standards for mark-up and communication.
Revised Jan 26, 2003.