School of Information Management & Systems.
Spring 2003.
245 Organization of Information in Collections.
Michael Buckland.
Description: Objects, Attributes, and Values.
We organize information in order to find (retrieve).
Finding includes searching
(ascertaing what, if anything, is present),
selecting (picking one or more from what is available), and
locating (finding something that has been identified).
These processes are based on "fitting descriptions". How
well does the searcher's description of what is wanted fit the
system's descriptions provided in the bibliography, catalog, directory,
index, or other information system? These descriptions, to be effective,
should anticipate how searchers' queries will be expressed.
Searchers need to pose their queries in terms of the descriptions
used in the system.
The description of an object is composed of attributes ("color of eyes")
and values ("brown").
Which attributes will be most useful to provide in any given retrieval system?
How can attributes and values be expressed in a mutually intelligible and
mutually convenient ways? How much description is worthwhile given limited
resources? Some features of attributes are:
1. "Derived" description: Parts of what is being described, e.g. title,
original abstract, thumbnail copy of an image, text.
Objects can represent themselves or be represented by versions of
themselves (or images or copies or fragments of them).
2. "Tidiness", e.g. Can fit a fixed standard format (e.g. year of
publication); use standardized codes (e.g. language: ENG SPA JAP).
Examples: Social Security No.; Book ID# (e.g. International Standard Book Number
ISBN; International Standard Serial Number ISSN); Catalog record ID#
(e.g. Library of Congress Card Number LCCN.
Each information storage system ordinarily has its own ID numbers for each record. Can it be expressed as YES or NO (Is it a biography?).
3. "Messiness", e.g.:
-- of variable length (e.g. title, person's name)
-- absent, present once, or present many times (e.g. authorship: Anonymous? Many authors?)
-- complex structure (e.g. complicated names)
-- open-ended (e.g. descriptive notes)
-- ambiguous (e.g. many names and subject descriptions)
4. Variety, e.g. (a) Copy of (part of) the object (e.g. picture of it,
text);
(b) Description of the object (physical dimensions, where it came from);
(c) Description of what the object signifies (what the document is "about");
(d) Description how one object is related to another (e.g. continuation of,
cure for, commentary on, contradiction of).
Distinguish between (i) the description of an object, mostly a
selective summary of details derived from the object, but can include
description derived from other sources; and (ii) the headings,
the index or "access points", used to avoid having to search through
all of all of the descriptions. For a book on Albania look for the
subject heading ALBANIA, then look at the descriptions of the books
listed under ALBANIA. Consider alternatives. A standardized list of
headings is called a "Thesaurus" or "Authority File".
245 examines conventions and standardized methods for making and handling
(a) descriptions (attributes, values) and (b) headings (access points,
searchable values of some attributes).
Different domains have different practices. US Federal trade statistics,
for example, use geo-political areas and
the North American Industry Classification System (NAICS).
The US Patent and Trademark Office uses names, text, and a classification of
patents based on their purpose.
Libraries, for example, use
the Anglo-American Cataloging Rules 2nd ed (AACR2), a highly
stylized, agreed standard for describing documents, used in conjunction with
other standards for mark-up and communication.
Revised Jan 26, 2003.