Course Blog

Course Info

IS290-rmm Syllabus


Resource and Metadata Management in Museums, Archives, and Research Collections: Tools and Practices

(Mon 16. Jan No Class - holiday)

Wed 18. Jan Lecture 1: Intro to class, concepts, goals.

  1. My background and reasons for being here
  2. Introduction to course themes
  3. Course structure and projects



I School students will already have read Bob's Chapter 1, but it is worth reviewing for those who have not.

Otherwise, just review this syllabus, etc.

Mon 23. Jan Lecture 2 CollectionSpace

  1. History and heritage
  2. Business and cultural context
  3. Goals and vision (for the community, and for UCB)



CollectionSpace website - Read about background (About, FAQ).

CollectionSpace wiki. This is a morass of documentation. Do explore these areas (but do not feel you have to absorb everything!):

  • Technical Development. Read about the Team structure, and browse the architecture section (and linked pages).
  • Design approach - read over the intro, and approaches. Explore further if you are interested.
  • Functional Team - Again, focus on the intro, and get a sense of the scope. Much of the museum jargon may be confusing right now - don't worry about that.

Assignment 1 due.

Wed 25. Jan Lecture/Lab 3 Basic metadata tools

  1. XML, XSD as basis for modern info exchange, management
  2. JSON in WebUI, jsonlint
  3. XSD as development tool (JAXB, REST, SOAP overview).


Preparation (do this before class meets):

Choose and install an XML editor capable of handling XML Schemas. See resources below for XML Editor info.

Optional, but highly recommended: Choose and install a browser add-on for handling REST and HTTP data calls. See Poster and Chrome


How I Explained REST to My Wife


XML Foundations class materials - Especially if you are not familiar with XML, XSD, Processing XML. Ignore DTD stuff, XQuery, XSLT. XPath is optional.

Web Architecture class materials - If you are unfamiliar with basics of how a web browser accesses and display pages. Focus on topics around Web browsers, HTML, CSS, what is javascript. Must have basic understanding of URL/URL structure, HTTP.

XML Editor info from Bob's 202 class - You will need a basic XML editor. XMLSpy and Oxygen are quite good.

Accessing CollectionSpace Services from Your Programs or Scripts

REST support add-ons for your browser:

JSON utilities:

Assignment 2: XML, REST, Started in class, due next Wednesday

Mon 30. Jan Lecture 4 Intro to Museum, Archives, and Research Collections practices

  1. Stuff vs. Activity
  2. Common themes across collections, and outliers
  3. Internal activities, research activities, outreach and other uses



Fragments of the World, Chapters 2, 3, and 4 (through pg 62).

Dry Storeroom #1, Chapter 1 (through page 30).

A Legal Primer on Managing Museum Collections, Only the TOC is included, to provide some idea of the scope of concerns.

Wed 1. Feb Lecture 5 Intro to Museum, Archives, and Research Collections practices

Slides (Lectures 4 and 5 share same slides)


Dry Storeroom #1, Chapter 5 (pp 154-171).

Dry Storeroom #1, Multum in Parvo (pp 185-205).

Spectrum diagram and data requirements, from CollectionsLink UK


Spectrum 3.1 Note that this is long and very dry. I find it tilted to the arts and cultural heritage, and weak on the life sciences and physical sciences. Nevertheless, it is a useful reference.

Mon 6. Feb Lecture 6 Introducing our customer for the practical projects (UC Bot Garden)

  1. Museum and collections overview and history
  2. Larger context of migration
  3. Users, activities

(No slides)


UC Botanical Garden website. Read the Collections section, try the plant browser,

UCBG (BSCIT) About the Data model

BNHM Collections statistics

Query Help page (provides some idea of how diverse schemas and usage are even within Natural History domain)

Wed 8. Feb Lecture 7 CollectionSpace model for configuration and extensibility, and deployment tools lab

  1. Architecture and function
  2. Shared semantics, domain and local extensions 
  3. IT Architecture and community dynamics
  4. Multi-tenancy and its implications
  5. Extensions, overlays, and replacements
  6. Communications and project workflows (using the wiki, IRC, email lists)



Successful Strategies for a Multi-tenant Architecture

HP Service Oriented Architecture White Paper. Through pg 13.

Real World SOA. pp 1-6 (you can scan the rest as you wish)


If you want to learn more about SOA, check out the series by Thomas Erl at They are very good.

Mon 13. Feb Lecture 8 Authorities - shared, local, models, uses, & Metadata mapping in practice: Background and principles


  1. Reference and search issues.
  2. Management and policy.
  3. Standards, common resources, ontology.



Metadata Mapping

  1. Business process analysis and UCD
  2. UI concepts and workflow, versus data models
  3. The data model, and the data model
  4. Active listening and critical analysis
  5. Mapping needs and desires to the possible
  6. Identify resources in CSpace wikis on project focus



Dry Storeroom #1,  Chapter 2 The Naming of Names (pp 31-72), House of the Muses (pp 304-307).

Wed 15. Feb Lecture 9 Botanical Garden Visit

  1. Discussion of background and tools, in the context of this museum
  2. Considering choices for the mapping project

(No slides)


Review materials from Monday's discussion:

Assignment 3 discussed, due Wed 22nd Feb. 

Mon 20. Feb No Class - holiday

Wed 22. Feb Lecture 10: Review of legacy system metadata models and authorities, for the chosen projects

Review of schemas, data samples, proposed procedures for attention

  1. Understanding the project-specific schemas
  2. Relationships to other resources in the model
  3. Managing scope for the project



Assignment 4 discussed, due Mon 27nd Feb.:

Mon 27. Feb Lecture 11: Discussion of proposed mappings, gaps, UX issues

  1. Review and discussion with Informatics staff
  2. Introduction of sample data sets for migration


Please also note the UCBG site on the CSpace wiki, where we will begin to gather our work.

Wed 29. Feb (Cancelled due to illness) BPA Session 1 (Metadata mapping in practice)

Mon 5. Mar Lecture 12: BPA Session 1 (Metadata mapping in practice)

Interview session with collection expert (probably UCBG collections manager, IST Informatics staff). Students will conduct BPA session with the domain experts, and must identify the current usage, requirements, and desires for the new system. Each team will concentrate on a single procedure (activity record) or authority. Will break into groups for interviews, and review of proposed mappings. 

Assignment 5 discuss at end of class, due Wed 7. Mar

Wed 7. Mar Lecture 13: Review of BPA session, & ETL Intro

BPA review

  1. Presentations of Observations
  2. Open Questions
  3. Follow-up strategies
  4. Embracing ambiguity

ETL Intro

  1. Data warehousing activity
  2. Metadata migration as a discipline
  3. Principles and techniques of mapping


Mon 12. Mar Lecture 14: Extract, Transform, and Load (ETL) tools intro

  1. Intro to the our tool of choice - Talend Open Studio
    1. Resources and tutorials
    2. Adding XML or XSD models
    3. Simple example map, mapping columns from CSV into XML schema
    4. String manipulation
  2. Advanced ETL (discussion only)
    1. Configuring a CSpace DB as a data source
    2. Merging and mapping sources
    3. Generating OAI-PMH from CSpace
    4. Limitations of the ETL tools, and alternate approaches

Preparation (Must be done before class, as we will work with it in class):

  1. Install TOS (free). Do not go to the Talend download page. Instead go to the Softpedia pages for Talend TOS.
    1. Choose the version for your computer OS. Choose the "stable" version. Note that this is a pig, and can take a while (their servers were slow for me). The installation instructions (really simple) are embedded in section 2.2.1 of the User Manual (see #3 below).
    2. Mac users launch the only mac app among the files exploded from the zip file.
    3. You can experiment with starting it up, and getting started, but we'll cover it in the Lab on Monday.
  2. View the "Data Integration" demo at Some of the steps are done quickly and may not be clear - don't worry about this as much as the overall feel of the application, the way that they wire together components, etc.
  3. In the Talend User Guide:
    1. Read Appendix A. This describes the UI of the tool.
    2. Read chapters 1, and 2.1 through 2.4. Sections 2.5 and 2.6 are about configuration, and you can skip them for now. 
    3. Chapter 3 is about Business modeling, and you can also ignore that.
    4. Read Chapter 4, sections 4.1, 4.2, and 4.3 up through You can skim the other sections if you are curious.
    5. Chapter 6 will be very useful to have read, but may be slightly overwhelming before we do the lab. You might want to skim it once to see the concepts, and then review it once we have played with the tool.
    6. Read Chapter 7, sections 7.1, 7.5, and 7.8
    7. Appendix B has several tutorials. Read them if you're feeling ambitious ;-)

Assignment 6 begun in class, due Mon 19. Mar. Note changed scope!

Wed 14. Mar Lecture 15: Diving into data: Managing scope and complexity

  1. Deeper look at sample data and the schemas
  2. The implications of dirty, un-normalized data
  3. Noise detection and noise reduction
  4. Normalization as the goal - and the enemy
  5. Deduplication
  6. Ex post facto strategies (procrastination as a strategy)
  7. Michael and John for examples?
  8. Examples in the UCBG data


Mon 19. Mar Lecture 16: UX Workshop 1

  1. How the Voucher UI fits into the CSpace context
  2. Overview of how UI is supported and built by the framework
  3. Brainstorming on the Voucher UI (collaborative, led by Leslie)

First of a two-day workshop to create wireframes, describe how the wireframes are bound to a data model (conceptually), and document this for review


Assignment 6 DUE. Note changed scope!

Wed 21. Mar Lecture 17: UX Workshop 2

  1. Wind up UX brainstorming
  2. How to formalize UX proposals for community review.
    1. Goals
    2. Structure
  3. Traditions in CollectionSpace (and what we may ignore)
    1. Capturing requirements
    2. Preparing hi-fi wireframes, sharing on wiki
    3. Using workflow diagrams
    4. User stories

Assignment 7 DUE.

(Mon 26. Mar, and Wed 28. Mar No Class Spring break)

Mon 2. Apr Lecture 18: Voucher workflows in context

  1. Considering the context of the voucher work
    1. What related workflows intersect with vouchers?
    2. How do we think about and ensure support for this context?
    3. How do we relate and document the old and new workflows?
  2. Considering context as we migrate data from the old system to the new one
    1. Linking voucher imports to authorities (Person, Organization, Location)
  3. What's our story on Voucher labels?
    1. Proposal for review: prepare the voucher label with the same schema we use for Vouchers.
    2. Proposal for review: generate the labels with a report.
    3. Preparing a story for customer review

Assignment 8 DUE.

Wed 4. Apr Lecture 19: Customer session to review mappings, data issues, authorities, etc.

  1. Present wireframes, data mappings, behavior aspects
  2. Identify issues and limitations
  3. Negotiate scope

Mon 9. Apr Lecture 20: Reporting Intro, examples from customer

  1. Enterprise Reporting, whys and wherefores, issues
  2. Reporting use-cases in Museums and Archives
  3. UCBG example reports


Addition Resources:

Preparation (do this before class meets):

Install iReport authoring tool. See for downloads.

Assignment 9 Discussed - be prepared to demonstrate progress.

Wed 11. Apr Lecture 21: Reporting lab, and installing a report into CSpace

  1. BIRT, Jasper, Commercial tools
  2. Configuring Jasper for CollectionSpace
  3. Authoring a basic report
  4. Handling parameters in reports, passed and default values
  5. User model vs. Services model vs. DB model
  6. Denormalizing tables to produce a report
  7. String manipulation (refName as example, and gathering or truncating fields for title)


  • iReport Ultimate Guide Read:
    • Chapter 5, intro and sections 5.1 and 5.3
    • Chapter 6, up through section 6.1.2
    • Scan Ch 8, ignoring the technical bits
    • Intro to Ch 9 and 9.1
    • Ch. 11 through 11.3

Mon 16. Apr Lecture 22: Language and I18N issues + Lab

We will spend half the class in lecture, and half in Lab (as needed) to continue with the ETL and UI work for vouchering, and consider some of the L14N issues for the vouchering UI.

  1. Marking language
  2. Considering translations
  3. UI versus data language
  4. Cross-lingual and multi-lingual search, reporting, access
  5. Localization in CollectionSpace
    1. Changing labels and titles in UI
    2. Changing themes and styling
    3. Changing layout and organization

Assignment 11 First pass due.

Wed 18. Apr Lecture 23: Standards and models for metadata management (Spectrum, CDLA, OAI, CIDOC-CRM, & friends) + Lab

We will spend half the class in lecture, and half in Lab (as needed) to continue with the ETL and UI work for vouchering.

  1. Purpose and goals, reality and usage
  2. Examples and specifics

Mon 23. Apr Lecture 24: Metadata interchange and sharing (OAI-PMH, MARC/MODS, DiGiR, Dublin Core, Darwin Core, etc.) + Lab

We will spend half the class in lecture, and half in Lab (as needed) to continue with the ETL and UI work for vouchering.

  1. Modeling versus harvesting/sharing
  2. Museums vs. Archives (Mary Elings guest?)
  3. Domain specific standards
  4. UCBG examples


To come...

Wed 25. Apr Lecture 25: Exposing collections + Lab

We will spend half the class in lecture, and half in Lab (as needed) to continue with the ETL and UI work for vouchering.

  1. Data access for research
  2. Public access for service
  3. Portals for discovery
  4. Varying points of view, and implications for data models, authorities, UI
  5. Examples and approaches:
  6. Community curation and feedback


Dry Storeroom #1, (pp 174-177).

Mon 30. Apr Lecture 26: Review week

Wed 2. May Lecture 27: Review week

Mon 7. May Exams week

Assignment 12 Final write up due.