Levels of Representation, cont.

(Complications & Counter Measures)

INLS 525: Managing Electronic Records

Week 4 (2/5)

Mini-Assignment 3:

Files on Your Computer


What were the main things you discovered from this exercise?

Mini-Assignment 3:

Files on Your Computer

Recordkeeping Implications

RIM should "ensure:

Stephens, David O. "Introduction: Status and Trends." Records Management: Making the Transition from Paper to Electronic. Lenexa, KS: ARMA, 2007. p.1 (emphasis in original)

Mini-Assignment 3:

Files on Your Computer

Recordkeeping Implications

Pretend someone whose records are important to you has just allowed you to use WinDirStat / Disk Inventory X to analyze the contents of a computer that he/she uses.

Representation Information

Every digital object is concurrently:

Thibodeau, Kenneth. "Overview of Technological Approaches to Digital Preservation and Challenges in Coming Years." In The State of Digital Preservation: An International Perspective, 4-31: Council on Library and Information Resources, 2002.

Bitstreams mean nothing w/o context

A bitstream can represent any type of information.

Rothenberg, Jeff. "Ensuring the Longevity of Digital Information." Washington, DC: Council on Library and Information Resources, 1999.

Representation Information (RI)

Data object -Interpreted using-> Representation Information -Yields-> Information Object

Reference Model for an Open Archival Information System (OAIS). Consultative Committee for Space Data Systems, 2002.: Figure 2-2

RI can Reside in Many Places

Let's look inside some files...

A Web Page

Hex/ASCII view of a webpage


Hex/ASCII view of a PDF

Identifying File Types

Magic Numbers & File Signatures

Examples of file signatures: Word, PDF, JPEG, & ZIP

File Extensions

Layered Formats

Change a docx extension to zip; open in a zip viewer; open "document.xml".

Finding RI Outside the File

Registries of representation information types (file formats):

Tools & frameworks...

...for monitoring, identifying & addressing obsolescence of representation information:



Those who forget the past are condemned to reload it.
— Nick Montfort, July 2000

All layers undergo change over time, at varying rates.


"A period of time long enough for there to be concern about the impacts of changing technologies, including support for new media and data formats, and of a changing user community, on the information being held in a repository." (OAIS, emphasis added)

Risks Associated with Obsolescence

Compression (e.g. Run Length Encoding)

Run-length compression example.

Rothenberg, Jeff. "Ensuring the Longevity of Digital Information." Washington, DC: Council on Library and Information Resources, 1999.

3 Levels of Compression


Encryption at Various Levels

LOC's 7 Sustainability Factors

  1. Disclosure. Degree to which complete specifications and tools for validating technical integrity exist and are accessible to those creating and sustaining digital content.
  2. Adoption. Degree to which the format is already used by the primary creators, disseminators, or users of information resources.
  3. Transparency. Degree to which the digital representation is open to direct analysis with basic tools, such as human readability using a text-only editor.
  4. Self-documentation. Self-documenting digital objects contain basic descriptive, technical, and other administrative metadata.
  5. External Dependencies. Degree to which a particular format depends on particular hardware, operating system, or software for rendering or use and the predicted complexity of dealing with those dependencies in future technical environments.
  6. Impact of Patents. Degree to which the ability of archival institutions to sustain content in a format will be inhibited by patents.
  7. Technical Protection Mechanisms. Implementation of mechanisms such as encryption that prevent the preservation of content by a trusted repository.


"To describe or delineate the character or peculiar qualities of (a person or thing)"

Oxford English Dictionary, Second Edition, 1989

Characterizations = Surrogate Representations

What to Characterize?

Significant Properties

Whoever takes the decision that a particular digital object should be preserved will have to decide what properties are to be regarded as significant. The submission agreement could usefully specify a list of significant properties. (CEDARS)

Holdsworth, David, and Derek M. Sergeant. "A Blueprint for Representation Information in the OAIS Model." Paper presented at the IEEE Symposium on Mass Storage Systems, College Park, Maryland, USA, March 27-30, 2000.

"properties of digital objects that affect their quality, usability, rendering, and behaviour" (CAMiLEON)

Hedstrom, Margaret, and Christopher A. Lee. "Significant Properties of Digital Objects: Definitions, Applications, Implications." In Proceedings of the DLM-Forum 2002, Barcelona, 6-8 May 2002: @ccess and Preservation of Electronic Information: Best Practices and Solutions, 218-27. Luxembourg: Office for Official Publications of the European Communities, 2002.

Essence = "characteristics that must be preserved for the record to maintain its meaning over time"

Heslop, Helen, Simon Davis, and Andrew Wilson. "An Approach to the Preservation of Digital Records." National Archives of Australia, 2002.

Defining Significant Properties

Emulation vs. Migration (Traditionally)


"To reproduce the action of or behave like (a different type of computer) with the aid of hardware or software designed to effect this; to run (a program, etc., written for another type of computer) by this means."

Oxford English Dictionary, Second Edition


Not Just "Emulation vs. Migration"