Notes
Slide Show
Outline
1
User-Centered Evaluation of Digital Libraries
  • Gary Marchionini
  • University of North Carolina
  • march@ils.unc.edu


2
Evaluation Perspective
  • Need to choose
    • product testing
    • controlled comparisons
  • Need to assess
    • system performance
    • outcome research (e.g., social programs)
  • Need to understand
    • basic research
3
 
4
Existing Models
  • Library Effectiveness
    • circulation
    • collection size
    • reference encounters
    • satisfaction
  • Information Retrieval
    • recall/precision tradeoff
    • satisfaction
5
Library Effectiveness
  • Count stuff
    • Volumes, circulations, reference questions


  • Transaction log equivalents in DLs?
6
Increased Usage at BLS
7
Decreased Rates of Growth at LC and BLS
8
Length of session at BLS October 2000
9
User Centered Library Evaluation
  • D’Elia & Walsh LQ article (physical libraries)
    • Satisfaction complexity (direct, indirect)
    • Results must be contextualized
  • See LibQual (www.libqual.org) for ARL/TAMU
  • See http://www.vuw.ac.nz/~agsmith/evaln/
  • See http://www.library.ucla.edu/libraries/college/help/critical/
  • MIT Press book Fall 03



10
IR Evaluation
  • Recall and Precision metrics
  • System performance (e.g., response time, broken links, etc.)
  • Satisfaction
  • Usability?
11
Claims
  • Today’s IR systems are not comparable to paper-based systems.
    • bibliographic, full-text, and multimedia IR systems are not comparable
  • Complex systems are greater than the sum of their respective components.
    • systems that include human components are inherently complex
  • Information seeking is an interactive process.
    • different users, domains, and settings require distinct IR system capabilities
12
Retrieval as Matching Documents to Queries
13
Information-Seeking Process
14
Evaluate Systems
  • TREC ad hoc and routing evaluations
  • TREC interactive track
    • introduces the user as a component but not the problems, perceived needs, and actions
  • Hybrid solutions
    • human + automatic
    • statistical + natural language processing
15
Evaluate Actions: Medical Case
  • Does the patient recover?
  • Were good decisions made?
    • patient, physician, hospital, HMO views?
  • Difficult (impossible?) to disambiguate component effects
  • Task-oriented studies (e.g.., Hersh’s medical student decisions)
16
Evaluate Interactions
  • Think aloud protocols
  • Observations, Transaction log analysis
  • Interviews, Stimulated recall
  • Error analysis
  • Time on task
  • Cost-benefit analysis
  • Questionnaires
  • Simulations
17
New, User-oriented Questions
  • Given many relevant documents, which can be most easily processed/understood?
  • What are the cost-benefits to different stakeholders?
  • What are the organizational/institutional changes due to a system?
  • What are the most useful surrogates (representations) for multimedia objects?
  • How to best integrate results
    • multiple retrieved sets
    • multiple evaluation efforts
18
Alternative Strategies
  • Consider the information seeker’s context
    • Cognitive accessibility (it does not matter how good the results are if the information cannot be easily understood)
    • Cost-benefit assessment (it does not matter how good results are if there is no time to use it)
  • Study special populations (cell biologist vs. practicing physician)
  • Usability testing approach (iterative, impressionistic)
  • Systematic case studies
  • Epidemiology approach (start with outcomes and trace influences)
  • Develop an IR interaction model
19
The Perseus Case
  • Multiple stakeholders, methods, and components
  • A set of evaluation questions (learning, teaching, system, publishing)
  • Longitudinal effects
    • mechanical advantages
    • side effects
    • new types of learning and teaching
    • systemic change
20
Evaluating New Systems
  • “We may never know quantitatively the impact of these combined effects, partly because we don’t know what would have happened without the collaboratory.”


  •   William Wulf, The National Collaboratory--A White Paper, 1989