INLS 235

Day  12

4/3/2002

 

  1. One Minute Papers

Big Points

      No accepted definition of DL [I’ll know it when I see it]

      Evolution as design vs original design

      Importance of public collections

Questions

      Does a broad definition of DL devalue the term ‘library?’  [compare to IA]

      Could we put together a DL design tools?  [Greenstone does some of this, see below for wish list]

      In content management driving DL R&D today? [compare to early technology push, is content now king?]

      DLs for children?

      Are pointer services DLs?

      What are distinctions between digitized collections, reused objects, and born digital objects for DLs?

      Can a DL of federal materials be built? Especially that interoperate? [we aim to work toward this for gov stats]

     

 

  1. DL Reviews
    1. Scott
    2. Erica
    3. Matt

 

  1. DL Architecture

 

The Infobus concept.  (Paepke et al)

            Consider the list of 10 processing services [any one of these is a big R&D effort  itself]

            Consider the translation services across systems (e.g., Dialog and google) from user perspective

 

            Distributed resources

            Maintaining state

            Documents

            Deployment

            User traditions

 

See http://www.omg.org/gettingstarted/corbafaq.htm for intro to CORBA (Common Object Request Broker Architecture)

 

The Open Video architecture (see PP)

 

Toward a digital librarian's toolkit

Michael Levi's BLS wish list for backend tools

            Guiding principles: don't release early, be correct, don't release late, release equitably

1. Hardware: automatic failure detection and switch-over (need cheap, easy to configure soln's)

2. Database: data replication across machines & backups, data loading schedules, query optimization (many concurrent users running complex queries)

3. Configuration management: testing tools; version control (system AND apps) including fixes/patches; cross-platform!; installation tools (all or nothing--finish all machines or back out), unistall tools

4. Secruity: intrusion prevention; intrusion detection & analysis; safer defaults (how did they get in and what did they change?  Right now, 3-7 logs must be examined manually)

5. Site analysis tools: log analysis; session tracking; site map creational search analysis (e.g., parse queries)

 

Komlodi, Marchionini, & Plaisant wish list

1.      Objects/items

CD tools: filtering, validating accuracy, authority, authentication

Loading, exporting

Digitization: scanning, OCR, keyframe extraction, imaging

Object naming and addressing

Redundancy checking

Storage/refreshing/migrating

File format helps (e.g., Unicode)

File helps: format (e.g., gif vs tiff), version number, item format (gif can be image of text or picture), item level (bib record, note, picture, etc.)

2.      Working with objects and collections

Directory structure tools (e.g., IBM DL separates object server from metadata server); WebToc

Browsers for special types (image browser, page image browser)

Tools for special types (key frame extraction, speech to text, text to speech)

Document conversion: GIF converters, SGML to HTML, etc.

Indexing (text, multimedia)

Link metadata with primary data (multiple layer dbms)

3.      Metadata

Standards (e.g., Dublin core)

Conversion (e.g., EAD to MARC, postscript to PDF, RTF)

Self-describing objects

4.      Users

Needs assessment tools and procedures

User profile builders/manager

Logging tools, client side? Standard formats? Analysis tools

Reference services

            FAQ

            FAQ with updates

            Listserv scanners for local, community service

            Help/suggestions

            Tours/paths/guides

            Public scheduler

            Query parsers and forwarding schedules

            Referral tools

            User communication (online discussion, collab filtering, suggestions, shared ps)

5.  Management (backend)

editors (HTML, SGML, XML, etc.)

templates for style guides

style checkers

automatic platform simulators (browsers, settings, etc.)

item gathering and labeling tools

site mapping with alternative views: relationships such as function, in and out links, user behavior

version control (backups, new versions, auto what's new, old versions, archives, broken links, etc.)

link checker (broken, updated)

bug reporter (email, auto content analysis?)

move web sites across servers

log analysis (summary + sequential)

renaming pages, moving pages (auto update all related)

site reorg tools

alerts for errors

garbage collection storage routines

encryption/de

watermarking tools

authority control tools (names, dates)

 

 

 3. One-minute paper

      What was the main point you learned in class today?

What is the main, unanswered question you leave class with today?