SILS, U. of North
Carolina, Chapel Hill
INLS-210(40)
– Library (and Digital Library)
Operations and Effectiveness
Bob Losee
Manning 302
962-7150 (voice and voicemail)
962-8071 (fax)
losee@unc.edu
Spring 2005
Brief Description:
An introductory survey of models of library
operations and effectiveness. We will consider both traditional paper libraries
and digital libraries and the integration of the two. There will be a significant
discussions of both non-mathematical and mathematical models of basic issues in
library performance – what is the optimal number of copies of a book? How many licenses should a digital library
have for a given title? Why do 20% of
the books get 80% of the use? How many
people should be working at the circulation desk? When should books be discarded? etc.
The course will begin with a discussion of several mathematical methods
(including decision theory, systems of equations, regressions, queueing theory), with brief exercises using these
methods. After this, we will progress
through operations topics, with a major emphasis on selection and collections
(including circulation). There will be
a strong bias toward proactive management, rather than being reactive. This will be a small class and will be
discussion oriented.
Prerequisites for Masters and
Doctoral Students:
INLS 111, 151, and 153. The course is meant primarily for master’s
students who want to understand what makes libraries effective and are
comfortable learning mathematical techniques for developing performance models
(mathphobes should avoid this course). There is no math prerequisite, but a willingness
to learn is essential.
Doctoral
students are at UNC to learn to do research and benefit most from taking courses
taught by their research advisor and faculty with whom their advisor regularly
collaborates on research, as well as by conducting their own, publishable
research with their advisor. Doctoral
students should carefully consider whether, instead of taking this course (or
any other course), they might benefit more from conducting research with their
advisor or their advisor’s close collaborators.
Course WWW links:
http://LibraryOperations.com
(contains course schedule)
(if you forget, there is link from my home
page)
Course Outline
Readings below are required except for those
preceded by an asterisk (*) Note that
students are never expected to absorb
all the material or understand all the mathematics in the
articles.
Introduction: Retrieval
and Filtering
Losee, Lectures Notes (available in
bookstore), Chapter 1.
Sparck-Jones and Willett, Readings in Information Retrieval ("RIR" below), Morgan Kaufmann Publishers, 1997. Chapter 1.
* Baeza-Yates and Ribeiro-Neto, Chapters 4, 10
* Case, Donald, Looking for Information: A
Survey of Research on Information Seeking, Needs, and Behavior, Academic Press, 2002.
* Sugar, “User-centered Perspectives of
Information Retrieval Research and Analysis Methods,” Annual Review of Information Science and Technology, 1995, 77-109.
Probability
Losee, Lecture Notes, Chapter 2.
Students may wish to consult one or more of
the "management science" books in the UNC libraries.
Indexing, Document, and Media Representation
Losee, Lecture Notes, Chapter 3
RIR, Chapter 2, articles by Joyce and Needham
(p. 15); Luhn (p. 21); Doyle (p. 25); Cleverdon (p. 47); Salton and Lesk (p.
60.)
* Iivonen and Sonnenwald, “From Translation to
Navigation of Different Discourses: a Model of Search Term Selection during the
Pre-online Stage of the Search Process,” Journal of the American Society for Information
Science, 49 (Apr. 1 '98), 312-26.
* Svenonius, "Access to Nonbook
Materials: The Limits of Subject Indexing for Visual and Aural Languages,"
Journal of the American Society for
Information Science, 45(8) Sept. 94,
600-606.
* Salton and McGill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983,
Chapter 3.
* Salton, Automatic
Text Processing, Addison-Wesley, 1989, Chapter 9.
Retrieval Performance
RIR, Chapter 3, article by Saracevic (p. 143.)
RIR, Chapter 4, articles by Saracevic, Kantor,
Chamis, and Trivison (p. 175); Cooper (p. 191); Tague-Sutcliffe (p. 205); Keen
(p. 217.)
* Baeza-Yates and Ribeiro-Neto, Chapter 3.
Losee, Lecture Notes,
Chapter 4.
* Losee, Lecture Notes,
Chapter 6.
* Van Rijsbergen, Information Retrieval, 2nd ed., Butterworths, 1979, Chapter 7.
Similarity and Retrieval Decisions
RIR, Chapter 5, articles by Cooper(p.
265); Belkin, Oddy, and Brooks (p. 299.)
RIR, Chapter 6, articles
by Salton and Buckley (p. 355); Croft and Harper (p. 339.)
RIR, Chapter 7, article
by Tenopir and Cahn (p. 446.)
Losee, Lecture Notes, Chapter 5
* Van Rijsbergen,
Chapters 5 & 6.
Relationships between Terms, Natural Language
Processing
Losee, Lecture Notes, Chapter 8, 9, 11.
RIR, Chapter 5, article by Turtle and Croft
(p. 287.)
RIR, Chapter 6, article by Porter (p. 313.)
RIR, Chapter 8, articles by Salton, Allan,
Buckley and Singhal (p. 478); Rau (p. 527); Johnson, Paice, Black, and Neal (p.
538.)
* Chowdhury, “Natural Language Processing,” in
Annual Review of Information Science and Technology, 2003.
Rule Based and Logical Systems
Losee, Lecture Notes, Chapter 10.
* Forsyth and Rada, Machine Learning: Applications in Expert Systems and Information
Retrieval, Wiley, 1986, Chapters 6-14.
Coding and Compression
* Salton, 1989, Chapters
5 & 6.
* Losee, Science of Information, 1990, Chapter 2.
Course Evaluation:
Quality of class participation (includes discussions of homework) 60%
Written projects, including a final project 40%
Late papers or homework will be penalized.
Mission:
The mission statements of the school (“to advance the profession and
practice of librarianship and information science, to prepare students for
careers in the field of information and library science, and to make
significant contributions to the study of information”) and of the
university (“The University is a research university. Fundamental to this designation
is a faculty actively involved in research, scholarship, and creative work,
whose teaching is transformed by discovery….[to] provide graduate and
professional programs of national distinction at the doctoral and other
advanced levels to future generations of research scholars, educators,
professionals, and informed citizens…”) should be read
and reread by students new to our school in order to understand our focus on information
science and library science and our emphasis on research and
understanding.
Each student is expected to conduct a
small research project and write up the project in a paper of 4 to 10 pages of
text, single spaced, to be handed in on paper.
You may use any widely accepted paper style (e.g., Chicago, APA,
MLA). The project should begin with a
question whose answer would be of value to the information retrieval community. The question is best phrased in the form “Is
X better than Y for Z?” rather than “How and why does Z work?” or “How does X impact Z?” There should be a brief discussion of the
literature addressing areas around the question, possibly citing 3 to 6 related
articles. The question should be clearly
stated in the paper and the paper should focus on answering this question by
drawing conclusions based primarily on the data collected and analyzed. The research should involve either the manual
or automated analysis of data to be gathered by the student (not from the
literature), and it may be either quantitative or qualitative. Studies must focus on more than one system
(or multiple distributed systems) or more than one user; the focus should be on
knowledge and techniques applicable to a wide range of systems and/or
users. Do not base your data analysis
primarily on published data. Implementing a system or software, or planning to
implement such a system, is not acceptable as the course project; you may wish
to perform a study to gain knowledge that might help outside the course to
develop a system, or you might use software you have developed to test out a
hypothesis. The paper should describe
and analyze the results, with an emphasis on interpretation (“why”) leading to
an understanding of the results. Insight
into the strengths and weaknesses of the different techniques or situations is
more important than raw performance improvement. The last paragraph of the paper should
contain specific recommendations for professional practice, as well as
summaries of the reasons for these recommendations.
Criteria for Leadership Proposals (and Class Participation)
Evaluation
This is a required course for the SILS Master’s degree in
Information Science. You are here to
learn, not to worry. Anyone who puts in
a reasonable effort should expect to pass the course.
An H paper includes a question whose answer will improve
the operation of more than one information retrieval system. The paper should include strong reasons for
considering the problem important to ILS professionals; a brief literature
review, and a methods section, as well as a clear explanation or argument about
why these results occurred. The
question to be answered should be topically similar to those questions
addressed in journals such as JASIS and IP&M. An H course grade indicates clear
excellence and leadership in the course.
A P paper is a good solid piece of work, at the normal
graduate level, that may be less effective in explaining why the question’s
answer would be useful or in connecting it to central issues in the field; or
it may lack references to relevant literature; or it may lack an obvious
connection between the question and the methods to be used; or it may not
describe the question or the methodology precisely; or it may overlook some
minor methodological problems or fail to discuss or resolve them
satisfactorily. There may be little
explanation about why these particular results occurred. P is the most commonly awarded course
grade in graduate level courses such as this.
An L paper may fail to explain the utility of the research
or it may fail to connect the question to the methods to be used or the
different aspects of methods to each other.
Major methodological problems may have been overlooked. There may be little or no understanding
provided as to the cause of the results.
An F paper is lacking a required element (the question,
relevant literature, research site and/or sources and/or subjects, data
collection and analysis). Any plagiarism
or other violation of the Honor Code will also result in an F and the
likelihood of further action.
Each student will develop three informal IR
Leadership proposals. The Leadership
proposals areas and due dates (late proposals penalized!) are
Wed. Oct. 11 Individual
users' information needs, expressions of needs as queries.
Wed. Nov. 8 Univariate feature matching
and term independence, indexing.
Wed. Dec. 13 Multivariate systems,
reasoning systems, natural language processing.
The first 2 proposals are due at the start of
class on the day indicated, and the last proposal is due at noon. Each proposal should be a total of 2 to 4
pages, single spaced. State clearly what
question you are asking, formulated as an English language question with
a question mark at the end. The proposal
should address the nature of the problem, a discussion of how results and
theory in the literature "support" the problem, methodology, the
kinds of results you expect to find, and the importance of your question and
approach. The focus of each proposal
needs to be on a question closely related to the topic for the date, with other
information retrieval system considerations being secondary. Grading will be based upon how well the
proposal addresses the question related to the topic, the usefulness of the
proposed research, its feasibility as a student 3 credit project or master's
paper, and the quality of the proposed methodology. Proposing a small project that leads to
definite knowledge and possible improvement of practice is always better than a
larger project which just amasses data but doesn’t lead to much understanding
or the improvement of practice.
For the first proposal, your question should
not discuss or evaluate a particular information system or information
resource. Propose a study of information
needs independent of how the need might be satisfied or how searching for an
answer takes place. You might want to
think about psychological studies of individuals, to learn how needs are
formulated, felt, or expressed, or you might wish to focus on a particular
functional group and their particularly different needs or expressions of
needs. If you start writing about how a
system serves people, stop.
For the second proposal, your question should
address matters associated with individual terms, either in the area of
indexing or retrieval. You can address
multiple term systems; however, the terms should be treated as independent of
each other (as do most of the retrieval models discussed up to this point in
the course).
For the third proposal, your question should
explicitly address systems using the relationships that exist between document
features and consider how this would impact retrieval performance. Methods of looking at these relationships
might include statistical dependencies, linguistic (syntactic or semantic)
information, or a logical system based on a thesaurus.
Warning: Don’t write on a topic. You should be writing to show how the
methodology will answer the question you provide. If your methodology won’t provide a
definitive (or at least solid) answer to the question, the question may be too
broad and might be narrowed further.
Doing a good job on a professionally relevant but narrow question is
always better than a much weaker answer to a broader question.
Honor Code:
Students should familiarize themselves with the University of North
Carolina at Chapel Hill Honor Code that is
described in University publications. It
should be noted that in this course, students are expected to receive (and
provide) some assistance regarding the use of hardware and software in the
laboratories and general problem solving techniques for homework
assignments. Students should NOT receive
(or provide) major creative assistance or continuing minor support for
projects.
Plagiarism:
Student assignments that are handed in that contain more than 5
consecutive words that the instructor feels were taken from another source
without proper attribution (without the proper quote marks and citations) definitely
will be referred to the appropriate administrative authorities who address
issues of Academic Integrity (e.g. the Honor Court) I assume that all students are equally
likely to be honest and will put an equal amount of effort into considering the
possibility of plagiarism for each student’s paper.
Classroom Behavior:
Separate from the Honor Code but related to respect for classmates is
classroom behavior, which will be a factor in your class participation
grade. Students are expected to behave
in a professional manner in class.
Students in class are expected to focus on classroom materials. Students are expected to avoid
student-to-student conversations during class.
Use of laptop computers should be limited to taking notes for class
or for using class related materials.
Students who appear to be involved in non-class related activities
during class time will be graded as not participating in class. Similarly, materials being read should be
limited to those appropriate for the classroom lecture or discussion. Cellular telephones and computers should have
speakers or other audio devices muted before class begins so
as to not disturb others.