|Description:||The field of information retrieval (IR) is concerned with the analysis, organization, storage, and retrieval of unstructured and semi-structured data. In this course, we will focus on mostly text. While IR systems are often associated with Web search engines (e.g., Google), IR applications also include digital library search, patent search, search for local businesses, and expert search, to name a few. Likewise, IR techniques (the underlying technology behind IR systems) are used to solve a wide range of problems, such as organizing documents into an ontology, recommending news stories to users, detecting spam, and predicting reading difficulty. This course will provide an overview of the theory, implementation, and evaluation of IR systems and IR techniques. In particular, we will explore how search engines work, how they "interpret" human language, what different users expect from them, how they are evaluated, why they sometimes fail, and how they might be improved in the future.|
|Prerequisites & Expectations:||
There are no prerequisites for this course. Information retrieval is the
study of computer-based solutions to a human problem. Thus, the
first half of the course will be system-focused, while the second half will be
user-focused. During the first half, you should expect to see some math (e.g.,
basic probability and statistics and some linear algebra). However, we will
focus on the concepts rather than the details.
Students will have an opportunity to explore their interests with a open-ended literature review.
|Time & Location:||M, W 9:30-10:45 am, Manning 304|
|Instructor:||Jaime Arguello (email, web)|
|Office Hours:||T, Th 9:30-10:30 am, Manning 305|
|Required Textbook:||Search Engines - Information Retrieval in Practice, W. B. Croft, D. Metzler, and T. Strohman. Cambridge University Press. 2009. Available at the bookstore.|
|Additional Resources:||Foundations of Statistical Natural Language Processing. C. Manning and H
Introduction to Information Retrieval. C. Manning, P. Raghavan and H. Schutze. 2008.
|Other Readings:||Selected papers and chapters from other books will sometimes be assigned for reading. These will be available online.|
|Course Policies:||Attendance, Participation, Collaboration, Plagiarism & Cheating, Late Policy|
|Grading:||30% homework (10% each)
15% midterm exam
15% final exam
30% literature review (5% proposal, 10% presentation, 15% paper)
|Grade Assignments:||Letter grades will be assigned using the following scale: H 95-100%, P 80-94%, L 60-79%, and F 0-59%. All homework, exams, and the literature review will be graded on a curve.|
Subject to change! The required textbook (Croft, Metzler, and Strohman) is
denoted as CMS below.