Course in Information Retrieval
Description
Searching for information is an everyday process. People search in order to close gaps in their knowledge and to advance the solution of a task. Information systems that enable a quick search in unstructured data are known as search engines. They help searchers to find what they need as quickly as possible. In contrast to structured data stored in databases, the search in unstructured data is usually characterized by vague queries and uncertain or incomplete knowledge. The role of search engines in the transfer of knowledge from producers to consumers of information is the subject of research in the field of information retrieval. The module teaches basic concepts and methods of information retrieval as well as the corresponding formal background. Essential contents are the architecture of search engines, acquisition and crawling, pre-processing and information extraction from unstructured text data, algorithms and data structures for indices and the processing of queries, retrieval models, learning-to-rank, retrieval axioms and (online) evaluation methods.General Information
Lecturer | Prof. Dr. Martin Potthast |
Teaching Assistants | Tim Hagen |
Workload | 6 Credits: 2 SWS Lecture, 2 SWS Exercises |
Prerequisites | Passing grades in Grundbereich A and B (Examination regulations § 7 (2) and (3)) |
Lecture | Tuesday, 10:00 - 14:00, Hörsaal 0446 |
Exercise | Tuesday, 10:00 - 14:00, Hörsaal 0446 |
Contact | via email or office hours |
Exam | TBD |
Organization
- Lectures will take place in person, but have additionally been prerecorded. The videos can be accessed by following the lecturenotes below, or on the Webis YouTube channel. [playlist]
- Lab and corresponding material consists of a project in which you develop your own domain-specific information retrieval system. We will have regular tutorial sessions from TBD on.
- Examination will take place either as a written exam (90 min) or oral exam (30 min).
- Communication
- Discord — direct communication with teaching staff and announcements will be posted here. Please email the teaching staff for a link to join.
- Lecture website — materials and organisation annoucements will be uploaded on this website.
- Email — important announcements will be sent out via mail.
Lecturenotes
-
Information Retrieval »
Introduction »
Organization, Literature
Pre-recorded: [video 1] -
Information Retrieval »
Introduction »
Retrieval Problems
Pre-recorded: [video 2] [video 3] [video 4] [video 5] [video 6] -
Information Retrieval »
Introduction »
Architecture of a Search Engine
Pre-recorded: [video 7] [video 8] [video 9] [video 10] [video 11] [video 12] [video 13] [video 14]
-
Information Retrieval »
Evaluation »
Laboratory Experiments
Pre-recorded: [video 30] -
Information Retrieval »
Evaluation »
Effectiveness Measures
Pre-recorded: [video 31] [video 32] [video 33] -
Information Retrieval »
Evaluation »
Training and Testing
Pre-recorded: [video 34]
-
Information Retrieval »
Indexing »
Indexing Basics
Pre-recorded: [video 17] -
Information Retrieval »
Indexing »
Inverted Index
Pre-recorded: [video 18] [video 19] [video 20] [video 21]
-
Natural Language Processing »
Words »
Text Preprocessing
[excerpt]
Pre-recorded: [video 15] -
Natural Language Processing »
Words »
Morphological Analysis
[excerpt]
Pre-recorded: [video 16]
-
Information Retrieval »
Retrieval Models »
Overview of Retrieval Models
Pre-recorded: [video 22] -
Information Retrieval »
Retrieval Models »
Unigram Models 1
Pre-recorded: [video 23] -
Machine Learning »
Bayesian Learning »
Probability Basics
Pre-recorded: [video 24] [video 25] -
Machine Learning »
Bayesian Learning »
Bayes Classifier
Pre-recorded: [video 26] -
Information Retrieval »
Retrieval Models »
Unigram Models 2
Pre-recorded: [video 27] [video 28] -
Information Retrieval »
Retrieval Models »
Sequence Models
Pre-recorded: [video 29]
Lab Project
The lab project consists of building and evaluating an information system for a specific domain. This entails data
processing, implementing retrieval methods, and an analysis of the retrieval system.
Lab project material will be published here over the course of the semester.
Literature
- W.B. Croft, D. Metzler, T. Strohman. Search Engines: Information Retrieval in Practice.
- C.D. Manning, P. Raghavan, H. Schütze. Introduction to Information Retrieval. [view]
Further Resources
- 01 — Introduction to Python [view] [download]
- 02 — Introduction to Jupyter [view] [download
- 03 — How to commandline [MIT's missing-semester]