Course in Machine Learning for Language Technologies
Description
Für eine gegebene Aufgabe und ein Erfolgsmaß lernt ein Computerprogramm (und damit eine Maschine), wenn es die Aufgabe mit zunehmender Erfahrung besser lösen kann. In diesem Modul lernen die Studierenden das maschinelle Lernen für Sprachtechnologien als eine gezielte Suche in einem Raum potenzieller Hypothesen kennen. Eine Hypothese ist eine Funktion bzw. ihre Parameter, die eine zu erlernende Beziehung zwischen Predictor- und Response-Variablen beschreibt. Die Studierenden sollen einen breiten Überblick über die für Sprachtechnologien relevanten Lernparadigmen gewinnen und die jeweils grundlegenden Konzepte und Theorien verstehen. Die Entwicklung moderner Sprachtechnologien bis hin zu Sprachmodellen soll nachvollzogen, das Erlernte praktisch erprobt und der erzielte Erfolg bewertet werden.General Information
Lecturer | Prof. Dr. Martin Potthast |
Lab Advisors | Niklas Deckers |
Workload | 2 SWS Lecture, 2 SWS Lab |
Language | Materials in English; lecture and lab in German; exam in German (but writing answers in English is possible). |
Requirements | See below: Requirements. |
Lecture | Thursday, 10:15-11:45, Hörsaal 0446 (Wilhelmshöher Allee 73), starting 24.04.2025. |
Lab | Thursday, 14:15-15:45, Hörsaal 1114 (Wilhelmshöher Allee 71), starting 24.04.2025. |
Exam | Written exam at the end of the semester. |
Organization
- The lecture will take place in attendance. It will also be recorded and uploaded afterwards.
- In addition to the lecture, you will be provided with both theoretical and practical exercises. We will not collect or grade these exercises, but provide you with solutions and helpful tutorial videos (released at the same time). Nevertheless, you will still need to solve the exercises on your own/with a partner. You are responsible for your own learning success.
- We expect you to have studied the lecture, solved the exercises and reviewed the solutions before the corresponding lab session takes place. The lab sessions will be done in a flipped classroom format. This only works if you are well prepared. Please bring a laptop, tablet or smartphone to participate. The lab sessions will not be streamed or recorded.
- Please also communicate with your fellow students to exchange about the course topics, the exercises and solution approaches.
Exam
- There will be a take-home mock exam that we are planning to release on 17.07.2025.
- It will be discussed in the lecture slot on 24.07.2025.
- The exam will take place on 12.08.2025, start at 10:15, in Room 0425 (Wilhelmshöher Allee).
- The exam will be in German, but you are also allowed to write your answers in English.
- Please remember to bring your passport/Personalausweis and your student ID.
- You are allowed to bring a handwritten sheet (DIN A4, one side only) with useful information into the exam. It must be written by hand by yourself. The sheet will be collected together with the exam at the end of the exam.
- You are also allowed to bring a non-programmable calculator.
- If needed (international students), you may bring a dictionary.
Requirements
Prerequisites for this course are the basic modules about algorithms and data structures, theoretical computer science and mathematics. We will use Python as a programming language.It might be helpful to refresh the following topics as their use will be required in the course. This list might be incomplete.
- Mathematical set notation
- Boolean formulas
- Matrix multiplication and transposition, matrix-vector multiplication
- Partial derivatives, rules of derivation, Taylor formula
- Straight line equation: Formulating the equation, drawing and reading a plot; quadratic functions
- Laws of exponents, laws of logarithms
- Probabilities: Product rule, conditional probabilities, independence, total probability, Bayes Theorem
Lecturenotes
- Machine Learning » Introduction » Organization, Literature [video]
- Machine Learning » Introduction » Learning Tasks [video]
- Machine Learning » Introduction » Elements of Machine Learning [video]
-
Machine Learning »
Introduction »
Syntax & Model Overview
- Machine Learning » Machine Learning Basics » Concept Learning [video from 2023 (Niklas)] [video from 2022]
- Machine Learning » Machine Learning Basics » From Regression to Classification [video]
-
Machine Learning »
Machine Learning Basics »
Evaluating Effectiveness [video part 1] [video part 2]
- Machine Learning » Linear Models » Logistic Regression [video part 1] [video part 2]
- Machine Learning » Linear Models » Overfitting and Regularization [video]
-
Machine Learning »
Linear Models »
Gradient Descent [video]
- Machine Learning » Bayesian Learning » Probability Basics [video]
- Machine Learning » Bayesian Learning » Bayes Classifier [video]
-
Machine Learning »
Bayesian Learning »
Frequentist versus Subjectivist
- Machine Learning » Decision Trees » Decision Trees Basics [video]
- Machine Learning » Decision Trees » Impurity Functions [video part 1] [video part 2]
- Machine Learning » Decision Trees » Decision Trees Algorithms [video]
-
Machine Learning »
Decision Trees »
Decision Trees Pruning [video]
- Machine Learning » Neural Networks » Perceptron Learning [video]
- Machine Learning » Neural Networks » Multilayer Perceptron [video part 1] [video part 2]
- Machine Learning » Neural Networks » Advanced MLPs [video]
Logbook
- 24.04.2025: Until the end of Introduction > Elements of Machine Learning
- 08.05.2025: Until the end of Machine Learning Basics > Concept Learning
Lab
Topic 1: Intro, Python and Maths Basics
- Relevant lectures: Until Introduction > Learning Tasks
- Homework (must be solved before your lab session on 08.05.2025):
- Exercise sheet 1 [solution - please review before the lab session] [explanatory videos: ex. 2 | ex. 3c | ex. 4b | ex. 5a | ex. 5b | ex. 5c | ex. 5d]
- Practical exercise 1 [setup | video | requirements.txt] [python (ipynb)] [jupyter (ipynb)] [numpy (ipynb) | video]
- Lab exercises (will be solved by you during your lab session on 08.05.2025 and 15.05.2025):
- Lab sheet 1 [solution - please only view after the lab session]
- PINGO - please only view after the lab session
Topic 2: Concept Learning & Evaluation
- Material will be published on: 15.05.2025.
- Relevant lectures: Until Machine Learning Basics > Evaluating Effectiveness
Topic 3: Linear Models
- Material will be published on: 05.06.2025.
- Relevant lectures: Until Linear Models > Overfitting & Regularization
Topic 4: Bayesian Classification
- Material will be published on: 12.06.2025.
- Relevant lectures: Until Bayesian Learning > Bayes Classifier
Topic 5: Tree-Based Classification
- Material will be published on: 03.07.2025.
- Relevant lectures: Until Decision Trees > Pruning
Topic 6: Neural Networks
- Material will be published on: 10.07.2025.
- Relevant lectures: Until Neural Networks > Multilayer Perceptron